Can Sorting Ever Be Done in Time Proportional to n?

SSequence · Aug 6, 2018

The question(s) here can be asked for many problems (and not just sorting).

Let's consider the standard case of sorting an array of size ##n## containing numbers. If we consider a naive algorithm, its time would be proportional to ##n^2##. For example, the naive algorithm could be proceed with two loops. In the outer loop we select the ##n##-th element and in the inner loop we assign a "rank" to that element via a counter (and the rank would exactly correspond to where we want to place that element in the sorted array). For example, if we adopt the convention of starting the array at index ##1##, then the smallest element of the array will get the rank ##1## (via a counter, because the counter gets initialised to ##1##, but is never incremented). The largest element will get the rank ##n## (because the counter keeps getting incremented at each comparison after being initialised).

For example, here is the pseudo-code for doing that:

C:

for(i:=1 to n)  {
rank:=1
for(j:=1 to n)  {
    if(A[i]>A[j])
    rank:=rank+1
}
B[rank]=A[i]      
}

Just a couple of points. It is assumed that the array ##B## has been initialised appropriately (based on specific programming language conventions) and is of size ##n##. And ofc the array ##A## is given as input. And further, I have assumed all numbers in the array ##A## to be distinct. It is not difficult to see that we can modify the above segment for the more generic case (where repetition is allowed), but I just wanted to keep it simple.

Now here are the questions:
(1) A lot of the times when we are given a typical problem (at least among ones that seem to be of more interest in CS), it seems that there is a clear worst case time associated with the naive or exhaustive algorithms. For example, in the case of sorting it seems to be ##n^2##. The time of ##n^2## is bad, but it does seem like one would really have to make a deliberate effort to make it any worse.

The question is, whether notions such as "naive algorithm" (or "exhausting all possibilties" given a problem) can be made rigorous in reasonably general terms (that is, at least for some classes of problems)?

(2) Can we prove that sorting can never be done in time proportional to ##n##? Can we prove that it can never be done better than ##n.f(n)##, where ##f## is presumably some function which grows very slow?

The little I know, proving these kind of bounds is generally quite difficult. But since these are very well-studied problems, perhaps more is known?

StoneTemplePython · Aug 7, 2018

SSequence said:

Now here are the questions:
(1) A lot of the times when we are given a typical problem (at least among ones that seem to be of more interest in CS), it seems that there is a clear worst case time associated with the naive or exhaustive algorithms. For example, in the case of sorting it seems to be ##n^2##. The time of ##n^2## is bad, but it does seem like one would really have to make a deliberate effort to make it any worse.

The question is, whether notions such as "naive algorithm" (or "exhausting all possibilties" given a problem) can be made rigorous in reasonably general terms (that is, at least for some classes of problems)?

I'm not sure that "rigorous in reasonably general terms" is well defined...

A lot of the time a "naive algorithm" is the "obvious one" though properly converting ##\text{obvious} \to \text{rigorous}## seems like an exercise for Wittgenstein. A nice example of this would be comparing "naive / obvious" matrix -matrix multiplication with that in the Strassen family of algorithms.

note that what you've suggested is rather different in my view than "exhausting all possibilities" as that would seem to be ##O\big(n\cdot n!\big)## - - i.e. doing a linear scan against each permutation and keeping the one that scores best. The point of the proofs underlying sorting algorithms is there is no need for exhaustive search. But again linguistics nits remain.

SSequence said:

(2) Can we prove that sorting can never be done in time proportional to ##n##? Can we prove that it can never be done better than ##n.f(n)##, where ##f## is presumably some function which grows very slow?

The little I know, proving these kind of bounds is generally quite difficult. But since these are very well-studied problems, perhaps more is known?

There's a few ways to interpret this -- one that worries me is saying that you can never sort in O(n) is reminiscent of people saying you can "never" factor a degree 5 polynomial exactly, which is of course wrong. If there is special structure you can use bubblesort, insertion sort or whatever appropriate to exploit said structure and get a ##\lt O\big(n \cdot \log(n)\big)## sort. There are proofs that in general (i.e. without some known special structure) you cannot beat ## O\big(n \cdot \log(n)\big)## though I don't know them particularly well. As a simplistic case: if you know that the array is "backwards" (or any other known permutation) you can reverse said thing in ##O\big(n\big)##.

To muddy the waters a bit, I'd point out that you can actually sort/heapify something in ##O\big(n\big)## but to extract / view all items requires ##n## operations each taking ##O\big(\log(n)\big)## I'm afraid.

- - - -

It seems worth pointing out that there is in fact a massive difference between ## O\big(n \cdot \log(n)\big)## and ## O\big(n^2\big)##... they may seem close but they aren't if you're talking about day to day workhorse engines-- e.g. consider sorting an array with say 10 million items in it. There's a similar issue with FFT being (## O\big(n \cdot \log(n)\big)## vs vanilla matrix vector multiplication being ##O\big(n^2\big)##.

SSequence · Aug 7, 2018

StoneTemplePython said:

I'm not sure that "rigorous in reasonably general terms" is well defined...

A lot of the time a "naive algorithm" is the "obvious one" though properly converting ##\text{obvious} \to \text{rigorous}## seems like an exercise for Wittgenstein. A nice example of this would be comparing "naive / obvious" matrix -matrix multiplication with that in the Strassen family of algorithms.

When I said "reasonably general terms", I meant in the sense that most problems that are typically encountered by computer scientists should be covered (and this forms a very small subset of the general case, since usually the running times are limited by rather obvious and moderately growing functions).

One of the reasons for asking this was that there doesn't seem to be a shortage of similar questions. For example:
i--- When are two algorithms (with same functionality) are same or different?
ii--- When are two proofs of the same statement same or different?
iii--- Which program, that computes a given function, is the simplest?

I think there are many similar questions which can be asked. For example, perhaps one can even ask "in a specific well-defined context, when is a given problem harder or easier than another given problem?".

I would be surprised if any single one of these questions (and many more which can be asked) have a fully general solution. The general theme is that there seems to be some "common language expression" in each case which one often uses informally ("algorithm","same/different proof","simplest program","harder/easier problem" etc.), but defining it rigorously in a fully general sense doesn't seem to be possible.

It does seem that the question I asked (when put more generally) isn't too different from (iii) above. Since it seems that (iii) can perhaps be roughly phrased as "Which algorithm that accomplishes a given task is the simplest". And the simplest presumably might vaguely have some correspondence to "naive".

It does seem that one can usefully, if not rigorously in a fully general sense, analyse many questions of this kind, but usually it doesn't seem mathematicians are interested in that kind of analysis in most cases. Though perhaps, any analysis of this kind becomes more useful when some kind of real life correspondence or importance exists.

StoneTemplePython said:

note that what you've suggested is rather different in my view than "exhausting all possibilities" as that would seem to be ##O\big(n\cdot n!\big)## - - i.e. doing a linear scan against each permutation and keeping the one that scores best. The point of the proofs underlying sorting algorithms is there is no need for exhaustive search. But again linguistics nits remain.

Interesting. One could just list all finite permutations and see which one is sorted.

StoneTemplePython said:

There are proofs that in general (i.e. without some known special structure) you cannot beat ## O\big(n \cdot \log(n)\big)## though I don't know them particularly well.

Yes, this was more or less my second question. Since mathematically, one doesn't assume any thing other than what is given (in this case, just an array of numbers), unless explicitly specified otherwise. Any reference for this? Perhaps one could find them in computational complexity texts (I have never read any ... nor I ever re-call seeing anything similar in casual browsing) ...

FactChecker · Aug 7, 2018

A classic text is Knuth, "The Art of Computer Programming: Volume 3, Sorting and Searching". Although it is old, the entire series of 4 volumes is still a "bible" of computer algorithms.

Tom.G · Aug 7, 2018

FactChecker said:

the entire series of 4 volumes

In looking it up I just discovered that volume 4 has 4A and 4B, and that volume 1 has a fascicle on MMX and RISC processors. I have the first three volumes from back when they were published and they are truly Gold. Would love to have the other three but being retired, find it hard to rationalize the $180 at Amazon. Oh well.

rcgldr · Aug 8, 2018

Not mentioned yet is that while comparison based sorts (on random data) have a best case time complexity of O(n log(n)), non-comparison based sorts such as radix sort have time complexity O(n).

SSequence · Aug 8, 2018

FactChecker said:

A classic text is Knuth, "The Art of Computer Programming: Volume 3, Sorting and Searching". Although it is old, the entire series of 4 volumes is still a "bible" of computer algorithms.

My second question in the OP was whether we can prove that sorting (with some standard assumptions on computational model and time-cost) can't be done in time ##f(n)## where the function ##f \in O(n)##. A stronger (and somewhat plausible) statement might be to show that sorting can't be done in time ##f(n)## where we have ##O(f) \subset O(nlog(n))##.

Any result of this kind seems like more of a computational complexity kind of result to me, though it wouldn't be surprising to find it in a comprehensive text like the one you mentioned. An exact reference would be good to find.

===========

For example, one of the questions in earlier times was whether multiplication (for two n-digit numbers I think) could be done in time ##f(n)## where ##O(f) \subset O(n^2)##. This was answered in positive. It was shown that it could be done in ##O(n^a)## (where ##1<a<2## is some constant).

But now an analogous question (to the second one in OP) that can be asked is whether addition for two-digit numbers is strictly easier than multiplication for two-digit numbers. In more mathematical terms, it can be done by proving that multiplication can't be done in ##O(n)## time. I think this was shown too (but I only vaguely remember reading it years ago ...).

jbriggs444 · Aug 8, 2018

SSequence said:

My second question in the OP was whether we can prove that sorting (with some standard assumptions on computational model and time-cost) can't be done in time ##f(n)## where the function ##f \in O(n)##. A stronger (and somewhat plausible) statement might be to show that sorting can't be done in time ##f(n)## where we have ##O(f) \subset O(nlog(n))##.

Assume an algorithm based on comparisons. There are n factorial possible permutations of the elements to be sorted. How many binary comparisons are required to distinguish between n factorial possibilities?

A useful approximation is: https://en.wikipedia.org/wiki/Stirling's_approximation

SSequence · Aug 8, 2018

Thanks, it appears that the relevant link can be found in the link you mentioned. The crucial identity seems to be 2^f(n) ≥ n! ... Of course n! is the number of possible arrays (when all elements are distinct), but I don't understand the inquality (for now). So I will probably need some time to think about it. Anyway, that does seem to answer the second question in OP.

sysprog · Aug 13, 2018

Tom.G said:

In looking it up I just discovered that volume 4 has 4A and 4B, and that volume 1 has a fascicle on MMX and RISC processors. I have the first three volumes from back when they were published and they are truly Gold. Would love to have the other three but being retired, find it hard to rationalize the $180 at Amazon. Oh well.

You can get the ebooks (pdf) for free. Professor Knuth is apparently ok with that -- they're mirrored on various .edu sites. Here's a link for the vol 1 fascicle and the vol 4 stuff:
http://www.cs.utsa.edu/~wagner/knuth/

FactChecker · Aug 13, 2018

Tom.G said:

In looking it up I just discovered that volume 4 has 4A and 4B, and that volume 1 has a fascicle on MMX and RISC processors. I have the first three volumes from back when they were published and they are truly Gold. Would love to have the other three but being retired, find it hard to rationalize the $180 at Amazon. Oh well.

When I retired, I gave my set of volumes 1-3 to a coworker. In a prior lifetime, I loved them and studied them. I occasionally get the urge to buy the full set, but I know that I'm too old and tired to study it.

rcgldr · Aug 13, 2018

Off topic reply here:

SSequence said:

But now an analogous question (to the second one in OP) that can be asked is whether addition for two-digit numbers is strictly easier than multiplication for two-digit numbers. In more mathematical terms, it can be done by proving that multiplication can't be done in ##O(n)## time. I think this was shown too (but I only vaguely remember reading it years ago ...).

With a Wallace Tree, hardware multiply can be done in O(log(n)).

https://en.wikipedia.org/wiki/Wallace_tree

For current X86 64 bit procesors, some of the 64 bit multiplies only take 2 clocks.

https://www.agner.org/optimize/instruction_tables.pdf

Vanadium 50 · Aug 18, 2018

SSequence said:

For example, in the case of sorting it seems to be [itex]n^2[/itex]. The time of [itex]n^2[/itex] is bad, but it does seem like one would really have to make a deliberate effort to make it any worse.

The question is, whether notions such as "naive algorithm" (or "exhausting all possibilties" given a problem) can be made rigorous in reasonably general terms (that is, at least for some classes of problems)?

When I took Algorithms, a fellow student developed an [itex]n^3[/itex] algorithm. We spent a whole day analyzing it, looking for waste, and we never found any.

It's important to distinguish between average times and worst case. Quicksort is [itex]n^2[/itex] worst case, but usually runs in [itex]n\log n[/itex]. I also assume we're talking about a Computer Science definition of "average" and "usually". In the real world, you almost never sort (or index) an array from scratch - you insert elements into an already sorted array.

It takes [itex]n[/itex] time to verify an array is sorted. So one cannot sort any faster than that. That requires one can immediately find the right next item in the array. A binary search takes [itex]\log n[/itex] time and so there's your [itex]n\log n[/itex].

Can Sorting Ever Be Done in Time Proportional to n?

1. What is sorting?

2. What is worst time analysis in sorting?

3. Why is worst time analysis important in sorting?

4. What is the difference between best, average, and worst-case time complexity in sorting?

5. How can we improve the worst case time complexity of a sorting algorithm?

Similar threads

Hot Threads

Recent Insights