Find the Fastest Sort Method with O(n*log(n)) Time Complexity

  • Thread starter Thread starter Alkatran
  • Start date Start date
  • Tags Tags
    Method Sort
Click For Summary
The discussion focuses on identifying the fastest sorting method with O(n*log(n)) time complexity, highlighting the use of tree structures for sorting. It explains how elements are added and removed from a tree, resulting in an overall complexity of O(2*log[2](n)*n), while questioning how to eliminate the constant factor. Various sorting algorithms are compared, noting that quicksort and treesort are efficient, while radix and bucket sorts can achieve O(n) under specific conditions but may degrade with larger datasets. The conversation also touches on the limitations of bucket sort for certain data types and the efficiency of heapsort compared to quicksort. Overall, the exploration of sorting algorithms emphasizes the trade-offs between speed and data type suitability.
Alkatran
Science Advisor
Homework Helper
Messages
959
Reaction score
0
I know that the fastest sort has O(n*log(n)) but I can't figure out what it is!

Here's what I have...

Use a tree structure: place the first value in the tree. If the next is less than the first, move the first to a child position and place the next in the main position. Otherwise just put the next in the child position. Basically repeat this for every item, with every Node having two children, always moving higher values downwards.

When all values are placed in the list, start taking them off.

remove element 1, and then check which of it's children are highest: move that child up and do the same computation for the now empty node: repeat until you hit a node with no children. Then remove element 1 again...

So... that comes out to:
O(n) items added * O(log[2](n)) sorting new items into heap
O(n) items removed * O(log[2](n)) sorting new items out of heap

That comes to:
O(2*log[2](n)*n)

Is that it? How do they eliminate the 2?
 
Computer science news on Phys.org
Some sorting algorithms run in O(n) time! (Radix sort, bucket sort... ones not based on comparisons)

Anyways, remember that O(c f(n)) = O(f(n)) for any positive constant c...
 
Isn't bucket sort another name for bubble sort? Complexity O(n^2).
Quicksort and treesort are the fastest maybe, of complexity O(nlogn).
 
Radix sort and bucket sort is only O(n) in the best case scenario. It is more common to find it to be O(n log log n), which is better than quicksort which is O(n log n). The problem with radix and bucket sorts is that it is only good for small amount of data.

Here is an example where you want to use bucket or radix sorts:

4.3, 2.4, 6.4, 3.3, 7.5, 2.5, 5,3

You'll basically create a holding place (bucket if you will) for the range of integers: 2,3,4,5,6,7. Then you only go through the list in one pass dumping the numbers in their apporiate holding place. Once your done going through the list you concatanate all the buckets.

If each bucket only has one key when you are done going through the list then it is O(n), but in most cases you'll have multiple keys inside each bucket, therefore you have to sort each bucket that has more than one key. The more keys you have per bucket the moe ineffient the algorithm is. Qucksort doesn't degenerate as bad as bucket or radix sort when the key size gets large. For strings you'll get O(n^2) for bucket sort and O(n log n) for quicksort. You might as well use bubble sort O(n^2) if your going to use bucket sort for strings.
 
Introsort, incidentally, is the fastest comparison based sorting algorithm. It runs quicksort unless it starts performing poorly, then switches over to heapsort.
 
I like the idea of the bucket sort. But it is definitely limited because you have to define the range. It would be completely useless for double type variables.

In any case I wrote up my idea in visual basic. 45 seconds to sort 100 000 random strings (vary between 5 and 10 characters from A-Z) and 2 children per node. Increasing childs per node seems to have no noticeable effect on the end result. It only takes 2 seconds to sort 10 000 strings, which is amazing IMO.

Anyways, what is the difference between quicksort and heapsort? Heapsort is what I'm doing correct?
 
What you're doing is almost heapsort. When you remove the root of the tree, the usual algorithm moves the last child node to the root, then sifts it down. It's slightly more efficient because you can keep the tree balanced this way.

Also, for additional speed, there's a clever way to store the tree as an array.
 
That's how I used to store it, but I find this method much easier to understand (object oriented).

The idea is to store the value for a Node's children at the 2^(Node's position) (+1 for second child) position, correct?


I tried to sort a millikon strings, but I stopped once I realized it was taking up 200 megabytes of memory and rising.
 
The problem is probably that it's making a new string every time you do an assignment. (I don't know VBASIC) One common solution to this problem is to maintain an array of pointers, and only shuffle arouund this auxilliary array.

The children of array position N would be 2N and 2N+1.
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
Replies
5
Views
2K
  • · Replies 36 ·
2
Replies
36
Views
6K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
Replies
1
Views
2K
Replies
2
Views
3K
  • · Replies 65 ·
3
Replies
65
Views
7K