Find the Fastest Sort Method with O(n*log(n)) Time Complexity

  • Thread starter Thread starter Alkatran
  • Start date Start date
  • Tags Tags
    Method Sort
Click For Summary

Discussion Overview

The discussion revolves around identifying the fastest sorting method with O(n*log(n)) time complexity. Participants explore various sorting algorithms, their efficiencies, and specific implementations, including heapsort, quicksort, radix sort, and bucket sort. The conversation includes theoretical considerations, practical implementations, and performance comparisons.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant describes a tree-based sorting method and calculates its time complexity, questioning how to eliminate constants in big O notation.
  • Another participant mentions that some sorting algorithms, like radix sort and bucket sort, can achieve O(n) time but are not based on comparisons.
  • There is a claim that bucket sort is synonymous with bubble sort, which is contested by others who assert that quicksort and treesort are faster algorithms with O(n*log(n)) complexity.
  • A participant argues that radix sort and bucket sort are typically O(n log log n) in practice and discusses their limitations with larger datasets.
  • Introsort is introduced as the fastest comparison-based sorting algorithm, which adapts between quicksort and heapsort based on performance.
  • One participant shares their implementation of bucket sort in Visual Basic, noting its speed with random strings but also its limitations with defining ranges for sorting.
  • Another participant clarifies the heapsort algorithm, emphasizing the importance of maintaining a balanced tree during sorting.
  • There is a discussion about memory usage when sorting large datasets and suggestions for optimizing memory management through pointer arrays.

Areas of Agreement / Disagreement

Participants express differing views on the efficiency and applicability of various sorting algorithms, particularly regarding the performance of radix and bucket sorts compared to quicksort and heapsort. No consensus is reached on the best sorting method, and multiple competing views remain throughout the discussion.

Contextual Notes

Some participants note that the performance of certain sorting algorithms can vary significantly based on the dataset characteristics and that assumptions about data types (e.g., integers vs. strings) can impact efficiency. Limitations regarding memory usage and the need for defining ranges in bucket sort are also highlighted.

Alkatran
Science Advisor
Homework Helper
Messages
959
Reaction score
0
I know that the fastest sort has O(n*log(n)) but I can't figure out what it is!

Here's what I have...

Use a tree structure: place the first value in the tree. If the next is less than the first, move the first to a child position and place the next in the main position. Otherwise just put the next in the child position. Basically repeat this for every item, with every Node having two children, always moving higher values downwards.

When all values are placed in the list, start taking them off.

remove element 1, and then check which of it's children are highest: move that child up and do the same computation for the now empty node: repeat until you hit a node with no children. Then remove element 1 again...

So... that comes out to:
O(n) items added * O(log[2](n)) sorting new items into heap
O(n) items removed * O(log[2](n)) sorting new items out of heap

That comes to:
O(2*log[2](n)*n)

Is that it? How do they eliminate the 2?
 
Computer science news on Phys.org
Some sorting algorithms run in O(n) time! (Radix sort, bucket sort... ones not based on comparisons)

Anyways, remember that O(c f(n)) = O(f(n)) for any positive constant c...
 
Isn't bucket sort another name for bubble sort? Complexity O(n^2).
Quicksort and treesort are the fastest maybe, of complexity O(nlogn).
 
Radix sort and bucket sort is only O(n) in the best case scenario. It is more common to find it to be O(n log log n), which is better than quicksort which is O(n log n). The problem with radix and bucket sorts is that it is only good for small amount of data.

Here is an example where you want to use bucket or radix sorts:

4.3, 2.4, 6.4, 3.3, 7.5, 2.5, 5,3

You'll basically create a holding place (bucket if you will) for the range of integers: 2,3,4,5,6,7. Then you only go through the list in one pass dumping the numbers in their apporiate holding place. Once your done going through the list you concatanate all the buckets.

If each bucket only has one key when you are done going through the list then it is O(n), but in most cases you'll have multiple keys inside each bucket, therefore you have to sort each bucket that has more than one key. The more keys you have per bucket the moe ineffient the algorithm is. Qucksort doesn't degenerate as bad as bucket or radix sort when the key size gets large. For strings you'll get O(n^2) for bucket sort and O(n log n) for quicksort. You might as well use bubble sort O(n^2) if your going to use bucket sort for strings.
 
Introsort, incidentally, is the fastest comparison based sorting algorithm. It runs quicksort unless it starts performing poorly, then switches over to heapsort.
 
I like the idea of the bucket sort. But it is definitely limited because you have to define the range. It would be completely useless for double type variables.

In any case I wrote up my idea in visual basic. 45 seconds to sort 100 000 random strings (vary between 5 and 10 characters from A-Z) and 2 children per node. Increasing childs per node seems to have no noticeable effect on the end result. It only takes 2 seconds to sort 10 000 strings, which is amazing IMO.

Anyways, what is the difference between quicksort and heapsort? Heapsort is what I'm doing correct?
 
What you're doing is almost heapsort. When you remove the root of the tree, the usual algorithm moves the last child node to the root, then sifts it down. It's slightly more efficient because you can keep the tree balanced this way.

Also, for additional speed, there's a clever way to store the tree as an array.
 
That's how I used to store it, but I find this method much easier to understand (object oriented).

The idea is to store the value for a Node's children at the 2^(Node's position) (+1 for second child) position, correct?


I tried to sort a millikon strings, but I stopped once I realized it was taking up 200 megabytes of memory and rising.
 
The problem is probably that it's making a new string every time you do an assignment. (I don't know VBASIC) One common solution to this problem is to maintain an array of pointers, and only shuffle arouund this auxilliary array.

The children of array position N would be 2N and 2N+1.
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
Replies
5
Views
2K
  • · Replies 36 ·
2
Replies
36
Views
7K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
Replies
1
Views
3K
Replies
2
Views
3K
  • · Replies 65 ·
3
Replies
65
Views
10K