How does radix sort perform on partially sorted data?

  • Thread starter Thread starter rcgldr
  • Start date Start date
  • Tags Tags
    Sorting
AI Thread Summary
Radix sort has been added to a collection of sorting algorithms, with performance results shared for sorting 4,194,304 64-bit elements of pseudo-random data on a Windows XP64 system using an Intel i7 2600K processor. The radix sort achieved the fastest time at 203 ms, outperforming hybrid (radix+merge) sort at 265 ms, merge sort at 297 ms, Microsoft sort at 344 ms, and Microsoft stable_sort at 375 ms. The radix sort implementation involves a pre-pass to create histograms and eight sorting passes based on the least significant to most significant byte, utilizing a single temporary array for efficiency. The discussion also suggests exploring the performance of these algorithms on partially sorted data, referencing previous work done with pointer-based text file sorts. The updated code for the radix sort function is provided, demonstrating its structure and methodology.
rcgldr
Homework Helper
Messages
8,922
Reaction score
674
Added radix sort to sort examples in:

http://rcgldr.net/misc/sortv.zip

Results on my system:

Code:
Time to sort 4,194,304 64 bit elements of psuedo random data:

radix sort                 rsortv.cpp    203 ms
hybrid (radix+merge) sort  hsortv.cpp    265 ms
merge sort                 msortv.cpp    297 ms
microsoft sort             sortv.cpp     344 ms
microsoft stable_sort      sortv.cpp     375 ms

64 bit mode, Windows XP64, Visual Studio 2005 compiler.
Intel i7 2600K 3.4ghz, DP67BG motherboard, 4GB ram

Previous threads:

https://www.physicsforums.com/showthread.php?t=218369

https://www.physicsforums.com/showthread.php?t=181589

Code for radix sort function example - it makes one pre-pass over the data array, generating an array of histograms (256 entries for each histogram), then coverts the histograms into the starting indices for each "bucket". Each pass places data into "buckets" based on the 8 bit value of the data, and the passes go from least signficant byte to most significant byte. In this example, 1 pre pass and 8 sort passes are used to sort an array of 64 bit unsigned integers. Only one temporary array (the same size as the original array) is required for the radix sort, using variable size "buckets".

Code:
typedef unsigned __int64    UI64;
typedef unsigned __int64 *  PUI64;

PUI64 RadixSort(PUI64 pData, PUI64 pTemp, size_t count)
{
size_t mIndex[8][256] = {0};            // index matrix
PUI64 pDst, pSrc, pTmp;
size_t i,j,m,n;
UI64 u;

    for(i = 0; i < count; i++){         // generate histograms
        u = pData[i];
        for(j = 0; j < 8; j++){
            mIndex[j][(size_t)(u & 0xff)]++;
            u >>= 8;
        }       
    }
    for(j = 0; j < 8; j++){             // convert to indices
        n = 0;
        for(i = 0; i < 256; i++){
            m = mIndex[j][i];
            mIndex[j][i] = n;
            n += m;
        }       
    }

    pDst = pTemp;                       // radix sort
    pSrc = pData;
    for(j = 0; j < 8; j++){
        for(i = 0; i < count; i++){
            u = pSrc[i];
            m = (size_t)(u >> (j<<3)) & 0xff;
            pDst[mIndex[j][m]++] = u;
        }
        pTmp = pSrc;
        pSrc = pDst;
        pDst = pTmp;
    }

    return(pSrc);
}
 
Last edited:
Technology news on Phys.org
You might want to try comparing how the algorithms perform on partially sorted data. It's just a thought.
 
MisterX said:
You might want to try comparing how the algorithms perform on partially sorted data. It's just a thought.
This was previously done with one of the pointer based text file sorts, msortp.cpp, that optionally takes advantage of ascending or descending sequences (change "#define VARSZGRP 0" to "#define VARSZGRP 1" in msortp.cpp). If the data is already sorted, it determines this with one read pass of the data and completes.

http://rcgldr.net/misc/sortp.zip

This wasn't done with the vector / array test sort programs, since the goal at that time was to benchmark the sorts with pseudo random data.

The previous thread didn't have a proper radix sort, so I added it to the zip file and created this thread.
 
Dear Peeps I have posted a few questions about programing on this sectio of the PF forum. I want to ask you veterans how you folks learn program in assembly and about computer architecture for the x86 family. In addition to finish learning C, I am also reading the book From bits to Gates to C and Beyond. In the book, it uses the mini LC3 assembly language. I also have books on assembly programming and computer architecture. The few famous ones i have are Computer Organization and...
I had a Microsoft Technical interview this past Friday, the question I was asked was this : How do you find the middle value for a dataset that is too big to fit in RAM? I was not able to figure this out during the interview, but I have been look in this all weekend and I read something online that said it can be done at O(N) using something called the counting sort histogram algorithm ( I did not learn that in my advanced data structures and algorithms class). I have watched some youtube...

Similar threads

Replies
1
Views
2K
Back
Top