How does radix sort perform on partially sorted data?

  • Thread starter rcgldr
  • Start date
  • Tags
    Sorting
In summary, radix sort is a sorting algorithm that performs well on partially sorted data. This is because it sorts the data by examining one digit at a time, starting from the least significant digit. This allows it to efficiently sort data that is already partially sorted, as it only needs to look at the remaining digits to determine the correct order. However, if the data is not partially sorted and contains a large number of unique values, radix sort may not be the most efficient sorting algorithm.
  • #1
rcgldr
Homework Helper
8,855
632
Added radix sort to sort examples in:

http://rcgldr.net/misc/sortv.zip

Results on my system:

Code:
Time to sort 4,194,304 64 bit elements of psuedo random data:

radix sort                 rsortv.cpp    203 ms
hybrid (radix+merge) sort  hsortv.cpp    265 ms
merge sort                 msortv.cpp    297 ms
microsoft sort             sortv.cpp     344 ms
microsoft stable_sort      sortv.cpp     375 ms

64 bit mode, Windows XP64, Visual Studio 2005 compiler.
Intel i7 2600K 3.4ghz, DP67BG motherboard, 4GB ram

Previous threads:

https://www.physicsforums.com/showthread.php?t=218369

https://www.physicsforums.com/showthread.php?t=181589

Code for radix sort function example - it makes one pre-pass over the data array, generating an array of histograms (256 entries for each histogram), then coverts the histograms into the starting indices for each "bucket". Each pass places data into "buckets" based on the 8 bit value of the data, and the passes go from least signficant byte to most significant byte. In this example, 1 pre pass and 8 sort passes are used to sort an array of 64 bit unsigned integers. Only one temporary array (the same size as the original array) is required for the radix sort, using variable size "buckets".

Code:
typedef unsigned __int64    UI64;
typedef unsigned __int64 *  PUI64;

PUI64 RadixSort(PUI64 pData, PUI64 pTemp, size_t count)
{
size_t mIndex[8][256] = {0};            // index matrix
PUI64 pDst, pSrc, pTmp;
size_t i,j,m,n;
UI64 u;

    for(i = 0; i < count; i++){         // generate histograms
        u = pData[i];
        for(j = 0; j < 8; j++){
            mIndex[j][(size_t)(u & 0xff)]++;
            u >>= 8;
        }       
    }
    for(j = 0; j < 8; j++){             // convert to indices
        n = 0;
        for(i = 0; i < 256; i++){
            m = mIndex[j][i];
            mIndex[j][i] = n;
            n += m;
        }       
    }

    pDst = pTemp;                       // radix sort
    pSrc = pData;
    for(j = 0; j < 8; j++){
        for(i = 0; i < count; i++){
            u = pSrc[i];
            m = (size_t)(u >> (j<<3)) & 0xff;
            pDst[mIndex[j][m]++] = u;
        }
        pTmp = pSrc;
        pSrc = pDst;
        pDst = pTmp;
    }

    return(pSrc);
}
 
Last edited:
Technology news on Phys.org
  • #2
You might want to try comparing how the algorithms perform on partially sorted data. It's just a thought.
 
  • #3
MisterX said:
You might want to try comparing how the algorithms perform on partially sorted data. It's just a thought.
This was previously done with one of the pointer based text file sorts, msortp.cpp, that optionally takes advantage of ascending or descending sequences (change "#define VARSZGRP 0" to "#define VARSZGRP 1" in msortp.cpp). If the data is already sorted, it determines this with one read pass of the data and completes.

http://rcgldr.net/misc/sortp.zip

This wasn't done with the vector / array test sort programs, since the goal at that time was to benchmark the sorts with pseudo random data.

The previous thread didn't have a proper radix sort, so I added it to the zip file and created this thread.
 

1. What is the purpose of sorting revisited again?

The purpose of sorting revisited again is to review and analyze different sorting algorithms and techniques in order to determine the most efficient and effective methods for organizing data.

2. How does sorting revisited again differ from traditional sorting techniques?

Sorting revisited again involves a more in-depth analysis and comparison of sorting algorithms, while traditional sorting techniques typically focus on implementing a single algorithm to organize data.

3. What are some common examples of sorting algorithms discussed in sorting revisited again?

Some common examples of sorting algorithms discussed in sorting revisited again include bubble sort, merge sort, quicksort, insertion sort, and selection sort.

4. How can sorting revisited again benefit my research or work as a scientist?

Sorting revisited again can benefit your research or work as a scientist by providing a deeper understanding of sorting techniques and their applications, which can help you make more informed decisions when organizing and analyzing data.

5. Is sorting revisited again relevant to all fields of science?

Yes, sorting revisited again is relevant to all fields of science as data organization and analysis is a crucial aspect of scientific research and experimentation.

Similar threads

  • Programming and Computer Science
Replies
1
Views
2K
  • Programming and Computer Science
Replies
2
Views
3K
Back
Top