Stack Overflow with Quicksort

1. Apr 9, 2009

KLoux

Hello. I've been reading up on sorting algorithms, and decided to try writing a quicksort algorithm. After a bit of debugging, I got it working, but when I increase the size of my data set, I get a stack overflow. To sort an array with 100 elements, I sometimes get a stack overflow, but with 1000 elements, I consistently get an overflow.

I know that pivot choice can have an effect on how many recursive calls to my sort I need to make, but I am choosing a random pivot, which many sources say is a good way to reduce the chance of hitting the worst case performance. Is this something anyone else has run into with quicksort? I quick google search suggests I am not the only one to have this problem, but beyond "there is probably something wrong with your algorithm" it hasn't helped much.

Is there something wrong here? What is the size of the stack? Can I change it? Should I change it? I understand if you don't want to debug my code, but I would be very appreciative if you can offer some general suggestions.

Here is my code:
Code (Text):

void DoQuickSort(int *Array, const int ArraySize, bool HighToLow)
{
// If there is nothing to sort, don't do anything
if (ArraySize <= 1)
return;

// Pick a random pivot
int PivotLocation = rand() % ArraySize;
int Pivot = Array[PivotLocation];

// Swap the pivot with the number at the beginning of the array
int SwapValue = Array[ArraySize - 1];
Array[ArraySize - 1] = Pivot;
Array[PivotLocation] = SwapValue;

// Split the array into two groups:  Higher than pivot and lower than pivot
int SortedElements = 0;
int NumberOfTopElements = 0;
while (SortedElements < ArraySize)
{
// Check to see if the element is larger or smaller than the pivot
// and if we are sorting in ascending or descending order
if ((Array[SortedElements] > Pivot && HighToLow) ||
(Array[SortedElements] < Pivot && !HighToLow))
{
// Swap this element with the next element toward the top of the list
SwapValue = Array[NumberOfTopElements];
Array[NumberOfTopElements] = Array[SortedElements];
Array[SortedElements] = SwapValue;
NumberOfTopElements++;
}

// Increment the number of elements we've sorted
SortedElements++;
}

// Place our pivot in the right location (this is the last time the pivot will be moved)
SwapValue = Array[NumberOfTopElements];
Array[NumberOfTopElements] = Array[ArraySize - 1];
Array[ArraySize - 1] = SwapValue;

// Call this recursively on the two new lists
DoQuickSort(Array, NumberOfTopElements, HighToLow);
DoQuickSort(Array + NumberOfTopElements, ArraySize - NumberOfTopElements, HighToLow);

return;
}

-Kerry

2. Apr 9, 2009

Staff: Mentor

First of all - trace how deep is the recursion. Add external variable int RecursionDepth, increase it on entry, decrese it on exit, output the value after incrementation.

And yes, you can change stack depth. Details will depend on the compiler/linker, but it is possible.

3. Apr 9, 2009

KLoux

Ok, I tried this for a set of 100 random numbers. Roughly 90% of the time, the recursion depth maxes out ~150-170, but the rest of the time it climbs as high as ~7360 before I get the stack overflow.

Does this imply that I'm just unlucky? Worst case for quicksort is n^2, which would be 10,000, right? So long as I'm not exceeding that, there's nothing obviously wrong with the code I guess.

What are the pros/cons for increasing the size of the stack?

-Kerry

4. Apr 9, 2009

Mark512

An important point to note -- your implementation is not O(n^2). As is tipped off by the experiments just performed - having recursion depths of greater than 100 for a 100 element array means you are trying to sort the same sized arrays multiple times.

Observe that you recursively sort the lists of the elements [1,2,...,i-1] and [i,i+1,...,n]. That is, you aren't actually reducing the problem size in the recursive step (which is pretty much always what you want to do with a recursive algorithm). Your comments correctly note that the pivot element (which I call element i) is always placed in the final place, so you should not be recursively sorting on it. That is, (quicksort is designed to) recursively sort [1,2,...,i-1] and [i+1,...,n].

EDIT:
Once making these changes, you shouldn't get any more stack size issues for problems of your size. I encourage you to measure the recursion depths after correcting your code. You should note that the depth should definitely be smaller than the problem size and can get as small as around lg(n) - around 7 for 100 elements.

In general, you should not be increasing the stack size. When performing recursion that causes stack overflows, it is in general important to look for buggy code before trying to increase the stack size. Furthermore, if you come up against stack size limitations due to recursion, you will likely be at a point where the overhead of making recursive calls is no longer negligible and will be adversely affecting the efficiency of your algorithm.

Last edited: Apr 10, 2009
5. Apr 10, 2009

Coin

It seems clear to me you have a bug where occasionally your sort will enter an infinite loop and never stop. If 150-170 is normal, then I don't think you could get 7360 just through bad luck-- even with quicksort. The times that it overflowed at 7360, the actual number of iterations for that run was probably just "infinite".

Without looking at the code too closely what jumps out at me is this NumberOfTopElements variable. Is it possible that it could be ever equal to either 0, or ArraySize? It seems like in that case you'd have DoQuickSort calling itself with the exact same arguments it was itself originally called with. This isn't necessarily the problem but it could be something of this sort. I suggest adding printf()s before your recursive DoQuickSort calls flagging what arguments you're about to call them with so if there is a case where it gets stuck doing one thing over and over it will be more obvious...

6. Apr 10, 2009

KLoux

Thanks for your help! Mark was right - I didn't separate my pivot from the both sub-lists and was not effectively reducing the problem size. I fixed the problem and the recursion depth is ~70 for lists with 100 elements and I can sort much larger arrays without overflowing the stack. I appreciate your help!

-Kerry

7. Apr 10, 2009

rcgldr

8. Apr 10, 2009

KLoux

Thanks, Jeff - those are some really great threads...