Troubleshooting Quicksort: Stack Overflow and Random Pivots

  • Thread starter Thread starter KLoux
  • Start date Start date
AI Thread Summary
The discussion revolves around a quicksort implementation that experiences stack overflow issues when sorting larger datasets. The user, Kerry, initially encounters stack overflow errors with arrays of 100 and 1000 elements. Despite using a random pivot to mitigate worst-case performance, the recursion depth becomes problematic, indicating potential inefficiencies in the algorithm.Key insights reveal that the implementation does not effectively reduce the problem size during recursion, as it attempts to sort the same elements multiple times. The suggestion is made to modify the recursive calls to exclude the pivot element, which should resolve the overflow issue. After making these adjustments, Kerry reports a significant reduction in recursion depth, achieving around 70 for 100 elements and successfully sorting larger arrays without overflow. The discussion emphasizes the importance of ensuring that recursive algorithms properly reduce their problem size to avoid inefficiencies and stack overflows.
KLoux
Messages
174
Reaction score
1
Hello. I've been reading up on sorting algorithms, and decided to try writing a quicksort algorithm. After a bit of debugging, I got it working, but when I increase the size of my data set, I get a stack overflow. To sort an array with 100 elements, I sometimes get a stack overflow, but with 1000 elements, I consistently get an overflow.

I know that pivot choice can have an effect on how many recursive calls to my sort I need to make, but I am choosing a random pivot, which many sources say is a good way to reduce the chance of hitting the worst case performance. Is this something anyone else has run into with quicksort? I quick google search suggests I am not the only one to have this problem, but beyond "there is probably something wrong with your algorithm" it hasn't helped much.

Is there something wrong here? What is the size of the stack? Can I change it? Should I change it? I understand if you don't want to debug my code, but I would be very appreciative if you can offer some general suggestions.

Here is my code:
Code:
void DoQuickSort(int *Array, const int ArraySize, bool HighToLow)
{
	// If there is nothing to sort, don't do anything
	if (ArraySize <= 1)
		return;

	// Pick a random pivot
	int PivotLocation = rand() % ArraySize;
	int Pivot = Array[PivotLocation];

	// Swap the pivot with the number at the beginning of the array
	int SwapValue = Array[ArraySize - 1];
	Array[ArraySize - 1] = Pivot;
	Array[PivotLocation] = SwapValue;

	// Split the array into two groups:  Higher than pivot and lower than pivot
	int SortedElements = 0;
	int NumberOfTopElements = 0;
	while (SortedElements < ArraySize)
	{
		// Check to see if the element is larger or smaller than the pivot
		// and if we are sorting in ascending or descending order
		if ((Array[SortedElements] > Pivot && HighToLow) ||
			(Array[SortedElements] < Pivot && !HighToLow))
		{
			// Swap this element with the next element toward the top of the list
			SwapValue = Array[NumberOfTopElements];
			Array[NumberOfTopElements] = Array[SortedElements];
			Array[SortedElements] = SwapValue;
			NumberOfTopElements++;
		}

		// Increment the number of elements we've sorted
		SortedElements++;
	}

	// Place our pivot in the right location (this is the last time the pivot will be moved)
	SwapValue = Array[NumberOfTopElements];
	Array[NumberOfTopElements] = Array[ArraySize - 1];
	Array[ArraySize - 1] = SwapValue;

	// Call this recursively on the two new lists
	DoQuickSort(Array, NumberOfTopElements, HighToLow);
	DoQuickSort(Array + NumberOfTopElements, ArraySize - NumberOfTopElements, HighToLow);

	return;
}

Thanks in advance!

-Kerry
 
Technology news on Phys.org
First of all - trace how deep is the recursion. Add external variable int RecursionDepth, increase it on entry, decrese it on exit, output the value after incrementation.

And yes, you can change stack depth. Details will depend on the compiler/linker, but it is possible.
 
Ok, I tried this for a set of 100 random numbers. Roughly 90% of the time, the recursion depth maxes out ~150-170, but the rest of the time it climbs as high as ~7360 before I get the stack overflow.

Does this imply that I'm just unlucky? Worst case for quicksort is n^2, which would be 10,000, right? So long as I'm not exceeding that, there's nothing obviously wrong with the code I guess.

What are the pros/cons for increasing the size of the stack?

-Kerry
 
An important point to note -- your implementation is not O(n^2). As is tipped off by the experiments just performed - having recursion depths of greater than 100 for a 100 element array means you are trying to sort the same sized arrays multiple times.

Observe that you recursively sort the lists of the elements [1,2,...,i-1] and [i,i+1,...,n]. That is, you aren't actually reducing the problem size in the recursive step (which is pretty much always what you want to do with a recursive algorithm). Your comments correctly note that the pivot element (which I call element i) is always placed in the final place, so you should not be recursively sorting on it. That is, (quicksort is designed to) recursively sort [1,2,...,i-1] and [i+1,...,n].

EDIT:
Once making these changes, you shouldn't get any more stack size issues for problems of your size. I encourage you to measure the recursion depths after correcting your code. You should note that the depth should definitely be smaller than the problem size and can get as small as around lg(n) - around 7 for 100 elements.

In general, you should not be increasing the stack size. When performing recursion that causes stack overflows, it is in general important to look for buggy code before trying to increase the stack size. Furthermore, if you come up against stack size limitations due to recursion, you will likely be at a point where the overhead of making recursive calls is no longer negligible and will be adversely affecting the efficiency of your algorithm.
 
Last edited:
It seems clear to me you have a bug where occasionally your sort will enter an infinite loop and never stop. If 150-170 is normal, then I don't think you could get 7360 just through bad luck-- even with quicksort. The times that it overflowed at 7360, the actual number of iterations for that run was probably just "infinite".

Without looking at the code too closely what jumps out at me is this NumberOfTopElements variable. Is it possible that it could be ever equal to either 0, or ArraySize? It seems like in that case you'd have DoQuickSort calling itself with the exact same arguments it was itself originally called with. This isn't necessarily the problem but it could be something of this sort. I suggest adding printf()s before your recursive DoQuickSort calls flagging what arguments you're about to call them with so if there is a case where it gets stuck doing one thing over and over it will be more obvious...
 
Thanks for your help! Mark was right - I didn't separate my pivot from the both sub-lists and was not effectively reducing the problem size. I fixed the problem and the recursion depth is ~70 for lists with 100 elements and I can sort much larger arrays without overflowing the stack. I appreciate your help!

-Kerry
 
Thanks, Jeff - those are some really great threads...
 
Back
Top