Troubleshooting Quicksort: Stack Overflow and Random Pivots

  • Thread starter Thread starter KLoux
  • Start date Start date
Click For Summary

Discussion Overview

The discussion revolves around troubleshooting a quicksort implementation that experiences stack overflow issues when sorting larger data sets. Participants explore the effects of pivot selection, recursion depth, and potential bugs in the code, while sharing insights on algorithm performance and debugging strategies.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Kerry describes encountering stack overflow when sorting arrays of 100 and 1000 elements, despite using a random pivot to mitigate worst-case performance.
  • One participant suggests tracing the recursion depth to understand the behavior of the algorithm better and mentions that stack depth can be adjusted depending on the compiler/linker.
  • Kerry reports varying recursion depths, with most cases maxing out around 150-170, but occasionally reaching as high as 7360, raising questions about the algorithm's efficiency.
  • Another participant argues that the implementation is not O(n^2) and points out that the recursion may not be effectively reducing the problem size, leading to repeated sorting of the same elements.
  • Concerns are raised about a potential infinite loop in the code, particularly regarding the handling of the NumberOfTopElements variable, which could lead to recursive calls with unchanged arguments.
  • Kerry acknowledges the identified bug related to not separating the pivot from the sub-lists and reports improved performance after making corrections to the code.

Areas of Agreement / Disagreement

Participants generally agree that there were issues with the implementation that led to stack overflow, particularly regarding the handling of the pivot and recursion. However, there remains some uncertainty about the implications of recursion depth and the overall efficiency of the algorithm.

Contextual Notes

The discussion highlights limitations in the original code related to recursion and pivot handling, which may have contributed to stack overflow issues. The effectiveness of the quicksort algorithm is contingent on properly reducing problem size during recursive calls.

Who May Find This Useful

This discussion may be useful for individuals interested in sorting algorithms, debugging recursive functions, and understanding the implications of pivot selection in quicksort implementations.

KLoux
Messages
174
Reaction score
1
Hello. I've been reading up on sorting algorithms, and decided to try writing a quicksort algorithm. After a bit of debugging, I got it working, but when I increase the size of my data set, I get a stack overflow. To sort an array with 100 elements, I sometimes get a stack overflow, but with 1000 elements, I consistently get an overflow.

I know that pivot choice can have an effect on how many recursive calls to my sort I need to make, but I am choosing a random pivot, which many sources say is a good way to reduce the chance of hitting the worst case performance. Is this something anyone else has run into with quicksort? I quick google search suggests I am not the only one to have this problem, but beyond "there is probably something wrong with your algorithm" it hasn't helped much.

Is there something wrong here? What is the size of the stack? Can I change it? Should I change it? I understand if you don't want to debug my code, but I would be very appreciative if you can offer some general suggestions.

Here is my code:
Code:
void DoQuickSort(int *Array, const int ArraySize, bool HighToLow)
{
	// If there is nothing to sort, don't do anything
	if (ArraySize <= 1)
		return;

	// Pick a random pivot
	int PivotLocation = rand() % ArraySize;
	int Pivot = Array[PivotLocation];

	// Swap the pivot with the number at the beginning of the array
	int SwapValue = Array[ArraySize - 1];
	Array[ArraySize - 1] = Pivot;
	Array[PivotLocation] = SwapValue;

	// Split the array into two groups:  Higher than pivot and lower than pivot
	int SortedElements = 0;
	int NumberOfTopElements = 0;
	while (SortedElements < ArraySize)
	{
		// Check to see if the element is larger or smaller than the pivot
		// and if we are sorting in ascending or descending order
		if ((Array[SortedElements] > Pivot && HighToLow) ||
			(Array[SortedElements] < Pivot && !HighToLow))
		{
			// Swap this element with the next element toward the top of the list
			SwapValue = Array[NumberOfTopElements];
			Array[NumberOfTopElements] = Array[SortedElements];
			Array[SortedElements] = SwapValue;
			NumberOfTopElements++;
		}

		// Increment the number of elements we've sorted
		SortedElements++;
	}

	// Place our pivot in the right location (this is the last time the pivot will be moved)
	SwapValue = Array[NumberOfTopElements];
	Array[NumberOfTopElements] = Array[ArraySize - 1];
	Array[ArraySize - 1] = SwapValue;

	// Call this recursively on the two new lists
	DoQuickSort(Array, NumberOfTopElements, HighToLow);
	DoQuickSort(Array + NumberOfTopElements, ArraySize - NumberOfTopElements, HighToLow);

	return;
}

Thanks in advance!

-Kerry
 
Technology news on Phys.org
First of all - trace how deep is the recursion. Add external variable int RecursionDepth, increase it on entry, decrese it on exit, output the value after incrementation.

And yes, you can change stack depth. Details will depend on the compiler/linker, but it is possible.
 
Ok, I tried this for a set of 100 random numbers. Roughly 90% of the time, the recursion depth maxes out ~150-170, but the rest of the time it climbs as high as ~7360 before I get the stack overflow.

Does this imply that I'm just unlucky? Worst case for quicksort is n^2, which would be 10,000, right? So long as I'm not exceeding that, there's nothing obviously wrong with the code I guess.

What are the pros/cons for increasing the size of the stack?

-Kerry
 
An important point to note -- your implementation is not O(n^2). As is tipped off by the experiments just performed - having recursion depths of greater than 100 for a 100 element array means you are trying to sort the same sized arrays multiple times.

Observe that you recursively sort the lists of the elements [1,2,...,i-1] and [i,i+1,...,n]. That is, you aren't actually reducing the problem size in the recursive step (which is pretty much always what you want to do with a recursive algorithm). Your comments correctly note that the pivot element (which I call element i) is always placed in the final place, so you should not be recursively sorting on it. That is, (quicksort is designed to) recursively sort [1,2,...,i-1] and [i+1,...,n].

EDIT:
Once making these changes, you shouldn't get any more stack size issues for problems of your size. I encourage you to measure the recursion depths after correcting your code. You should note that the depth should definitely be smaller than the problem size and can get as small as around lg(n) - around 7 for 100 elements.

In general, you should not be increasing the stack size. When performing recursion that causes stack overflows, it is in general important to look for buggy code before trying to increase the stack size. Furthermore, if you come up against stack size limitations due to recursion, you will likely be at a point where the overhead of making recursive calls is no longer negligible and will be adversely affecting the efficiency of your algorithm.
 
Last edited:
It seems clear to me you have a bug where occasionally your sort will enter an infinite loop and never stop. If 150-170 is normal, then I don't think you could get 7360 just through bad luck-- even with quicksort. The times that it overflowed at 7360, the actual number of iterations for that run was probably just "infinite".

Without looking at the code too closely what jumps out at me is this NumberOfTopElements variable. Is it possible that it could be ever equal to either 0, or ArraySize? It seems like in that case you'd have DoQuickSort calling itself with the exact same arguments it was itself originally called with. This isn't necessarily the problem but it could be something of this sort. I suggest adding printf()s before your recursive DoQuickSort calls flagging what arguments you're about to call them with so if there is a case where it gets stuck doing one thing over and over it will be more obvious...
 
Thanks for your help! Mark was right - I didn't separate my pivot from the both sub-lists and was not effectively reducing the problem size. I fixed the problem and the recursion depth is ~70 for lists with 100 elements and I can sort much larger arrays without overflowing the stack. I appreciate your help!

-Kerry
 
Thanks, Jeff - those are some really great threads...
 

Similar threads

  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 31 ·
2
Replies
31
Views
3K
  • · Replies 5 ·
Replies
5
Views
14K
  • · Replies 17 ·
Replies
17
Views
4K
  • · Replies 29 ·
Replies
29
Views
4K
Replies
12
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K