# Homework Help: Code: Return random number less than specifed value

1. Feb 26, 2014

### collinsmark

Code: Random number less than specifed value

This isn't really a homework problem. I'm just doing this for fun and giggles. But given the nature of the forum rules, I'll post it here.

1. The problem statement, all variables and given/known data

Create a method (function) that returns a random number less than the specified input value. Both the input and return value are to be unsigned, 64 bit values (UInt64).

UInt64 NextUInt64(UInt64 maxValue)

The functionality is to be similar to that of MSDN System.Random.Next(Int32 maxValue) method, except for data type UInt64. Also, the following requirements must be met:

Requirements:
• The returned value must be a random number uniform across [0, maxValue - 1]. Non-uniformity due to rounding/truncation errors is not acceptable.
• Returned values must be uncorrelated with each other.
• You are given an existing method NextByte() that returns a random byte, uniform across [0, 255]. It can be assumed for the purposes of this exercise that NextByte() is perfectly uniform with arbitrarily large periodicity (it can be assumed that the returned NextByte() numbers are completely uncorrelated). It is acceptable to utilize this NextByte() method.
• Given the choice between speed and memory, speed is preferred. However, it must not violate the other requirements under any circumstances.

2. Relevant equations

Normal coding stuff.

3. The attempt at a solution

This problem would be easy if maxValue was guaranteed to be power of 2. Simply load up a UInt64 value with random bytes, mask of any unnecessary most significant bits (msbs), and we're finished. But what if maxValue isn't a power of 2?

The first idea that ran through my mind was to scale it. For example, one could code up
$$\mathrm{returned \ value} = \mathrm{(64 \ bit \ random \ number)} \left( \frac{\mathrm{maxValue}}{2^{64}} \right).$$

Of course we couldn't introduce floating point numbers for the scaling factor. Even double precision floating point numbers are not precise enough, and uniformity cannot be guaranteed.

But even with fixed point manipulation there remains a major problem. This is most easily demonstrated by examining a smaller data type. Consider for a moment that we are working with a 2-bit data type. Suppose such a data type existed, UInt2. Let's name the method, UInt2 NextUInt2(UInt2 maxValue). Values of this data type can be 0, 1, 2 and 3. Now consider for a moment that maxValue is 3. That means the method must return uniformly distributed random numbers 0, 1, and 2. Using the scaling approach, we would load in two, random bits, and scale. The scaling involves multiplying times 3 and dividing by 22 = 4. Using that method would produce the following for all combinations of our two random bits:
$$\begin{cases} \mathrm{random \ bits} & \mathrm{scaling \ result} \\ 0 \ 0 & 0 \\ 0 \ 1 & 0 \\ 1 \ 0 & 1 \\ 1 \ 1 & 2 \\ \end{cases}$$
As you can see, since the initial 0, 1, 2, and 3 (bit patterns 00, 01, 10, 11) are equally likely to occur, the scaling approach, with maxValue == 3, produces 0 half the time, and 1 and 2 a quarter of the time. That's definitely not uniform distribution.

Although this scaling error is significantly reduced when using larger data types, it is never eliminated. So this scaling approach will not meet the requirements of set out in the problem statement.

---------------------

So that leaves me with a totally new approach: load up the bits in the candidate random number to the nearest power 2 to maxValue. For example, if maxValue is 5, we create a random number with 3 random bits (it takes 3 bits to represent the number 4); If maxValue is 10, we create a number with 4 random bits (it takes 4 bits to represent 8 and 9). Then we compare the random number to maxValue. If the random number is greater than or equal to maxValue, we throw it away and start over. We continue until the random number is less than maxValue.

I had considered that if the random value is greater than or equal to maxValue, one could simply rearrange the bits, or maybe partially reload the random number rather than load a completely new one. But this approach would place some conditional probability on the next random number. For example if the new number is greater than maxValue, the next number would be more likely to contain '1' (binary representation). This would lead to non-uniformity in the end result. Therefore, if the random number is greater than or equal to maxValue, it's best to start with a completely new random number.

Code (Text):
/// <summary>
/// Returns a nonnegative random unsigned integer that is
/// less than the specified maximum. Format is UInt64 for
/// both the input parameter and the return value.
/// </summary>
/// <param name="maxValue">Specified maximum value,
/// exclusive</param>
/// <returns>Random number of type UInt64</returns>
public UInt64 NextUInt64(UInt64 maxValue)
{
int numRandomBits = 1; // number of random bits needed.
UInt64 returnVal = 0; // Random number to eventually return.
bool randNotReady = true; // Flag for checking number.

// Check for situation where maxValue makes little/no sense.
if (maxValue <= 1)
return returnVal;

// Create the bitMask and calculate number of random bits.
while (bitMask < (maxValue - 1))
{
numRandomBits++;
}

//Do this as many times as it takes, until random number < maxValue.
{
// Load up the number with random bytes.
returnVal = (UInt64)NextByte(); // load a byte.
if (numRandomBits > 8)
returnVal |= (UInt64)((UInt64)NextByte() << 8); // load another.
if (numRandomBits > 16)
returnVal |= (UInt64)((UInt64)NextByte() << 16); // load another.
if (numRandomBits > 24)
returnVal |= (UInt64)((UInt64)NextByte() << 24); // load another.
if (numRandomBits > 32)
returnVal |= (UInt64)((UInt64)NextByte() << 32); // load another.
if (numRandomBits > 40)
returnVal |= (UInt64)((UInt64)NextByte() << 40); // load another.
if (numRandomBits > 48)
returnVal |= (UInt64)((UInt64)NextByte() << 48); // load another.
if (numRandomBits > 56)
returnVal |= (UInt64)((UInt64)NextByte() << 56); // load another.

// Remove any unnessesary msbs.

// Check if maxValue requirment satisfied
if (returnVal < maxValue)
}
return returnVal;  // finished.
}
The method seems to work. But it just seems kind of wasteful to occasionally just throw numbers away like that.

Can anyone think of a better approach that meets the requirements?

Last edited: Feb 26, 2014
2. Feb 26, 2014

### Staff: Mentor

Confined within a loop until the number suits you could see you stuck there for some unpredictable duration.

Suppose you seek a number between 0 and 21. So, in line with your justification, you generate one between 0 and 32. You could see this as providing you with 2 candidates for what you seek: the range when you count up from 0, or when you count down from 31. So if the number returned is 27, it's too large for your purposes, so counting down from 31 you'd call it 4. (Or do something equivalent to map it to 0→21.)

Calling these the lower and upper candidates, keep choosing the lower until it falls out of range, then swap to using the uppers until you hit a snag there, whereupon you return to choosing the lowers.

It sounds promising? At least there won't be any unnavigable loops to get mired in.

3. Feb 26, 2014

### collinsmark

That is correct. And that is my primary apprehension for choosing such an algorithm.

On the other hand [after thinking about it all day], it may not be so bad of an option! I am already using such an approach for my Gaussian normal number method. it uses the Marsaglia polar approach, which in principal can also be stuck indefinitely, since it throws away values outside the unit circle, and continues until it gets a value within.

http://en.wikipedia.org/wiki/Marsaglia_polar_method

Even though in principle it might be stuck indefinitely, the Marsaglia polar approach is more efficient than the more deterministic Box–Muller transform.

----

Anyway, back to this specific approach in this thread.

The fraction of candidates that get discarded is dependent upon how close maxValue is to a power of 2. Worst case, if maxValue is 2n+1, only about half the candidates get discarded. On the other hand, if maxValue is a power of 2, none of the candidates get discarded.

So one might say that on average that only about 1/4 of the candidate numbers get discarded. So it's not unduly expensive.

I think I can prove mathematically that *any* deterministic reuse of the too-large-candidate number will cause a non-uniformity. And I'm pretty sure that even partial reuse risks biases, although that would be harder for me to prove. [Edit: Assuming we're using a finite number of resources, each with finite precision]

Let's examine your example. Let's assume that maxValue is 28 and we're using a [STRIKE]32-bit system[/STRIKE] [Edit: I mean a 5-bit system; a system that can represent numbers from 0 to 31; it has 32 representable numbers] (That's sort of how I understood your example).
• There is a probability of 28/32 that the first candidate will "pass" and not be rejected. From these, the numbers 27 through 0 will be uniformly distributed.
• There is a probability of 4/32 that the first candidate will "fail." Now, if we take that same number and count down from 31 in order to make it pass, numbers in this category will be uniformly distributed from 0 to 3.

And then Bayes theorem comes to bite us in the behind. The overall probability of the end result being 0 to 3 is higher than that of 4 to 27. That's because there are two ways that a number 0 to 3 can be obtained: either from the first candidate number passing, or from one that fails. Numbers 4 to 27 only occur if the first candidate passes.

Last edited: Feb 26, 2014
4. Feb 27, 2014

### Staff: Mentor

I'll ruminate further...

5. Feb 27, 2014

### collinsmark

Take you time. There's no deadline for this. Again, it's just for fun.

I'm still working on it too. I'll post again if I have any updates with further testing.

But I'll allow modification of the 64-bit version to a more simple 8-bit version implementation, if one wishes. Using a smaller data type is where the scaling errors are more pronounced. Using scaling:
$$\mathrm{returned \ value} = \mathrm{(8 \ bit \ random \ number)} \left( \frac{\mathrm{maxValue}}{256} \right)$$
The scaling errors would be very pronounced when using fixed point arithmetic. The 8 bit version would also easier to code up and test.

In the 8 bit version, maxValue can range from 0 to 255 (which makes it a lot easier to test, as opposed to 0 to 264).

Here is an implementation that does not use the scaling method. Rather the code below creates a random number candidate with the same number of bits necessary to represent maxValue - 1, and then throws away candidates that are greater than maxValue, and keeps trying until it finds one less than maxValue. Doing it that way avoids all scaling errors. But it does throw away values occasionally, if maxValue is not a power of 2. It is essentially the same as the code posted in a previous post, but modified for type byte instead of type UInt64.

Code (Text):
/// <summary>
/// Returns a nonnegative random byte (8-bit value) that is
/// less than the specified maximum. Format is byte for
/// both the input parameter and the return value.
/// </summary>
/// <param name="maxValue">Specified maximum value,
/// exclusive. Range is from 0 to 255</param>
/// <returns>Random number of type byte, which is less than
/// the specified value maxValue. Returns maxValue if
/// maxValue is 0.</returns>
public byte NextByte(Byte maxValue)
{
byte returnVal = 0; // Random number to eventually return.
bool randNotReady = true; // Flag for checking number.

// Check for situation where maxValue makes little/no sense.
if (maxValue <= 1)
return returnVal;

// Create the bitMask and calculate number of random bits.
while (bitMask < (maxValue - 1))
{
// Starting from the right, shift a '1' for each
// possible bit in the representation of
// maxValue - 1, and all smaller numbers.
}

//Do this as many times as it takes, until random number < maxValue.
{
// Load up the number with random bytes.
returnVal = NextByte(); // load a byte.

// Remove any unnessesary msbs.

// Check if maxValue requirment satisfied
if (returnVal < maxValue)
}
return returnVal;  // finished.
}
[Edit: Changed the code slightly. As it turns out, there is no need for the "numRandomBits" in this particular 8-bit version. Removed the variable since it was not being used.]

Btw, this whole thread reminds me of this:

Ayn Random

[Source: https://xkcd.com/1277/]

Last edited: Feb 27, 2014
6. Feb 27, 2014

### .Scott

Perhaps you would prefer this (using integer arithmetic of course):
1) Call the input argument "nMax".
2) Call the maximum 64-bit value (hex FFFFFFFFFFFFFFFF) nMax64.
3) Compute limit "nLmt" as follows: (nMax64-nMax+1)/nMax.
4) Get a random 64-bit value "nR64".
5) If nR64/nMax == nLmt, get back to step 4.
6) Return nR64 modulo nMax.

7. Feb 28, 2014

### collinsmark

Hi .Scott, I think your pseudo code might have a bug in it.

When analyzing these algorithms, it's usually easiest to start by transforming the algorithm to one that uses a smaller data type. If there are errors/biases, these errors/biases will be more pronounced when implemented with smaller data types. Also, with smaller data types, we can examine all possible outcomes.

So let's look at your algorithm with smaller data types. In each case, the "specified" value I'll use will always be 3 here. That means we want a random number, equally distributed, between 0, 1, and 2, inclusive (but not 3 or greater).

/////////////////////////
// 2-bit data type.
////////////////////////

Okay, let's say nMax = 3. So we're looking for the program to return 0, 1, or 2, uniformly distributed.

For this analysis, we're only using 2 bits, so "nMax2" is the maximum 2 bit value, 3 (hex 0x3)

rewriting as: (nMax2-nMax+1)/nMax

(3 - 3 + 1)/3

which is evaluated as 0.

So nLmt here is 0.

Since we're only getting 2 bit values here, our initial possibilities for "nR2" are
0
1
2
3
all with equal probability.

So here are our possibilities for nR2/nMax == nLmt comparison:
0/3 == 0 // succeeds!
1/3 == 0 // succeeds!
2/3 == 0 // succeeds!
3/3 == 0 // fails.

So 3 out of 4 times, we go back to step 4. We keep doing that until our nR2 = 3. So after this point it is guaranteed that nR2 = 3.

Returned value is 3 mod 3, which is 0

The returned value is always 0. That doesn't quite work.

Let's try it again, but this time with 4 bits.

////////////////////////
// 4-bit data type
////////////////////////

Okay, let's say nMax = 3. So we're looking for the program to return 0, 1, or 2, uniformly distributed.

For this analysis, we're only using 4 bits, so "nMax4" is the maximum 4 bit value, 15 (hex 0xF)

rewriting as: (nMax4-nMax+1)/nMax

(15 - 3 + 1)/3

which is evaluated as 4.

So nLmt here is 4.

Since we're only getting 4 bit values here, our initial possibilities for "nR4" are
0 // (hex 0x0)
1 // (hex 0x1)
2 // (hex 0x2)
3 // (hex 0x3)
4 // (hex 0x4)
5 // (hex 0x5)
6 // (hex 0x6)
7 // (hex 0x7)
8 // (hex 0x8)
9 // (hex 0x9)
10 // (hex 0xa)
11 // (hex 0xb)
12 // (hex 0xc)
13 // (hex 0xd)
14 // (hex 0xe)
15 // (hex 0xf)
all with equal probability.

So here are our possibilities for nR4/nMax == nLmt comparison:
0/3 == 4 // fails -- 0/3 is 0
1/3 == 4 // fails -- 1/3 is 0
2/3 == 4 // fails -- 2/3 is 0
3/3 == 4 // fails -- 3/3 is 1
4/3 == 4 // fails -- 4/3 is 1
5/3 == 4 // fails -- 5/3 is 1
6/3 == 4 // fails -- 6/3 is 2
7/3 == 4 // fails -- 7/3 is 2
8/3 == 4 // fails -- 8/3 is 2
9/3 == 4 // fails -- 9/3 is 3
10/3 == 4 // fails -- 10/3 is 3
11/3 == 4 // fails -- 11/3 is 3
12/3 == 4 // succeeds! -- 12/3 is 4
13/3 == 4 // succeeds! -- 13/3 is 4
14/3 == 4 // succeeds! -- 14/3 is 4
15/3 == 4 // fails -- 15/3 is 5

So 3 out of 16 times, we go back to step 4. We keep doing that and eliminate the change of our nR4 = 12, 13, or 14. So after this point it is guaranteed that nR4 is one of these values:
0
1
2
3
4
5
6
7
8
9
10
11
15
all with equal probability

finding nR4 mod nMax

0 mod 3 is 0
1 mod 3 is 1
2 mod 3 is 2
3 mod 3 is 0
4 mod 3 is 1
5 mod 3 is 2
6 mod 3 is 0
7 mod 3 is 1
8 mod 3 is 2
9 mod 3 is 0
10 mod 3 is 1
11 mod 3 is 2
15 mod 3 is 0

Histogram[0] = 5
Histogram[1] = 4
Histogram[2] = 4

So that case is better, but still not uniformly distributed.

----

I think with some minor tweaks you can fix the bugs. But you're idea is in principle the same as mine: where we throw away candidate numbers that don't meet a certain criteria.

8. Feb 28, 2014

### .Scott

You're right. Step 5 should be:
5) If nR64/nMax > nLmt, get back to step 4.

The point of this algorithm is to reduce the likelihood that you will have loop by starting out with a random number much larger than what you need.

9. Feb 28, 2014

### collinsmark

Right.

On a related note, I've just implemented a new algorithm. In my previous algorithms (shown in this thread) I was shifting in the number of bytes required to represent nMax-1. But then I was masking off any unnecessary bits of the most significant byte. So I was wasting bits, in addition to any numbers that fail the comparison check.

The new algorithm shifts in only the number of bits required for the comparison. For example, if nMax - 1 needs 13 bits to be represented, then only 13 random bits are shifted in (rather than a full two bytes = 16 bits. The previous algorithm would mask off the 3 msbits of the 16 to make 13 bits before the comparison, but those masked bits would then be wasted). It still throws away candidates that are greater than or equal to nMax, but at least no bits are wasted due to the previous byte constraint.

I'm running some accuracy/profiling tests on it now, comparing it against the scaling approach, and also against Microsoft's .NET System.Random.Next(Int32) method. I should have some results over the weekend.

Last edited: Feb 28, 2014
10. Feb 28, 2014

### Staff: Mentor

I can't see anything wrong with generating your random integer by sequentially setting individual bits of your 64-bit word to random 1 or 0. Though for an example, I'll consider an 8-bit processor, because I can comfortably manage that on my fingers.

Let's try for an even spread of integers 0..103. Starting with all 0's, set the MSB to random [01]. Pretend it's 1.
Is unsigned 10000000 >103? Yes, so reset the MSB to 0.
Set the 2nd bit to random [01]. Assume it's 1.
Is 01000000 > 103? No, so let it stand.
Set the 3rd bit to random [01]. Assume it's 1.
Is 01100000 > 103? No, so let it stand.
Set the 4th bit to random [01]. Assume it's 1.
Is 01110000 > 103? Yes, so reset the 4th bit to 0.
... proceed through the remaining 4 bits, randomly setting each bit but immediately resetting it to a 0 if the resultant integer would lie outsider the desired range 0..103

In an actual implementation you would speed it up by doing most of this in 16-bit blocks when that's what your language's random function returns, but the idea remains the same.

11. Feb 28, 2014

### collinsmark

That would still cause a bias.

Again, suppose the specified input parameter is 3, making the desired outputs uniformly distributed, 0, 1 and 2.

So, using your method it would eventually work its way down to
000000aa where 'a' represents a random bit. This is before we do the comparison check of the final two bits. There are 4 possibilities of the random configuration at this point:
00000000
00000001
00000010
00000011
Each of these cases has an equal probability of occurring. We'll take each one individually.

////////////////////
// Case 00000000
////////////////////
is 00000000 > 2? No, so let it stand. (Returned result: 0)

////////////////////
// Case 00000001
////////////////////
is 00000001 > 2? No, so let it stand. (returned result: 1)

////////////////////
// Case 00000010
////////////////////
is 00000010 > 2? No, so let it stand. (returned result: 2)

////////////////////
// Case 00000011
////////////////////
is 00000011 > 2? Yes, so set second bit to 0.
is 00000001 > 2? No, so let it stand. (Returned result: 1)

Histogram[0] = 1
Histogram[1] = 2
Histogram[2] = 1

Which fails the uniformity test. The algorithm is biased toward '1'.

-------------------------------------------------------------------

Let's try again. Now let's say now that our specified input parameter is 6, meaning we want equal probability of returning 0, 1, 2, 3, 4, 5 (binary, 000, 001, 010, 011, 100, 101, 110, 111).

Again, we work our way down to the following
00000aaa
where'a' is a random bit. Here are the possibilities with equal probability:
00000000
00000001
00000010
00000011
00000100
00000101
00000110
00000111

////////////////////
// Cases:
// 00000000
// 00000001
// 00000010
// 00000011
// 00000100
// 00000101
//////////////////
Without going into detail, these all pass right through. So they each get '1' in their corresponding histograms from this step.

////////////////////
// Case 00000110
////////////////////
is 00000110 > 5? Yes, so set third bit to 0.
is 00000010 > 5? No, so let it stand. (Returned result: 2)

////////////////////
// Case 00000111
////////////////////
is 00000111 > 5? Yes, so set third bit to 0.
is 00000011 > 5? No, so let it stand. (Returned result: 3)

Histogram[0] = 1
Histogram[1] = 1
Histogram[2] = 2
Histogram[3] = 2
Histogram[4] = 1
Histogram[5] = 1

Biased again.

-----------------------------------------

Initial value 5, making our desired numbers 0, 1, 2, 3, and 4

00000000
00000001
00000010
00000011
00000100
all pass through.

00000101 goes to 1
00000110 goes to 2
00000111 goes to 3

Histogram[0] = 1
Histogram[1] = 2
Histogram[2] = 2
Histogram[3] = 2
Histogram[4] = 1

Biased again.

We can show that using this method will produce a bias as long as the initial, specified input parameter is not a power of 2.

12. Feb 28, 2014

### .Scott

But that's the opposite of what you want to do. If you pass the number 9 and you only use four bit, you have a 7/16 chance of having to loop at least a second time. If you get all 64 random bits, then you have less than a 0.0001% of having to loop.

The modulo function will work as long as you discard the very highest 64-bit random values, as I showed above.

13. Feb 28, 2014

### collinsmark

Take a look at my code included in previous posts. I think you have misunderstood my algorithm.

The way I've done it (by way of partial masking of the most significant byte), is as follows. Suppose the specified input value is 9, meaning we want the method to return numbers 0 through 8. We create a bitmask for the number of bits needed to represent 9 - 1. In this case the mask is
...0000001111 (binary)

Since that can be represented with one byte, I only loaded in one byte from the NextByte() method.
aaaaaaaa (binary, where 'a' is a random bit).

Then I perform a bitwise AND operation of the mask with the random byte which produces:
0000aaaa
And that's what gets compared with original input value.

The probability that that gets through on the first pass (for this example is)
00000000 pass
00000001 pass
00000010 pass
00000011 pass
00000100 pass
00000101 pass
00000110 pass
00000111 pass
00001000 pass
00001001 fail. Get another number
00001010 fail. Get another number
00001011 fail. Get another number
00001100 fail. Get another number
00001101 fail. Get another number
00001110 fail. Get another number
00001111 fail. Get another number

Chance of passing: 9/16 in this case.

In the new algorithm, I can specify the number of bits to get from the random number generator rather than load things in one byte at a time. So the new algorithm doesn't even need the masking anymore.

Last edited: Feb 28, 2014
14. Feb 28, 2014

### .Scott

What would be better would be to use the 1-byte random number - a value of 0 to 255. If it is 252 or more, get another byte. Otherwise, perform a modulo 9 operation and return the result. Your chance of having to get another byte would be 4/256, not 7/16.

The reason you need to discard 252, 253, 254, and 255 is that they would cause 0, 1, 2, and 3 to be biased slightly higher than 4, 5, 6, 7, and 8.

Of course, better yet is to gram 8 bytes and check them against FFFFFFFFFFFFFFF4. Then your chance of having to go back for another 8 byte would be very tiny.

15. Feb 28, 2014

### collinsmark

I'm not sure I understand what you are saying.

Could you give a more detailed example?

I don't know what you mean by
That does not make any sense to me.

[Edit: Btw, if the original input value is 9, then in a sense I am discarding 252, 253, 254 and 255 by virtue of the bitmask. All numbers greater than 15 are discarded -- well, actually converted to numbers 0 to 15. This happens automatically by ANDing the random byte with the bitmask, 00001111 (binary) in this case.]

Last edited: Feb 28, 2014
16. Feb 28, 2014

### .Scott

Let's say you have a random number (nBRand) that has a range of nBig (0 to nBig-1) and you want a want one that has a range of nSmall (0 to nSmall-1).
You can use modulo arithmetic, but you need to be careful of bias.

You seem to be using C++, so consider this:
Code (Text):

nSRand = nBRand % nSmall;

Here is where the bias comes in. Taking our example above, let's say we don't discard 252 to 255. We would get this:
Code (Text):

0   1   2   3   4   5   6   7   8
9  10  11  12  13  14  15  16  17
18  19  20  21  22  23  24  25  26
...
243 244 245 246 247 248 249 250 251
252 253 254 255
Notice that all the values in the first column will give you a zero, and the other columns produce other nSRand results. But the first four columns are 1 value too long.

So you need to set a limit on nBRand, we will call it nBLimit. nBLimit must be less than or equal to nBig and must be a multiple of nSmall. The largest number that meets this criteria is 252. So when after we get nRBig, if its 252 or greater, we throw it away and try again.
That gives us this table:
Code (Text):

0   1   2   3   4   5   6   7   8
9  10  11  12  13  14  15  16  17
18  19  20  21  22  23  24  25  26
...
243 244 245 246 247 248 249 250 251

With all the columns of equal length and therefore all nSRand value have and an equal chance of being returned.

17. Feb 28, 2014

### collinsmark

Thank you! That's a wonderfully insightful approach. I never thought of it that way. And you have eliminated non-uniformity! Very nice. You've certainly at least met the criteria of the original post.

While your proposed method greatly reduces the chance of having to throw away a random number, the amount of random bits used is greater in your method, (assuming nSmall < nBig/2, roughly, otherwise your method and mine are identical). And this is particularly important to me since I am no longer constrained to loading random bits in chunks of bytes -- I can now load an arbitrary number of random bits from the generator at any stage.

So if we associate each random bit with a computational "cost" the problem now becomes one of weighted probabilities. Will it cost more to load in more bits up front with a smaller probability of reload (and avoid any overhead with the bit-queue), or less up front with a higher probability of reload.

I looks like I have some math to do. And maybe I'll code this implementation up this weekend and compare it empirically against other methods.

Definitely worthy of investigation though!

Last edited: Feb 28, 2014
18. Feb 28, 2014

### .Scott

The best case is when nSmall is a power of two. Then you are best using exactly the right number of bits.
The worse case is when nSmall is one more than a power of two, then you have about a 50/50 shot at having to ask for another set of bits.

19. Mar 2, 2014

### collinsmark

I have an update!

I've coded up and tested various methods. Since I'm using Microsoft's free version of Visual Studio, I don't have access to fancy profiling tools. But I did some rough profiling by timing how long it takes to perform long operations. Here are the results.

Note that the horizontal axis is not the actual input parameter, but rather n where the input parameter, maxValue, is 2n. In other words, the x-axis actually represents the input parameter maxValue as it varies from 0 to 4294967295 (and shown on a logarithmic scale).

Operations were done using a 32-bit data type. (Yes, I know the original post specified a 64-bit data type, and we've been doing examples with much smaller data types, but I figured 32-bit data type is a pretty good place to start taking data.)

[Edit: I'm not sure why there is a large jump in cost toward the end. It might be because I got home before the test was finished and watched an episode of Star Trek on the same computer running the test. But I'm not sure if that's the cause. The computer has several cores, and the test program only has one thread. So I don't know.]

Here is a summary of the algorithms.

Minimal Bits (throw away method) ______________
This is essentially my original approach, modified only such that masking is eliminated and only the minimum number of random bits are loaded that are necessary to represent maxValue - 1 and smaller numbers. This method uses the smallest amount of bits up front, but has a higher probability of having to throw them away, if the set of random bits corresponds to a number ≥ maxValue.
o Best approach available for many, relatively small values of maxValue.
o Not efficient for moderate to larger values of maxValue.
o Blows chunks if maxValue is slightly higher than a power of 2.
o Highest amount of overhead, involving shifting numbers around, sometimes on a bit-wise basis.

Modulo, nearest byte (.Scott's method) ______________
This is the approach that was essentially introduced by .Scott in post #16. Random bits are loaded in chunks of bytes. The number of bytes loaded are the minimum number that can represent maxValue - 1. Then a modulo operation is performed. This approach uses a larger number of bits up front, but has a smaller probability of having to throw them away (random bits are thrown away only if they correspond to a number greater than {the largest integer multiple of maxValue} minus 1, that can be represented by the loaded bytes).
o Performs very well overall, particularly with smaller values of maxValue.
o Performs relatively poorly if maxValue is smaller than a multiple of one byte, but larger than 1/2 that.; i.e., if 128 (0x0000080) < maxValue < 256 (0x00000100), 32768 (0x00008000) < maxValue < 65536 (0x00010000), and 8388608 (0x00800000) < maxValue < 16777216 (0x01000000).

Modulo, full 32 bits (.Scott's method). ______________
This is the same as above, except that a full 32 bit random number is initially loaded in, regardless of how large the input parameter maxValue happens to be. After that though, the algorithm is identical to that described above.
o Performs very well for larger values of maxValue.
o Low overhead, in terms of shifting numbers around.
o Performs relatively poorly for most smaller values of maxValue.

Scaling method. ______________
This is the
$$\mathrm{returned \ value} = \mathrm{(32 \ bit \ random \ number)} \left( \frac{\mathrm{maxValue}}{2^{32}} \right).$$
approach.
o Superior computational performance for moderate to large values of maxValue.
o Simple, and straightforward implementation, with almost no overhead in terms of shifting numbers around.
o Due to rounding errors, produces results that are biased toward certain numbers. This effect is arguably negligible when using 32 bit datatypes as done here, but would be significant if smaller data types were used.
o Not the most efficient for small values of maxValue, since it uses an entire 32 bit word of data every call.
o Does not meet the criteria of the original post, but is shown anyway for comparison.

#### Attached Files:

• ###### RandomComparison.png
File size:
21.4 KB
Views:
188
Last edited: Mar 2, 2014
20. Mar 2, 2014

### Staff: Mentor

Collinsmark, I'd like your opinion of another method I've been tinkering with. Here's an example.

You can produce a random number by summing a handful of smaller random binary numbers. For example, the sum of
a 10 bit number + a 9 bit number + a 4 bit number + another 4 bit number will produce a result that lies in the range 0..1564

What I'd like confirmed is that there is an equal probability of occurrence for every integer in that range.

21. Mar 2, 2014

### collinsmark

It turns out that no, there wouldn't be an equal probability of all numbers in that range. You would end up some numbers being more likely than others.

For example, let's add a random 1 bit number with another random 1 bit number. Possible outcomes are:

0 + 0 = 0
0 + 1 = 1
1 + 0 = 1
1 + 1 = 2

Which is biased toward 1.

Generally speaking, adding two equally sized, uniform, binary random numbers together -- of any size, but as long as they are equally sized -- will produce a result with a triangle shaped probability distribution.

As you increase the amount of equally sized numbers added together, the sum will eventually attain a Gaussian normal probability distribution. The central limit theorem is in play in this case.

Adding together random binary numbers of different sizes complicates the problem a little. But suffice it to say that the sum of uniformly distributed random numbers is not uniform.

22. Mar 2, 2014

### Staff: Mentor

Thanks. It would have been a neat method, nonetheless.

I have one more technique, probably its basically similar to that in your initial discussion. I'll use decimals to illustrate.

You want integers randomly over 000000..347751
Pick any random 4-digit number, then tack on a prefix selected at random from the range 00..34
But if the maximum prefix is selected, here its 34, you must perform this test:
if prefix==34 then if num > 7751 discard this attempt

It would be more straight forward if it was coded to tack on just a single digit prefix, 0..3 in this case. But you'll end up discarding a lot of attempts. The 2-digit prefix seems a reasonable compromise. I'm confident there is no bias here. It has the same drawback, that of getting waylaid in a black hole for an indeterminate time.

23. Mar 3, 2014

### .Scott

Let's work out what is optimal if the only thing that "costs" is generating a random bit:
We can express our range as (1+f)*2^n where 0<=f<1.
If f==0, then we are working with a power of 2. We extract "n" bits and return them.

When f>0, we need at least n+1 random bits, but we may want to take n+2 or more. Here's how we decide:

If f>0 and we extract n+1 bits, we will call this strategy "1". Then there is a (1-f)/2 chance that we will need to draw bits again. Assuming we stick to this strategy, we will be looking at this series of attempts:
1st try: try rate: 1 (100%); cost="n+1"; success rate=(1+f)/2
2nd try: try rate: (1-f)/2; cost="n+1"; success rate=(1+f)/2
3rd try: try rate: ((1-f)/2)^2; cost="n+1"; success rate=(1+f)/2
...
To compute the average cost for strategy #1 (C1), we can add up the try rates and multiply by the cost per try.
So, with the "retry rate" r=(1-f)/2, the average cost will be:
C1 = (1+r+r^2+r^3+r^4...) (n+1).
Letting q = 1/r = 2/(1-f):
C1 = (1+(q^-1)+(q^-2)+(q^-3)+(q^-4)...) (n+1).
C1 = (1+(1/ (q-1) )) (n+1)
Looking at q-1 and 1/(q-1):
q-1 = ( 2/(1-f) ) -1 = (2-(1-f))/(1-f) = (1+f)/(1-f)
(1/(q-1)) = (1-f)/(1+f)

C1 = (1+ (1/(q-1)) ) (n+1)
C1 = (1+ ((1-f)/(1+f)) ) (n+1)
C1 = ( ((1+f)+(1-f)) / (1+f) ) (n+1)
C1 = ( 2/(1+f) ) (n+1)
C1 = 2(n+1)/(f+1)

If f>0 and we extract n+2 bits, we will call this strategy "2". In this case, our success rate will be:
for f<=(1/3): n+2 bits gives us triple the range we need, so s=(3+3f)/4
for f>(1/3): n+2 bits give us only double the range we need, so s=(2+2f)/4
Assuming we stick to this strategy #2, we will be looking at this series of attempts:
1st try: try rate: 1 (100%); cost="n+2"; success rate=(3+3f)/4 or (2+2f)/4
2nd try: try rate: (1-3f)/4 or (2-2f)/4; cost="n+2"; success rate=(3+3f)/4 or (2+2f)/4
3rd try: try rate: ((1-3f)/4)^2 or ((2-2f)/4)^2; cost="n+2"; success rate=(3+3f)/4 or (2+2f)/4
...
To compute the average cost for strategy #2 (C2), we can add up the try rates and multiply by the cost per try.
So, with the "retry rates" ra=(1-3f)/4 and rb=(2-2f)/4, the average cost will be:
C2 = (1+r+r^2+r^3+r^4...) (n+2).
Letting q = 1/r = 4/(1-3f) or 4/(2-2f):
C2 = (1+(q^-1)+(q^-2)+(q^-3)+(q^-4)...) (n+2).
C2 = (1+(1/ (q-1) )) (n+2)
Looking at q-1 and 1/(q-2):
for f<=(1/3):
qa-1 = ( 4/(1-3f) ) -1 = (4-(1-3f))/(1-3f) = (3+3f)/(1-3f)
(1/(qa-1)) = (1-3f)/(3+3f)
for f>(1/3):
qb-1 = ( 4/(2-2f) ) -1 = (4-(2-2f))/(2-2f) = (2+2f)/(2-2f) = (1+f)/(1-f)
(1/(qb-1)) = (1-f)/(1+f)

We're looking to hold costs to a minimum. When f>(1/3), we have:
C1 = (1+ (1/(q-1)) ) (n+1)
C2 = (1+ (1/(qb-1)) ) (n+2)
But q==qb, so C1 will always win out of C2 when f>(1/3).

When f<(1/3):
C2 = (1+ ( (1-3f)/(3+3f) ) ) (n+2)
C2 = ( ((3+3f)+(1-3f))/(3+3f) ) (n+2)
C2 = ( 4/(3+3f) ) (n+2)
C2 = 4(n+2)/(3+3f)

When f<(1/3), compare:
C1 = 2(n+1)/(f+1)
C2 = 4(n+2)/(3+3f)

Multiplying both by positive value (f+1)/2:
C1(f+1)/2 = (n+1)
C2(f+1)/2 = (2n+4)/3
Subtracting (n+1)
(C1(f+1)/2) - (n+1) = 0
(C2(f+1)/2) - (n+1) = (-n+1)/3

So C2 is no more costly when (-n+1)/3<=0, (-n+1)<=0, (n-1)>=0, n>=1.
Since n>=1 is true when the range is 2 or greater and unimportant when the range is 1, we should use the second method whenever f<(1/3).

Without going through the arithmetic again, I'm guessing the same pattern will follow when we pick n+3 bits. If so, stick with n+2 until you reach f<(1/7); stick with n+3 until you reach f<(1/15); ...

Code that spares every random bit - but does not check for integer overflow.
Code (Text):

//
// For an input nRange, count the bit positions "nNRBits" and 1st bit "nRBit1"
for(nNRBits=0,nRBit0=1; nRBit0>nRange; nNRBits++,nRBit0*=2);
if(nRBit0==nRange) {
return nNRBits random bits
}
nRBit1 = nRBit0/2;
//
// Compute "f", or actually nRBit1*f.
nRangeF = nRange - nRBit1;
//
// Right now "nNRBits" specifies our "n+1".  Check is collecting more bits will reduce our costs.
for(nFraction=3;;) {
if((nFraction*nRangeF)>=nRBit1) break;
nNRBits++;
nRBit0 *= 2;
nFraction += nFraction+1; // sequence 3, 7, 15, ...
}
//
// Compute our random input limit (for preventing bias).
nLimit = nRBit0 - (nRBit0%nRange);

//
// We have our strategy.
for(nRand=nLimit;nRand<nLimit;) nRand = Get nNRBits random bits.
return nRand % nRange;

24. Mar 8, 2014

### collinsmark

Sorry I haven't already responded by now. I was swamped this week.

I beleive the pattern fully breaks down when the Limit value (I call it maxValue) is 342 (meaning, when asking for a set of numbers from 0 to 342, inclusive). (And assuming we're really talking about the same thing.) I'll explain more later.

Honestly, I lost you after the first half of your explanation. But I suspect we're pretty much on the same page. The first half is good at least. I'll explain my method below. But if it helps, I think you usage of "(n+1)" is the same number that I am using for 'n', below.

------------------------

Here is the approach to find the optimal number of "extra" bits. First, allow me to define a few terms.
• maxValue: This is the specified input parameter. The returned random number should be greater than 0 (inclusive) but less than maxValue (exclusive). It's roughly the same idea as Microsoft's .NET random number terminology here: http://msdn.microsoft.com/en-us/library/9b3ta19y%28v=vs.110%29.aspx
• n: Number of bits it takes to represent the largest possible random number, that we might need to return as our output. Put another way, it's the number of bits it takes to represent maxValue - 1.
• k: Number of "extra" bits, beyond n that can make the overall process more efficient. For a given maxValue, there exists an optimal k.
• p[n+k]: Probability that candidate random number, containing n+k bits, will pass the uniformity test on a single try.

I find it helps with examples, so let's try a few.

// ********************
// Example: maxValue = 5.
// ********************

Here, n = 3 since it takes three bits to represent 5-1=4. In other words, out final output numbers can be in binary, 000, 001, 010, 011, and 100. The binary values 101 (decimal 5), 101 (decimal 6) and 111 (decimal 7) will not be an output because theyiare too big. But we still need 3 bits to represent the valid numbers we might return.

If k = 0, the candidate, 3 bit, random numbers are:
Code (Text):

0   1   2   3   4
5   6   7

So if the random number candidate is 0 through 4, we keep it. If it 5 to 7, we throw it away to maintain uniformity. The probability of success of a single candidate in this case is p[n+k] = 5/8.

If k = 1, the candidate, 3+1 bit, random numbers are:
Code (Text):

0    1    2    3    4
5    6    7    8    9
10   11   12   13   14
15

In this case, if the candidate, random number is 15, we throw it away. If it's not 15, we take the (random number candidate) mod maxValue, and that is our output. The probability of success of a single, random number candidate is p[n+k] = 15/16.

// ********************
// Example: maxValue = 3.
// ********************

It takes three bits to represent 00, 01, and 10. So n = 2 here.

With k = 0, candidates are:
Code (Text):

0   1   2
3

And p[n+k] = 3/4.

With k = 1, candidates are:
Code (Text):

0   1   2
3   4   5
6   7

And p[n+k] = 6/8 = 3/4 (doesn't change in this case).

// ********************
// Example: maxValue = 17.
// ********************

Here, n = 5, because it takes five bits to represent 17-1 = 16 (binary 1 0000).

With k = 0, our candidates are from zero, to 25-1 = 31
Code (Text):

0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16
17  18  19  20  21  22  23  24  25  26  27  28  29  30  31

This gives a probability of success on the first candidate, 17/32 = 0.53125. Not a very good probability.

With k = 1, our candidates are from 0 to 26-1 = 63
Code (Text):

0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16
17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33
34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50
51  52  53  54  55  56  57  58  59  60  61  62  63

With that, our probability of a single candidate rises to p[n+k] = (64 - 13)/64 = 0.796875. 'Not great, but much better.

With k = 2, our candidates are from 0 to 27-1 = 127
Code (Text):

0    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16
17   18   19   20   21   22   23   24   25   26   27   28   29   30   31   32   33
34   35   36   37   38   39   40   41   42   43   44   45   46   47   48   49   50
51   52   53   54   55   56   57   58   59   60   61   62   63   64   65   66   67
68   69   70   71   72   73   74   75   76   77   78   79   80   80   82   83   84
85   86   87   88   89   90   91   92   93   94   95   96   97   98   99  100  101
102  103  104  105  106  107  108  109  110  111  112  113  114  115  116  117  118
119  120  121  122  123  124  125  126  127

So if our candidates range from 119 to 127, we throw them away. Probability of success of a single candidate is p[n+k] = (128 - 9)/128 = 0.9296875. That's a much better probability. But it came with a cost, the extra two bits.

The probability of success of a single candidate is
$$p[n+k] = \frac{\mathrm{number \ of \ valid \ candidates}}{\mathrm{number \ of \ total \ candidates}}$$
Noting that the last line in each set of candidates is (total number of candidates) mod (maxValue),
$$p[n+k] = \frac{(\mathrm {number \ of \ total \ candidates}) - ( \mathrm{number \ of \ total \ candidates \ mod \ maxValue)}}{\mathrm{number \ of \ total \ candidates}}$$
And since the total number of candidates is 2n+k
$$p[n+k] = \frac{2^{n+k} - \left( 2^{n+k} \mathrm{mod \ maxValue} \right)}{2^{n+k}}$$

We'll come back to that in a moment. First we need to discuss cost.

We've already defined p as the probability of success of a single candidate. So the probability of failure of a single candidate is 1 - p. Let's define that as r = 1 - p.

The cost of a single candidate is n + k bits.
The cost of two candidates is 2(n + k) bits.
The cost of three candidates is 3(n + k) bits.
and so on.

So the total cost of the operation is

$$\mathrm{Cost} = (\mathrm{probability \ of \ success})(\mathrm{cost \ of \ candidate}) + (\mathrm{probability \ of \ failure})(\mathrm{cost \ of \ failure})$$
Noting that probability of failure of a single candidate is r = 1 - p, and what the cost of candidates are, we can put it all together as,
$$\mathrm{Cost} = p(n+k) + r( p2(n+k) + r( p3(n+k) + r( p4(n+k) + ... )))$$
multiplying the rs through,
$$\mathrm{Cost} = p(n+k) + rp2(n+k) + r^2p3(n+k) + r^3p4(n+k) + ...$$
factoring out (n+k)p,
$$\mathrm{Cost} = (n+k)p \left(1 + 2r + 3r^2 + 4r^3 + 5r^4 + ... \right)$$
And that's obviously
$$\mathrm{Cost} = \frac{n+k}{p}$$
Ha! I'm kidding about it being obvious. So how did I get that? For that I need to step back a little.

Consider the infinite series polynomial,
$$S = 1 + X + X^2 + X^3 + X^4 + X^5 ....$$
We can factor out an X on some of the terms:
$$S = 1 + X(1 + X + X^2 + X^3 + X^4 + X^5 ....)$$
But the stuff in the parentheses is the same thing as S.
$$S = 1 + XS$$
Solving for S gives
$$S = \frac{1}{1 - X}$$
Therefore,
$$1 + X + X^2 + X^3 + X^4 + X^5 .... = \frac{1}{1 - X}$$
Neato. But where does that get us? Well here's the trick. Take the derivative of both sides of the equation with respect to X.
$$\frac{\partial}{\partial X} \left\{ 1 + X + X^2 + X^3 + X^4 + X^5 .... \right\}= \frac{\partial}{\partial X} \left\{ \frac{1}{1 - X} \right\}$$
$$1 + 2X + 3X^2 + 4X^3 + 5X^4 = \frac{1}{\left( 1 - X \right)^2}$$
Now we can use that in our total cost equation.
$$\mathrm{Cost} = \frac{(n+k)p}{(1 - r)^2}$$
And noting that 1 - r = p
$$\mathrm{Cost} = \frac{(n+k)}{p}$$
But we can't stop there. We already know what p is, so let's throw that in.

$$\mathrm{Cost} = \frac{(n+k)2^{n+k}}{2^{n+k} - \left( 2^{n+k} \ \mathrm{mod \ maxValue} \right)}$$

----------
If you go through the numbers on a few examples, you'll find the the denominator seems to reduce to the form of (1)maxValue, (3)maxValue, (7)maxValue, (15)maxValue, ... and so on. One might be tempted to simplify the denominator with $\left(2^{k+1}-1 \right) \mathrm{maxValue}$. But that would be a mistake. The pattern breaks down as n + k become larger. The first instance where the pattern breaks and the optimal k is chosen with the broken pattern, is maxValue = 342 (n = 9, k = 3).
-----------

So I've coded it up. The problem is that if the optimal k is chosen every call, it takes more overhead that it would just to use a full word of random numbers. Even with optimizations such as coding $2^{n+k}$ as (1 << NplusK), and other performance optimizations, it's not enough.

Instead of calculating k every call, I've made a static function that returns a short set of parameters, in the form of a small structure, for a given maxValue. These parameters include maxValue, n+k, and the largest multiple of maxValue, above which we throw candidate numbers away. Then the calling function sends the structure as the input parameter to the Next() function/method (instead of sending maxValue alone as the input parameter). This is typically okay, since most applications use repeat maxValue numbers. And even if not, the calling function will know when it needs to use a different maxValue (the calling function can even store different structures for each maxValue it plans to use, if it wishes).

Preliminary results show it works pretty, lightning fast, compared to the other approaches, particularly for small values of maxValue. I'm still doing some minor performance optimizations, and I'll post back later after running some numbers if it looks interesting.

Last edited: Mar 9, 2014