- #1
- 3,405
- 2,910
Code: Random number less than specifed value
This isn't really a homework problem. I'm just doing this for fun and giggles. But given the nature of the forum rules, I'll post it here.
Create a method (function) that returns a random number less than the specified input value. Both the input and return value are to be unsigned, 64 bit values (UInt64).
UInt64 NextUInt64(UInt64 maxValue)
The functionality is to be similar to that of MSDN System.Random.Next(Int32 maxValue) method, except for data type UInt64. Also, the following requirements must be met:
Requirements:
Normal coding stuff.
This problem would be easy if maxValue was guaranteed to be power of 2. Simply load up a UInt64 value with random bytes, mask of any unnecessary most significant bits (msbs), and we're finished. But what if maxValue isn't a power of 2?
The first idea that ran through my mind was to scale it. For example, one could code up
[tex] \mathrm{returned \ value} = \mathrm{(64 \ bit \ random \ number)} \left( \frac{\mathrm{maxValue}}{2^{64}} \right). [/tex]
Of course we couldn't introduce floating point numbers for the scaling factor. Even double precision floating point numbers are not precise enough, and uniformity cannot be guaranteed.
But even with fixed point manipulation there remains a major problem. This is most easily demonstrated by examining a smaller data type. Consider for a moment that we are working with a 2-bit data type. Suppose such a data type existed, UInt2. Let's name the method, UInt2 NextUInt2(UInt2 maxValue). Values of this data type can be 0, 1, 2 and 3. Now consider for a moment that maxValue is 3. That means the method must return uniformly distributed random numbers 0, 1, and 2. Using the scaling approach, we would load in two, random bits, and scale. The scaling involves multiplying times 3 and dividing by 22 = 4. Using that method would produce the following for all combinations of our two random bits:
[tex] \begin{cases}
\mathrm{random \ bits} & \mathrm{scaling \ result} \\
0 \ 0 & 0 \\
0 \ 1 & 0 \\
1 \ 0 & 1 \\
1 \ 1 & 2 \\
\end{cases}[/tex]
As you can see, since the initial 0, 1, 2, and 3 (bit patterns 00, 01, 10, 11) are equally likely to occur, the scaling approach, with maxValue == 3, produces 0 half the time, and 1 and 2 a quarter of the time. That's definitely not uniform distribution.
Although this scaling error is significantly reduced when using larger data types, it is never eliminated. So this scaling approach will not meet the requirements of set out in the problem statement.
---------------------
So that leaves me with a totally new approach: load up the bits in the candidate random number to the nearest power 2 to maxValue. For example, if maxValue is 5, we create a random number with 3 random bits (it takes 3 bits to represent the number 4); If maxValue is 10, we create a number with 4 random bits (it takes 4 bits to represent 8 and 9). Then we compare the random number to maxValue. If the random number is greater than or equal to maxValue, we throw it away and start over. We continue until the random number is less than maxValue.
I had considered that if the random value is greater than or equal to maxValue, one could simply rearrange the bits, or maybe partially reload the random number rather than load a completely new one. But this approach would place some conditional probability on the next random number. For example if the new number is greater than maxValue, the next number would be more likely to contain '1' (binary representation). This would lead to non-uniformity in the end result. Therefore, if the random number is greater than or equal to maxValue, it's best to start with a completely new random number.
The method seems to work. But it just seems kind of wasteful to occasionally just throw numbers away like that.
Can anyone think of a better approach that meets the requirements?
This isn't really a homework problem. I'm just doing this for fun and giggles. But given the nature of the forum rules, I'll post it here.
Homework Statement
Create a method (function) that returns a random number less than the specified input value. Both the input and return value are to be unsigned, 64 bit values (UInt64).
UInt64 NextUInt64(UInt64 maxValue)
The functionality is to be similar to that of MSDN System.Random.Next(Int32 maxValue) method, except for data type UInt64. Also, the following requirements must be met:
Requirements:
- The returned value must be a random number uniform across [0, maxValue - 1]. Non-uniformity due to rounding/truncation errors is not acceptable.
- Returned values must be uncorrelated with each other.
- You are given an existing method NextByte() that returns a random byte, uniform across [0, 255]. It can be assumed for the purposes of this exercise that NextByte() is perfectly uniform with arbitrarily large periodicity (it can be assumed that the returned NextByte() numbers are completely uncorrelated). It is acceptable to utilize this NextByte() method.
- Given the choice between speed and memory, speed is preferred. However, it must not violate the other requirements under any circumstances.
Homework Equations
Normal coding stuff.
The Attempt at a Solution
This problem would be easy if maxValue was guaranteed to be power of 2. Simply load up a UInt64 value with random bytes, mask of any unnecessary most significant bits (msbs), and we're finished. But what if maxValue isn't a power of 2?
The first idea that ran through my mind was to scale it. For example, one could code up
[tex] \mathrm{returned \ value} = \mathrm{(64 \ bit \ random \ number)} \left( \frac{\mathrm{maxValue}}{2^{64}} \right). [/tex]
Of course we couldn't introduce floating point numbers for the scaling factor. Even double precision floating point numbers are not precise enough, and uniformity cannot be guaranteed.
But even with fixed point manipulation there remains a major problem. This is most easily demonstrated by examining a smaller data type. Consider for a moment that we are working with a 2-bit data type. Suppose such a data type existed, UInt2. Let's name the method, UInt2 NextUInt2(UInt2 maxValue). Values of this data type can be 0, 1, 2 and 3. Now consider for a moment that maxValue is 3. That means the method must return uniformly distributed random numbers 0, 1, and 2. Using the scaling approach, we would load in two, random bits, and scale. The scaling involves multiplying times 3 and dividing by 22 = 4. Using that method would produce the following for all combinations of our two random bits:
[tex] \begin{cases}
\mathrm{random \ bits} & \mathrm{scaling \ result} \\
0 \ 0 & 0 \\
0 \ 1 & 0 \\
1 \ 0 & 1 \\
1 \ 1 & 2 \\
\end{cases}[/tex]
As you can see, since the initial 0, 1, 2, and 3 (bit patterns 00, 01, 10, 11) are equally likely to occur, the scaling approach, with maxValue == 3, produces 0 half the time, and 1 and 2 a quarter of the time. That's definitely not uniform distribution.
Although this scaling error is significantly reduced when using larger data types, it is never eliminated. So this scaling approach will not meet the requirements of set out in the problem statement.
---------------------
So that leaves me with a totally new approach: load up the bits in the candidate random number to the nearest power 2 to maxValue. For example, if maxValue is 5, we create a random number with 3 random bits (it takes 3 bits to represent the number 4); If maxValue is 10, we create a number with 4 random bits (it takes 4 bits to represent 8 and 9). Then we compare the random number to maxValue. If the random number is greater than or equal to maxValue, we throw it away and start over. We continue until the random number is less than maxValue.
I had considered that if the random value is greater than or equal to maxValue, one could simply rearrange the bits, or maybe partially reload the random number rather than load a completely new one. But this approach would place some conditional probability on the next random number. For example if the new number is greater than maxValue, the next number would be more likely to contain '1' (binary representation). This would lead to non-uniformity in the end result. Therefore, if the random number is greater than or equal to maxValue, it's best to start with a completely new random number.
Code:
/// <summary>
/// Returns a nonnegative random unsigned integer that is
/// less than the specified maximum. Format is UInt64 for
/// both the input parameter and the return value.
/// </summary>
/// <param name="maxValue">Specified maximum value,
/// exclusive</param>
/// <returns>Random number of type UInt64</returns>
public UInt64 NextUInt64(UInt64 maxValue)
{
int numRandomBits = 1; // number of random bits needed.
UInt64 bitMask = 1; // bit mask
UInt64 returnVal = 0; // Random number to eventually return.
bool randNotReady = true; // Flag for checking number.
// Check for situation where maxValue makes little/no sense.
if (maxValue <= 1)
return returnVal;
// Create the bitMask and calculate number of random bits.
while (bitMask < (maxValue - 1))
{
bitMask = (UInt64)((UInt64)((UInt64)(bitMask << 1)) | 1);
numRandomBits++;
}
//Do this as many times as it takes, until random number < maxValue.
while (randNotReady)
{
// Load up the number with random bytes.
returnVal = (UInt64)NextByte(); // load a byte.
if (numRandomBits > 8)
returnVal |= (UInt64)((UInt64)NextByte() << 8); // load another.
if (numRandomBits > 16)
returnVal |= (UInt64)((UInt64)NextByte() << 16); // load another.
if (numRandomBits > 24)
returnVal |= (UInt64)((UInt64)NextByte() << 24); // load another.
if (numRandomBits > 32)
returnVal |= (UInt64)((UInt64)NextByte() << 32); // load another.
if (numRandomBits > 40)
returnVal |= (UInt64)((UInt64)NextByte() << 40); // load another.
if (numRandomBits > 48)
returnVal |= (UInt64)((UInt64)NextByte() << 48); // load another.
if (numRandomBits > 56)
returnVal |= (UInt64)((UInt64)NextByte() << 56); // load another.
// Remove any unnessesary msbs.
returnVal &= bitMask;
// Check if maxValue requirment satisfied
if (returnVal < maxValue)
randNotReady = false; // now ready to return.
}
return returnVal; // finished.
}
The method seems to work. But it just seems kind of wasteful to occasionally just throw numbers away like that.
Can anyone think of a better approach that meets the requirements?
Last edited: