Generating a random sample with a standard deviation

Click For Summary
SUMMARY

This discussion focuses on generating a random list of numbers that conform to a bell curve, utilizing Python programming. The user initially created a list with varying frequencies of numbers but seeks a more sophisticated method. Key techniques mentioned include the Box-Muller transformation and rejection sampling, both of which address the challenges of pseudo-random number generation and achieving the desired distribution. References to authoritative sources such as Knuth's "The Art of Computer Programming" and MATLAB's normrnd function provide additional context and guidance.

PREREQUISITES
  • Understanding of Python programming
  • Familiarity with statistical concepts such as mean and standard deviation
  • Knowledge of probability distributions, particularly the normal distribution
  • Basic comprehension of numerical methods for solving equations
NEXT STEPS
  • Research the Box-Muller transformation for generating normally distributed random numbers
  • Explore rejection sampling techniques for sampling from complex distributions
  • Learn about the inverse transform method for generating random samples
  • Investigate MATLAB's normrnd function for practical applications in random number generation
USEFUL FOR

Data scientists, statisticians, and software developers interested in advanced random number generation techniques and statistical modeling.

gamow99
Messages
71
Reaction score
2
I'm trying to write a computer program which generates a random list of numbers but the random numbers form a bell curve, that is, there is a mean and a standard deviation from that mean. I'm not interested in some function that gets the job done, rather I'm trying to understand how do you generate a random list of numbers which are not entirely but conform to a bell curve. I have already done the following in Python:

list5 = [5] * 8
list4 = [4,5,6] * 4
list3 = [3,4,5,6,7] * 2
list2 = [x for x in range(2,9)]
list1 = [x for x in range(1,11)]
list6 = list1 + list2 + list3 + list4 + list5

So in the above 5 appears 8 times more often often 1,2,9,10. 4 times more often than 3 and 4 and twice as often as 4 and 6 which does form a bell curve and then I just select randomly from list 6. But I don't like that solution.
 
Physics news on Phys.org
There are two problems of pseudo-random numbers that can be handled independently. The first is that the series of numbers should have as little detectable autocorrelations as possible. The second is to get the desired sample distribution. If the first problem is solved for generating a uniform distribution of numbers in [0,1), then there are several ways to use that to solve the second problem.

There has been a great deal of work done to solve both problems. An excellent reference is Knuth, The Art of Computer Programming, Vol 2: Seminumerical Algorithms. Chapter 3, Random Numbers. (Knuth's series of books is almost a bible for computer programmers.)

I do not advise you to try your own uniform random number generator unless you are prepared to learn a lot of number theory concepts.
The easiest, most versatile, brute-force method to solve the second problem is to use "rejection sampling". See https://en.wikipedia.org/wiki/Rejection_sampling. For the special case of the normal distribution there are several other techniques. A popular one is to use the Box-Muller transformation (see http://www.design.caltech.edu/erik/Misc/Gaussian.html). Mathworks uses other techniques in their MATLAB normrnd function, which they document reasonably well (see https://www.mathworks.com/company/newsletters/articles/normal-behavior.html )
 
  • Like
Likes   Reactions: WWGD, jim mcnamara, BvU and 1 other person
Suppose, f(x)= f(x,μ,σ) is your curve with known mean (μ) and sd (σ) and f(x)≥0. Find C=∫Xf(x)dx, -∞<x<∞. Draw a 3 digited (say) random number and put a decimal point before it. Let this fraction be R. Find x by solving ∫-∞x f(x)dx/c =R. x is now a sample form f(x).
 
  • Like
Likes   Reactions: FactChecker
ssd said:
Suppose, f(x)= f(x,μ,σ) is your curve with known mean (μ) and sd (σ) and f(x)≥0. Find C=∫Xf(x)dx, -∞<x<∞. Draw a 3 digited (say) random number and put a decimal point before it. Let this fraction be R. Find x by solving ∫-∞x f(x)dx/c =R. x is now a sample form f(x).
If the cumulative distribution function is invertible, this is a great method. It's called the inverse transform method (see https://en.wikipedia.org/wiki/Inverse_transform_sampling )
 
More often than not, inverse function of a CDF is not analytically solvable in terms of simple functions. We have to use a computer program for numerical solution.
 
  • Like
Likes   Reactions: FactChecker

Similar threads

  • · Replies 24 ·
Replies
24
Views
6K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 18 ·
Replies
18
Views
3K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 6 ·
Replies
6
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K