Generating a random sample with a standard deviation

In summary, generating a random sample with a standard deviation involves selecting a subset of data from a larger population in a way that reflects the variability or spread of the population. This is accomplished by using a formula that takes into account the standard deviation, which measures how much the data points vary from the average. Random sampling with standard deviation is a common method used in statistics to obtain a representative sample of a population for further analysis and research.
  • #1
gamow99
71
2
I'm trying to write a computer program which generates a random list of numbers but the random numbers form a bell curve, that is, there is a mean and a standard deviation from that mean. I'm not interested in some function that gets the job done, rather I'm trying to understand how do you generate a random list of numbers which are not entirely but conform to a bell curve. I have already done the following in Python:

list5 = [5] * 8
list4 = [4,5,6] * 4
list3 = [3,4,5,6,7] * 2
list2 = [x for x in range(2,9)]
list1 = [x for x in range(1,11)]
list6 = list1 + list2 + list3 + list4 + list5

So in the above 5 appears 8 times more often often 1,2,9,10. 4 times more often than 3 and 4 and twice as often as 4 and 6 which does form a bell curve and then I just select randomly from list 6. But I don't like that solution.
 
Physics news on Phys.org
  • #3
There are two problems of pseudo-random numbers that can be handled independently. The first is that the series of numbers should have as little detectable autocorrelations as possible. The second is to get the desired sample distribution. If the first problem is solved for generating a uniform distribution of numbers in [0,1), then there are several ways to use that to solve the second problem.

There has been a great deal of work done to solve both problems. An excellent reference is Knuth, The Art of Computer Programming, Vol 2: Seminumerical Algorithms. Chapter 3, Random Numbers. (Knuth's series of books is almost a bible for computer programmers.)

I do not advise you to try your own uniform random number generator unless you are prepared to learn a lot of number theory concepts.
The easiest, most versatile, brute-force method to solve the second problem is to use "rejection sampling". See https://en.wikipedia.org/wiki/Rejection_sampling. For the special case of the normal distribution there are several other techniques. A popular one is to use the Box-Muller transformation (see http://www.design.caltech.edu/erik/Misc/Gaussian.html). Mathworks uses other techniques in their MATLAB normrnd function, which they document reasonably well (see https://www.mathworks.com/company/newsletters/articles/normal-behavior.html )
 
  • Like
Likes WWGD, jim mcnamara, BvU and 1 other person
  • #4
Suppose, f(x)= f(x,μ,σ) is your curve with known mean (μ) and sd (σ) and f(x)≥0. Find C=∫Xf(x)dx, -∞<x<∞. Draw a 3 digited (say) random number and put a decimal point before it. Let this fraction be R. Find x by solving ∫-∞x f(x)dx/c =R. x is now a sample form f(x).
 
  • Like
Likes FactChecker
  • #5
ssd said:
Suppose, f(x)= f(x,μ,σ) is your curve with known mean (μ) and sd (σ) and f(x)≥0. Find C=∫Xf(x)dx, -∞<x<∞. Draw a 3 digited (say) random number and put a decimal point before it. Let this fraction be R. Find x by solving ∫-∞x f(x)dx/c =R. x is now a sample form f(x).
If the cumulative distribution function is invertible, this is a great method. It's called the inverse transform method (see https://en.wikipedia.org/wiki/Inverse_transform_sampling )
 
  • #6
More often than not, inverse function of a CDF is not analytically solvable in terms of simple functions. We have to use a computer program for numerical solution.
 
  • Like
Likes FactChecker

FAQ: Generating a random sample with a standard deviation

1. How do I generate a random sample with a specific standard deviation?

To generate a random sample with a specific standard deviation, you can use a random number generator and manipulate the data to match the desired standard deviation. This can be done using statistical software or programming languages such as Python or R.

2. What is the importance of using a random sample with a standard deviation?

A random sample with a standard deviation is important because it allows for a representative sample of a population to be obtained. This helps to reduce bias and ensure that the sample accurately reflects the characteristics of the population.

3. Can a random sample have a negative standard deviation?

No, a random sample cannot have a negative standard deviation. Standard deviation is a measure of variability and therefore, cannot be negative. It is possible for a sample to have a standard deviation of zero, meaning there is no variability in the data.

4. How does the sample size affect the standard deviation of a random sample?

The sample size can affect the standard deviation of a random sample in that a larger sample size generally results in a smaller standard deviation. This is because a larger sample size provides more data points, making it more representative of the population and reducing the variability in the data.

5. Is it possible to generate a random sample with a standard deviation of zero?

Yes, it is possible to generate a random sample with a standard deviation of zero. This would mean that there is no variability in the data and all data points are equal. However, this is rare and may not be a representative sample of the population.

Back
Top