Generated random data how can I check if it is a Gaussian?

raisin_raisin · Oct 15, 2008

Hi, I have used some code that generates random numbers in a Gaussian distribution.
when I plot say 100 numbers I don't get the bell curve although all the data seems to be around the correct mean value and changing the variance effects the spread.

Am I wrong to expect the bell curve or am I not plotting it right? I am not entirely sure what to put on the other axis thanks.

Pere Callahan · Oct 15, 2008

in order to obtain the bell shaped curve you expect you need to plot the relative frequencies of your sampled data. For this you have to define categories. for example [itex]I_k=(k,k+1][/itex] and then you count how many of the numbers you generated happened to fall in one of these categories and divide the result by the size of your sample; call this number [itex]n_k[/itex]. Then you can plot the points [itex]\{k+1/2, n_k\}[/itex] and you will hopefully see what you expect.

Note that you can choose the categories as you like. You will notice that you will get btter results if you work with a larger sample and smaller (and thus more) categories.

statdad · Oct 15, 2008

Even if you follow the above suggestions you will never get a perfect bell curve with your, or any sample, whether you graph a histogram or a type of density plot (from R) to represent your data.
Visual evidence from those graphs will be, at best, as you describe: you can see the general outline of the "bell curve", and the center of your data seems to be close to the mean of the simulated distribution. From graphs alone, you will never have "proof" that your data comes from a normal distribution or whether it comes from a distribution that is "normal like" in the center but has slightly longer tails.
There are certain tests you could do to examine the hypothesis of normality, however each one has its own assumptions, drawbacks, pros and cons.

stewartcs · Oct 15, 2008

raisin_raisin said:

Hi, I have used some code that generates random numbers in a Gaussian distribution.
when I plot say 100 numbers I don't get the bell curve although all the data seems to be around the correct mean value and changing the variance effects the spread.

Am I wrong to expect the bell curve or am I not plotting it right? I am not entirely sure what to put on the other axis thanks.

The typical determination of an approximately normal distribution is to do one of three things:

1. Create a relative frequency histogram and visually check to see the shape is that of a bell curve

2. Find the ratio of the interquartile range to standard deviation. If the ratio is approximately 1.34, then the data is an approximately normal distribution

3. Construct a normal probability plot for the data. If the points fall in an approximately straight line, then the data is an approximately normal distribution

Keep in mind (like statdad eluded to) that checks for normality as given above are only descriptive in nature. It is possible (but unlikely) that the data are non-normal even when the checks are reasonably satisfied. Thus, one should be careful not to claim that the data is in fact normally distributed. One can only say that it is reasonable to believe that the data are from a normal distribution.

CS

Generated random data how can I check if it is a Gaussian?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect