Generated random data how can I check if it is a Gaussian?

  • Context: Undergrad 
  • Thread starter Thread starter raisin_raisin
  • Start date Start date
  • Tags Tags
    Data Gaussian Random
Click For Summary

Discussion Overview

The discussion revolves around the generation of random numbers in a Gaussian distribution and the challenges participants face in visualizing this data as a bell curve. It includes considerations of plotting techniques, statistical tests for normality, and the interpretation of results.

Discussion Character

  • Exploratory, Technical explanation, Conceptual clarification, Debate/contested

Main Points Raised

  • One participant expresses confusion about not seeing a bell curve when plotting 100 generated random numbers, despite them being centered around the correct mean.
  • Another participant suggests that to achieve the expected bell-shaped curve, one should plot relative frequencies of the sampled data by defining categories and counting occurrences within those categories.
  • A third participant notes that even with proper plotting methods, a perfect bell curve is unattainable, and visual representations can only suggest a normal-like distribution without definitive proof.
  • A later reply outlines three methods for assessing normality: creating a relative frequency histogram, calculating the ratio of the interquartile range to standard deviation, and constructing a normal probability plot, while cautioning that these methods are descriptive and do not guarantee normality.
  • There is a reminder that it is possible for data to appear normal under these checks while still being non-normal, emphasizing the need for careful interpretation of results.

Areas of Agreement / Disagreement

Participants generally agree on the challenges of confirming normality through visual methods and statistical tests, but there is no consensus on the effectiveness of specific methods or the implications of their results.

Contextual Notes

Participants mention various assumptions and limitations regarding the methods for checking normality, including sample size and the nature of the data distribution.

raisin_raisin
Messages
26
Reaction score
0
Hi, I have used some code that generates random numbers in a Gaussian distribution.
when I plot say 100 numbers I don't get the bell curve although all the data seems to be around the correct mean value and changing the variance effects the spread.

Am I wrong to expect the bell curve or am I not plotting it right? I am not entirely sure what to put on the other axis thanks.
 
Physics news on Phys.org
in order to obtain the bell shaped curve you expect you need to plot the relative frequencies of your sampled data. For this you have to define categories. for example I_k=(k,k+1] and then you count how many of the numbers you generated happened to fall in one of these categories and divide the result by the size of your sample; call this number n_k. Then you can plot the points \{k+1/2, n_k\} and you will hopefully see what you expect.

Note that you can choose the categories as you like. You will notice that you will get btter results if you work with a larger sample and smaller (and thus more) categories.
 
Even if you follow the above suggestions you will never get a perfect bell curve with your, or any sample, whether you graph a histogram or a type of density plot (from R) to represent your data.
Visual evidence from those graphs will be, at best, as you describe: you can see the general outline of the "bell curve", and the center of your data seems to be close to the mean of the simulated distribution. From graphs alone, you will never have "proof" that your data comes from a normal distribution or whether it comes from a distribution that is "normal like" in the center but has slightly longer tails.
There are certain tests you could do to examine the hypothesis of normality, however each one has its own assumptions, drawbacks, pros and cons.
 
raisin_raisin said:
Hi, I have used some code that generates random numbers in a Gaussian distribution.
when I plot say 100 numbers I don't get the bell curve although all the data seems to be around the correct mean value and changing the variance effects the spread.

Am I wrong to expect the bell curve or am I not plotting it right? I am not entirely sure what to put on the other axis thanks.

The typical determination of an approximately normal distribution is to do one of three things:

1. Create a relative frequency histogram and visually check to see the shape is that of a bell curve

2. Find the ratio of the interquartile range to standard deviation. If the ratio is approximately 1.34, then the data is an approximately normal distribution

3. Construct a normal probability plot for the data. If the points fall in an approximately straight line, then the data is an approximately normal distribution

Keep in mind (like statdad eluded to) that checks for normality as given above are only descriptive in nature. It is possible (but unlikely) that the data are non-normal even when the checks are reasonably satisfied. Thus, one should be careful not to claim that the data is in fact normally distributed. One can only say that it is reasonable to believe that the data are from a normal distribution.

CS
 

Similar threads

Replies
28
Views
4K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 13 ·
Replies
13
Views
2K
  • · Replies 6 ·
Replies
6
Views
6K
  • · Replies 3 ·
Replies
3
Views
9K
  • · Replies 11 ·
Replies
11
Views
4K
  • · Replies 22 ·
Replies
22
Views
4K