Confusion about normal probability plots

In summary, the conversation discusses the use of normal probability plots to check for normal distribution in a data set. The main questions revolve around which values to plot on the axes, whether the scales are linear, and how to calculate the theoretical values. The suggested approach is to plot the observed values on the horizontal axis and calculate the probability of each value using the normal distribution. The z-score is then used to find the probability, but the points on the plot are not (v, z-score of v) as this will always result in a straight line.
  • #1
nomadreid
Gold Member
1,670
204
The expositions for a normal probability plot (aka normal quantile plot) (in which observed probabilities are plotted against theoretical probabilities, or sometimes the other way around, to get a rough check as to whether a set of data is normally distributed by checking linearity) are not too clear (to me).
To make this easy to answer, I will put my doubts into four succinct questions:
First, which one standardly goes on the vertical axis: the observed or theoretical values?
Secondly, does one put probabilities, or the values, on the axes? If probabilities, are the observed probabilities just calculated as per a frequency table?
Thirdly (and this depends on the answer to the previous question), are the scales linear, a concatenation of logarithmic scales, or what?
Fourthly, how does one calculate the theoretical value that goes to a given observed value?
Thanks for answering any (or all) of these.
 
Physics news on Phys.org
  • #2
http://analyse-it.com/blog/2008/11/normal-quantile-probability-plots
First, which one standardly goes on the vertical axis: the observed or theoretical values?
Doesn't matter. I haven't heard of a standard. Use the one that makes the math easiest.

Secondly, does one put probabilities, or the values, on the axes? If probabilities, are the observed probabilities just calculated as per a frequency table?
All the ones I've seen are from frequency data.
The idea is to use the theoretical distribution to generate a theoretical data set which you compare with the actual data set. So you use whatever the data says it is.

Thirdly (and this depends on the answer to the previous question), are the scales linear, a concatenation of logarithmic scales, or what?
They are usually linear.

Fourthly, how does one calculate the theoretical value that goes to a given observed value?
You use the theory of probabilities. You know how a normal distribution works right?
 
  • #3
Thanks, Stephen Bridge. So, following your answers and the link you sent, I would proceed as follows:
put the observed values on one axis (say, the horizontal one)
Then for each observed value v, I find Prob(X< v), and plot that on the other axis.
Right?
 
  • #4
Thanks, Stephen Bridge. So, following your answers and the link you sent, I would proceed as follows:
put the observed values on one axis :say, the horizontal one.
Then for each observed value v, I find Prob(X< v) from the normal curve, and plot (v, Prob(X< v)) .
Right?
 
  • #5
You would compute Z-score.
http://www.measuringusability.com/zcalc.htm

... once you have the key words, you can look them up ;)
 
  • #6
Thanks, of course I would calculate the z-score in order to find the probability, but I don't think you mean that the points are (v, z-score of v), because that will always give you a straight line
y=(1/σ)x - (μ/σ),
regardless of whether your data is normally distributed or not.
 
Last edited:

1. What is a normal probability plot and why is it used?

A normal probability plot is a graphical representation of the distribution of a set of data points. It is used to determine whether a set of data follows a normal distribution, which is a symmetrical bell-shaped curve. This plot helps scientists and researchers to understand the underlying pattern or trends in their data and make more accurate statistical analyses.

2. How do you interpret a normal probability plot?

The closer the data points on the plot align with the diagonal line, the more closely the data follows a normal distribution. If the points form a straight line, it indicates that the data is normally distributed. However, if the points deviate significantly from the diagonal line, it suggests that the data is not normally distributed.

3. Can a normal probability plot be used for non-normal data?

Yes, a normal probability plot can be used to assess the distribution of any set of data, regardless of whether it follows a normal distribution or not. It can also be used to identify any outliers or unusual patterns in the data.

4. How is a normal probability plot different from a histogram?

A histogram is a bar graph that displays the frequency distribution of a set of data, while a normal probability plot shows the distribution of the data in a more precise and detailed manner. A histogram is a visual representation, while a normal probability plot is a statistical tool that can provide more insights about the data.

5. Are there any limitations of using a normal probability plot?

Yes, there are a few limitations to using a normal probability plot. Firstly, it can only assess the distribution of one variable at a time. Secondly, it is more suitable for larger sample sizes. In addition, it assumes that the data is normally distributed, so it may not be accurate if the data is heavily skewed or has extreme outliers.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
342
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Calculus and Beyond Homework Help
Replies
4
Views
858
  • General Discussion
Replies
24
Views
1K
  • Quantum Physics
Replies
4
Views
683
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
2K
  • Advanced Physics Homework Help
Replies
1
Views
723
Replies
24
Views
2K
Back
Top