T-test: normal probability plot

In summary, the book "Design of Experiments" by Montgomery states that it is necessary to check if samples are described by a normal distribution through a normal probability plot. The y-scale on this plot is arranged so that if the hypothesized distribution accurately describes the data, the plotted points will fall along a straight line. If the points deviate significantly from a straight line, the hypothesized model is not appropriate. However, the determination of whether the data plots as a straight line is subjective. The y-scale is chosen based on the scaling of the variables and there are different representations of the percentiles of the standard normal distribution that may require different scales. Overall, a normal probability plot is used to test if a sample fits a normal distribution and
  • #1
serbring
269
2
I'm studying statistics from the book "design of experiments" by Montgomery and about the t-test it's stated it is necessary to check the samples are described by a normal distribution throughout a normal probability plot and I have noticed the y-scale is not familiar to me, it's neither linear of logaritmic. In the book is written:

the cumulative frequency scale has been arranged so that if the hypothesized distribution adequately describes the data, the plotted points will fall approximately along a straight line; if the plotted points deviate significantly from a straight line, the hypothesized model is not appropriate. Usually, the determination of
whether or not the data plot as a straight line is subjective.


How is the yscale chosen?



 
Physics news on Phys.org
  • #2
Hey serbring.

You should picture a graph with your x and y data points where an average line that minimizes the sum of squared residuals is plotted. Some points will be above and others below.

If the sum of squared residuals is too large within some particular confidence measure, then what that means is that the correlation is too low and you can't use a simple linear fit to describe the variation present in the model itself.

When you fit a simple linear regression and test correlation, the correlation measure is actually the linear coefficient where y = cx + b and c is the correlation value. If you don't have a linear model then basically either your c is insignificant or you have to use a more complicated model to capture the variation of the data.

Testing whether a sample fits a distribution is usually done with goodness of fit or specific tests that look at specific distributions in one form or another.

Usually the scale depends on how you scale the variables themselves and without context it is hard to really evaluate.

In a simple linear model, the usual assumption is that if you have two sets of data Y and X (both random variables) where Y lies on the real line, then Y = a + bX + e where e is Normally distributed with 0 mean and some constant variance. This is the simplest regression model and is called a simple linear regression.
 
  • #3
A normal probability plot is not a regression plot (by the way: in linear correlation the correlation IS NOT, in general, the slope in the equation y = cx +b).

I don't know what the graph you refer to looks like: a common way to create a normal probability plot is to arrange the Yi in order (smallest to largest) and plot them (on the horizontal axis). The vertical axis is often taken to be some representation of the percentiles of the standard normal distribution. If the actual percentiles are plotted then ordinary scales can be used: there are some software packages that use a different representation of the percentiles and they require different scales. As stated, without seeing the plot you reference it is impossible to state specifically what is going on in your book. If the points lie along a straight line you have evidence the "model" (the hypothesized normal distribution for your data) is a good fit (no regression involved). Note that it is very common for these plots to show a strong linear pattern in the center of the graph but have the points stray from the line in the extremes: that simply reflects the fact that data often "appears normal" in the middle of the distribution but deviate from normality in the tails.

A short but readable discussion of normal probability plots can be found here.
http://www.statit.com/support/quality_practice_tips/testingfornearnormality.shtml
 

Related to T-test: normal probability plot

What is a T-test?

A T-test is a statistical method used to determine if there is a significant difference between the means of two groups.

What is a normal probability plot?

A normal probability plot is a graphical representation of the data that helps to determine if the data follows a normal distribution.

How is a T-test used with a normal probability plot?

A T-test can be used with a normal probability plot to determine if the data is normally distributed, which is a requirement for performing a T-test.

What does a normal probability plot look like?

A normal probability plot typically looks like a straight line, with the data points closely following the line if the data is normally distributed.

What does it mean if a normal probability plot is not a straight line?

If a normal probability plot is not a straight line, it may indicate that the data is not normally distributed. This could affect the results of a T-test and alternative statistical methods may be needed.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
27
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
2K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
2
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
2K
Back
Top