P value from z statistic

Hey All,

Can someone please explain to me why the p value is obtained by taking the integral under the z curve from the z statistic you calculate to the end of the tail ?

Thanks

What you mean "why"? Do you already know the methodology? By the way, I don't think you mean "Z curve." The Z statistic is a measure of the distance between the mean and some point to the right or left on the X axis under the PDF (Bell Curve). It is calibrated in terms of the Standard Normal Distribution (SND) in Standard Deviation (SD) Units.

Do you know what a probability density or a probability density function (PDF) is?

Sorry for the questions. I just don't want to try to explain things you already know.

Last edited:
Hey SW VandeCarr,

yep I know what a pdf is - and I am comfortable with the idea that if you integrate between bounds of a pdf that it gives you the probability of your random variable being between those bounds.

As I understood it you get this thing called a z statistic from that formula
$$z = \frac{\bar{X}-\mu}{\sigma}$$

and then you integrate from this value of z you work out to the tail (I guess here I am assuming a one sided test).

I thought that what you were doing with the above z formula, was scaling your probability distribution to a gaussian function with mean mu and variance sigma, which we have tables for the integration bounds.

I guess thinking about your questions about what I know, I don't really understand the connection between that z curve (if it is indeed called that) and the z statistic.

Then I also don't understand why the p value, which as I understand it is the probability of getting the result you got IF the null hypothesis was true, is given by getting the area under a standard normal curve from the z statistic value to the tail.

Thanks

Hey SW VandeCarr,

yep I know what a pdf is - and I am comfortable with the idea that if you integrate between bounds of a pdf that it gives you the probability of your random variable being between those bounds.

As I understood it you get this thing called a z statistic from that formula
$$z = \frac{\bar{X}-\mu}{\sigma}$$

and then you integrate from this value of z you work out to the tail (I guess here I am assuming a one sided test).

I thought that what you were doing with the above z formula, was scaling your probability distribution to a gaussian function with mean mu and variance sigma, which we have tables for the integration bounds.

I guess thinking about your questions about what I know, I don't really understand the connection between that z curve (if it is indeed called that) and the z statistic.

Then I also don't understand why the p value, which as I understand it is the probability of getting the result you got IF the null hypothesis was true, is given by getting the area under a standard normal curve from the z statistic value to the tail.

Thanks

The correct language is "rejecting or failing to reject" the null hypothesis. You probably know that you select some value $$\alpha$$ such that $$p=1-\alpha$$ is the upper limit of integration for the one tailed test. Any value of a test statistic that falls within this range is the basis for "failing to reject" the null hypothesis. Typically$$\alpha = 0.05. or 0.025$$. The Z score* for this limit is 1.96 SD which corresponds to the 0.95 probability density value of the integral of the PDF. Any value outside this range (Z>1.96) is the basis for "rejecting the null hypothesis" with alpha=0.05,

You probably know all this. What is key is that this is a model and it's reasonably useful for a lot of applications. The model can be adapted through transformations (semi log for instance) to skewed distributions. The Central Limit Theorem supports the concept that all kinds of populations which are amenable to good random sampling techniques will yield approximately normally distributed sample means. In any case there are plenty of other distribution models for special situations.

In short, the $$\alpha$$ "p" value is the probability of the data under the null hypothesis given the model. This phrasing is often called "frequentist" interpretation of statistical inference..

*The Z score is not measured on a curve. It is measured on the x axis in SD units from the mean to a one sided limit of integration. The probability density is measured on the PDF between the limits of integration.

EDIT: In terms of standard functional notation the Z score is x in SD units and F(x) is the p value at the limit of integration (x). Alpha is the difference between this p value and one.

Last edited:
Hey All,

Can someone please explain to me why the p value is obtained by taking the integral under the z curve from the z statistic you calculate to the end of the tail ?

Thanks

The standard Normal curve is symmetric and the area underneath any density curve is 1 (by definition). Therefore, it would be redundant to give areas under the curve going from a calculated z-score to negative infinity and also calculate the integral between your z-score and positive infinity.

I believe you also asked why we look at only the standard Normal curve. Statistically, the Normal distribution is a location-scale family. For any Normally distributed random variable, if you subtract off a constant, like $$\mu$$ (the population mean), and divide by a constant, $$\sigma$$ (the known population standard deviation), you will still have a Normally distributed random variable! Since the integral of the Normal density has no closed form, a table of areas under the curve to the left or right of z are given for the standard Normal distribution (N(0,1)) because you can transform any Normal random variable into a standard Normal random variable.

Lastly, are you sure that $$\sigma$$ is known? This is a very strong assumption. If you are calculating p-values for a hypothesis test of a single mean, or calculating a confidence interval for a population mean, $$\mu$$, then I recommend using a T-distribution if you have a small sample size. I hope this helps!

The correct language is "rejecting or failing to reject" the null hypothesis. You probably know that you select some value $$\alpha$$ such that $$p=1-\alpha$$ is the upper limit of integration for the one tailed test. Any value of a test statistic that falls within this range is the basis for "failing to reject" the null hypothesis. Typically$$\alpha = 0.05. or 0.025$$. The Z score* for this limit is 1.96 SD which corresponds to the 0.95 probability density value of the integral of the PDF. Any value outside this range (Z>1.96) is the basis for "rejecting the null hypothesis" with alpha=0.05,

You probably know all this. What is key is that this is a model and it's reasonably useful for a lot of applications. The model can be adapted through transformations (semi log for instance) to skewed distributions. The Central Limit Theorem supports the concept that all kinds of populations which are amenable to good random sampling techniques will yield approximately normally distributed sample means. In any case there are plenty of other distribution models for special situations.

In short, the $$\alpha$$ "p" value is the probability of the data under the null hypothesis given the model. This phrasing is often called "frequentist" interpretation of statistical inference..

*The Z score is not measured on a curve. It is measured on the x axis in SD units from the mean to a one sided limit of integration. The probability density is measured on the PDF between the limits of integration.

EDIT: In terms of standard functional notation the Z score is x in SD units and F(x) is the p value at the limit of integration (x). Alpha is the difference between this p value and one.

Formally, the Central Limit Theorem states for any population with a finite variance $$\sigma$$ and mean $$\mu$$, for sufficiently large simple random sample of size n, the sample mean will follow a Normal distribution with mean $$\mu$$ and standard error $$\sigma$$$$/$$$$\sqrt{n}$$. Generally, if your sample size is larger than 30, you can approximate the sampling distribution of your sample mean with a Normal density.

Now, this discussion about using the CLT seems only relevant if the population standard deviation is known, and you are conducting either a hypothesis test or constructing a confidence interval for a population mean. For hypothesis tests, we assume that the true mean of the population is the value claimed under the null hypothesis. This is done so we may calculate a p-value. Small p-values indicate that we are far from this conjectured value (regardless of a two-sided or one-sided alternative hypothesis) and provide evidence that the true value of $$\mu$$ is not what we claim it to be under H0. Frequentist statisticians believe that parameters are fixed, unknown quantities. This is why this is a "Frequentist" interpretation of a p-value. Bayesians believe that parameter values are not fixed but are random and follow a some distribution. A further discussion of the Bayesian and Frequentist approaches is beyond the scope of a physics forum.

thanks guys - the statistics picture is starting to become clearer. just at a very core conceptual level can someone explain this to me:

I have a sample (lets say a collection of IQ tests in a suburb) and I want to test whether the mean IQ of these kids is higher than average. So
H_0 = the mean IQ of these kids are the same as everywhere else
H_1 the mean of theses kids IQ is different to everywhere else.

And I set the significance level as 0.05

So I get my sample mean $$\bar{X}$$ and then since I know the population mean and variance I can compute a z statistic using:
$$z = \frac{\bar{X} - \mu}{\sigma}$$

then I find the area under the standard normal function (which I have effectively transformed to) using the tables that exist, from the positive z value to the positive tail of the pdf and then do the same with the negative of the z value to the negative tail (obviously this is a two tailed test). this gives me the p value.

Now I know the pvalue gives you the probability that the mean I found from my sample would occur if H_0 is true. I still don't really see why this is given by the area under the normal pdf described above.

I know this is really elementary but I just can't seem to understand it conceptually for the life of me

thanks guys - the statistics picture is starting to become clearer. just at a very core conceptual level can someone explain this to me:

Now I know the pvalue gives you the probability that the mean I found from my sample would occur if H_0 is true. I still don't really see why this is given by the area under the normal pdf described above.

I know this is really elementary but I just can't seem to understand it conceptually for the life of me

OK. I'll try. Let's use the one tailed test for illustration. You know alpha is the probability of the data under the null hypothesis given the model. The limit of integration for the SND PDF for a one tailed test is $$1-\alpha$$. If alpha is 0.05 then the integral f(x) is 0.95.

1) If x corresponds to a value less then f(x)= 0.95, you fail to reject the null hypothesis.

2) If x corresponds to a value more then f(x)= 0.95 you reject the null hypothesis.

Another way think of it is that if the value of 1-f(x) is 0.05 or less, you reject the null hypothesis. That is, the difference between the observed and expected data under the null hypothesis will occur "randomly" under the normal assumption only 5% or less of the time. Therefore, you reject the null hypothesis that an intervention had no effect with alpha=0.05 in a controlled randomized trial. (Look up alpha error).

The variable x is expressed in SD units from the mean. For f(x)=0.95 x=1.96

Last edited:
The variable x is expressed in SD units from the mean. For f(x)=0.95 x=1.96

For a one-sided test with an alternative hypothesis that $$\mu$$ is greater than some hypothesized value, the critical value, x in this case, is 1.645.

For a one-sided test with an alternative hypothesis that $$\mu$$ is greater than some hypothesized value, the critical value, x in this case, is 1.645.

Right. I'm used to always doing two sided tests.

To truly understand why you use a standard Normal distribution to calculate p-values relating to $$\bar{X}$$, you first must understand the concept of a sampling distribution. The sampling distribution of $$\bar{X}$$ is the distribution of values taken by the sample mean in all possible samples of the same size from the same population. For a sample of 100 students from Chicago, the sampling distribution of sample mean IQ scores would be the values of sample mean IQ scores taken in all unique samples of size 100 students from the population of students in Chicago. The number of unique samples of size 100 is really large, and you would need access to all student IQ scores in the Chicago student population. For these reasons, we approximate this distribution with a Normal curve because mathematical theory says we can for a large sample size and known variance (CLT).

Now, p-values relate to ranges of values in a sampling distribution. To calculate a p-value, a sampling distribution must first be defined. For $$\bar{X}$$, this is accomplished by assuming that the true mean of the Normal sampling distribution is the null hypothesized value (note that the standard deviation is known, so a sampling distribution is fully defined). Maybe you would hypothesize that Chicago's mean IQ score is 100. Next, you collect data to test this assumption. In your case, you might observe a sample mean IQ score of 90 for 100 Chicago students. The test statistic, correctly stated below, is then calculated. A z value that is far from 0 corresponds to a sample mean IQ score far from the hypothesized population mean of 100. Ranges of IQ scores are then considered in order to calculate a p-value/integral/area under the curve. With a two-sided alternative hypothesis, if you observed 90, the p-value is the probability of observing a sample mean IQ score less than 90 or a sample mean IQ score more than 110. Statisticians justify looking at this range with the following logic: You observe a sample mean. What is the chance of seeing an even more extreme sample mean if your sampling distribution truly is centered at the hypothesized value?

Here is the formal interpretation of a p-value in your IQ example context:

For a true mean IQ score of 100, the probability of observing a Chicago sample mean score as extreme or more extreme than 90 is 0.035.

All this said, it should now be clear why the area in a standard Normal curve is of interest. The 90 and 110 IQ score sample means are standardized into z-scores because you need them to calculate proportions of "extreme" sample means; this is synonymous to looking at extreme tail areas in your standard Normal distribution.

Note: Thrillhouse86, your test statistic regarding the sample mean from a population with known variance is wrong. It should be

$$z = \frac{\bar{X} - \mu}{\sigma/\sqrt{n}}$$

Thank you both d3t3rt & SW VandeCarr.
With my incorrect z score: am I incorrect because the quantity you divide the $$\bar{X} - \mu$$ by is not the standard deviation of the population, but the standard error of my sample mean ?

Thank you both d3t3rt & SW VandeCarr.
With my incorrect z score: am I incorrect because the quantity you divide the $$\bar{X} - \mu$$ by is not the standard deviation of the population, but the standard error of my sample mean ?

Yes, you need to divide by the standard error.