Understanding the Uniform Distribution of P-Values in Hypothesis Testing

Click For Summary
The discussion clarifies that when a p-value is used as a test statistic for a null hypothesis with a continuous distribution, it is uniformly distributed between 0 and 1 if the null hypothesis is true. This means that if the null hypothesis is valid, the p-value can take any value from 0% to 100% with equal likelihood. The p-value represents the expected frequency of observing the actual data under the null hypothesis, guiding decisions on whether to reject it. The explanation involves understanding the relationship between the p-value and a generic test statistic with a continuous distribution. Overall, the uniform distribution of p-values is a fundamental concept in hypothesis testing.
chowpy
Messages
4
Reaction score
0
I read the following statement from wiki,but I don't know how to get this.

"when a p-value is used as a test statistic for a simple null hypothesis, and the distribution of the test statistic is continuous, then the test statistic (p-value) is uniformly distributed between 0 and 1 if the null hypothesis is true."

anyone can explain it more?
thanks~~
 
Physics news on Phys.org
Hi chowpy, welcome to PF!

Imagine that you have a data set A of one or more experimental observations. You also have a null hypothesis in mind (a possible distribution of results that data set A may or may not have come from). Say you're comparing the means of these two distributions (but it could be any parameter that you're comparing).

The p-value is always defined as the expected frequency of obtaining your actual data set A from the null hypothesis. (If the p-value is incredibly low, we might decide that A came from another distribution, and therefore reject the null hypothesis; that's what hypothesis testing is all about.)

If the null hypothesis is actually true, then we'd expect to get a p-value anywhere from 0% to 100%, distributed evenly. In other words, if the data set A (or a more extreme* data set) would only arise 20% of the time, then we'd expect a p-value of 0.20. *By more extreme I mean a data set with a mean farther away from the mean of the null hypothesis, in the example we're using.

Does this answer your question?
 
Remember what it means for a random variable X to be uniformly distributed on (0,1)

P(X <=a) = a for any a in (0,1)
Let P denote the p-value as a random variable

T stand for a generic Test statistic that has a continuous distribution.

Pick an a in (0,1). Since T has a continuous distribution, there is a number ta that satisfies

<br /> \Pr(T \le ta) = a<br />

Now, the events P \le a and T \le ta are equivalent, so that

<br /> \Pr(P \le a) = \Pr(T \le ta) = a<br />

comparing this to the meaning of "uniformly distributed on (0,1) shows the result.
 
Thanks Mapes and statdad~
I understand it now~
 
The standard _A " operator" maps a Null Hypothesis Ho into a decision set { Do not reject:=1 and reject :=0}. In this sense ( HA)_A , makes no sense. Since H0, HA aren't exhaustive, can we find an alternative operator, _A' , so that ( H_A)_A' makes sense? Isn't Pearson Neyman related to this? Hope I'm making sense. Edit: I was motivated by a superficial similarity of the idea with double transposition of matrices M, with ## (M^{T})^{T}=M##, and just wanted to see if it made sense to talk...

Similar threads

  • · Replies 24 ·
Replies
24
Views
6K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 43 ·
2
Replies
43
Views
5K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
Replies
20
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K