I Proper understanding of p-value

  • I
  • Thread starter Thread starter fog37
  • Start date Start date
  • Tags Tags
    P-value
fog37
Messages
1,566
Reaction score
108
Hello,

I am still slightly confused about the meaning of the p-value. Here my current understanding:
  • There is a population. We don't know its parameters but we want to estimate them.
    We collect a possibly large sample of size ##n## from it.
    We formulate the hypotheses ##H0## and ##H1##, set a significance level ##\alpha##, and perform a hypothesis test to either fail to reject ##H0## or reject ##H0## in favor ##H1##.
  • The p-value is the probability, ASSUMING H0 is correct, of the calculated sample statistic.
  • A low p-value leads to rejecting H0: it means that the calculated sample statistic would have been really too rare, under the assumption that H0 is correct, for it actually happen. But it happened. The sample statistic, given its low p-value probability, is to be considered rare, but it happened. This means that it cannot be ascribed to just being a random fluke. Just sampling error would have not generated such a low probability statistic value. Something deeper must be going on. This leads us to believe that H0 is not so reliable.
  • The p-value is also called "the probability of chance" because it should be the value we would expect if only chance was at work, as it happens in random sampling. The fact that the sample statistic happened regardless of its low chances, must be attributed to something other than chance.
Is this correct?

The procedure above is based on analyzing a single, large sample. What if we repeated the procedure above with another simple random sample and this time the p-value was larger than the set threshold ##\alpha##? That would mean that we would fail to reject H0...So how many samples do we need to analyze to convince ourselves that ##H0## must be rejected or not?
It seems reasonable to explore multiple random samples and determine the p-value before drawing conclusions of what to do with H0.

THANK YOU!
 
Physics news on Phys.org
fog37 said:
I am still slightly confused about the meaning of the p-value.
Yes, it is one of the most frequently misused and misunderstood statistics. That said, your understanding seems correct:

fog37 said:
The p-value is the probability, ASSUMING H0 is correct, of the calculated sample statistic.
The usual mistake, which you are not making, it to consider the p-value as the probability that ##H_0## is correct. Or even worse, to consider it as being related to the probability of ##H_1## in any way. In simple terms it is the probability of the data, given the null hypothesis.

fog37 said:
What if we repeated the procedure above with another simple random sample and this time the p-value was larger than the set threshold α?
In this very common case you would need to perform a correction for multiple comparisons. I like the Bonferroni Holm correction.

However, even with a multiple-comparisons correction, this is one of the big problems with frequentist statistics in science. When you perform that next experiment, you actually need to alter the p-value of the original experiment. In fact, ideally when reporting the original experiment you should have considered that you would repeat the experiment and you should have adjusted the original p-value in anticipation of the follow-up experiment. By simply intending to do follow-up experiments your p-value becomes weaker, and in the limit of a conscientious experimenter who intends to continue studying a topic indefinitely, any data can be made not statistically significant.

fog37 said:
So how many samples do we need to analyze to convince ourselves that H0 must be rejected or not?
That actually is less critical than to be explicit on your stopping criteria and use that stopping criteria in calculating your p-values. Once you have your defined stopping criteria then a power analysis using that experiment can guide you about the number of samples needed.
 
Thanks Dale!

Glad I am on the right track. So, in general, unless we want to get into more sophisticated analysis and corrections, such as the Bonferroni Holm correction that you bring up, junior statisticians stick with analyzing one single, possibly large sample from the population...

Also, given that the starting assumption that ##H0## is correct, we are considering the value claimed to be true by ##H0## at the center of a probability distribution and the p-value is the probability value from such distribution at the corresponding z value.

In regards to such distribution, is it the theoretical and Gaussian sampling distribution for the statistic (say we are concerned with the sample mean) under study? We are essentially envisioning the sampling distribution of the means with the mean value proposed by ##H0## at the center of such Gaussian sampling distribution.

Is that correct?
 
fog37 said:
unless we want to get into more sophisticated analysis and corrections, such as the Bonferroni Holm correction that you bring up, junior statisticians stick with analyzing one single, possibly large sample from the population.
Yes

fog37 said:
Also, given that the starting assumption that H0 is correct, we are considering the value claimed to be true by H0 at the center of a probability distribution and the p-value is the probability value from such distribution at the corresponding z value.

In regards to such distribution, is it the theoretical and Gaussian sampling distribution for the statistic (say we are concerned with the sample mean) under study? We are essentially envisioning the sampling distribution of the means with the mean value proposed by H0 at the center of such Gaussian sampling distribution.

Is that correct?
Not necessarily. Your $H_0$ does not necessarily need to be at all related to the Gaussian distribution. That is a common approach , but not mandatory.
 
fog37 said:
  • The p-value is the probability, ASSUMING H0 is correct, of the calculated sample statistic.

Assuming H0, the probability that the sample statistic takes on the value we observe is typically zero! - if we are talking about sample statistics that can take on a continuous range of values.

The p-value is, in general, the probability that the statistic lies in some interval. For example, the interval might be ##[0,\infty)##.

Justifying the use of a particular interval is a sophisticated intellectual exercise. For example, it's easy to explain the customary scenarios for using "one-tailed" vs "two-tailed" tests and the procedures are intuitively pleasing, but how do we prove that the methods are correct in any sense? The key to that is to define "correct" rigorously. This has to do with defining the "power" of statistical tests.

After all, since the p-value is, in general, the probability of an event that includes outcomes where the observed statistic did not have the value we observe, how do we justify including the probability of things that did not happen in making a decision?
 
  • Like
Likes fog37 and Dale
Since we are discussing statistics, my understanding is that statistical inference can be used to for 3 purposes:

a) find a yes/no answer about a parameter of an unknown population (that is hypothesis testing)
b) estimate the parameter(s) of an unknown population with a certain level of confidence (that is estimation)
c) predict the future (that is forecasting)

I am not sure about c)...How is inferential statistics used to predict the future? Are we assuming that the populations parameters can vary in the future and the idea is to predict them? Are talking about statistics in the context of time series, regression models as models to predict data that we currently don't have?

Also, I have been reading about probability vs statistics and some simplistically define them the inverse of each other...Every intro statistics book has a probability section. Is it because statistics employs the tools of probability to do statistical analysis? I guess...

Thanks
 
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Thread 'Detail of Diagonalization Lemma'
The following is more or less taken from page 6 of C. Smorynski's "Self-Reference and Modal Logic". (Springer, 1985) (I couldn't get raised brackets to indicate codification (Gödel numbering), so I use a box. The overline is assigning a name. The detail I would like clarification on is in the second step in the last line, where we have an m-overlined, and we substitute the expression for m. Are we saying that the name of a coded term is the same as the coded term? Thanks in advance.

Similar threads

Replies
5
Views
2K
Replies
20
Views
2K
Replies
7
Views
2K
Replies
13
Views
3K
Replies
5
Views
3K
Replies
6
Views
2K
Replies
3
Views
2K
Back
Top