Understanding Chi Squared p-values

  • Context: High School 
  • Thread starter Thread starter Isaac0427
  • Start date Start date
  • Tags Tags
    Chi
Click For Summary

Discussion Overview

The discussion revolves around the interpretation of chi-squared p-values within the context of hypothesis testing, particularly in relation to a biological example involving handedness. Participants explore the meaning of p-values, the implications of rejecting or failing to reject the null hypothesis, and the differences between frequentist and Bayesian approaches to statistics.

Discussion Character

  • Technical explanation
  • Conceptual clarification
  • Debate/contested

Main Points Raised

  • One participant describes their understanding of p-values as the probability of no favor in handedness, questioning if this interpretation is correct.
  • Another participant clarifies that p-values indicate the probability of observing the data given that the null hypothesis is true, not the probability of the null hypothesis itself being true.
  • There is a discussion about the common misconception that a p-value of 0.05 implies a 95% confidence in rejecting an incorrect null hypothesis, with several participants asserting this interpretation is incorrect.
  • Some participants note that frequentist statistics cannot provide the probability of the null hypothesis being true or false, only the likelihood of the data under the null hypothesis.
  • Concerns are raised about the threshold of 0.05 being perceived as low for accepting hypotheses, with questions about how certainty is quantified in relation to p-values.
  • One participant mentions their experience with statistics and notes they have never used a left-tailed chi-squared test, prompting questions about the purpose of such tests.
  • Another participant emphasizes the need for precise vocabulary in discussing statistical concepts to avoid confusion.

Areas of Agreement / Disagreement

Participants express disagreement on the interpretation of p-values and the implications of hypothesis testing. There is no consensus on the correct understanding of how p-values relate to the null hypothesis and the certainty of conclusions drawn from them.

Contextual Notes

Participants highlight limitations in understanding p-values and hypothesis testing, including the distinction between frequentist and Bayesian interpretations, and the need for precise language in statistical discussions.

Isaac0427
Insights Author
Gold Member
Messages
718
Reaction score
163
Hi,

I am in AP Biology, and I completely understand how to use chi squared to the level the AP exam and class requires, but I do not completely understand why p-values work the way they do. I have done a lot of research, and I think I may have an idea as to how this works. Can you please tell me if I am correct or not, and if not why? Thank you in advance!

For this, consider we go to a remote island, and we look at right- and left-handedness. Our null hypothesis is that there is no natural favor between being right- and left-handed, i.e. the distribution is 50/50. The alternative hypothesis is thus that nature favors either right- or left-handedness. We calculate a chi squared and then find a p-value. Is this how we would interpret our results?:
Chance that there is no favor: p
Chance that there is a favor: 1-p

If we use the .05 threshold and we get a p-value of .06, there is only a 6% chance the null hypothesis is correct, but we still accept it because we would rather say there is no favor if there is one than say there is one if there isn't, right? If we wanted the other way around, i.e. to rather say there is a favor when there is not than say there is no favor when there is, we would use the left-tailed test (p=0.95 threshold), right?
 
Physics news on Phys.org
Isaac0427 said:
We calculate a chi squared and then find a p-value. Is this how we would interpret our results?:
Chance that there is no favor: p
Chance that there is a favor: 1-p

A p value gives the probability that some set of outcomes happens on the assumption that the "null hypothesis" is correct. It does not say anything about the probability that the null hypothesis is correct and, likewise, 1-p cannot be interpreted as the probability that the null hypothesis is false. So both interpretations you mention are incorrect.

( I assume you understand that the conditional probabilities Pr(H|D) and Pr(D|H) need not be equal. The p-values concern Pr(the data happened | null hypothesis true). Your interpretations concern Pr(null hypothesis true | the data happened).)

It is human nature to think: If the assumption of the null hypothesis implies the probability of the data happening is small then the the null hypothesis is "probably" wrong. However, this statement cannot be mathematically proven or quantified without making more assumptions that are made in traditional hypothesis testing.

Hypothesis testing is a procedure that has been found empirically to be useful. It isn't a procedure that can be mathematically proven to get the right answer with some known probability.

A Bayesian approach to hypothesis testing can answer "What is the probability that the null hypothesis is true given the data happened", but it involves making more assumptions than are made in traditional ("frequentist") hypothesis testing.
 
  • Like
Likes   Reactions: FactChecker and Dale
Stephen Tashi said:
( I assume you understand that the conditional probabilities Pr(H|D) and Pr(D|H) need not be equal. The p-values concern Pr(the data happened | null hypothesis true). Your interpretations concern Pr(null hypothesis true | the data happened).)
A video I watched from bozeman science says that we can interpret .05 as a 95% confidence that we are rejecting an incorrect null hypothesis. Is this incorrect?
 
Isaac0427 said:
A video I watched from bozeman science says that we can interpret .05 as a 95% confidence that we are rejecting an incorrect null hypothesis. Is this incorrect?

No. [Edit: I mean "Yes, it is incorrect"]

Any calculation you do in frequentist hypothesis is done by assuming the probability distribution implied by the null hypothesis is correct. So all probabilities you calculate are of the form Pr(...something...| null hypothesis is true). You can compute the probability of (incorrectly) rejecting the null hypothesis given the null hypothesis is true. You can't compute the probability of rejecting the null hypothesis (or doing anything else) when the null hypothesis is false.

In frequentist statistics, analysis of what happens when the null hypothesis is false is done by "power curves" and discussions of the "power" of statistical tests. This involves assuming specific ways in which the null hypothesis can be false. (E.g. You can't compute specific probabilities of seeing a certain number of right handers on the vague assumption "right handers are more probable than left handers". To do that, you must make a specific assumption such as "the probability of a person being right handed is 0.64.)

"Confidence" has a technical definition in statistics. In particular "95% confidence about such-and-such a statement" is not the same as saying that there is .95 probability that the statement is true. To evaluate what bozeman science says, we'd have see an exact quote about what is claimed.
 
Last edited:
Isaac0427 said:
A video I watched from bozeman science says that we can interpret .05 as a 95% confidence that we are rejecting an incorrect null hypothesis. Is this incorrect?
As you wrote it it is indeed incorrect. A p value of is the probability of the observed data assuming that the null hypothesis is true: P(data|H0 is true). If we make a decision in advance to reject H0 iff p<0.05 then: P(reject H0|H0 is true)=0.05. It cannot and does not tell you anything about P(X|H0 is false)
 
In the video, Mr. Anderson says that you can interpret .05 as “you are 95% sure you are [correctly] accepting or rejecting the null hypothesis.”

I guess what is also confusing is that .05 seems like a really low threshold to accept a hypothesis. How can you be certain about a hypothesis if 95% if the data you would get under that hypothesis would be better than the data you have?
 
No, he is being very sloppy with his wording. For one thing, you do not “accept” a null hypothesis, you only “fail to reject” a null hypothesis. He seems like an engaging lecturer, but he has clearly decided to trade precision for engagement.

To be fair, he is describing how many people think about statistics. But he is wrong in a very common way.

Isaac0427 said:
I guess what is also confusing is that .05 seems like a really low threshold to accept a hypothesis.
Again, you never accept the null hypothesis in this way. You only fail to reject it.
 
Dale said:
To be fair, he is describing how many people think about statistics. But he is wrong in a very common way.
Well, I guess that confuses me. And I brought up to my AP Bio teacher the question of how critical values go up when you want more certainty, which seems counter-intuitive. He was not sure. We both assumed that Mr. Anderson was completely correct, as the comments were not filled with corrections. So, is there some digit that gives you how certain you are for a given p-value?

Also, if you cannot accept the null hypothesis, what would you use the left-tailed test for?
 
  • #10
Isaac0427 said:
So, is there some digit that gives you how certain you are for a given p-value?
The question isn’t how certain you are. It is what are you certain about. A low p value makes you certain that the data is not likely under the null hypothesis. In other words you reject the null hypothesis as a likely explanation for the data because it would be so unlikely to see that data if it were true.

Isaac0427 said:
Also, if you cannot accept the null hypothesis, what would you use the left-tailed test for?
I have been doing statistics since the early 90’s and I have never used a left tailed chi squared test.
 
  • Like
Likes   Reactions: Isaac0427
  • #11
Dale said:
I have been doing statistics since the early 90’s and I have never used a left tailed chi squared test.
Critical value tables have p values of .9, .95, .99, etc. What are those used for?
 
  • #12
Isaac0427 said:
So, is there some digit that gives you how certain you are for a given p-value?
What exactly is your question? To understand statistics, you need to develop a precise vocabular.y. If you are asking a question about a probability, phrase it using the word "probability" and specify the event associated with the probability. Don't phrase questions about a probability using words like "certain" or "confident".

Also, if you cannot accept the null hypothesis, what would you use the left-tailed test for?

This is another vocabulary technicality. If a company makes decision based only a statistical hypothesis test then it will either proceed on the assumption the null hypothesis is true or proceed on the assumption the null hypothesis is false. So, in common language, the company has either accepted or rejected the null hypothesis. However, some statisticians don't like to use the word "accept" when talking about the null hypothesis. This is because all the calculations in a hypothesis test are done on the assumption that the null hypothesis is true. So it would be redundant to say we "accept" the null hypothesis. (The phrase "reject the null hypothesis" has the interesting logical problem that if we do calculations base on the assumption the null hypothesis is true and then reject it, this implies our calculations are no longer justified. Nevertheless, most statisticians are comfortable using that phrase. )
 
  • #13
Isaac0427 said:
Critical value tables have p values of .9, .95, .99, etc. What are those used for?
Nothing that I have ever seen
 
  • #14
Ok, thank you guys. This Bozeman video seems to have some very key details incorrect, as the whole video is about accepting the null hypothesis. Thank you for all help.
 
  • Like
Likes   Reactions: Dale
  • #15
Stephen Tashi said:
The phrase "reject the null hypothesis" has the interesting logical problem that if we do calculations base on the assumption the null hypothesis is true and then reject it, this implies our calculations are no longer justified.
I don’t think this is a problem. It is the typical approach of assuming X, finding a contradiction, and therefore concluding not-X. It is a standard logical approach for many proofs. The only difference is that it is not a contradiction, but just an incongruity.
 
  • Like
Likes   Reactions: FactChecker
  • #16
Notice that the term "confidence" is not formally defined as a probability. That is intentional. Consider your original example. You can say "If there is no favor between right and left-handedness, then we have a probability less than 1-p of getting the results we got (or worse)." But you can not turn that statement around and say anything about the probability of "no favor between right and left-handedness". It is true that one should be skeptical of the null hypothesis if the data obtained is extremely unlikely for that assumption. But we can not directly assign a probability number to that skepticism. So terms like "skepticism" or "confidence" should not be thought of directly as probabilities.
 
  • Like
Likes   Reactions: Isaac0427
  • #17
Dale said:
I don’t think this is a problem. It is the typical approach of assuming X, finding a contradiction, and therefore concluding not-X. It is a standard logical approach for many proofs. The only difference is that it is not a contradiction, but just an incongruity.

I agree with the analogy to indirect proof. The convoluted aspect is justifying what we consider an incongruity. Typically, the data gives a single value for a statistic. So how do we explain why we look at probabilities for that single value being in a larger "acceptance region". And how do we explain the (intuitively obvious) choice of the rejection region as one or both tails of the distribution for the statistic? After all, if we picked any aribitrary set of intervals with a total probability of p for the acceptance region then the probability of incorrectly rejecting the null hypothesis when the statistic falls outside of those intervals is 1-p.

I think answers to those questions (in frequentist statistics) must explained by arguments about the power of various tests. The concept of the power of test involves considering specific ways the null hypothesis can be false.

It's no wonder that people trying to explain introductory statistics make oversimplifications.
 
  • Like
Likes   Reactions: Dale
  • #18
Stephen Tashi said:
After all, if we picked any aribitrary set of intervals with a total probability of p for the acceptance region then the probability of incorrectly rejecting the null hypothesis when the statistic falls outside of those intervals is 1-p.
Woah! I hadn’t considered that. Very nice.

On a related note, a null hypothesis is inherently kind of silly. I mean, it is almost always much more reasonable to assume that it is false, and ask if it is a good approximation instead.
 

Similar threads

  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 23 ·
Replies
23
Views
3K
  • · Replies 24 ·
Replies
24
Views
7K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 20 ·
Replies
20
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
Replies
20
Views
3K