Error Rate and Its Role in Significance

Click For Summary

Discussion Overview

The discussion revolves around the relationship between p-values and Type II error rates in statistical hypothesis testing. Participants explore the implications of varying alpha levels on Type II errors, the definition of null and alternative hypotheses, and the role of power curves in understanding these concepts. The scope includes theoretical considerations and practical implications in statistical testing.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Some participants suggest that an increase in p-value correlates with a higher likelihood of retaining the null hypothesis, potentially increasing Type II error rates.
  • Others argue that the p-value does not change simply because results are not significant, and question the implications of increasing alpha on Type II error without a specific alternative hypothesis.
  • A participant mentions that power curves can illustrate the probability of Type II error for various alternative hypotheses, emphasizing the need for specificity in defining these hypotheses.
  • Some participants note that increasing alpha generally leads to a decrease in Type II error when a specific alternative hypothesis is defined, while others contest this by stating that practical situations often lack such specificity.
  • There is a discussion about the acceptance region in hypothesis testing, with questions raised about the necessity of symmetrical intervals around the mean and the implications for Type I and Type II errors.
  • Participants explore the concept of contracts based on sample means and intervals, using hypothetical scenarios to illustrate the likelihood of accepting or rejecting hypotheses based on varying intervals.

Areas of Agreement / Disagreement

Participants express differing views on the relationship between alpha levels and Type II error rates, with no consensus reached. Some agree that increasing alpha can decrease Type II error under specific conditions, while others maintain that in many practical situations, the lack of a specific alternative hypothesis complicates this relationship.

Contextual Notes

Participants highlight the importance of defining specific alternative hypotheses to compute Type II error accurately. There are also mentions of the limitations of using power curves without knowing the exact position on the curve, as well as the challenges of establishing acceptance regions in hypothesis testing.

Soaring Crane
Messages
461
Reaction score
0
I am trying to understand the relationship between p-values and Type II error rate. Type II error occurs when there is failure to reject the null when it is false. As an example, suppose that I find that my results are not significant. In this case, the p-value is increased, or higher (in relation to the alpha level). If the p-value increases, then we are more likely to retain the null, and doesn't this increase Type II error? If this error increases, then the error rate also increases?

Thanks.
 
Physics news on Phys.org
Soaring Crane said:
As an example, suppose that I find that my results are not significant. In this case, the p-value is increased, or higher (in relation to the alpha level). If the p-value increases, then we are more likely to retain the null, and doesn't this increase Type II error? If this error increases, then the error rate also increases?

In a statistical test, the null hypothesis must be specific enough to let you compute a p-value. So if you have a given set of results, it doesn't make sense to say that the "p-value is increased" when the results aren't significant. For the given null hypothesis and given results, the p-value is whatever it is. It doesn't change if it fails to be significant. Perhaps you want to ask about the effect of increasing alpha.

I think in most practical situations increasing alpha does increase the probability of type II error, but I don't know any mathematical proof that this must always be true.

Unless you have a specific alternative hypothesis, you cannot compute type II error. For example in flipping a coin ten times, if null hypothesis is "the coin is fair" let's you compute the p-value of a given number of heads. But the hypothesis "the coin is not fair" is not specific enough to let you compute any probability. So unless you are very specific about the way in which the null hypothesis is false, you can't compute type II error.

In frequentist statistics people get a feeling for type II error by looking at graphs that represent, in a manner of speaking, all possible alternatives to the null hypothesis. These are called "power curves" for statistical tests. For example, for a specific probability Q of heads, you can compute the probability of a given number of heads in 10 tosses and (for a given alpha) you can compute the probability that a hyothesis test of "the coin is fair" will accept the null hypothesis that the coin is fair when the true probability of heads is Q. The power curve gives you a probability of type II error for each possible value of Q.
 
Stephen Tashi said:
The power curve gives you a probability of type II error for each possible value of Q.

I should have said "The power curve gives you the probability of rejecting the null hypothesis for each possible value of Q". From one minus that, you can get the probability of type II error.
 
Stephen Tashi said:
I think in most practical situations increasing alpha does increase the probability of type II error, but I don't know any mathematical proof that this must always be true.

On the contrary, when we have a specific alternative hypothesis, P[typeII error] decreases as α (the size of the test) increases.
 
ssd said:
On the contrary, when we have a specific alternative hypothesis, P[typeII error] decreases as α (the size of the test) increases.

I agree with that except for the "on the contrary" because in most practical situations there is no specific alternative hypothesis.
 
Well, let us have a little discussion in the spirit of the game. Hope, no one minds.
Consider a test for sample mean, population normal with known σ. Notations usual.
H is null and K is alt hypothesis.
To test H:μ=μ0, ag, K: μ>μ0. Test statistic Z (follows normal) , size of the test =α, critical region w:Z> Zα.

Since K does not specify a value of μ, we need to compare the power curves corresponding to different α in context of the present problem.

Suppose we plot the power curves for a number of different α values on the same graph paper, with μ (where, μ> μ0, to see the type II error) values in the horizontal axis.
It will be seen that for higher α, the curve is higher. This means, probabilities of type II error is lower for for higher α corresponding to any μ belonging to K.

α increased implies higher probability of rejection of H, other factors remaining unchanged. Therefore higher α implies higher probability of rejection of H, even if μ belongs to K.

Looking forward for discussions and counter statements, which are most welcome for refinement of understanding of the matter.
 
ssd said:
Consider a test for sample mean, population normal with known σ..

I consider a known [itex]\sigma[/itex] to be rare in practical situations.

I agree that your example shows a situation where a power curve argument proves increasing acceptance region increases type II error. It increases it by an unknown amount since we don't know where on the curve we are.

There are many natural and good "beginners" questions about math that we see posted over and over on the forum, but there is one in statistics that I've yet to see. An example of it is this: Suppose we know the variance of a normal distribution is [itex]\sigma^2 = 1[/itex] and we are doing a test of the hypothesis that its mean [itex]\mu = 0[/itex] at a 5% significance level. Why must we set the acceptance region (for the sample) mean to be a symmetrical interval that contains zero? After all, we could define the acceptance region to be any sort of intervals that have a 95% probability of containing the sample mean and get the same probability of type I error. We could even define the acceptance region to be two disjoint intervals that don't contain the value 0 at all !

Perhaps the only answer to the above question is to resort to a power curve argument and show that a test with a symmetrical region about 0 is "uniformly most powerful" among all possible tests using the value of the sample mean. I've never read such a proof. (If it exists, I think it would have to be a very technical since "all possible" tests is a big class of tests!)
 
I just gave the simplest (?) example. If σ is unknown, we simply have to think of a 't' test.
The fact about power curves and varying α remains very similar.
 
ssd said:
I just gave the simplest (?) example. If σ is unknown, we simply have to think of a 't' test.
The fact about power curves and varying α remains very similar.

A t-test of whether two distributions known of have the same standard deviation have the same mean has a power curve. If we take the frequently encountered situation of testing whether two distributions, not known to have the same standard deviation have the same mean, then we get a "power surface" - say alpha on the z, difference in means on x and difference in standard deviations on the y.
 
  • #10
Thinking more about the original question
Soaring Crane said:
If the p-value increases, then we are more likely to retain the null, and doesn't this increase Type II error?
I've tried to defog my mind as follows:

Suppose we define an interval such as [itex]I_A[/itex] = {-1.5,1.5}.

And suppose we sign some contracts such as the following:

If the sample mean falls in [itex]I_A[/itex] then I will dance a jig.

If the sample mean falls in [itex]I_A[/itex] then I will accept the teaching of the philosopher Hegel

If the sample mean falls in [itex]I_A[/itex] then I will reject the theory of Evolution

Now define a larger interval [itex]I_B[/itex] = {-2.0, 2.0}.

What can we say about analagous contracts that are based on [itex]I_B[/itex] instead of [itex]I_A[/itex]?

A contract such as

If the sample mean falls in [itex]I_B[/itex] then I will dance a jig

is at least as likely to take effect as the similar contract based on [itex]I_A[/itex] since [itex]I_B[/itex] includes [itex]I_A[/itex] as a subset. The contract based on [itex]I_B[/itex] will be more likely to take effect if increasing the interval to [itex]I_B[/itex] includes an additional event that has positive probability.

If we restrict ourselves to situations not mentioned in the contract, such as particular weather, the same conclusions apply. The contract about dancing a jig didn't mention anything about the weather. So we can say:

If it is raining then the probability that I will dance a jig under the contract using [itex]I_B[/itex] is equal or greater than the probability that I will dance a jig under the contract using [itex]I_A[/itex].

The contracts don't mention any conditions about the absolute truth of the ideas to be accepted or rejected. Thus we can say:

If the ideas of Hegel are false then the probability that I will dance a jig under the contract using [itex]I_B[/itex] is equal or greater than the probability that I will dance a jig under the contract using [itex]I_A[/itex].

and

If the ideas of Hegel are false then the probability that I will accept the teaching of the philosopher Hegel under the contract using [itex]I_B[/itex] is equal or greater than the probability that I will accept his teaching under the contract using [itex]I_A[/itex].

and

If the ideas of Hegel are true then the probability that I will accept the teaching of the philosopher Hegel under the contract using [itex]I_B[/itex] is equal or greater than the probability that I will accept his teaching under the contract using [itex]I_A[/itex].

and

If the theory of evolution is false then the probability that I will accept the teaching of the philosopher Hegel under the contract using [itex]I_B[/itex] is equal or greater than the probability that I will accept his teaching under the contract using [itex]I_A[/itex].

and

If the theory of evolution is false then the probabtility that I will reject the theory of Evolution under the contract using [itex]I_B[/itex] is equal or greater than the probability that I will reject it under the contract using [itex]I_A[/itex].

If the theory of evolution is true then the probabtility that I will reject the theory of Evolution under the contract using [itex]I_B[/itex] is equal or greater than the probability that I will reject it under the contract using [itex]I_A[/itex].

The answer to original question is that the probability of type II error will increase if you enlarge the acceptance interval and the enlargement includes some event with non-zero probability. But, more generally, the probability of your doing anything ( dancing a jig, etc.) that is triggered by the result falling in a certain region will increase if you enlarge the region to include addtional probability.
 

Similar threads

  • · Replies 43 ·
2
Replies
43
Views
6K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 9 ·
Replies
9
Views
4K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 13 ·
Replies
13
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
Replies
4
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
Replies
26
Views
3K