Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Error Rate and Its Role in Significance

  1. Feb 27, 2013 #1
    I am trying to understand the relationship between p-values and Type II error rate. Type II error occurs when there is failure to reject the null when it is false. As an example, suppose that I find that my results are not significant. In this case, the p-value is increased, or higher (in relation to the alpha level). If the p-value increases, then we are more likely to retain the null, and doesn't this increase Type II error? If this error increases, then the error rate also increases?

    Thanks.
     
  2. jcsd
  3. Feb 27, 2013 #2

    Stephen Tashi

    User Avatar
    Science Advisor

    In a statistical test, the null hypothesis must be specific enough to let you compute a p-value. So if you have a given set of results, it doesn't make sense to say that the "p-value is increased" when the results aren't significant. For the given null hypothesis and given results, the p-value is whatever it is. It doesn't change if it fails to be significant. Perhaps you want to ask about the effect of increasing alpha.

    I think in most practical situations increasing alpha does increase the probability of type II error, but I don't know any mathematical proof that this must always be true.

    Unless you have a specific alternative hypothesis, you cannot compute type II error. For example in flipping a coin ten times, if null hypothesis is "the coin is fair" lets you compute the p-value of a given number of heads. But the hypothesis "the coin is not fair" is not specific enough to let you compute any probability. So unless you are very specific about the way in which the null hypothesis is false, you can't compute type II error.

    In frequentist statistics people get a feeling for type II error by looking at graphs that represent, in a manner of speaking, all possible alternatives to the null hypothesis. These are called "power curves" for statistical tests. For example, for a specific probability Q of heads, you can compute the probability of a given number of heads in 10 tosses and (for a given alpha) you can compute the probability that a hyothesis test of "the coin is fair" will accept the null hypothesis that the coin is fair when the true probablity of heads is Q. The power curve gives you a probability of type II error for each possible value of Q.
     
  4. Feb 28, 2013 #3

    Stephen Tashi

    User Avatar
    Science Advisor

    I should have said "The power curve gives you the probability of rejecting the null hypothesis for each possible value of Q". From one minus that, you can get the probability of type II error.
     
  5. Mar 11, 2013 #4

    ssd

    User Avatar

    On the contrary, when we have a specific alternative hypothesis, P[typeII error] decreases as α (the size of the test) increases.
     
  6. Mar 11, 2013 #5

    Stephen Tashi

    User Avatar
    Science Advisor

    I agree with that except for the "on the contrary" because in most practical situations there is no specific alternative hypothesis.
     
  7. Mar 12, 2013 #6

    ssd

    User Avatar

    Well, let us have a little discussion in the spirit of the game. Hope, no one minds.
    Consider a test for sample mean, population normal with known σ. Notations usual.
    H is null and K is alt hypothesis.
    To test H:μ=μ0, ag, K: μ>μ0. Test statistic Z (follows normal) , size of the test =α, critical region w:Z> Zα.

    Since K does not specify a value of μ, we need to compare the power curves corresponding to different α in context of the present problem.

    Suppose we plot the power curves for a number of different α values on the same graph paper, with μ (where, μ> μ0, to see the type II error) values in the horizontal axis.
    It will be seen that for higher α, the curve is higher. This means, probabilities of type II error is lower for for higher α corresponding to any μ belonging to K.

    α increased implies higher probability of rejection of H, other factors remaining unchanged. Therefore higher α implies higher probability of rejection of H, even if μ belongs to K.

    Looking forward for discussions and counter statements, which are most welcome for refinement of understanding of the matter.
     
  8. Mar 12, 2013 #7

    Stephen Tashi

    User Avatar
    Science Advisor

    I consider a known [itex] \sigma [/itex] to be rare in practical situations.

    I agree that your example shows a situation where a power curve argument proves increasing acceptance region increases type II error. It increases it by an unknown amount since we don't know where on the curve we are.

    There are many natural and good "beginners" questions about math that we see posted over and over on the forum, but there is one in statistics that I've yet to see. An example of it is this: Suppose we know the variance of a normal distribution is [itex] \sigma^2 = 1 [/itex] and we are doing a test of the hypothesis that its mean [itex] \mu = 0 [/itex] at a 5% significance level. Why must we set the acceptance region (for the sample) mean to be a symmetrical interval that contains zero? After all, we could define the acceptance region to be any sort of intervals that have a 95% probability of containing the sample mean and get the same probability of type I error. We could even define the acceptance region to be two disjoint intervals that don't contain the value 0 at all !

    Perhaps the only answer to the above question is to resort to a power curve argument and show that a test with a symmetrical region about 0 is "uniformly most powerful" among all possible tests using the value of the sample mean. I've never read such a proof. (If it exists, I think it would have to be a very technical since "all possible" tests is a big class of tests!)
     
  9. Mar 12, 2013 #8

    ssd

    User Avatar

    I just gave the simplest (?) example. If σ is unknown, we simply have to think of a 't' test.
    The fact about power curves and varying α remains very similar.
     
  10. Mar 12, 2013 #9

    Stephen Tashi

    User Avatar
    Science Advisor

    A t-test of whether two distributions known of have the same standard deviation have the same mean has a power curve. If we take the frequently encountered situation of testing whether two distributions, not known to have the same standard deviation have the same mean, then we get a "power surface" - say alpha on the z, difference in means on x and difference in standard deviations on the y.
     
  11. Mar 13, 2013 #10

    Stephen Tashi

    User Avatar
    Science Advisor

    Thinking more about the original question
    I've tried to defog my mind as follows:

    Suppose we define an interval such as [itex] I_A [/itex] = {-1.5,1.5}.

    And suppose we sign some contracts such as the following:

    Now define a larger interval [itex] I_B [/itex] = {-2.0, 2.0}.

    What can we say about analagous contracts that are based on [itex] I_B [/itex] instead of [itex] I_A [/itex]?

    A contract such as

    is at least as likely to take effect as the similar contract based on [itex] I_A [/itex] since [itex] I_B [/itex] includes [itex] I_A [/itex] as a subset. The contract based on [itex] I_B [/itex] will be more likely to take effect if increasing the interval to [itex] I_B [/itex] includes an additional event that has positive probability.

    If we restrict ourselves to situations not mentioned in the contract, such as particular weather, the same conclusions apply. The contract about dancing a jig didn't mention anything about the weather. So we can say:

    The contracts don't mention any conditions about the absolute truth of the ideas to be accepted or rejected. Thus we can say:

    and

    and

    and

    and

    The answer to original question is that the probability of type II error will increase if you enlarge the acceptance interval and the enlargement includes some event with non-zero probability. But, more generally, the probability of your doing anything ( dancing a jig, etc.) that is triggered by the result falling in a certain region will increase if you enlarge the region to include addtional probability.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: Error Rate and Its Role in Significance
Loading...