Some questions about hypothesis testing

Artusartos
Messages
236
Reaction score
0
I think I’m a bit confused about what \alpha is. For the definition that is given, what theta really means when they say \alpha = max_{\theta \in w_0} P_{\theta} [(X_1, … , X_n) \in C]. I just think that it’s not really clear in my head, so is it ok if anybody explains/gives and example about this?<br /> <br /> (my question is about the second attachment)<br /> <br /> Thanks in advance
 

Attachments

  • 20121201_075858[1].jpg
    20121201_075858[1].jpg
    50.8 KB · Views: 462
  • 20121201_075910[1].jpg
    20121201_075910[1].jpg
    43.5 KB · Views: 449
Physics news on Phys.org
Perhaps the example of a "one tailed" test is the best intuitive explanation for that. Suppose we have a coin and our "null hypothesis" is "the coin is not biased toward heads". This hypothesis is different that the hypothesis "the coin is fair" because "not biased toward heads" would include the cases where the probability of the coin landing heads was 0.0 or 0.01 or 0.499 etc.

Hypothesis testing is a procedure where you define some feature of experimental data as "the test statistic" and define some "acceptance region" for that feature. If the feature of the particular data that you observe is outside the "acceptance region", you "reject" the null hypothesis. (Hypothesis testing isn't a mathematical proof that the null hypothesis is true or that it is false and it doesn't compute the probability that the null hypothesis is true or the probability that the null hypothesis is false. It is simply a procedure.)

To apply hypothesis testing to the above example, you must pick a feature of the data. Let's say the data is 100 independent flips of the coin. Let the feature be the total number of heads that occurred. Let's say the "acceptance region" is "80 or fewer flips of the coin produced heads".

The intuitive idea of "alpha" is to answer the question "What is the probability that I will reject the null hypothesis when it is actually true?". In this example, the question is "What is the probability that the observed number of heads will be more than 80 when the coin is actually "not biased toward heads". However, we can't compute a probability for this event because the statement the statement that the coin is "not biased toward heads" doesn't give us a specific probability to compute with. We don't know whether to assume that the probability of a head is 0.0 or 0.01 or .499 etc. Intuitively, if we assume the probability of heads is 0.5 then we are letting the coin be "as prone as possible to produce a head, without actually being biased toward producing a head". Assuming the probability of a head is 0.5 gives us a specific probability to use and it captures the idea of "the situation where the null hypothesis would be most likely to produce a false result".

Your book's definition is more sophisticated than the definition found in many introductory texts. Many texts define alpha as "the probability of rejecting the null hypothesis when it is true". Your text defines it as "the maximum of the probabiilities of rejecting the null hypothesis taken over all the possible ways the null hypothesis can be true". That is the more general definition of alpha.
 
Stephen Tashi said:
Perhaps the example of a "one tailed" test is the best intuitive explanation for that. Suppose we have a coin and our "null hypothesis" is "the coin is not biased toward heads". This hypothesis is different that the hypothesis "the coin is fair" because "not biased toward heads" would include the cases where the probability of the coin landing heads was 0.0 or 0.01 or 0.499 etc.

Hypothesis testing is a procedure where you define some feature of experimental data as "the test statistic" and define some "acceptance region" for that feature. If the feature of the particular data that you observe is outside the "acceptance region", you "reject" the null hypothesis. (Hypothesis testing isn't a mathematical proof that the null hypothesis is true or that it is false and it doesn't compute the probability that the null hypothesis is true or the probability that the null hypothesis is false. It is simply a procedure.)

To apply hypothesis testing to the above example, you must pick a feature of the data. Let's say the data is 100 independent flips of the coin. Let the feature be the total number of heads that occurred. Let's say the "acceptance region" is "80 or fewer flips of the coin produced heads".

The intuitive idea of "alpha" is to answer the question "What is the probability that I will reject the null hypothesis when it is actually true?". In this example, the question is "What is the probability that the observed number of heads will be more than 80 when the coin is actually "not biased toward heads". However, we can't compute a probability for this event because the statement the statement that the coin is "not biased toward heads" doesn't give us a specific probability to compute with. We don't know whether to assume that the probability of a head is 0.0 or 0.01 or .499 etc. Intuitively, if we assume the probability of heads is 0.5 then we are letting the coin be "as prone as possible to produce a head, without actually being biased toward producing a head". Assuming the probability of a head is 0.5 gives us a specific probability to use and it captures the idea of "the situation where the null hypothesis would be most likely to produce a false result".

Your book's definition is more sophisticated than the definition found in many introductory texts. Many texts define alpha as "the probability of rejecting the null hypothesis when it is true". Your text defines it as "the maximum of the probabiilities of rejecting the null hypothesis taken over all the possible ways the null hypothesis can be true". That is the more general definition of alpha.

Thank you so much for your help, but I think I'm a bit confused again...

There is an example in my textbook (which I attached)...

It says that \alpha = P_{H_0} [S \leq k] and \alpha = P_{p_0} [S \leq k]

I have two questions:

1) In my previous attachments the textbook defined alpha as the maximum of P_{\theta} [(X_1, ... , X_n) \in C] So in this example, are they saying that \theta = H_0 when they are saying \alpha = P_{H_0} [S \leq k]?

2) How did H_0 turn into p_0 when they wrote P_{p_0} [S \leq k]
instead of \alpha = P_{H_0} [S \leq k]

Thanks in advance
 

Attachments

  • 20121201_160139.jpg
    20121201_160139.jpg
    47.7 KB · Views: 443
Last edited:
Artusartos said:
1) So in this example, are they saying that \theta = H_0 when they are saying \alpha = P_{H_0} [S \leq k]?
The null hypothesis H_0 is a statement, not a number. So it wouldn't make sense to say \theta = H_0. As I understand their notation P_{ H_0}[S \leq k ] means "The probability that S is less than or equal to k under the assumption that H_0 is true" and P_{p_0}[ S \leq k ] means the "The probability that S is less than or equal to k under the assumption that the probability of success is p_0. So both expressions refer to the same probability. (Your text didn't do a good job of definining those notations.)

In that example it is not necessary to speak of the maximum probability computer over the set of all \theta. The null hypothesis in the example is stated as \theta = p = p_0. So the null hypothesis only deals with a single value of \theta.

If they had chosen to state the null hypothesis as \theta = p \ge p_0 then we would have done the same computation for \alpha as the example did, since p = p_0 is the value of p that maximizes the probability that S \leq k among all the possible values of p that are allowed when the null hypothesis is true.

It's an interesting question whether the null hypothesis in the example should be "The new treatment has the same effectiveness as the old treatment" or whether it makes more sense to make it say "The new treatment is no more effective than the old treatment". Since the example proposes a "one tailed" acceptance region, I think it makes more sense to phrase the null hypothesis the second way.

Trying to prove whether a particular type of acceptance region (one tailed, two tailed, or a even bunch of isolated intervals) is "best" involves defining what "best" means. The only way I know to approach that topic in frequentist statistics is to compare the "power" of tests that use difference acceptance regions. The power of a test is defined by a function, not by a single number, so comparing the power of two tests is not straightforward either.
 
Stephen Tashi said:
The null hypothesis H_0 is a statement, not a number. So it wouldn't make sense to say \theta = H_0. As I understand their notation P_{ H_0}[S \leq k ] means "The probability that S is less than or equal to k under the assumption that H_0 is true" and P_{p_0}[ S \leq k ] means the "The probability that S is less than or equal to k under the assumption that the probability of success is p_0. So both expressions refer to the same probability. (Your text didn't do a good job of definining those notations.)

In that example it is not necessary to speak of the maximum probability computer over the set of all \theta. The null hypothesis in the example is stated as \theta = p = p_0. So the null hypothesis only deals with a single value of \theta.

If they had chosen to state the null hypothesis as \theta = p \ge p_0 then we would have done the same computation for \alpha as the example did, since p = p_0 is the value of p that maximizes the probability that S \leq k among all the possible values of p that are allowed when the null hypothesis is true.

It's an interesting question whether the null hypothesis in the example should be "The new treatment has the same effectiveness as the old treatment" or whether it makes more sense to make it say "The new treatment is no more effective than the old treatment". Since the example proposes a "one tailed" acceptance region, I think it makes more sense to phrase the null hypothesis the second way.

Trying to prove whether a particular type of acceptance region (one tailed, two tailed, or a even bunch of isolated intervals) is "best" involves defining what "best" means. The only way I know to approach that topic in frequentist statistics is to compare the "power" of tests that use difference acceptance regions. The power of a test is defined by a function, not by a single number, so comparing the power of two tests is not straightforward either.

Thank you so much for your time.
 
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Thread 'Detail of Diagonalization Lemma'
The following is more or less taken from page 6 of C. Smorynski's "Self-Reference and Modal Logic". (Springer, 1985) (I couldn't get raised brackets to indicate codification (Gödel numbering), so I use a box. The overline is assigning a name. The detail I would like clarification on is in the second step in the last line, where we have an m-overlined, and we substitute the expression for m. Are we saying that the name of a coded term is the same as the coded term? Thanks in advance.
Back
Top