matt grime said:
Getting 100 heads in a row does not mean we *must* reject the hypothesis of an unbiased coin. And you cannot *prove* that you *must*.
If you wish to do a hypothesis test, then do so. But don't make nonsensical alternative hypotheses like 'H_0: the coin is fair. H_1 I've got more chance of throwing this letter through a small hole.'
Firstly, I gave no such alternative hypothesis as you mentioned. I gave the example so that one may visualize the practical impossibility of occurance of events with very very small prob.s.
Well,the word 'must' may or may not be used depending on the sense of using the word. One may test the null hyp H:p=0.5 ag: K:p<>0.5 (p= prob of getting head in a single toss) with sizes of the critical region 0.1 or 0.05 or 0.5 or 1-1/2^100 or 1/2^100 as he pleases... but last three of them do not certainly make sense to a statistican. The choice of the size of the critical region is subjective. Different sizes may give different conclusions. While performing a test of hypothesis one 'must' commit two errors. So, can we say that the method of hypothesis testing is an erroneous one? Even, in general we cannot minimize the two errors simultaneously. But whatever the choice of critical region, once it is decided, we
infer on the basis of it using the sample at disposal. What we infer: something like H is true (accepted) ag. K or H is false (not accepted) ag.K at the given level. I find little difference between the two statements "H is true" and "H must be true" while making statistical inference.
In the said example of coin flipping, the hypothesis H:p=0.5 ag: K:p<>0.5 will be rejected for
any sensible choice of the critical region using the given sample of 99 heads out of 100 tosses...that is what I ment by saying the the coin 'must' be biased.
P.S.
matt grime said:
Getting 100 heads in a row does not mean we *must* reject the hypothesis of an unbiased coin. And you cannot *prove* that you *must*.
1/ If you raise the question of 'must' in the way you did, Bernoulli's theorem 'must' remain unacceptable ("unproved"?) in the same way as long run relative frequencies will not converge to corresponding probabilities.
2/ Can you show that
the size of the critical region used to accept the hypothesis p=0.5 against p<>0.5 when the sample is 100 heads out of 100 tosses have ever been used in any practical application of hypothesis testing? Or, for that matter that size of the critical region is theoritically used where the purpose of hypothesis testing prevails?