# Two Bernoulli distribution- test hypothesis for biased coins

• A
I'm simplifying a research question that I have at work. Assuming I have 2 coins each with a different probability of head, let's call heads a success (p). Those are biased coins each with a different p, and I do not know the probability of success of each coin, but I do got a sample:

Coin 1 - 4 head 3 tails
Coin 2 - 3 head 1 tails

Now I will like to reject an hypothesis that - coin 1 p= 0.3 and coin 2 p = 0.5, how can I do that?

Or/And I will also like to not reject a different hypothesis (state that it is not very very unlikely that this sample came from this hypotheses) - coin 1 p=0.7 and coin 2 p=0.7 (Is it the same calculation just depend on the p value?).

EDIT - I am actually more interested to say that I can not reject an hypothesis...

Also if I am in the same situation with more than 2 coins (5 coins with different size of sample for each coin)

Any help will be great, even directions of where to search as I did not manage to understand if there is a name of the problem I presented, and if so what it is.

Thank you.

Last edited:

andrewkirk
Homework Helper
Gold Member
Now I will like to reject an hypothesis that - coin 1 p= 0.3 and coin 2 p = 0.5, how can I do that?
To do a hypothesis test you need to extract a single test statistic to be measured from the observations. There are many different ways you could do this. One way is to make the test statistic the total number of heads ##R=R_1+R_2##.
With the null hypothesis that you gave, call it ##H_0##, we have ##R_1\sim binom(7,0.3)## and ##R_2\sim binom(4,0.5)##.
The probability of getting seven or more heads is:

$$Pr(R\ge 7| H_0) = \sum_{r_1=0}^7\sum_{r_2=7-r_1}^4 Pr(R_1=r_1| H_0))Pr(R_2=r_2| H_0))$$

which is readily calculated using binomial probabilities. If your chosen significance level is ##\alpha## then you can reject the null hypothesis with confidence ##\alpha## if the above probability is less than ##\alpha##.

Tosh5457
Dale
Mentor
2020 Award
I will also like to not reject a different hypothesis
Do you want to not reject it, or do you want to accept it?

Thank you @andrewkirk I will look into it. But your method will probably not help me as I want to say that I can not reject the hypothesis

@Dale I actually want to say that I can not reject it, I want to say that it is not very very unlikely that the sample came from this distribution.

Dale
Mentor
2020 Award
@Dale I actually want to say that I can not reject it,
Then the method proposed by @andrewkirk is what you want to do. Also, you are best off acquiring as little data as possible. The less data you acquire the easier it is to not reject a null hypothesis, regardless of the data.

Note, not rejecting a null hypothesis does not allow you to accept the alternate hypothesis.

Last edited:
@Dale thank you for the response.
I am not sure @andrewkirk method will help, as his method is for rejecting it (if for example I got 7 H, his method is to test what is the probability to get more than 5 H, but if this probability is not lower than alpha it does not say that I can not reject )

Maybe I do not understand something?
Thanks.

Dale
Mentor
2020 Award
if this probability is not lower than alpha it does not say that I can not reject
If ##p<\alpha## then you reject the null hypothesis. If ##p>\alpha## then you cannot reject it. Again, the easiest way to fail to reject a null hypothesis is to acquire insufficient data.

Last edited:
Thank you @Dale but I am still not sure that it is correct, as he is trying to reject the hypothesis by stating that getting larger than 7 H is not likely , but what about getting exactly 7H ? If I want to not reject it, I need to show that getting exactly 7H is p > alpha.

Dale
Mentor
2020 Award
he is trying to reject the hypothesis by stating that getting larger than 7 H is not likely , but what about getting exactly 7H ?
No, you never reject a null hypothesis that way. If your test statistic were continuous instead of discrete then the probability of getting any given value would be 0. You always test the probability of getting a test statistic equal to or more extreme than the actual one.

Last edited:
I see no relation between p1 and p2 in the proposed context of testing. pi= prob of head of coin i. Why you do not think of testing them separately. Note that you have to specify the level of significance (α) before hand. To perform a test of size exactly = α, we generally have to do a randomized test.

Stephen Tashi
Now I will like to reject an hypothesis that - coin 1 p= 0.3 and coin 2 p = 0.5, how can I do that?

If you form an opinion and then try different statistical tests until you find one that confirms your opinion, you are violating the assumptions used in the tests and you are not doing legitimate science.

The p-value computed in a statistical test gives the probability that the observed statistic is in a certain interval given the null hypothesis is true. If you try 5 different statistical tests and pick the lowest of the 5 p-values then, in addition to the null hypothesis, your are also implementing the hypothesis "I will try 5 different tests and pick the lowest of the p-values".

Tosh5457 and Dale
Dale
Mentor
2020 Award
If you try 5 different statistical tests and pick the lowest of the 5 p-values then, in addition to the null hypothesis, your are also implementing the hypothesis "I will try 5 different tests and pick the lowest of the p-values".
This is a recognized weakness of standard methods. The p value depends on future analyses, and any significant result can be made insignificant by planning enough future tests.

IMO, this is one reason to prefer Bayesian methods. Scientists, by their nature are curious people, and data is expensive. Analyzing data only once is both contrary to their personality and economically wasteful.

@Dale thank you for your replay.
I am still a bit baffled with the question I asked (and the solution that I got).
You are @andrewkirk are saying that I should test if the probability of getting higher than 6 is not very low, and if it is not very low, I can not reject the hypothesis. But this probability (>6) is the probability of getting 7 or 8 or 9 or 10 etc... It feels to me that this is not very accurate. The reason I feel that it is not accurate is that if I assume that I have a different probability density in-which the probability of getting 7-9 is extremely low, but the probability of getting 10-19 is very high? then the analysis that you suggest (which would give a high probability of getting >6 ) is wrong. Am I missing something?
Thank you

Dale
Mentor
2020 Award
What is 7, 8, 9, or 10-19 referring to here?

In general when doing standard hypothesis testing you do the following procedure:

1) form a null hypothesis, this is generally the hypothesis that there is no effect.

2) calculate the probability distribution of some test statistic given the hypothesis

3) sample your data and calculate the test statistic on that sample

4) calculate the probability of getting a test statistic that extreme or more given the null hypothesis

5) if that probability is very low then reject the null hypothesis, meaning reject the hypothesis that there is no effect.

Thank you @Dale , first 7,8,9 or 10-19 are the total number of H from @andrewkirk response. Also, thank you for your explanation. My main problem is that I do not want to reject a null hypothesis. I want to show that I can not reject it. That there is no way to reject it.

I will explain in a bit more detail, I have a model of connection probability between nodes (This can be anything in the world, let say molecules), I'll simplify it and say that in my model p=0.5, the probability of connection between nodes is 0.5. This is a specific model that I built.
Now we can sample node couples and check if they are connected. I want to say that even though the number of connection (let say we found that 15 out of 20 couples are connected) is not exactly what I would predict from my model (10 out of 20), I can not reject my model. What I want to say is that my model is valid. That even if we have a system with a connection probability of 0.5 it is not very unlikely to get 15 connection out of 20 couples. Is it more clear?
Thank you.

Ray Vickson
Homework Helper
Dearly Missed
Thank you @Dale , first 7,8,9 or 10-19 are the total number of H from @andrewkirk response. Also, thank you for your explanation. My main problem is that I do not want to reject a null hypothesis. I want to show that I can not reject it. That there is no way to reject it.

I will explain in a bit more detail, I have a model of connection probability between nodes (This can be anything in the world, let say molecules), I'll simplify it and say that in my model p=0.5, the probability of connection between nodes is 0.5. This is a specific model that I built.
Now we can sample node couples and check if they are connected. I want to say that even though the number of connection (let say we found that 15 out of 20 couples are connected) is not exactly what I would predict from my model (10 out of 20), I can not reject my model. What I want to say is that my model is valid. That even if we have a system with a connection probability of 0.5 it is not very unlikely to get 15 connection out of 20 couples. Is it more clear?
Thank you.

If you want to perform statistical tests there is no way you can guarantee NOT rejecting whatever hypothesis you propose, at least if you have large amounts of data. If your data sets are "small" then, of course, what you say may very well be the case for "reasonable" values of rejection criteria, except in "extreme" cases. For example, if you have a coin in which ##0 < p < 1## (##p=## H probability), then if you only toss it once you cannot reject anything, except for the extreme cases of (i) ##(p=0,\, {\rm H})##--reject for sure; and (Ii) ##(p=1,\, {\rm T})##--reject for sure). If you toss it twice and get HH you would have a 0.001% chance of seeing such an outcome when ##p = 0.01##, so if you did allow yourself to reject the hypotheses ##p = 0.01## there is a 99.99% chance you would be doing the right thing by rejecting it, and not rejecting it would seem very puzzling to everybody else.

Last edited:
Dale
Mentor
2020 Award
I want to show that I can not reject it.
So, using standard statistical methods wanting to not reject the null hypothesis is considered to be a “very bad” thing. Basically, you can fail to reject the null hypothesis for two reasons: 1) it is correct, 2) you did not collect enough data. This makes it easy for someone to fail to reject the null hypothesis simply by not collecting much data.

Basically, by wanting to not reject the null hypothesis you are attempting to place the burden of proof on the world. You are saying to the world “prove me wrong otherwise I am going to assume I am right”. The world is lazy and unwilling to assume the burden of proof, so instead it is up to you to provide enough proof that you are right. So you have to use different methods that can provide strong evidence for a hypothesis while requiring the researcher to collect enough data.

The process is as follows:

1) think about the practical application of your hypothesis. What are the practical consequences if, for example, your hypothesis is p=0.5 but in reality p=0.5001? Will airplanes fall out or the sky or will deadly cancer go undetected? Or will an entertainment file have an imperceptible glitch?

2) Based on 1) decide on a “region of practical equivalence” (ROPE). This is a region of hypotheses where, if the truth is anywhere in this region, then for all practical purposes you will consider it equivalent to your hypothesis being correct.

3) then you collect your data

4) from the data you form a confidence interval for the hypothesis

5) if the confidence interval lies entirely within the ROPE then you accept your hypothesis and consider the data to have proven it. If the confidence interval lies entirely outside the ROPE then you consider your hypothesis to be disproven by the data. If the confidence interval covers a region that is both inside and outside the ROPE then you consider your data inconclusive.

Thank you @Ray Vickson and @Dale for the detailed explanation. I understand the problem.
@Dale can you help me with an example on how to apply those steps that you suggest to the original question?

Dale
Mentor
2020 Award
@Dale can you help me with an example on how to apply those steps that you suggest to the original question?
Sure, but you will have to do part 1 since there is not enough information here for me to infer.

If you assume p=0.5 what are the consequences for your application if in reality p=0.51 or p=0.55, or p=0.60, or p=0.75, or p=0.90, or p=0.99?

Let's say that as long as there is no more than 0.05 error it is ok. For example, if the real value is between 0.45-0.55 there will not be any consequences, thanks.

Dale
Mentor
2020 Award
Let's say that as long as there is no more than 0.05 error it is ok. For example, if the real value is between 0.45-0.55 there will not be any consequences, thanks.
OK, so if ##h## is the number of heads in ##n## trial flips then the 95% confidence interval for ##p## is: $$\frac{h}{n}\pm \frac{1.96}{n}\sqrt{\frac{h(n-h)}{n}}$$
So if you flip 1000 times and get 518 heads then the confidence interval would be (0.487, 0.549) which is entirely within the ROPE so you would consider it good evidence for p=0.5

On the other hand, if you flip 300 times even if you get exactly 150 heads the confidence interval is (0.443, 0.557) which includes regions both inside and outside the ROPE so it would be inconclusive evidence. This means that because you are specifying such a tight ROPE you will have to acquire a lot of data to convince people that your probability is within that ROPE. This is the behavior that you want, it assures people that you are not just assuming that your hypothesis is correct, but that you are forcing yourself to provide real strong evidence before making the claim.

Last edited:
WWGD