Two Bernoulli distribution- test hypothesis for biased coins

no_alone · Jul 2, 2018

I'm simplifying a research question that I have at work. Assuming I have 2 coins each with a different probability of head, let's call heads a success (p). Those are biased coins each with a different p, and I do not know the probability of success of each coin, but I do got a sample:

Coin 1 - 4 head 3 tails
Coin 2 - 3 head 1 tails

Now I will like to reject an hypothesis that - coin 1 p= 0.3 and coin 2 p = 0.5, how can I do that?

Or/And I will also like to not reject a different hypothesis (state that it is not very very unlikely that this sample came from this hypotheses) - coin 1 p=0.7 and coin 2 p=0.7 (Is it the same calculation just depend on the p value?).

EDIT - I am actually more interested to say that I can not reject an hypothesis...Also if I am in the same situation with more than 2 coins (5 coins with different size of sample for each coin)

Any help will be great, even directions of where to search as I did not manage to understand if there is a name of the problem I presented, and if so what it is.

Thank you.

andrewkirk · Jul 2, 2018

no_alone said:

Now I will like to reject an hypothesis that - coin 1 p= 0.3 and coin 2 p = 0.5, how can I do that?

To do a hypothesis test you need to extract a single test statistic to be measured from the observations. There are many different ways you could do this. One way is to make the test statistic the total number of heads ##R=R_1+R_2##.
With the null hypothesis that you gave, call it ##H_0##, we have ##R_1\sim binom(7,0.3)## and ##R_2\sim binom(4,0.5)##.
The probability of getting seven or more heads is:

$$Pr(R\ge 7| H_0) = \sum_{r_1=0}^7\sum_{r_2=7-r_1}^4 Pr(R_1=r_1| H_0))Pr(R_2=r_2| H_0))$$

which is readily calculated using binomial probabilities. If your chosen significance level is ##\alpha## then you can reject the null hypothesis with confidence ##\alpha## if the above probability is less than ##\alpha##.

Dale · Jul 2, 2018

no_alone said:

I will also like to not reject a different hypothesis

Do you want to not reject it, or do you want to accept it?

no_alone · Jul 2, 2018

Thank you @andrewkirk I will look into it. But your method will probably not help me as I want to say that I can not reject the hypothesis

@Dale I actually want to say that I can not reject it, I want to say that it is not very very unlikely that the sample came from this distribution.

Dale · Jul 2, 2018

no_alone said:

@Dale I actually want to say that I can not reject it,

Then the method proposed by @andrewkirk is what you want to do. Also, you are best off acquiring as little data as possible. The less data you acquire the easier it is to not reject a null hypothesis, regardless of the data.

Note, not rejecting a null hypothesis does not allow you to accept the alternate hypothesis.

no_alone · Jul 2, 2018

@Dale thank you for the response.
I am not sure @andrewkirk method will help, as his method is for rejecting it (if for example I got 7 H, his method is to test what is the probability to get more than 5 H, but if this probability is not lower than alpha it does not say that I can not reject )

Maybe I do not understand something?
Thanks.

Dale · Jul 2, 2018

no_alone said:

if this probability is not lower than alpha it does not say that I can not reject

If ##p<\alpha## then you reject the null hypothesis. If ##p>\alpha## then you cannot reject it. Again, the easiest way to fail to reject a null hypothesis is to acquire insufficient data.

no_alone · Jul 2, 2018

Thank you @Dale but I am still not sure that it is correct, as he is trying to reject the hypothesis by stating that getting larger than 7 H is not likely , but what about getting exactly 7H ? If I want to not reject it, I need to show that getting exactly 7H is p > alpha.

Dale · Jul 2, 2018

no_alone said:

he is trying to reject the hypothesis by stating that getting larger than 7 H is not likely , but what about getting exactly 7H ?

No, you never reject a null hypothesis that way. If your test statistic were continuous instead of discrete then the probability of getting any given value would be 0. You always test the probability of getting a test statistic equal to or more extreme than the actual one.

ssd · Jul 13, 2018

I see no relation between p1 and p2 in the proposed context of testing. pi= prob of head of coin i. Why you do not think of testing them separately. Note that you have to specify the level of significance (α) before hand. To perform a test of size exactly = α, we generally have to do a randomized test.

Stephen Tashi · Jul 15, 2018

no_alone said:

Now I will like to reject an hypothesis that - coin 1 p= 0.3 and coin 2 p = 0.5, how can I do that?

If you form an opinion and then try different statistical tests until you find one that confirms your opinion, you are violating the assumptions used in the tests and you are not doing legitimate science.

The p-value computed in a statistical test gives the probability that the observed statistic is in a certain interval given the null hypothesis is true. If you try 5 different statistical tests and pick the lowest of the 5 p-values then, in addition to the null hypothesis, your are also implementing the hypothesis "I will try 5 different tests and pick the lowest of the p-values".

Dale · Jul 15, 2018

Stephen Tashi said:

If you try 5 different statistical tests and pick the lowest of the 5 p-values then, in addition to the null hypothesis, your are also implementing the hypothesis "I will try 5 different tests and pick the lowest of the p-values".

This is a recognized weakness of standard methods. The p value depends on future analyses, and any significant result can be made insignificant by planning enough future tests.

IMO, this is one reason to prefer Bayesian methods. Scientists, by their nature are curious people, and data is expensive. Analyzing data only once is both contrary to their personality and economically wasteful.

Dale · Jul 15, 2018

Dale said:

The p value depends on future analyses,

Here is a paper discussing this

http://www.indiana.edu/~kruschke/articles/Kruschke2010WIRES.pdf

no_alone · Feb 28, 2019

@Dale thank you for your replay.
I am still a bit baffled with the question I asked (and the solution that I got).
You are @andrewkirk are saying that I should test if the probability of getting higher than 6 is not very low, and if it is not very low, I can not reject the hypothesis. But this probability (>6) is the probability of getting 7 or 8 or 9 or 10 etc... It feels to me that this is not very accurate. The reason I feel that it is not accurate is that if I assume that I have a different probability density in-which the probability of getting 7-9 is extremely low, but the probability of getting 10-19 is very high? then the analysis that you suggest (which would give a high probability of getting >6 ) is wrong. Am I missing something?
Thank you

Dale · Feb 28, 2019

What is 7, 8, 9, or 10-19 referring to here?

In general when doing standard hypothesis testing you do the following procedure:

1) form a null hypothesis, this is generally the hypothesis that there is no effect.

2) calculate the probability distribution of some test statistic given the hypothesis

3) sample your data and calculate the test statistic on that sample

4) calculate the probability of getting a test statistic that extreme or more given the null hypothesis

5) if that probability is very low then reject the null hypothesis, meaning reject the hypothesis that there is no effect.

no_alone · Feb 28, 2019

Thank you @Dale , first 7,8,9 or 10-19 are the total number of H from @andrewkirk response. Also, thank you for your explanation. My main problem is that I do not want to reject a null hypothesis. I want to show that I can not reject it. That there is no way to reject it.

I will explain in a bit more detail, I have a model of connection probability between nodes (This can be anything in the world, let say molecules), I'll simplify it and say that in my model p=0.5, the probability of connection between nodes is 0.5. This is a specific model that I built.
Now we can sample node couples and check if they are connected. I want to say that even though the number of connection (let say we found that 15 out of 20 couples are connected) is not exactly what I would predict from my model (10 out of 20), I can not reject my model. What I want to say is that my model is valid. That even if we have a system with a connection probability of 0.5 it is not very unlikely to get 15 connection out of 20 couples. Is it more clear?
Thank you.

Ray Vickson · Feb 28, 2019

no_alone said:

Thank you @Dale , first 7,8,9 or 10-19 are the total number of H from @andrewkirk response. Also, thank you for your explanation. My main problem is that I do not want to reject a null hypothesis. I want to show that I can not reject it. That there is no way to reject it.

I will explain in a bit more detail, I have a model of connection probability between nodes (This can be anything in the world, let say molecules), I'll simplify it and say that in my model p=0.5, the probability of connection between nodes is 0.5. This is a specific model that I built.
Now we can sample node couples and check if they are connected. I want to say that even though the number of connection (let say we found that 15 out of 20 couples are connected) is not exactly what I would predict from my model (10 out of 20), I can not reject my model. What I want to say is that my model is valid. That even if we have a system with a connection probability of 0.5 it is not very unlikely to get 15 connection out of 20 couples. Is it more clear?
Thank you.

If you want to perform statistical tests there is no way you can guarantee NOT rejecting whatever hypothesis you propose, at least if you have large amounts of data. If your data sets are "small" then, of course, what you say may very well be the case for "reasonable" values of rejection criteria, except in "extreme" cases. For example, if you have a coin in which ##0 < p < 1## (##p=## H probability), then if you only toss it once you cannot reject anything, except for the extreme cases of (i) ##(p=0,\, {\rm H})##--reject for sure; and (Ii) ##(p=1,\, {\rm T})##--reject for sure). If you toss it twice and get HH you would have a 0.001% chance of seeing such an outcome when ##p = 0.01##, so if you did allow yourself to reject the hypotheses ##p = 0.01## there is a 99.99% chance you would be doing the right thing by rejecting it, and not rejecting it would seem very puzzling to everybody else.

Dale · Mar 1, 2019

no_alone said:

I want to show that I can not reject it.

So, using standard statistical methods wanting to not reject the null hypothesis is considered to be a “very bad” thing. Basically, you can fail to reject the null hypothesis for two reasons: 1) it is correct, 2) you did not collect enough data. This makes it easy for someone to fail to reject the null hypothesis simply by not collecting much data.

Basically, by wanting to not reject the null hypothesis you are attempting to place the burden of proof on the world. You are saying to the world “prove me wrong otherwise I am going to assume I am right”. The world is lazy and unwilling to assume the burden of proof, so instead it is up to you to provide enough proof that you are right. So you have to use different methods that can provide strong evidence for a hypothesis while requiring the researcher to collect enough data.

The process is as follows:

1) think about the practical application of your hypothesis. What are the practical consequences if, for example, your hypothesis is p=0.5 but in reality p=0.5001? Will airplanes fall out or the sky or will deadly cancer go undetected? Or will an entertainment file have an imperceptible glitch?

2) Based on 1) decide on a “region of practical equivalence” (ROPE). This is a region of hypotheses where, if the truth is anywhere in this region, then for all practical purposes you will consider it equivalent to your hypothesis being correct.

3) then you collect your data

4) from the data you form a confidence interval for the hypothesis

5) if the confidence interval lies entirely within the ROPE then you accept your hypothesis and consider the data to have proven it. If the confidence interval lies entirely outside the ROPE then you consider your hypothesis to be disproven by the data. If the confidence interval covers a region that is both inside and outside the ROPE then you consider your data inconclusive.

no_alone · Mar 3, 2019

Thank you @Ray Vickson and @Dale for the detailed explanation. I understand the problem.
@Dale can you help me with an example on how to apply those steps that you suggest to the original question?

Dale · Mar 3, 2019

no_alone said:

@Dale can you help me with an example on how to apply those steps that you suggest to the original question?

Sure, but you will have to do part 1 since there is not enough information here for me to infer.

If you assume p=0.5 what are the consequences for your application if in reality p=0.51 or p=0.55, or p=0.60, or p=0.75, or p=0.90, or p=0.99?

no_alone · Mar 3, 2019

Let's say that as long as there is no more than 0.05 error it is ok. For example, if the real value is between 0.45-0.55 there will not be any consequences, thanks.

Dale · Mar 3, 2019

no_alone said:

Let's say that as long as there is no more than 0.05 error it is ok. For example, if the real value is between 0.45-0.55 there will not be any consequences, thanks.

OK, so if ##h## is the number of heads in ##n## trial flips then the 95% confidence interval for ##p## is: $$\frac{h}{n}\pm \frac{1.96}{n}\sqrt{\frac{h(n-h)}{n}}$$
So if you flip 1000 times and get 518 heads then the confidence interval would be (0.487, 0.549) which is entirely within the ROPE so you would consider it good evidence for p=0.5

On the other hand, if you flip 300 times even if you get exactly 150 heads the confidence interval is (0.443, 0.557) which includes regions both inside and outside the ROPE so it would be inconclusive evidence. This means that because you are specifying such a tight ROPE you will have to acquire a lot of data to convince people that your probability is within that ROPE. This is the behavior that you want, it assures people that you are not just assuming that your hypothesis is correct, but that you are forcing yourself to provide real strong evidence before making the claim.

Two Bernoulli distribution- test hypothesis for biased coins

Discussion

Graduate Expected numbers of cards of a last color remaining

Graduate Probability puzzle

Undergrad The problem of points

Undergrad The countability paradox of computable numbers

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Graduate Cannot understand this corollary on surreal numbers

High School Bunkbed Conjecture Debunked?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Two Bernoulli distribution- test hypothesis for biased coins

Discussion

Similar threads

Graduate Test Hypothesis ##\it{p}##-value and ##\sigma##

Graduate Applying Bayesian Inference to Test Hypothesis on 100 Samples of Random Numbers

Undergrad Determining sample size needed to test hypothesis

Determining sample size needed to test hypothesis