B Choosing an appropriate hypothesis

  • B
  • Thread starter Thread starter Agent Smith
  • Start date Start date
  • Tags Tags
    population
Physics news on Phys.org
  • #52
Agent Smith said:
The issue was different. I guess I should've stuck to the original example question and so I will. How would I confirm the hypothesis that the proportion of cell phone owners in my small town hasn't changed. It is a valid hypothesis, right?
No. It's too strong and/or vague. Do you mean hasn't changed AT ALL -- was 0.31000000 and still is 0.31000000? If the conservative and cautious assumption is that it hasn't changed, then you shouldn't assume otherwise unless you have solid evidence. Why would you?
PS. There are techniques for optimal decision making if that is what you are looking for. It is an entire subject in itself.
Agent Smith said:
@Dale was kind enough to use a simpler (coin flip) example (or was it you?), but I realize now that it suffers from the same issue viz. I can't accept ##H_0##.
Why? If there are costs or dangers to balance, then you should define an example like that, where retaining an erroneous null hypothesis assumption is bad. Then you can compare the costs of Type I versus Type II errors.
Agent Smith said:
Hence, he suggested ROPE (which is confidence interval based inference). It's 😢 that a high school student won't be able to assist in wildlife conservation research, being unable to test (say) the mocking jay population hasn't changed.
High school students can use statistical hypothesis testing. (Please don't bring up yet another hypothetical scenario like mocking jays. Jumping around is just confusing.)
 
  • Haha
Likes Agent Smith
  • #53
@Dale what do you know, I had saved your article for a later read. I don't know if my compliments are of any value, but it's such a well-written article. Thank you.

I will now describe myself conducting a hypothesis test.
##H_0: \text{The coin is fair}##
##H_a: \text{The coin is not fair}##

I now take the coin and do a long series of coin flips (you did 4000) and count the outcomes. Say number of tails = A and number of Heads = E. I suppose this is my "evidence". We have something called a posterior distribution and highest posterior distribution. I didn't quite understand these. What are they? Are they probability distribution of the outcomes (heads/tails). They look as though they've been done by a computer.
##P(X \in [0.45, 055]) = 0.95## would mean the probability that the random variable X = the frequency of tails, is in the range ##[0.045, 0.55]## is ##95\%##. Correct?

Bayes' Theorem: ##P(H|E) = \frac{P(H) \times P(E|H)}{P(E)}##

I don't know how we we could use ##P(X \in [0.45, 0.55]) = 0.95## in Bayes' formula? What are the values for ##P(E|H)## and ##P(E)##?

Setting these doubts aside for the moment, we then end up with a posterior probability for ##H_0##. Is there some kind of threshold probability here, like p-value?

This is also excellent: ##\frac{P(H_A|E)}{P(H_B|E)} = \frac{P(E|H_A) \times P(H_A)}{P(E|H_B \times P(H_B)}##

I guess what I'm asking for is a walkthrough of a Bayes' theorem based hypothesis test (my hypothesis being a coin is fair) which I can do at home. Please :smile:
 
Last edited:
  • #54
FactChecker said:
No. It's too strong and/or vague. Do you mean hasn't changed AT ALL -- was 0.31000000 and still is 0.31000000? If the conservative and cautious assumption is that it hasn't changed, then you shouldn't assume otherwise unless you have solid evidence. Why would you?
PS. There are techniques for optimal decision making if that is what you are looking for. It is an entire subject in itself.

Why? If there are costs or dangers to balance, then you should define an example like that, where retaining an erroneous null hypothesis assumption is bad. Then you can compare the costs of Type I versus Type II errors.

High school students can use statistical hypothesis testing. (Please don't bring up yet another hypothetical scenario like mocking jays. Jumping around is just confusing.)
I really don't know how real-life hypotheses tests are done. Can you suggest me some DIY statistical hypotheses testing experiments; off the top of my head, I can think of testing a coin for fairness/unfairness.

I would definitely like things to be as convenient as possible for me and everyone else involved, but I'm doing high school statistics (B) and stuff have been simplified for us.

Sorry for causing confusion by introducing a new area of statistical research (wildlife conservation). It's just that children are being taught environmental conservation is important and I thought you might have experience in that domain, making it easier for you to respond.

Thank you 🔰
 
  • #55
Suppose you want to show that a parameter, ##p##, (mean or variance, etc.) of a distribution is within ##\epsilon## of a value ##p_0## with a certain confidence, ##c## (=95%, 97.5%, etc.). You need the mean and variance for the sample parameter estimator, ##\hat p##. Then you need to collect a sample large enough that the ##c## confidence interval derived from that sample is within ##(p_0-\epsilon, p_0+\epsilon)##.
Confidence intervals are discussed in most introductory statistics books.
 
  • #56
FactChecker said:
Suppose you want to show that a parameter, ##p##, (mean or variance, etc.) of a distribution is within ##\epsilon## of a value ##p_0## with a certain confidence, ##c## (=95%, 97.5%, etc.). You need the mean and variance for the sample parameter estimator, ##\hat p##. Then you need to collect a sample large enough that the ##c## confidence interval derived from that sample is within ##(p_0-\epsilon, p_0+\epsilon)##.
Confidence intervals are discussed in most introductory statistics books.
Yes, all I can remember is ##\text{Parameter} = \text{Statistic} \pm z^* \frac{\sigma}{\sqrt n}##. I hope this is correct. I had asked @Dale a question, hope he'll reply soon. As for the confidence interval, if my ##p## is within it, I have reason to say that my ##p## hasn't changed, right? So if my ##p = 0.25## and my confidence interval is ##0.23 \pm 0.04##, I can accept my ##H_0##, yes?
 
  • #57
If you want to PROVE that your parameter ##p## hasn't changed, that means you have to prove that it has not changed from ##p=0.2500000000000000## to ##p=0.2500000000000001##. Good luck.
Instead, you should specify an amount of change, ##\epsilon##, that is acceptable and prove that the entire CORRECTION: acceptable range (p-\epsilon, p+\epsilon) is inside the confidence interval confidence interval is inside the acceptable range ##(p-\epsilon, p+\epsilon)##. That requirement would determine the required sample size. That is what I tried to indicate in Post #55.
 
Last edited:
  • #58
@FactChecker So there is no way, given what you said, to check if a parameter has not changed? I could take a sample, compute a statistic and then construct a confidence interval (like I did above), no. Dale mentioned ROPE, based on Bayes' theorem. I understand that the larger the sample, the lower our margin of error (this was taught to me). However, can I or can I not accept the null hypothesis (assume everything's perfect for an inference to be made) 🤓
 
  • #59
It's not clear to me whether you are expecting too much from a statistical experiment or are just not being careful in wording your question. Being precise in mathematical statements is a learned skill. You can show at a certain level of confidence, that the parameter has not changed significantly. You can not prove that it has not changed even the tiniest amount.
 
  • Like
Likes Hornbein and Agent Smith
  • #60
Agent Smith said:
I don't know if my compliments are of any value, but it's such a well-written article. Thank you.
Thank you for that, I appreciate the feedback.

Agent Smith said:
We have something called a posterior distribution and highest posterior distribution. I didn't quite understand these. What are they?
Fair warning, this is well beyond basic. A lot of that is covered in the previous articles in that series. But basically the key idea is as follows:

We are going to treat uncertain things like population means, and coin flip frequencies, directly as random variables. Thus they have their own associated probability density functions and so forth.

So, in this case, our uncertainty about the fairness is represented as a probability distribution on the frequency. If we were very certain that it is unfair then the probability distribution on the frequency might have a mean of 0.6 and a standard deviation of 0.02. If we thought it is fair, but we were not very certain about it, then maybe it would have a mean of 0.5 and a standard deviation of 0.2. In any case, the frequency is not just a number, but a probability distribution.

The posterior distribution is just the probability distribution after (posterior to) collecting the data. And the highest posterior density interval is the smallest interval you can make that contains 95% (or whatever level you choose) of the posterior.

Agent Smith said:
P(X∈[0.45,055])=0.95 would mean the probability that the random variable X = the frequency of tails, is in the range [0.045,0.55] is 95%. Correct?
Yes.

Agent Smith said:
Is there some kind of threshold probability here, like p-value?
Yes, you can do the same kinds of things. The Bayesian techniques are not as old as the usual ones, so they don’t have quite as many traditions, like p values of 0.05, but you can use that one if you like and it often reduces arguments to use “traditional” values like that.

Agent Smith said:
I guess what I'm asking for is a walkthrough of a Bayes' theorem based hypothesis test (my hypothesis being a coin is fair) which I can do at home
So first, you have to choose your ROPE. What would you consider to be close enough to fair that you would call it practically fair.

Second, you need to decide how confident you want to be that it is practically fair.
 
  • #61
@Dale I want to know if a coin A I have is fair/unfair.
My hypothesis: ##H:## Coin A is unfair
I flip it 1000 times and record number of heads (H) and number of tails (T).
I now have the posterior distribution (?) = evidence.
I have Bayes' formula: ##P(H|E) = \frac{P(H) \times P(E|H)}{P(E)}##. How do I get the values for ##P(E|H)## and ##P(E)## from my experimental evidence? What I know is ##P(H)## the prior probability would depend on previous work/initial evidence (maybe I flipped my coin 50 times the previous day and notice 39 heads)
 
  • #62
Agent Smith said:
I want to know if a coin A I have is fair/unfair.
Yes. The first thing you need to do is make two decisions. These are choices that you make. They do not come from statistics, and ideally they should be done before you collect the data.

The first is to choose what would you consider to be close enough to fair that you would call it practically fair.

Second, you need to decide how confident you want to be that the coin is practically fair.
 
  • Like
Likes FactChecker
  • #63
Dale said:
Yes. The first thing you need to do is make two decisions. These are choices that you make. They do not come from statistics, and ideally they should be done before you collect the data.

The first is to choose what would you consider to be close enough to fair that you would call it practically fair.

Second, you need to decide how confident you want to be that the coin is practically fair.
Fair would be 50/50 heads and tails. Proportion of heads = ##p_H = 0.5 \pm 0.02## (that's the interval [0.48, 0.52] and proportion of tails = ##1 - p_H##. That would be my first choice.

Second choice, I want to be ##95\%## confident that the coin is fair.

So I have my data, I flipped the coins ##1000## times. Suppose I get (##2## scenarios)
1. Number of heads 490 i.e. ##p_H = 0.49##
2. Number of heads 358 i.e. ##p_H = 0.358##

I have my formula (Bayes' theorem).

How do I use my data to compute ##P(E|H)## and ##P(E)##?
 
  • #64
Agent Smith said:
Fair would be 50/50 heads and tails. Proportion of heads = pH=0.5±0.02 (that's the interval [0.48, 0.52] and proportion of tails = 1−pH. That would be my first choice.

Second choice, I want to be 95% confident that the coin is fair.
Perfect, with these we can proceed.

Agent Smith said:
How do I use my data to compute P(E|H) and P(E)?
P(E) is just a normalization that we use to scale the right hand side of the equation so that the integral is 1 (because a probability density function has to integrate to 1 by definition).

Coin flips can be represented as a binomial distribution B(n,p). In this case you are doing ##n=1000## flips, and the probability of heads on each flip is ##p=H##. So $$P(E|H)=\binom{1000}{E} H^E (1-H)^{1000-E}$$

The next thing is to choose your prior. This is your belief in the fairness of the coin (including your uncertainty). For a binomial likelihood like we have here, there is a very convenient form of the prior called a conjugate prior. For the binomial likelihood the conjugate prior is the beta distribution. If our prior is a beta distribution ##\beta(a,b)## then our posterior will be ##\beta(a+E,b+(1000-E))##.

So let's say that we wanted to say that we had a completely uniform prior. In other words, before running the experiment we did not have any reason to believe that the coin would land heads 50% of the time versus 99% of the time. This is called a uniform or uninformative prior. So that would be a prior of ##\beta(1,1)##.

Scenario 1. After observing ##E=490## heads (and ##1000-E=510## tails) then we would have the posterior distribution ##\beta(491,511)##. This has a probability of ##0.708## of being within the ROPE. So this is evidence that the coin is probably practically fair, but it is not strong enough evidence to meet your confidence requirement. There is a non-negligible ~30% chance that it is not practically fair, given this data and the uninformed prior.

1734496328711.png


Scenario 2. After observing ##E=358## heads (and ##1000-E=642## tails) then we would have the posterior distribution ##\beta(359,643)##. This has a probability of ##3.55 \ 10^{-15}## of being within the ROPE. This is pretty strong evidence that the coin is not practically fair.
1734496378337.png
 
Last edited:
  • Like
Likes Agent Smith and FactChecker
  • #65
@Dale Too advanced a topic for me, but I have a much better grasp of what's going on.
Here's where I trip up:
1. I don't know what's a ##\beta## distribution (you linked me to the Wikipage. :thumbup: )
2. I didn't quite get this 👇
Capture.PNG

My brain's telling me ##E## and ##H## aren't numbers and so I don't get the righthand side of the equality.

So given a confidence level ##95\%##, if ##\beta (a, b)## it means that the interval ##[a, b]## gives you (in my example) the proportions that can be considered equivalent to ##0.5## (the coin is fair). If my experimental proportion falls outside this range, the probability that the coin is fair is ##< 0.05##? Correct?
 
  • #66
Agent Smith said:
Fair would be 50/50 heads and tails. Proportion of heads = ##p_H = 0.5 \pm 0.02## (that's the interval [0.48, 0.52] and proportion of tails = ##1 - p_H##. That would be my first choice.

Second choice, I want to be ##95\%## confident that the coin is fair.

So I have my data, I flipped the coins ##1000## times. Suppose I get (##2## scenarios)
1. Number of heads 490 i.e. ##p_H = 0.49##
2. Number of heads 358 i.e. ##p_H = 0.358##

I have my formula (Bayes' theorem).

How do I use my data to compute ##P(E|H)## and ##P(E)##?
If you are going to use the Bayesian approach, the third thing you should initially decide is what your initial distribution of coin-bias probabilities should be. If you got the coin from Mother Teresa, you can assume a high probability of a reasonably fair coin. That would give you a limited-normal distribution (limited between 0 and 1) with a mean of 0.5 and a small variance. If you are using a coin at an amusement park game, you might want to assume that the coin is more likely to be biased toward the park winning. That would give you a skewed probability (again limited between 0 and 1) of coin-bias toward the park winning. This decision gives you a lot of freedom to tune your analysis to particular situations, but is susceptible to your initial bias.
 
Last edited:
  • Like
Likes Agent Smith and Dale
  • #67
Agent Smith said:
1. I don't know what's a β distribution (you linked me to the Wikipage. :thumbup: )
The beta distribution is used to model frequencies a lot because it has 0 probability outside of the range 0 to 1. So it has some nice mathematical properties for this application.

It is possible to do this without the beta distribution. I can show that later.

Agent Smith said:
My brain's telling me E and H aren't numbers and so I don't get the righthand side of the equality.
##E## is the evidence, so it is a number. Specifically, it is the number of heads out of 1000 coin flips. Of course, you could replace the 1000 with another number if you didn't want to do 1000 flips.

##H## is a variable. It is every possible hypothesis of the frequency of heads for the coin. So ##H## ranges from 0 to 1. ##H=0.5## is a coin that is exactly fair. ##0.48<H<0.52## is a coin that is practically fair.

Agent Smith said:
So given a confidence level 95%, if β(a,b) it means that the interval [a,b] gives you (in my example) the proportions that can be considered equivalent to 0.5 (the coin is fair). If my experimental proportion falls outside this range, the probability that the coin is fair is <0.05? Correct?
With the posterior and the ROPE we can directly compute the probability that the coin is fair. All we do is integrate the posterior over the ROPE. So, zooming in on the first scenario, we integrate over the orange region to get the 0.708 probability that the coin is fair:
1734533107110.png
 
  • #68
Dale said:
It is possible to do this without the beta distribution. I can show that later.
Here is a version in an Excel spreadsheet where you can see the calculation directly. It doesn't use the beta distribution at all, but I approximate the possible hypotheses discretely. Specifically, I allow only hypotheses with frequencies that are integer multiples of 1/100. So you could hypothesize a coin with a frequency of 0.43, but not a coin with a frequency of 0.432

The first column you put the evidence in terms of a number of heads and a number of tails.

In the second column you put in your prior beliefs. Right now I have it set for the uniform prior.

The remaining columns show all of the calculations needed to use Bayes here.
 

Attachments

  • #69
Capture.PNG


This is the binomial theorem I believe. So this is the probability of the evidence given the hypothesis. So I've flipped the coin ##1000## times and I get ##490## heads, that's a heads proportion = ##0.49##.

You said ##H = 0.5##. I don't quite get that. Shouldn't it be ##H = 0.49##. Is the hypothesis the coin is fair?

Capture.PNG


This is the probability distribution of frequencies of heads GIVEN the coin is fair?
 
Last edited:
  • #70
##P(H|E)## is usually evaluated as a probability distribution for all possible values of ##H##
 
  • Like
Likes Agent Smith
  • #71
@Dale and @FactChecker I think the discussion we've had is adequate for my level. Thank you.

One last question. For Bayes' theorem given as ##P(H|D) = \frac{P(H) \times P(D|H)}{P(D)}##, the Wikipage says that ##P(D) \ne 0## (division by ##0##). But I noticed that since ##P(\neg A) = 1 - P(A)##, if ##P(D) = 0##, we know that ##P(\neg D) = 1##. Can't we use this knowledge to solve the division by ##0## problem?

So we could do ##P(H|\neg D) = \frac{P(H) \times P(\neg D|H)}{P(\neg D)} = \frac{P(H) \times P(\neg D|H)}{1} = P(H) \times P(\neg D|H)##
 
Last edited:
  • #72
Agent Smith said:
Can't we use this knowledge to solve the division by 0 problem?
How? I don’t see what knowing a different number would do for you here.
 
  • #73
Dale said:
How? I don’t see what knowing a different number would do for you here.
We can still test the probability of the hypothesis, but it looks as though posterior probability < prior probability because(?) ##P(\neg D|H)## is going to be low.
 

Similar threads

Replies
5
Views
4K
Replies
2
Views
3K
Replies
0
Views
22K
Replies
0
Views
17K
Replies
1
Views
25K
Replies
0
Views
21K
Replies
1
Views
28K
Replies
2
Views
69K
Back
Top