Hypothesis Testing with Bayes

In summary, The question is wordy and hard to wrap your head around but I completely understand what it is asking, I just have no idea how to go about it, or where to start. I have spent hours scratching my head on this one!The question asks what the odds are that the cables are the branded version. The first thought is to use Bayesian hypothesis testing, which is estimation (one prior) and model comparison (two priors). However, the problem is that there are two varieties of non-branded cables, thin and thick, and the question asks about the odds of the cable being either the branded version or the thick non-branded version. To answer this, the question needs to
  • #1
FaraDazed
347
2
My example question is as below.

"You're at a local computer fair looking for some nice sleeved cables or your new system, a stall operator shows you his cables claiming they are the branded version. But the stall operator is not the usual person running the stall and actually does not know if they're the branded ones or not.

You measure the cable precisely with callipers and find the cable to have diameter of 13mm. You also know from research that the non-branded cables come in two forms, thin and thick, which have mean diameters of 31mm and 11mm, both with a standard deviation of 2mm. You also know that the branded ones have mean diameter of 23mm with a standard deviation of 7.5mm. Both the branded and non-branded sizes are normally distributed. In this part of the country you know 30% of these types of cables are the branded version and 70% are non-branded. The thin and thick variety of non-branded cables are just as popular as one another with the public.

What are the odds that the cables are the branded version? You will need to use the Bayesian approach to hypothesis testing."

The question is wordy and hard to wrap your head around but I completely understand what it is asking, I just have no idea how to go about it, or where to start. I have spent hours scratching my head on this one!

We have been taught two methods of Bayesian hypothesis testing, estimation (one prior) and model comparison (two priors).

My first though was to use model comparison, as that results in the odds of one model over the other, where the one model is that the hypothesis is that they are branded and the second model is the hypothesis that they are non-branded. I don't know if that is the correct approach or if it is, what to do, the fact there are two varieties of non-branded ones confuses the hell out of me too.

Any help appreciated.
 
Physics news on Phys.org
  • #2
FaraDazed said:
I don't know if that is the correct approach or if it is, what to do, the fact there are two varieties of non-branded ones confuses the hell out of me too.
Just treat it as three hypotheses. A: the cable is branded, B: the cable is thick unbranded, and C: the cable is thin unbranded. You already have a prior probability for each hypothesis, and you can calculate P(data|hypothesis) for each, so you can calculate the posterior also.
 
  • #3
Dale said:
Just treat it as three hypotheses. A: the cable is branded, B: the cable is thick unbranded, and C: the cable is thin unbranded. You already have a prior probability for each hypothesis, and you can calculate P(data|hypothesis) for each, so you can calculate the posterior also.

Thanks for the quick reply, ah I didn't think of splitting it into three. The prior probability for each hypothesis is where I am/were getting confused between knowing the sizes are normally distributed i.e. ##N(\mu=23mm \ \ \sigma=0.75mm)## for the non-branded, and also knowing that 30% in general are non-branded, i.e. where to use what information, or do I need to use both bits of information to construct the prior.
 
Last edited:
  • #4
FaraDazed said:
The prior probability for each hypothesis is where I am/were getting confused
Did you maybe miss this subtle statement “The thin and thick variety of non-branded cables are just as popular as one another with the public.” Together with the 70% number, this allows you to construct a prior for the thick and the thin hypotheses
 
  • #5
Dale said:
Did you maybe miss this subtle statement “The thin and thick variety of non-branded cables are just as popular as one another with the public.” Together with the 70% number, this allows you to construct a prior for the thick and the thin hypotheses

Ok yeah, so 30% branded, 35% thick non-branded, 35% thin non-branded, I get that, but where do the normal distributions of the sizes for the branded, thick and thin non-branded come into play? In terms of the "data" is the only data the data we got when we measured the cable ourselves and got 13mm?

I'm sorry I'm so confused, most of the research I do in the topic also is incomprehensible to me. Once I have one problem under my belt I can then usually reproduce it for similar problems, but doing it for the first time!
 
  • #6
FaraDazed said:
where do the normal distributions of the sizes for the branded, thick and thin non-branded come into play?
Those are used to calculate the likelihoods, ##P( data|hypothesis)##, for each hypothesis.

FaraDazed said:
In terms of the "data" is the only data the data we got when we measured the cable ourselves and got 13mm?
Yes
 
  • #7
Dale said:
Those are used to calculate the likelihoods, ##P( data|hypothesis)##, for each hypothesis.

Yes

Right! Ok I think I got it now. Thank you so much for your help, I was getting confused thinking the normal distributions were needed for the prior too.
 
  • #8
You are welcome! After a couple of these problems I am sure you will get it. It’s a new way of thinking, but it does make sense
 
  • #9
Dale said:
You are welcome! After a couple of these problems I am sure you will get it. It’s a new way of thinking, but it does make sense

Thanks again, I have done the problem now and have found that the odds that the cables are branded is roughly 1/3. I.e. it is almost three times more likely to be unbranded.

The posterior for the thick non-branded was so small as expected given the data, so this will make practically no difference to the result, but I did want to double check that it is mathematically correct to say that the odds the cables are branded is equal to[tex]
\frac{P(\textrm{branded hypothesis}|\textrm{data})}{P(\textrm{thin hypothesis }|\textrm{data}) + P(\textrm{thick hypothesis }|\textrm{data})}
[/tex]

As the non branded is comprised of two components, is the above correct?

Also I wanted to check, given my research on the topic when I first started the problem I was expecting to get a distribution for the posterior, rather than a number, is this just because my prior for each case was just a number and not a distribution?
 
Last edited:
  • #10
Dale said:
You are welcome! After a couple of these problems I am sure you will get it. It’s a new way of thinking, but it does make sense

Hi, sorry to bug you again, just needed to know if my understanding of the result is correct, if the equation in my post above is mathematical y correct for the odds of branded to non-branded.

I know if it were just a case of the odds of branded to thin non-branded then it is just

[tex]
\frac{P(\textrm{branded hypothesis}|\textrm{data})}{P(\textrm{thin hypothesis }|\textrm{data})}
[/tex]

But since the non-branded comes in both thin and thick varieties, is the equation in my previous post correct?

Sorry to bug you!
 
  • #11
FaraDazed said:
But since the non-branded comes in both thin and thick varieties, is the equation in my previous post correct?

Sorry to bug you!
No problem, sorry I missed the above post. Yes, that equation is correct. It looks like you have a correct understanding
 
  • #12
Dale said:
No problem, sorry I missed the above post. Yes, that equation is correct. It looks like you have a correct understanding

Ok thank you!
 
  • Like
Likes Dale

1. What is Bayes' theorem and how is it used in hypothesis testing?

Bayes' theorem is a mathematical formula that helps us update our beliefs about the probability of an event occurring based on new evidence. In hypothesis testing, it is used to calculate the probability of a hypothesis being true given the observed data. This allows us to make more accurate and informed decisions about the validity of a hypothesis.

2. How does Bayes' theorem differ from traditional frequentist methods of hypothesis testing?

Bayes' theorem takes into account prior beliefs and updates them with new evidence, while frequentist methods only consider the observed data. This makes Bayes' theorem more flexible and allows for a more nuanced interpretation of the data.

3. What are the key assumptions of Bayesian hypothesis testing?

The key assumptions of Bayesian hypothesis testing include having a well-defined hypothesis, prior beliefs about the hypothesis, and a likelihood function that describes the relationship between the data and the hypothesis. Additionally, it is assumed that the data is independent and identically distributed (IID).

4. How do you choose a prior distribution in Bayesian hypothesis testing?

Choosing a prior distribution can be a subjective process, as it involves incorporating prior beliefs about the hypothesis. However, it is recommended to use a non-informative prior, such as a uniform or Jeffreys prior, if there is no prior information available or if you want to avoid biasing the results.

5. Can Bayesian hypothesis testing be applied to all types of data?

Yes, Bayesian hypothesis testing can be applied to any type of data, including continuous, discrete, and categorical data. However, the choice of prior distribution may vary depending on the type of data and the assumptions made about the data-generating process.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
20
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
2K
  • Calculus and Beyond Homework Help
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
1K
  • Set Theory, Logic, Probability, Statistics
7
Replies
212
Views
11K
  • Beyond the Standard Models
Replies
9
Views
505
  • Beyond the Standard Models
4
Replies
105
Views
10K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
Back
Top