A Bayesian question of choosing a white or black dog

Thecla · May 11, 2022

An excellent article on Bayes and Bayesian statistics was found on Houston Public Radio.https://uh.edu/engines/epi1876.htm
The problem is in the first 2 paragraphs of the article.. I will summarize:
Your wife and her friend went out and got you a white dog for your birthday, and you wonder which of them selected it. At first blush it would be a 50-50 guess. But you know two things: your wife doesn't like white dogs very much, and your friend likes them a lot. So the friend probably chose the dog.
We can actually do a calculation but it is not simple If the likelihood of your wife's picking a white dog is 15% and her friend's doing so is 90%, the odds that her friend chose it turn out to be 85%
(End of summary)
How do you do the calculation to get 85% probability that her friend chose the white dog?.

Dale · May 11, 2022

This is just Bayes’ theorem: $$P(F|W)=\frac{P(W|F) P(F)}{P(W|F)P(F)+P(W|\neg F) P(\neg F)}$$ where ##W## is the event that the dog is white and ##F## is the event that the friend picked it.

Orodruin · May 11, 2022

Correct rounding would result in 86% … (result is 85.71%)

Then there is the issue of the prior. If my wife and a friend of hers go out to buy me a present I would probably not expect an equal probability prior on who picked it.

PeroK · May 12, 2022

Thecla said:

Summary: Who picked the white dog knowing each shoppers preference

An excellent article on Bayes and Bayesian statistics was found on Houston Public Radio.https://uh.edu/engines/epi1876.htm
The problem is in the first 2 paragraphs of the article.. I will summarize:
Your wife and her friend went out and got you a white dog for your birthday, and you wonder which of them selected it. At first blush it would be a 50-50 guess. But you know two things: your wife doesn't like white dogs very much, and your friend likes them a lot. So the friend probably chose the dog.
We can actually do a calculation but it is not simple If the likelihood of your wife's picking a white dog is 15% and her friend's doing so is 90%, the odds that her friend chose it turn out to be 85%
(End of summary)
How do you do the calculation to get 85% probability that her friend chose the white dog?

How can you possibly know these probabilities? It makes no sense to me to try to apply probability theory in a case where the basic probabilities themselves are unknowable. The answer that comes out is a meaningless number, rather than a probability of anything.

The article also falls into the trap of calculating a probability based on only some of the relevant factors. While other significant factors may be ignored: see the point made by @Orodruin.

Orodruin · May 12, 2022

PeroK said:

How can you possibly know these probabilities? It makes no sense to me to try to apply probability theory in a case where the basic probabilities themselves are unknowable. The answer that comes out is a meaningless number, rather than a probability of anything.

Oh, but you can, simply by the fact that something like the "degree of belief" in a statement follow the basic rules of probability. Bayesian statistics essentially tell you how to update your degree of belief in something given additional data. The result will depend on whatever prior knowledge I have on the relations between myself, my wife, and her friend, but that is fairly natural. If I think there is no chance in the nine hells that my wife would let someone else pick my present, then that will not be changed by this occurrence of me being given what would admittedly be a rather odd choice on the part of my wife. However, if my wife would be rather negatively disposed towards the entire giving presents thing, then this observation would also reaffirm that. Even if the numbers are not precise, Bayesian statistics work essentially as a person would typically go about in rearranging their beliefs given new evidence.

PeroK · May 12, 2022

Orodruin said:

Oh, but you can, simply by the fact that something like the "degree of belief" in a statement follow the basic rules of probability. Bayesian statistics essentially tell you how to update your degree of belief in something given additional data. The result will depend on whatever prior knowledge I have on the relations between myself, my wife, and her friend, but that is fairly natural. If I think there is no chance in the nine hells that my wife would let someone else pick my present, then that will not be changed by this occurrence of me being given what would admittedly be a rather odd choice on the part of my wife. However, if my wife would be rather negatively disposed towards the entire giving presents thing, then this observation would also reaffirm that. Even if the numbers are not precise, Bayesian statistics work essentially as a person would typically go about in rearranging their beliefs given new evidence.

That may be, but if you tested the results of such calculations you might find there was no relationship between your estimated probability and the reality.

For example, every day you might do some such calculation that results in a nominal 80% probability of something. It would be a different something every day.

If at the end of the year you were right only 50% of the time, then that would show that the 80% was a meaningless calculation. In that sense "degree of belief" by itself is potentially meaningless, unless that belief is testable in some way.

PeroK · May 12, 2022

PS for example, in a court of law you would have to justify those beliefs in some way. You shouldn't convict someone on the basis of a probability conjured from someone's untested belief in what their wife and her friend might or might do. Even if you end up with a "probability" of 99.9% that your neighbour is a murderer, then that is meaningless unless the beliefs that went into such a calculation are justified and/or testable.

Orodruin · May 12, 2022

PeroK said:

You shouldn't convict someone on the basis of a probability conjured from someone's untested belief in what their wife and her friend might or might do.

Yet this happens every day because most things that are tested in a court of law are not down to precisely knowable probabilities but subjective ones. Requiring those probabilities to be testable and established would mean crippling the justice system.

PeroK said:

If at the end of the year you were right only 50% of the time, then that would show that the 80% was a meaningless calculation. In that sense "degree of belief" by itself is potentially meaningless, unless that belief is testable in some way.

The point is that as you continue to add data, your degree of belief is updated and you become more certain (or uncertain if data goes against your preconceptions).

PeroK · May 12, 2022

Orodruin said:

Yet this happens every day because most things that are tested in a court of law are not down to precisely knowable probabilities but subjective ones. Requiring those probabilities to be testable and established would mean crippling the justice system.

That's why the justice system relies mainly on hard evidence and not speculation. It can and does go wrong but generally evidence such as "I'm 100% convinced that my brother could never commit a crime like this" is not taken as particularly serious evidence. That would be an example of a meaningless "probability".

PeroK · May 12, 2022

Orodruin said:

The point is that as you continue to add data, your degree of belief is updated and you become more certain (or uncertain if data goes against your preconceptions).

That requires multiple identical or similar experiments. The question is whether it's meaningful to calculate a probability based on a one-off scenario where the basic probabilities are unknown. How likely your wife is to buy a white dog seems to me an unknowable probability unless she's out buying dogs every day!

IMO, it uses mathematics to give the impression that we know something that in fact we do not. Or, give the impression that we can give a meaningful estimate of something.

Orodruin · May 12, 2022

PeroK said:

That's why the justice system relies mainly on hard evidence and not speculation. It can and does go wrong but generally evidence such as "I'm 100% convinced that my brother could never commit a crime like this" is not taken as particularly serious evidence. That would be an example of a meaningless "probability".

Even "hard" evidence often relies upon subjective probabilities or probabilities that aren't really the probabilities that you would really want to know. For example, DNA evidence and similar give you statement about probabilities of having a DNA match "by chance". This is not what you a priori would like to know and that type of evidence is also diluted by connecting the accused with a place or an object, but not necessarily the action. You need to start asking questions like "what is the probability that the accused is the perpetrator given that their DNA is on the murder weapon". The accused may have an identical twin, the DNA of the accused may have been on the murder weapon for some other reason (eg, two thieves broke into my office and got into an argument and one killed the other by hitting him with my coffee cup while wearing gloves - my DNA for sure is on that cup but what is the probability that there were two thieves etc etc), even if the probability of happening by chance is extremely small, there are a lot of people, how many of those should be considered?, etc. Unless you have an actual recording of the crime, evidence gets diluted by subjective probabilities all the time (and even with a recording you still need to establish motive and/or intent in many cases).

The "I am 100% convinced" is also not the actual probability you should consider as a juror or judge. The probability you should consider is the probability of the brother saying that given that the accused is guilty/not guilty.

PeroK said:

That requires multiple identical or similar experiments. The question is whether it's meaningful to calculate a probability based on a one-off scenario where the basic probabilities are unknown. How likely your wife is to buy a white dog seems to me an unknowable probability unless she's out buying dogs every day!

But this is also the basis of frequentist probabilities. You a priori need an infinitely repeatable experiment if you want to test those and you do not have access to that in a court of law. You only have access to a single sample.

Orodruin · May 12, 2022

XKCD as usual has something to say about this:

Neither frequentist nor Bayesian statistics is flawless. Both are endowed with certain problems, but they can both serve their purpose in different situations.

Dale · May 12, 2022

PeroK said:

degree of belief" by itself is potentially meaningless, unless that belief is testable in some way.

Then here it is meaningful because you can ask who picked it out, so it is testable.

PeroK · May 12, 2022

Dale said:

Then here it is meaningful because you can ask who picked it out, so it is testable.

That gives you a binary answer. It doesn't allow you to test the prior probability.

Dale · May 12, 2022

PeroK said:

That gives you a binary answer. It doesn't allow you to test the prior probability.

I disagree. It tests everything involved in producing the prediction. The prior, the posterior, and the likelihoods. All of these together are compared against the observation.

There doesn’t have to be one completely conclusive test, and there usually isn’t. Each test can give you a small amount of uncertain information and still be a valid test.

Orodruin · May 12, 2022

The frequentist statement* is not what you would want either, so I fail to see the particular aversion to using Bayesian statistics here.

* Roughly, in the simplest version: If I reject that my wife selected the dog, then that means that if she did select the dog, then at most 5% of the dogs over an infinite amount of trials would be white. This simplest version also does not take into account the propensity of the friend to pick white dogs so there is that too. You could easily end up rejecting both "my wife selected the dog" and "my wife's friend selected the dog" as null hypotheses, or just one, or none of them.

PeroK said:

It doesn't allow you to test the prior probability.

But it does, regardless of whether the outcome is binary or not. The related quantity is called the Bayesian evidence. A model where the wife always picks is going to be disfavoured over one where they both pick 50% of the time.

If you go to the doctor and the doctor tests you for an exceedingly rare (only one in ##10^9## get it) and lethal disease with a test that has a true positive rate of 1 and a false negative rate of ##10^{-6}## and your test is positive, it certainly does not mean that you have the disease.

Orodruin · May 12, 2022

Dale said:

Then here it is meaningful because you can ask who picked it out, so it is testable.

... and update your degree of belief accordingly using P(they say wife picked it | wife picked it) and P(they say wife picked it | wife didn't pick it) ...

PeroK · May 12, 2022

I can't see the sense in pretending you xan calculate something when you cannot. Any first estimate from 0 to 1 is equally valid in these cases. So, saying your prior is 0.5 or 0.8 or 0.1 are all just guessing. And nothing to do with mathematical calculations per se.

Orodruin · May 12, 2022

PeroK said:

I can't see the sense in pretending you xan calculate something when you cannot. Any first estimate from 0 to 1 is equally valid in these cases. So, saying your prior is 0.5 or 0.8 or 0.1 are all just guessing. And nothing to do with mathematical calculations per se.

Equally, there is then no point in pretending that the frequentist approach tells you anything more relevant. While the frequentist statement does not need an arbitrary place to start, it does not give an answer to the relevant question, i.e., what is P(wife selected | dog is white). What the frequentist statement tells you is that something that would have been unlikely if you knew your wife selected the dog happened.

PeroK · May 12, 2022

I'm not advocating a frequentist approach to this. There is simply the honest answer that we don't know. And the calculations given in the original article are arbitrary and essentially meaningless.

If Bayesian statistics can really conjure knowledge from ignorance, then I would be suspicious of it.

Dale · May 12, 2022

PeroK said:

I can't see the sense in pretending you xan calculate something when you cannot. Any first estimate from 0 to 1 is equally valid in these cases. So, saying your prior is 0.5 or 0.8 or 0.1 are all just guessing. And nothing to do with mathematical calculations per se.

You are allowed to just guess on your priors. Each datapoint gives you feedback about whether or not your guess was good. And Bayes tells you how you should change your next guess.

Just because it is a guess doesn’t mean that it cannot be a number.

PeroK said:

There is simply the honest answer that we don't know.

That is not exactly honest either. The honest answer is that we are uncertain.

PeroK said:

If Bayesian statistics can really conjure knowledge from ignorance, then I would be suspicious of it.

Bayesian statistics just let's us reason under uncertainty.

FactChecker · May 12, 2022

PeroK said:

How can you possibly know these probabilities? It makes no sense to me to try to apply probability theory in a case where the basic probabilities themselves are unknowable. The answer that comes out is a meaningless number, rather than a probability of anything.

The article also falls into the trap of calculating a probability based on only some of the relevant factors. While other significant factors may be ignored: see the point made by @Orodruin.

I assume that it is a simplified exercise to illustrate the calculations and the principles. Nobody really needs to know the probability that the friend bought the white dog.

Thecla · May 12, 2022

Thanks Dale for the formula

A Bayesian question of choosing a white or black dog

1. What is Bayesian decision theory?

2. How does Bayesian decision theory apply to choosing a white or black dog?

3. What is the role of probability in Bayesian decision theory?

4. How does Bayesian decision theory differ from other decision-making approaches?

5. Can Bayesian decision theory be applied to other decision-making scenarios?

Similar threads

Hot Threads

Recent Insights