Weird probability problem With what probability is it raining?

In summary, the conversation is about a probability problem involving determining if it's raining in a distant city. Three friends can be called to ask about the rain, but each friend may either tell the truth with a probability of 2/3 or lie with a probability of 1/3. The probability of a friend lying is independent of another friend lying. The naive answer, which is the probability of at least one friend telling the truth, is 26/27. However, there is an objection to this answer, as the event of one friend telling the truth and the other two lying is incoherent. The correct answer is 8/27, but this does not account for the prior probability of it raining. Using Bayes
  • #1
AxiomOfChoice
533
1
Weird probability problem..."With what probability is it raining?"

Came across this the other day looking at interview questions.

Suppose you want to determine whether it's raining in a distant city. You have three friends there who you can call and ask about this. Only thing is:
  • Each friend will tell the truth with probability 2/3.
  • Each friend will lie with probability 1/3.
  • The event that Friend [itex]i[/itex] lies is independent of the event that Friend [itex]j[/itex] lies for [itex]1 \leq i,j \leq k[/itex].
So, if you call all your friends, and they all tell you it's raining...what's the probability it's actually raining?

Here's a naive answer: The probability in question is just the probability that at least one of them is telling the truth, which by independence is [itex]P(t_1 \cup t_2 \cup t_3) = 1 - P(\ell_1 \cap \ell_2 \cap \ell_3) = 1 - P(\ell_1)P(\ell_2)P(\ell_3) = 1 - (1/3)^3 = 26/27[/itex]. But there is a conceivable objection to this: You don't need at least one of them to be telling the truth; you need them ALL to be telling the truth. Because the probability just computed includes, for instance, the event that Friend 1 is telling the truth (it's raining), but Friend 2 and 3 are lying (it's not raining), which is incoherent. So, in a sense, the sample space used in the calculation above is too big! And what you should compute instead is [itex]P(t_1 \cap t_2 \cap t_3) = P(t_1)P(t_2)P(t_3) = (2/3)^3 = 8/27[/itex]. But...that doesn't quite make sense...where's the remaining [itex]1 - 1/27 - 8/27 = 2/3[/itex] of our probability measure living?
 
Physics news on Phys.org
  • #2
AxiomOfChoice said:
Here's a naive answer: The probability in question is just the probability that at least one of them is telling the truth, which by independence is [itex]P(t_1 \cup t_2 \cup t_3) = 1 - P(\ell_1 \cap \ell_2 \cap \ell_3) = 1 - P(\ell_1)P(\ell_2)P(\ell_3) = 1 - (1/3)^3 = 26/27[/itex].
That naive answer is incorrect.

But there is a conceivable objection to this: You don't need at least one of them to be telling the truth; you need them ALL to be telling the truth. Because the probability just computed includes, for instance, the event that Friend 1 is telling the truth (it's raining), but Friend 2 and 3 are lying (it's not raining), which is incoherent. So, in a sense, the sample space used in the calculation above is too big! And what you should compute instead is [itex]P(t_1 \cap t_2 \cap t_3) = P(t_1)P(t_2)P(t_3) = (2/3)^3 = 8/27[/itex]. But...that doesn't quite make sense...where's the remaining [itex]1 - 1/27 - 8/27 = 2/3[/itex] of our probability measure living?
Obviously 8/27 is also incorrect.

The problem is that the 8/27 and 1/27 probabilities are the probability that one will obtain three "yes, it's raining" answers given that it is or is not raining. Well you obtained three yes answers. That's now a given. The prior probability of getting 3 yes answers was 1/3=9/27 (8/27+1/27). You want to know whether it's raining given this 3 yes answers. Using P(A|B)= P(A ∩ B)/P(B), the probability that it is raining given three yes answers is (8/27)/(9/27) = 8/9.
 
  • #3
I like Bayes' Theorem for this:[tex]p(3\mathrm{yes}|\mathrm{rain})p(\mathrm{rain})=p(\mathrm{rain}|3\mathrm{yes})p(3\mathrm{yes})[/tex]

The probability of three "yeses" given that it's actually raining is 8/27, as DH calculated.

The probability that it's raining is your estimate of how likely it is to be raining at any given moment. That's likely to be a higher number if your friends live in London than if they live in Karachi. Let's call it [itex]p_r[/itex]

The probability of three "yeses" is the total probability:[tex]p(3\mathrm{yes}) = p(3\mathrm{yes} | \mathrm{rain}) p(\mathrm{rain}) + p(3\mathrm{yes} | \mathrm{no rain}) p(\mathrm{no rain}) =(1+8p_r)/27[/tex]

You can substitute all that back into my first equation to get the remaining term, the probability that it's raining given that you got three "yeses". It's just:[tex]p(\mathrm{rain}|3\mathrm{yes})=\frac{8p_r}{1+7p_r}[/tex]

Note that my beliefs about how wet it is typically affect my confidence in the honesty of the answers. Note also that if I believe that it rains 50% of the time, then I agree with DH.
 
Last edited by a moderator:
  • #4
Aside:

Ibix, I fixed your LaTeX equations. In the future, please use some spaces in your LaTeX input. If you write "real" LaTeX it makes the LaTeX more readable. Spaces are just human readable noise when LaTeX/TeX is in math mode.

More importantly for this forum, the software that runs this forum doesn't like big honkin' words. You had a stream of 50+ characters, and that looked like a big honkin' word to the underlying software. Since the underlying software doesn't like big honkin' words, it inserts spaces, and it inevitably does so where it doesn't make sense to LaTeX.
 
  • #5
Ibix said:
I like Bayes' Theorem for this:[tex]p(3\mathrm{yes}|\mathrm{rain})p(\mathrm{rain})=p(\mathrm{rain}|3\mathrm{yes})p(3\mathrm{yes})[/tex]
There's one big problem with Bayes' law for this: What if you don't have a clue regarding the prior probability? Bayes' law as is has a bit of a problem if there is no prior.

There is a very nice way to rewrite Bayes' law to account for this "I haven't the foggiest" prior probability. It's called an information filter. In a nutshell, an information filter is like a Kalman filter except that an information filter uses an information matrix rather than the covariance matrix. I don't know nuffin' about X has a nice representation as an information matrix: It's the zero matrix.

If you take this formulation, then the first friend who says "yes" yields a 2/3 probability that it is raining. You take this first friend at face value (where face value includes the fact that this friend might be lying) because the prior information matrix is the zero matrix. The second friend who says "yes" raises the probability to 4/5, and the third friend who says "yes" raises the probability to 8/9.

Alternatively, you can use an information filter formalism to bootstrap the process, and then use Bayes' law proper after the first friend says "yes".
 
  • #6
D H said:
Since the underlying software doesn't like big honkin' words, it inserts spaces, and it inevitably does so where it doesn't make sense to LaTeX.
Ah! I thought it was just MathJax having a bad day on my phone. Thanks for the fix and the explanation.

The information filter was interesting, too. Is the fact that it implies a 50/50 a priori probability (8/9 being what you get with pr=0.5) an artifact of the problem or of the filter?
 
  • #7
The complete lack of prior knowledge (which only makes sense in an information filter formation) is *sometimes* equivalent to the principle of indifference. In this case that happens to be the case. I'm not a huge fan of the principle of indifference. It can get you in big trouble. An information filter formalism (to me) gives a much better mechanism for expressing complete lack of prior knowledge.
 
  • #8
D H said:
The problem is that the 8/27 and 1/27 probabilities are the probability that one will obtain three "yes, it's raining" answers given that it is or is not raining. Well you obtained three yes answers. That's now a given. The prior probability of getting 3 yes answers was 1/3=9/27 (8/27+1/27)

There is a problem here: these are still the conditional probabilities P(3y|R) = 8/27 and P(3y|~R) = 1/27 and so P(3y) ~= P(3y|R) + P(3y|~R) but P(3y)= P(3y|R)P(R) + P(3y|~R)P(~R). On cannot avoid the fact that there is incomplete information; Ibix is correct in the Bayesian approach that a assumption has to made for P(r).

Now one uses the maximum entropy principal in choosing P(R). Given one knows nothing about P(R), then one should choose P(R)=1/2 to maximize the entropy of the probability model on R,~R - entropy being a measure of uncertainty. The choice of probability distribution chosen should reflect the amount of uncertainty in your knowledge of the situation.
 
Last edited:
  • #9
BTP said:
There is a problem here: these are still the conditional probabilities P(3y|R) = 8/27 and P(3y|~R) = 1/27 and so P(3y) ~= P(3y|R) + P(3y|~R) but P(3y)= P(3y|R)P(R) + P(3y|~R)P(~R). On cannot avoid the fact that there is incomplete information; Ibix is correct in the Bayesian approach that a assumption has to made for P(r).
Not necessarily. You are assuming those are conditional probabilities. Look at them instead as marginal probabilities and there is no need for a prior for P(R).

Now one uses the maximum entropy principal in choosing P(R). Given one knows nothing about P(R), then one should choose P(R)=1/2 to maximize the entropy of the probability model on R,~R - entropy being a measure of uncertainty. The choice of probability distribution chosen should reflect the amount of uncertainty in your knowledge of the situation.
You do not need a prior if you use an information filters. They provide an explicit mechanism for saying "I have no prior" (or, if you wish, "my prior is complete garbage"). It doesn't matter what you use for a prior if the information matrix is the zero matrix.
 
  • #10
D H said:
Not necessarily. You are assuming those are conditional probabilities. Look at them instead as marginal probabilities and there is no need for a prior for P(R).


You do not need a prior if you use an information filters. They provide an explicit mechanism for saying "I have no prior" (or, if you wish, "my prior is complete garbage"). It doesn't matter what you use for a prior if the information matrix is the zero matrix.

Some clarifications:

Can it be view as a marginal distribution?

Let W= {R, ~R} (the weather) and let X= {T,F}^3 (the call answers) and P(X,W) be the joint distribution.

You cannot compute P(T^3, R) or P(T^3, ~R) because you nothing about the joint distribution even under the assumption of independence, though yes you can compute P(T^3)=P(T^3,W)=8/9=P(T^3)P(W) and P(F^3)=P(F^3,w)=1/9=P(F^3)P(W) , under the assumption of independence, and so treated as marginal probabilities.

But the key is that P(R|T^3)= P(R and T^3)/P(T^3) cannot be computed as P(R and T^3) cannot be computed.

------------

Now for your filter approach:

But your approach is another Bayesian approach updating the probability of whether it is raining as information comes in through each call. And it still requires an initial assumption about the P(R) initially. Your zero information filter at the start by assuming the person is telling the truth, which has a certain moral appeal to be certain, assumes in your case P0(R)=1 "a prior" given the first answer, and so you get P1(R)=P(R|Y1)=2/3 at the end of the first call. Mean while the maximum entropy proposition by assigning equal probability to R and ~R, so P0(R)=1/2, returns the same first step P1(R)=P(R|Y1)=2/3.

At the next step P2(R)=P(R|Y2; Y1) = P(Y2|R)P(R1)/P(Y2) and the last update of course being
P3(R)=P(R|Y3; Y2Y1)=...

Now if there was more than one option - Rain, sun, cloudy etc, the two different initial a prior assignments would return different answers in the a small sample but would converge in the limit of asking an infinite number of people with the same T or F distribution. Comparing this with treating the data in a single step with an assumed a prior distribution I believe the filter approach is more accurate give more calls for obvious reasons.

But the point is you still had to make an some assumption about the missing information. In this case at the start of the process. It is this ambiguity in the presence of uncertain information that leads many people to cringe with Bayesian approaches.
 
Last edited:

1. What is the probability of rain on any given day?

The probability of rain on any given day is determined by the climate of the specific location and can vary greatly. For example, in a region with a tropical climate, the probability of rain may be higher compared to a region with a desert climate.

2. How is the probability of rain calculated?

The probability of rain is calculated by dividing the number of days with rain by the total number of days in a given period. For example, if it rains on 10 out of 30 days in a month, the probability of rain for that month would be 10/30 or 33.3%.

3. Can the probability of rain change throughout the day?

Yes, the probability of rain can change throughout the day. This can be due to changing weather patterns, such as a cold front moving in or a storm system passing through the area. The probability of rain is typically highest during the late afternoon or evening.

4. How accurate is the probability of rain forecast?

The accuracy of the probability of rain forecast depends on various factors, such as the technology and methods used by meteorologists, the location and time period being forecasted, and the complexity of the weather patterns. Generally, the accuracy of the forecast decreases the further out in time it is made.

5. Can the probability of rain be 100%?

Technically, yes, the probability of rain can be 100%. This would mean that all the conditions necessary for rain to occur are present and it is highly likely that rain will happen. However, in most cases, the probability of rain is rarely 100% as there is always a small chance that unexpected weather patterns or changes could prevent rain from occurring.

Similar threads

  • Set Theory, Logic, Probability, Statistics
2
Replies
36
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
412
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
12
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
27
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
10
Views
2K
Back
Top