Conditional probabilities of conditioned probabilities

Click For Summary

Discussion Overview

The discussion revolves around the interpretation and definition of conditional probabilities, particularly in the context of already-conditioned probabilities. Participants explore the notation and mathematical relationships involving expressions like P(A,B|C) and P(A|B|C), questioning their clarity and correctness.

Discussion Character

  • Technical explanation
  • Conceptual clarification
  • Debate/contested

Main Points Raised

  • One participant questions how to define P(A,B|C) and suggests that writing P(A,B|C) = P(A|B|C) P(B|C) seems nonsensical.
  • Another participant proposes that P(A,B|C) = P(A|B,C) P(B|C) might be a clearer interpretation.
  • Concerns are raised about the precedence of the operations ',' and '|' in the notation, suggesting that added parentheses could clarify the question.
  • A participant discusses the ambiguity of notation and emphasizes the need to translate it into statements about sets for clarity.
  • There is mention of different probability measures and how the notation P(A|C) indicates using a distinct measure compared to P(A ∩ C).
  • Some participants express uncertainty about the interpretation of complex notations like P(A|B|C) and highlight that such notation lacks a universally accepted meaning.

Areas of Agreement / Disagreement

Participants express differing views on the clarity and correctness of various notations for conditional probabilities. There is no consensus on a single interpretation or approach, and the discussion remains unresolved regarding the best way to express these concepts.

Contextual Notes

The discussion highlights the complexities of probability measures and the potential ambiguity in notation, particularly when multiple interpretations exist. Participants note that the meaning of expressions can vary based on the context and definitions used.

WWCY
Messages
476
Reaction score
15
I know that ##P(A,B) = P(A|B) \ P(B)##. But If i should like to define conditional probabilities for already-conditioned probabilities ie.
$$P(A,B|C)$$
how should I do it?

Writing something like ##P(A,B|C) = P(A|B|C) \ P(B|C)## seems nonsensical, and I've seen stuff that suggests ##P(A,B|C) = P(A|B,C)\ P(B|C)##, but I don't understand what is the right answer.

Assistance is greatly appreciated.
 
Physics news on Phys.org
  • Like
Likes   Reactions: WWCY and FactChecker
It's hard for me to read the problem because I don't know what the precedence of the operations ',' and '|' are. Some added parentheses might make the question more clear.
 
WWCY said:
Writing something like ##P(A,B|C) = P(A|B|C) \ P(B|C)## seems nonsensical, and I've seen stuff that suggests ##P(A,B|C) = P(A|B,C)\ P(B|C)##, but I don't understand what is the right answer.

Assistance is greatly appreciated.
basic choices:
use the second form
(i) ##P(A,B|C) = P(A|B,C)\ P(B|C)##

or re-label so that ##A^{'}## is defined as the probability of A conditioned on B, and then
(ii) ## P(A|B|C) \ P(B|C)\longrightarrow P(A^{'}|C) \ P(B|C) =P(A,B|C)##
which looks less nonsensical

I'd tend to go for (i) though occassionally if there are a ton of moving parts and you are working through conditioning one 'stage' at a time (say with induction), then (ii) may be preferable

there's lots of variations on notation. What about (i) don't you like?

note for (i) the notation and idea (using indicators if needed) is the same as with conditional expectations, e.g.

##E\Big[X\big \vert Y,Z\Big]## or ##E\Big[X\big \vert Y=y,Z=z\Big]##
where you have a function defined on Y and Z. Standard function notation would be something like
##E\Big[X\big \vert Y=y,Z=z\Big]= g_X\big(y,z\big) =g\big(y,z\big)##
 
  • Like
Likes   Reactions: WWCY
WWCY said:
but I don't understand what is the right answer.

A question posed using notation isn't a specific question until the meaning of the notation is defined.
With certain interpretations of notation, both the equations you ask about are correct.

If you associate the English word "given" with the symbol "|" you can interpret notation like "P(A|B|C)" as a sequence of English words. However this doesn't guarantee that the sequence of Engish words has a specific mathematical interpretation. The formal theory of mathematical probability is based on assigning probabilities to sets. The difficulty you face in interpreting notation involving a "|" is how to translate that notation into a statement about sets. Notation such as "A|B" isn't specific until we can translate it into statements involving only the standard operations on sets such as "##\cap##" and "##\cup##".

It's important to understand the notation for conditional probability. To do that, begin by understanding that mathematical probability is defined on a "probability space". The probability space has a set ##\Omega## and there is a function ( a "probability measure) ##\mu## that gives the probability of certain subsets of ##\Omega##. With this understanding, the notation "##P(A)##" means ##\mu(A)##.

Suppose we have a problem concerning two probability spaces that have the same ##\Omega## but have different probability measures, ##\mu_1## and ##\mu_2##. In such a case, the notation "##P(A)##" is ambiguous. It might mean ##\mu_1(A)## or it might mean ##\mu_2(A)##.

The important thing to understand about the "##|##" notation is that it is used in the above situation for the purpose of distinguishing two different probability measures. In the convention for using the "##|##" notation, we write "##P(A)##" for the probability of the set ##A## using some probability measure ##\mu_1## that is "understood" or the "default" probability measure. We write "##P(A|##<something>##)##" to indicate the probability of ##A## using a different probability measure ##\mu_2##.

If we have a probability identity that applies to all probability measures we express it in the "P" notation with the understanding that some default probability measure ##\mu_1## is being used. So whe we write ##P(A \cup B) = P(A) + P(B) - P(A \cap B)##, we are saying ##\mu_1(A\cup B) = \mu_1(A) + \mu_1(B) - \mu_1(A\cap B)##.

With the understanding that "##|##<something>" indicates using a different probability measure, any identity valid for all probability measures also applies when written with the same "##|##<something>" used in the "##P(...)##"terms. For example, ##P(A \cup B|##<something>##)= P(A|##<something>##) + P(B|##<something>##) - P(A \cap B | ##<something>##)## is an abbreviation for ##\mu_2(A \cup B) = \mu_2(A) + \mu_2(B) - \mu_2(A \cap B)## where ##\mu_2## is some probability measure distinct from the default probability measure.

The convention for interpreting notation such a ##P(A|C)## where "##C##" is a set is as follows:. There is some default probability measure ##\mu_1##. Define a different probability measure ##\mu_2## by ##\mu_2(X) = \mu_1(X \cap C)/ \mu_1(C)##. Then "##P(A|C)##" denotes ##\mu_2(A)##.

Instructors emphasize the distinction between ##P(A \cap C)## and ##P(A | C)##. However, this can lead to the misunderstanding that the two notations are different because they involve different sets. The two notations both involve the set ##A \cap C##. The notations indicate computing the probability of ##A \cap C## using different probability measures. "##P(A \cap C)"## denotes ##\mu_1(A \cap C)## for the default probability measure ##\mu_1##. The evaluation of "##P(A |C)##" also includes computing ##\mu_1(A \cap C)##. The significant difference is that "##P(A|C)##" indicates that a probability measure ##\mu_2## distinct from ##\mu_1## is being used to get the final answer.

To restate the above idea, we can note ##P(A| C) = P(A \cap C | C)##. This is because ##P(A \cap C | C)## is defined to be ##\mu_1( (A \cap C) \cap C)/ \mu_1(C) = \mu_1 (A \cap C)/ \mu_1(C) = P(A | C) ##Both ##P(A|C) = P(A \cap C | C) ## and ##P(A \cap C)## can be regarded as computing a probability for the set ##A \cap C##. The distinction is that ##P(A \cap C)## and ##P(A|C)## assign probabilities to the set ##A \cap C## using different probability measures.

When we think of probability problems inuitively, it is tempting to imagine that events have a single probability and that other aspects of the problem (such as probabilities "given additional information") are merely accessories that have don't alter "the" unique probability of the event. It's better to realize that the same event can be assigned different probabilities by different probability measures. In one way of thinking, all probabilties are conditional probabilities. "##P(A)##" denotes the probability of the set ##A## under the conditions that establish some default probability measure ##\mu_1##.

When it comes to interpreting notation like "##P(A|B|C)##", we are on our own. I've never seen a probability textbook that develops such notation in detail. (The fact that we can write notation that looks plausible doesn't guarantee it has a unique or sensible interpretation.)

As @FactChecker says, we can worry about a distinction between "##P( (A|B) |C)" ## versus ##P(A | (B|C))##. However, to do that, we first have to make sense of notation like "##(A|B)##". The notation for conditional probability ##P(A|B)## doesn't define a way to interpret "##A|B##" as a set. In fact, in set theory, some people might use the notation "##A|B##" to denot the set ##\{x: x \in A \land x \notin B\}##, which is not what we want.

The technicalities of probability measures and probability spaces are complicated. Ignoring such complexities, I will make some suggestions.

WWCY said:
Writing something like P(A,B|C)=P(A|B|C) P(B|C) seems nonsensical,

Suppose we interpret ##P(A|B|C)## to mean ##P(A |(B \cap C)) ##. The latter has a specific interpretation by the conventions given above. This interpretation is ##\mu_2(A)## where ##\mu_2 = \mu_1(A \cap B \cap C)/ \mu_1(B \cap C))## using the default probability measure ##\mu_1##.

With that interpretation we can consider whether ##P(A \cap B|C) = P(A | B | C) P(B | C)##.

The left side of the equation is defined as ##\mu_1((A \cap B) \cap C)/ \mu_1(C)##.

With my suggestion, the right side is defined as ##P(A | B \cap C) P(B | C) = ( \mu_1(A \cap B \cap C)/ \mu_1(B \cap C) )\ ( \mu_1( B \cap C) / \mu_1(C))## which reduces to the left hand side. (Of course we must assume none of this involves division by zero.)
I've seen stuff that suggests P(A,B|C)=P(A|B,C) P(B|C),
Interpreting that as the claim ##P((A \cap B) | C) = P(A | (B \cap C)) P(B | C)## you should be able to translate it into a claim about expressions that only involve the probability measure ##\mu_1##
 
Last edited:
  • Like
  • Informative
Likes   Reactions: scottdave and WWCY

Similar threads

  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 7 ·
Replies
7
Views
1K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 5 ·
Replies
5
Views
1K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 2 ·
Replies
2
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 36 ·
2
Replies
36
Views
5K