# Conditional Probability: Sample Space, Observation Space, Random Variable, etc.

• Rasalhague
In summary, conditional probability is a way of thinking about probability that considers events with respect to a particular event set.
Rasalhague
I'm wondering how conditional probability relates to concepts of sample space, observation space, random variable, etc. Using the notation introduced in the OP here, how would one define the standard notation for conditional probability "P(B|A)" where A and B are both subsets of some sample space S?

Suppose P(A) is nonzero. I'm thinking there's an implicit random variable, namely the identity function on S. So S is both sample space and observation space. And the distribution of S-as-observation-space is Q : E --> [0,1] such that Q(B&A) = P(B&A)/P(A). Then P(_|_) is a function R : E x E --> [0,1], defined by:

If P(A) = 0, then R(B,A) = 0,
otherwise R(B,A) = Q(B&A).

Does that work, and is it how conditional probability is usually formalised in this system?

EDIT: On 2nd thoughts, the step of talking about a random variable and an observation space seems a bit superfluous, since we could just define conditional probability directly by the formula. Still, it gaves me a chance to test my understanding of the concepts...

Last edited:

The best way to think about conditional probability is to to think about having a two-way table (think matrix looking table) with all values of A and B (for the moment assume that the sample space is countable).

For certain values of A and B, you find P(A and B) for the appropriate values.

Now using this, we see that P(A|B) = P(A and B)/P(B). In seeing this we see that this is simply a ratio of an event with respect to a subspace B.

So the best way to visualize conditional probability is to think about things in terms of probability with respect to a particular set instead of the universal sample space.

The idea extends to continuous sample spaces just like you would think it should.

You can use different ways to formalize this idea, but the concept behind it is the same.

Thanks chiro. At first sight, conditional probability seems to fit rather awkwardly into the whole probability measure, sample space, observation space formalism, at least in my blundering attempt at reconciling them. Although the notation and concepts are similar, it seems that conditional probability isn't even exactly a probability measure, or is it? Maybe there's some sample space for which it is. I wonder if there's some more natural way or more standard way of defining it in that formalism. I was reading the first chapter of Ballentine's book on quantum mechanics, in which he advocates always using the conditional notation, P(A|B), to make exlicit what set-up a probability depends on. He offers a set of axioms, alternative to the probability measure axioms, which seem to make conditional probability fundamental:

$$1. \enspace\enspace\enspace 0 \leq P(A|B) \leq 1;$$

$$2. \enspace\enspace\enspace P(A|A) = 1;$$

$$3a. \enspace\enspace\enspace P(\sim A|B) = 1 - P(A|B),$$

$$3b. \enspace\enspace\enspace P(A \text{ or } B|C) = P(A|C) + P(B|C);$$

$$4. \enspace\enspace\enspace P(A \text{ and } B|C) = P(A|C) \cdot P(B|A \text{ and } C).$$

(3a and 3b are equivalent alternatives.)

One thing I'm trying to do is to relate the formalisms, to better understand them both, and to work out to what extent they are different systems, and how much of the measure theory formalism is also part of this system.

Rasalhague said:
Thanks chiro. At first sight, conditional probability seems to fit rather awkwardly into the whole probability measure, sample space, observation space formalism, at least in my blundering attempt at reconciling them. Although the notation and concepts are similar, it seems that conditional probability isn't even exactly a probability measure, or is it? Maybe there's some sample space for which it is. I wonder if there's some more natural way or more standard way of defining it in that formalism. I was reading the first chapter of Ballentine's book on quantum mechanics, in which he advocates always using the conditional notation, P(A|B), to make exlicit what set-up a probability depends on. He offers a set of axioms, alternative to the probability measure axioms, which seem to make conditional probability fundamental:

$$1. \enspace\enspace\enspace 0 \leq P(A|B) \leq 1;$$

$$2. \enspace\enspace\enspace P(A|A) = 1;$$

$$3a. \enspace\enspace\enspace P(\sim A|B) = 1 - P(A|B),$$

$$3b. \enspace\enspace\enspace P(A \text{ or } B|C) = P(A|C) + P(B|C);$$

$$4. \enspace\enspace\enspace P(A \text{ and } B|C) = P(A|C) \cdot P(B|A \text{ and } C).$$

(3a and 3b are equivalent alternatives.)

One thing I'm trying to do is to relate the formalisms, to better understand them both, and to work out to what extent they are different systems, and how much of the measure theory formalism is also part of this system.

I think you are making it too complicated.

The best way to get a more grounded approach is to think of conditional probability like I have said above: that is you are considering events with respect to a particular event set and not in terms of the universal sample space.

From the definition P(A|B) = P(A and B)/P(B). In terms of normal probability like P(A), if we let B be the universal sample space P(B) = 1 and P(A and B) = P(A) which gives P(A|U) = P(A and U) / P(U) = P(A)/1 = P(A).

Hopefully the above should give you the intuitive idea which you can use to understand the more rigorous formalisms.

Then with the above understanding, you take that and integrate it with the Kolmogorov Axioms.

This will provide you with ways to get the above identities.

If you use the Kolmogorov axioms and look at them in the measure theoretic sense after you get the intuitive understanding based on the set theoretic approach, I'm sure that you will be able to generate any identities you need to with a bit of manipulation.

It does seem a little complicated! Thanks for the tips. Your mention of Kolmogorov led me here:

http://en.wikipedia.org/wiki/Kolmogorov_axioms
http://en.wikipedia.org/wiki/Cox's_theorem

From these articles, it looks like there are two big axiomatic approaches: one attributed to Kalmogorov (related directly to measure theory), the other stemming from Cox (who Ballentine refers to; is that what you're calling the "set theoretic approach"?).

Rasalhague said:
It does seem a little complicated! Thanks for the tips. Your mention of Kolmogorov led me here:

http://en.wikipedia.org/wiki/Kolmogorov_axioms
http://en.wikipedia.org/wiki/Cox's_theorem

From these articles, it looks like there are two big axiomatic approaches: one attributed to Kalmogorov (related directly to measure theory), the other stemming from Cox (who Ballentine refers to; is that what you're calling the "set theoretic approach"?).

The Kolmogorov approach is set theoretical. I'll give you an easy way to translate the "set theoretic" approach to the "measure" approach intuitively.

If you want to think of visualizing the "set theoretic" approach in terms of "measure" based approach, think of projecting a set to an interval of the real number line. The whole universe is the interval [0,1] and the intersection of sets is just like you would expect: the collection of points on the interval that are in common between both sets.

There is all kinds of ways to make this "intuitive". In terms of a simple probability statement, the strong law of numbers is a good way: the empirical approach is basically that you do a lot of experiments and each time you measure the event and take a long term average of these events: the idea is that this "average" should converge to the real theoretical average if the number of trials are long enough (under conditions like independence).

With conditional probability, you use the same idea, but its with respect to a particular subset and not the universal set.

Also something I should point out, sometimes probability is not completely intuitive. A good example is the Monty-Hall problem. Things like Bayes theorem of probability are easier if you take a set theoretic approach: if it makes it easier draw some Venn diagrams to see the analogue between the symbols and the intuition.

I don't know if I have helped you, but hopefully I have helped you see the connection between the formal and the intuitive approaches.

chiro said:
The Kolmogorov approach is set theoretical. I'll give you an easy way to translate the "set theoretic" approach to the "measure" approach intuitively.

If you want to think of visualizing the "set theoretic" approach in terms of "measure" based approach, think of projecting a set to an interval of the real number line. The whole universe is the interval [0,1] and the intersection of sets is just like you would expect: the collection of points on the interval that are in common between both sets.

Kolmogarov's axioms look exactly like the definition of a probability measure, i.e. a measure with range [0,1] and P(S) = 1, where S is the sample space. So perhaps "set theoretic" isn't a very distinctive name for Kolmogorov's approach, at least not now that every branch of mathematics - including other formulations of probability - seems to either make use of set theory, or to have the potential to be formalised in terms of set theory. Wikipedia treats Kolmogorov's formailism and that of measure theory as synonymous here.

## 1. What is a sample space in conditional probability?

A sample space in conditional probability is the set of all possible outcomes of an experiment or event. It is denoted by S and can include any number of elements depending on the specific situation. For example, if a coin is tossed twice, the sample space would be {HH, HT, TH, TT} where H represents heads and T represents tails.

## 2. What is an observation space in conditional probability?

An observation space in conditional probability is the set of all possible outcomes that can be observed or measured in a given experiment or event. It is denoted by Ω and is a subset of the sample space. For example, if the experiment is tossing a coin twice and observing the number of heads, the observation space would be {0, 1, 2} where 0 represents no heads, 1 represents one head, and 2 represents two heads.

## 3. What is a random variable in conditional probability?

A random variable in conditional probability is a variable that takes on different values based on the outcome of a random experiment or event. It is often denoted by X and can be discrete or continuous. For example, in the coin toss experiment, the number of heads observed can be considered a random variable.

## 4. How is conditional probability calculated?

Conditional probability is calculated by dividing the probability of the intersection of two events by the probability of the first event. It can be represented mathematically as P(A|B) = P(A∩B) / P(B), where A and B are events. In other words, conditional probability is the likelihood of event A occurring given that event B has already occurred.

## 5. What is the difference between joint probability and conditional probability?

Joint probability is the probability of two events occurring together, while conditional probability is the probability of one event occurring given that another event has already occurred. Joint probability can be calculated by multiplying the individual probabilities of each event, while conditional probability is calculated using the formula mentioned in the previous answer.

Replies
6
Views
2K
Replies
9
Views
1K
Replies
1
Views
859
Replies
1
Views
509
Replies
1
Views
518
Replies
4
Views
1K
Replies
30
Views
3K
Replies
7
Views
1K
Replies
3
Views
1K
Replies
1
Views
540