# Conditional probability: How it relates to observation spaces, random variables, etc.

1. Jul 12, 2011

### Rasalhague

I'm wondering how conditional probability relates to concepts of sample space, observation space, random variable, etc. Using the notation introduced in the OP here, how would one define the standard notation for conditional probability "P(B|A)" where A and B are both subsets of some sample space S?

Suppose P(A) is nonzero. I'm thinking there's an implicit random variable, namely the identity function on S. So S is both sample space and observation space. And the distribution of S-as-observation-space is Q : E --> [0,1] such that Q(B&A) = P(B&A)/P(A). Then P(_|_) is a function R : E x E --> [0,1], defined by:

If P(A) = 0, then R(B,A) = 0,
otherwise R(B,A) = Q(B&A).

Does that work, and is it how conditional probability is usually formalised in this system?

EDIT: On 2nd thoughts, the step of talking about a random variable and an observation space seems a bit superfluous, since we could just define conditional probability directly by the formula. Still, it gaves me a chance to test my understanding of the concepts...

Last edited: Jul 12, 2011
2. Jul 13, 2011

### chiro

Re: Conditional probability: How it relates to observation spaces, random variables,

The best way to think about conditional probability is to to think about having a two-way table (think matrix looking table) with all values of A and B (for the moment assume that the sample space is countable).

For certain values of A and B, you find P(A and B) for the appropriate values.

Now using this, we see that P(A|B) = P(A and B)/P(B). In seeing this we see that this is simply a ratio of an event with respect to a subspace B.

So the best way to visualize conditional probability is to think about things in terms of probability with respect to a particular set instead of the universal sample space.

The idea extends to continuous sample spaces just like you would think it should.

You can use different ways to formalize this idea, but the concept behind it is the same.

3. Jul 13, 2011

### Rasalhague

Re: Conditional probability: How it relates to observation spaces, random variables,

Thanks chiro. At first sight, conditional probability seems to fit rather awkwardly into the whole probability measure, sample space, observation space formalism, at least in my blundering attempt at reconciling them. Although the notation and concepts are similar, it seems that conditional probability isn't even exactly a probability measure, or is it? Maybe there's some sample space for which it is. I wonder if there's some more natural way or more standard way of defining it in that formalism. I was reading the first chapter of Ballentine's book on quantum mechanics, in which he advocates always using the conditional notation, P(A|B), to make exlicit what set-up a probability depends on. He offers a set of axioms, alternative to the probability measure axioms, which seem to make conditional probability fundamental:

$$1. \enspace\enspace\enspace 0 \leq P(A|B) \leq 1;$$

$$2. \enspace\enspace\enspace P(A|A) = 1;$$

$$3a. \enspace\enspace\enspace P(\sim A|B) = 1 - P(A|B),$$

$$3b. \enspace\enspace\enspace P(A \text{ or } B|C) = P(A|C) + P(B|C);$$

$$4. \enspace\enspace\enspace P(A \text{ and } B|C) = P(A|C) \cdot P(B|A \text{ and } C).$$

(3a and 3b are equivalent alternatives.)

One thing I'm trying to do is to relate the formalisms, to better understand them both, and to work out to what extent they are different systems, and how much of the measure theory formalism is also part of this system.

4. Jul 13, 2011

### chiro

Re: Conditional probability: How it relates to observation spaces, random variables,

I think you are making it too complicated.

The best way to get a more grounded approach is to think of conditional probability like I have said above: that is you are considering events with respect to a particular event set and not in terms of the universal sample space.

From the definition P(A|B) = P(A and B)/P(B). In terms of normal probability like P(A), if we let B be the universal sample space P(B) = 1 and P(A and B) = P(A) which gives P(A|U) = P(A and U) / P(U) = P(A)/1 = P(A).

Hopefully the above should give you the intuitive idea which you can use to understand the more rigorous formalisms.

Then with the above understanding, you take that and integrate it with the Kolmogorov Axioms.

This will provide you with ways to get the above identities.

If you use the Kolmogorov axioms and look at them in the measure theoretic sense after you get the intuitive understanding based on the set theoretic approach, I'm sure that you will be able to generate any identities you need to with a bit of manipulation.

5. Jul 15, 2011

### Rasalhague

Re: Conditional probability: How it relates to observation spaces, random variables,

It does seem a little complicated! Thanks for the tips. Your mention of Kolmogorov led me here:

http://en.wikipedia.org/wiki/Kolmogorov_axioms
http://en.wikipedia.org/wiki/Cox's_theorem

From these articles, it looks like there are two big axiomatic approaches: one attributed to Kalmogorov (related directly to measure theory), the other stemming from Cox (who Ballentine refers to; is that what you're calling the "set theoretic approach"?).

6. Jul 15, 2011

### chiro

Re: Conditional probability: How it relates to observation spaces, random variables,

The Kolmogorov approach is set theoretical. I'll give you an easy way to translate the "set theoretic" approach to the "measure" approach intuitively.

If you want to think of visualizing the "set theoretic" approach in terms of "measure" based approach, think of projecting a set to an interval of the real number line. The whole universe is the interval [0,1] and the intersection of sets is just like you would expect: the collection of points on the interval that are in common between both sets.

There is all kinds of ways to make this "intuitive". In terms of a simple probability statement, the strong law of numbers is a good way: the empirical approach is basically that you do a lot of experiments and each time you measure the event and take a long term average of these events: the idea is that this "average" should converge to the real theoretical average if the number of trials are long enough (under conditions like independence).

With conditional probability, you use the same idea, but its with respect to a particular subset and not the universal set.

Also something I should point out, sometimes probability is not completely intuitive. A good example is the Monty-Hall problem. Things like Bayes theorem of probability are easier if you take a set theoretic approach: if it makes it easier draw some Venn diagrams to see the analogue between the symbols and the intuition.

I don't know if I have helped you, but hopefully I have helped you see the connection between the formal and the intuitive approaches.

7. Jul 15, 2011

### Rasalhague

Re: Conditional probability: How it relates to observation spaces, random variables,

Kolmogarov's axioms look exactly like the definition of a probability measure, i.e. a measure with range [0,1] and P(S) = 1, where S is the sample space. So perhaps "set theoretic" isn't a very distinctive name for Kolmogorov's approach, at least not now that every branch of mathematics - including other formulations of probability - seems to either make use of set theory, or to have the potential to be formalised in terms of set theory. Wikipedia treats Kolmogorov's formailism and that of measure theory as synonymous here.