# Multiplying the probabilities of two disjoint sets.

1. Aug 10, 2013

### cdux

I've been struggling for a few minutes with this basic thing and I want to make sure I got it right,

given A, B being disjoint,

We know that P(A and B) = 0

However, if they are independent then P(A and B) = P(A) x P(B)

Then if P(A) is [STRIKE]finite[/STRIKE] non zero and P(B) is [STRIKE]finite[/STRIKE] non zero, how could P(A and B) be zero?

My explanation is that the mistake in that reasoning is that P(A and B) is immediately zero when they are disjoint so it never gets to be tested for P(A) x P(B).

So in that case P(A) x P(B) is completely meaningless and it has no reason to be calculated at all: "Yeah, the product produces a number, but it's completely useless. If you started with P(A and B) you would immediately derive it's zero without reaching that calculation."

Is that assessment correct?

Then again I wonder if that product has any meaning at all that could be useful..

Last edited: Aug 10, 2013
2. Aug 10, 2013

### cdux

I'm now almost certain the product has no use at all. e.g. what could 1/2 X 1/2 = 1/4 in a coin toss mean? 1/4th in a coin toss? Why? What could possibly use that?

Hrm.. Maybe 2 coins giving an Ω of {(0,0), (1,0), (0,1), (1,1)}. Then P(A) being 1/2 of one coin and P(B) being 1/2 on another coin, so...

No it still doesn't make sense.

No wait.

So.. 1/4th being one of the choices of the new Ω.

Yeah but.. how could that be derived by the product of P(A) and P(B)? P(A) being defined in the Ω1 of {0a, 1a} and P(B) in the Ω2 of {0b, 1b}? I still don't see it.

Whatever.

3. Aug 10, 2013

### mathman

If A and B are disjoint, they cannot be independent and vice versa, as long as both have non-zero probabilities.

4. Aug 10, 2013

### tiny-tim

hi cdux!

if A and B are disjoint, then if A happens, B can't happen …

so they can't be independent!!

(eg if A is a 1 and B is a 6, then if you have A, you know you don't have B)

Last edited: Aug 10, 2013
5. Aug 10, 2013

### cdux

Well then it would be P(A|B) = P(A and B) / P(B) and it directly goes again to A and B being the impossible event and the whole being 0.

6. Aug 11, 2013

"Well then it would be P(A|B) = P(A and B) / P(B) and it directly goes again to A and B being the impossible event and the whole being 0."

Yes, so? Think this (intuitive) way: Since A and B are disjoint they have no outcomes in common (by definition), so A and B is the empty event. This means P(A and B) = 0. Because of this, and the formula for conditional probability, P(A | B) = 0/P(B) = 0 (as long as P(B) is not zero)

A second bit if intuitive explanation. Suppose the two events are independent, with neither P(A) nor P(B) equal to zero. Then if B is known to have occurred we know A could not have occurred (again because they do not share outcomes). This means P(A | B) is zero. But now we have that P(A) is not equal to P(A | B) (a non-zero number is not equal to zero), so that A and B are not independent. If they aren't independent the multiplication rule you first tried to apply is the wrong form.

7. Aug 11, 2013

### chiro

To understand independence, realize that having one variable gives you no extra information about another.

Mathematically we express this as P(A|B) = P(A) for any event or random variable B. The condition that satisfies this is P(A and B) = P(A)P(B) since P(A|B) = P(A) = P(A and B)/P(B).

This is the condition of independence and its the specific value where again you get no extra information or advantage of knowing B to get A.

8. Aug 12, 2013

### cdux

I had a good intuitive idea about the whole thing and I now find no contradiction with my book:

The very theoretical inception of P(A|B) starts from the fact that one "zooms in" B to see the chance of A happening in it. For a very intuitive example imagine that the chance of Α = {1,2} happening in the space of Ω = {1,2,3,4,5,6,7,8) is now double in the space of B = {1,2,3,4} simply because now only 2 'things' of A exist in the 4 'things' of B instead of 2 'things' of A in the 8 'things' of Ω. That is additionally seen arithmetically when we divide P(A and B) by P(B) in that calculation. The 0.5 of P(B) simply doubles the chance of A happening. Or in other words Ω is temporarily transformed into Ω' = Β for the duration of the calculation of the chance of A happening.

Now, the theory goes on to say that two events are independent when P(A and B) = P(A)P(B) because P(A|B) = P(A). And that's the tricky bit. Why did you go directly from P(A|B) to P(A) without any mathematically discreet way? Did you just declare it? Did you just test their 'real life attributes' to derive that? Who are you people? Most texts draw a blank there but I think they shouldn't. There is a distinct mathematical concept that describes it and I understand that some may wrongly find it abstract and that is the reason no further explanation is often included:

The real answer resides on how we view each element of a sample space.

For example P(American) may show the probability of being an american from the sum of particular persons being american over all the persons: N(A)/N(Ω)

However, if we have information about the particular persons being a man or a woman then we may have another P(M/Ν(Ω).

But here is the important bit: If we know that specific personal information then we can find P(A|M). (Because we could simply use their intersection.)

Hence it comes to reason that two events are dependent when their sample spaces intersect and reside in the same experiment.

And with simple probabilistic logic that should be obvious to anyone dealing with these things: They are independent when their sample spaces do not intersect or they do not reside in the same experiment.

Something that may scare mathematicians and keep them from going there is that that definition touches on the very fabric of probability theory: that is all an approximation. e.g. Is the probability of being an American really N(A)/N(Ω)? Obviously not. There is surely an impossibly complex distribution that would really describe it. But we simplify it that way when conditional probabilities come into play. It may not really work that way but that doesn't mean it isn't what we effectively do anyway. Why hide it?

One could further conclude that independence is often an approximation (one may start from there but it can be also derived). For a good example: The probability of being an American given that you are a Man is obviously very close to the probability of being an American! Surely, being a man in America may be more or less prevalent than in the Middle East (definitely less). Hence declaring an independence doesn't mean P(A|M) doesn't exist! There is obviously a variation but we may simply ignore it due to lack of information and hence there is an assumption that the sample spaces do not intersect or that the experiment is independent but the samples obviously intersect (they are all the same people!) and it is obviously the same experiment (the world!).

Last edited: Aug 12, 2013
9. Aug 12, 2013

### D H

Staff Emeritus
The answer to your question "Did you just declare it?" is yes. P(A|B) = P(A) is one way to define statistical independence. (Others use A and B are independent iff P(A ∩ B)=P(A)*P(B). The two definitions are equivalent.) Either way, it's a definition. That's all it is. That it turns out to be a rather useful definition is a different question.

Neither one of these is true. The former statement doesn't even make sense. The only way to make sense of things like P(A ∩ B) and P(A|B) is if A and B are members of some common universe of possibilities. The latter is incorrect. People test for statistical independence all the time in the same experiment.

10. Aug 12, 2013

### cdux

"doesn't even make sense"

Don't you just try to discourage people? Let's see if you make sense:

"A and B are members of some common universe of possibilities."

Where did I object to that? Nowhere. In fact, that's exactly what I said but in a discreet way and not that abstractly: That to be depended they must be from the same Ω and to intersect and to be from the same experiment.

I really don't see what you are trying to say when I see nothing objecting to that from that statement.

"People test for statistical independence all the time in the same experiment."

Do you really think that's a good argument to prove "I don't make sense"? I specifically talked of a situation that people may have a dependency and choose to make an approximation. So "they do it all the time" contradicts nothing. Of course they do it all the time! They approximate an independence all the time or they use a dependency all the time. Why on earth would I imply one negates the other?

In general, please respond only when you have arguments and without insulting people.

Last edited: Aug 12, 2013
11. Aug 12, 2013

### economicsnerd

If $A,B$ are both independent and disjoint, then $$0 = P(A\text{ and }B) = P(A)\cdot P(B),$$ so that at least one of $A,B$ is a null (i.e. probability $0$) event.

So the starting hypothesis (that $A,B$ are both independent and disjoint) is extremely restrictive.

12. Aug 13, 2013

"If A,B are both independent and disjoint, ..."

Do not say this for any reason, good or bad, for misunderstanding can come about: events cannot be both independent and disjoint.

"Hence it comes to reason that two events are dependent when their sample spaces intersect and reside in the same experiment."

Not sure what you mean by "sample spaces intersect". If you mean "the events share outcomes" then there is no single possibility: two events that share outcomes may be dependent, they may be independent.

The only thing we can say always is true is: if two events do not share outcomes they are not independent - better stated as disjoint events are dependent .

Finally, note that your "very theoretical" statement that begins "The very theoretical inception of P(A|B) starts from the fact that one "zooms in" B to see the chance of A happening in it. " is no such thing: no wording as vague as "zooms in" would be used in a formal statement of theory. What (to me) your statement seems to be doing is this:

* If A is an event we can, with sufficient information, calculate its probability $P(A)$ by considering what fraction of the total sample space A's outcome constitute
* Now, if we are given another event B has occurred, we should be able to use that information to obtain a new estimate of the probability of A, but how? Well, since B has occurred, we can reduce the effective sample space from all of $\Omega$ to B, and simply calculate the fraction of B which A constitutes. This calculation is denoted by $P(A \mid B )$

Once we have both $P(A)$ and $P(A \mid B)$ calculated, there are two possibilities:

a) They are equal: $P(A) = P(A \mid B)$. In this case we say the two events are independent - the intuitive notion is that knowledge of B does not change our original assessment
of A's probability
b) They are not equal: $P(A) \ne P(A \mid B)$. In this case we say that the two events are dependent: intuitively knowing that B has occurred gives us enough information to change our original assessment of A's probability

Notice that if A and B are independent, then A does not constitute any part of B, so in that case $P(A \mid B) = 0$ - this is a special case of 'b' above.

13. Aug 14, 2013

### economicsnerd

... unless at least one of them is null. In particular, the empty event is both independent of and disjoint with any event.

It seems worth noting that if $E$ is a trivial event (i.e. if $P(E)\in\{0,1\}$), then $E$ is independent of every other event.

14. Aug 14, 2013

Null events are special cases that are typically not considered independent of other events, as the traditional definition of independence requires that the two sets have a non-empty intersection. I've never used or reviewed a text that considers null events to be both disjoint and independent of all other events - it would seem that any one that states such reflects a nearly unique author's view.

Your second comment is partially derailed by this - the P(E) = 0 item. If P(E) = 1 then E is the sample space (or a copy of it), and since in this case for any other event $A \cap E = A$ you are sure to have
$$P(A \mid E) = \frac{P(A \cap E)}{P(E)} = \frac{P(A)}{1} = P(A)$$
so the definition'' of independence is trivially satisfied.

15. Aug 14, 2013

### Stephen Tashi

I've never seen a text on measure theory that uses that requirement. Is it provable from the axioms of probability?

Independence in probability texts is usually defined without reference to conditional probability as the condition that A and B are independent events if and only if P( A and B) = P(A)P(B).

The conditional probability P(B|A) isn't defined if P(A)= 0 so trying define independence with reference to a conditional probability would need to assume the conditional probability exists.

16. Aug 14, 2013

### D H

Staff Emeritus
And with that definition it's a rather trivial result that the null event is both independent of and disjoint from all other events.

However, I do agree with statdad that we're getting a bit pedantic and off-topic here. The OP appears to have a fundamental misunderstanding of statistical independence. We need to help with that situation, if possible.

17. Aug 24, 2013

### haruspex

cdux, please try to take what D H wrote not as a personal attack but as a simple statement of fact.
The whole discussion of relationships of probabilities of events starts with the notion that we have an agreed space Ω, and an event consists of a subset of that space. If you want to say anything useful about, say, the joint probability of a coin toss coming up heads and a bus arriving in the next minute then you must first construct a space that encompasses that combination.
Not sure what you mean by that. Again, the notion of an agreed space seems to prohibit it.