# Independent events and variables

• I
Hi all, I have a few questions regarding the issue of independence. Many thanks in advance.

##\textbf{1}##

If I find that some events ##A, B, C## obey the following formula
$$P(A \cap B \cap C ) = P(A)P(B)P(C)$$
it does not necessarily mean that a) they are mutually independent and b) ##A## and ##B## (or any two of the 3 events) are independent unless I explicitly find that all possible subcollections are mutually independent as well. Is this right?

##\textbf{2}##
Similarly, if I find that a set of discrete random variables ##X, Y, Z## obey the following (for specific ##i,j,k##),
$$P(X=x_i \wedge Y = y_j \wedge Z=z_k ) = P(X=x_i)P(Y=y_j)P(Z=z_j)$$
I am not allowed to conclude that a) they are mutually independent and b) any two of the 3 variables are independent unless I explicitly show that ##X,Y,Z## obey the above formula for all ##i,j,k##. Is this right?

StoneTemplePython
Gold Member
It's probably best to stick with events first, but this is basically right. I'm assuming the events in question have non-zero probability. The idea is if you have ##n\geq 2## events, you need to iterate through the entire powerset and check your formula in (1.) against it. Now there is nothing to do in the ##\binom{n}{0}## case and the ##\binom{n}{1}## case is trivially true, so you end up with ##\big(1+1\big)^n -\binom{n}{1} - \binom{n}{0} = 2^n -n -1## checks that need to be done.

- - - -
sometimes there are modeling reasons why you know the variables are independent. Other times you are only interested in the first two moments -- so only need to do pairwise (covariance) calculations. (In particular Gaussians have some additional special structure since they are entirely characterized by first two moments.) But highlevel, if you know nothing else, you need to check everything.

WWCY

So would it be right for me to say that if I know that a bunch of ##N## discrete random variables are mutually independent, I could pick any number (up to ##N##) of these variables and safely conclude that they are also mutually independent?

FactChecker
Gold Member

So would it be right for me to say that if I know that a bunch of ##N## discrete random variables are mutually independent, I could pick any number (up to ##N##) of these variables and safely conclude that they are also mutually independent?
This is not the same thing that you were asking about in post #1. In that, you were asking for confirmation that something was not true. Now you are asking if something is true. That is different and does not follow.

PS. A simple (degenerate) counterexample to the statement in the OP is if P(C) = 1 (like if C=U= the universal set.) In that case, any two independent sets A and B contained in C would satisfy the equation, but both are dependent on C.

rubi
A simple (degenerate) counterexample to the statement in the OP is if P(C) = 1 (like if C=U= the universal set.) In that case, any two independent sets A and B contained in C would satisfy the equation, but both are dependent on C.
That's not a counterexample. If ##P(C)=1##, we have
##P(A\cap B) = P(A\cap B \cap C) = P(A) P(B) P(C) = P(A) P (B)##
##P(A\cap C) = P(A) = P(A) P(C)## (analogously for ##B##)

A counterexample would be ##C = B##, ##P(A) = P(B) = \frac 1 2##:
##\frac 1 4 = P(A) P(B) \neq P(A\cap B) = P(A \cap B \cap B) = P(A) P(B) P(B) = \frac 1 8##

FactChecker
Gold Member
That's not a counterexample. If ##P(C)=1##, we have
##P(A\cap B) = P(A\cap B \cap C) = P(A) P(B) P(C) = P(A) P (B)##
##P(A\cap C) = P(A) = P(A) P(C)## (analogously for ##B##)
Sorry that I wasn't clear. There is sort of a double negative in the stated question. He asks if he is correct that the equation does NOT imply pairwise independence. This is a case to support his statement by providing a counterexample to the contention that A, B, and C would be pairwise independent. This case satisfies the equations but A, B, and C are not pairwise independent.

rubi
Stephen Tashi

So would it be right for me to say that if I know that a bunch of ##N## discrete random variables are mutually independent, I could pick any number (up to ##N##) of these variables and safely conclude that they are also mutually independent?

Yes.

How to define "mutually independent" is an interesting technical question. One possibility is to say it means "pairwise independent". The concept of "independent" is usually defined for pairs of random variables.

If we want to talk about 3 random variables "being mutually independent", we can say this means each pair of them is independent without formulating any new definitions.

StoneTemplePython
Gold Member

So would it be right for me to say that if I know that a bunch of ##N## discrete random variables are mutually independent, I could pick any number (up to ##N##) of these variables and safely conclude that they are also mutually independent?

Basically yes -- so long as they are bona fide random variables (not defective or whatever). It is implicit in the 'tests' for mutual independence. A more direct way of getting at this is... do it yourself by writing out the joint distribution and marginalizing the random variables you are not interested in selecting.

While there is a clear linkage between binary events and random variables (via indicator random variables) I would strongly suggest making sure you have the events case locked down mentally first. People probably jump into random variables too quickly.

Yes.

How to define "mutually independent" is an interesting technical question. One possibility is to say it means "pairwise independent". The concept of "independent" is usually defined for pairs of random variables.

If we want to talk about 3 random variables "being mutually independent", we can say this means each pair of them is independent without formulating any new definitions.

Sorry, but this is non-standard to the point of being dangerous for OP. Mutual independence already has a standard definition. Reference, e.g. Feller vol 1 3rd edition, pages 217 - 218 for a standard definition of mutual independence over discrete random variables. Pairwise comparisons are too weak in general.

WWCY and FactChecker
Stephen Tashi
Sorry, but this is non-standard to the point of being dangerous for OP. Mutual independence already has a standard definition. Reference, e.g. Feller vol 1 3rd edition, pages 217 - 218 for a standard definition of mutual independence over discrete random variables. Pairwise comparisons are too weak in general.

Yes, or the wikipedia https://en.wikipedia.org/wiki/Independence_(probability_theory)

FactChecker
FactChecker
Gold Member
Sorry, but this is non-standard to the point of being dangerous for OP. Mutual independence already has a standard definition. Reference, e.g. Feller vol 1 3rd edition, pages 217 - 218 for a standard definition of mutual independence over discrete random variables. Pairwise comparisons are too weak in general.
An example where pairwise independence does not imply mutual independence (standard definition) is in Feller vol 1 3rd edition, page 127. Another example can be found at https://en.wikipedia.org/wiki/Independence_(probability_theory)#Pairwise_and_mutual_independence

WWCY
WWGD
Gold Member
Hi all, I have a few questions regarding the issue of independence. Many thanks in advance.

##\textbf{1}##

If I find that some events ##A, B, C## obey the following formula
$$P(A \cap B \cap C ) = P(A)P(B)P(C)$$
it does not necessarily mean that a) they are mutually independent and b) ##A## and ##B## (or any two of the 3 events) are independent unless I explicitly find that all possible subcollections are mutually independent as well. Is this right?

##\textbf{2}##
Similarly, if I find that a set of discrete random variables ##X, Y, Z## obey the following (for specific ##i,j,k##),
$$P(X=x_i \wedge Y = y_j \wedge Z=z_k ) = P(X=x_i)P(Y=y_j)P(Z=z_j)$$
I am not allowed to conclude that a) they are mutually independent and b) any two of the 3 variables are independent unless I explicitly show that ##X,Y,Z## obey the above formula for all ##i,j,k##. Is this right?
Why not use ##P(A \cap B\cap C)=P(A\cap C \cap B)=P((A\cap B)\cap C) ##, etc?

StoneTemplePython
Gold Member
Basically yes -- so long as they are bona fide random variables (not defective or whatever). It is implicit in the 'tests' for mutual independence. A more direct way of getting at this is... do it yourself by writing out the joint distribution and marginalizing the random variables you are not interested in selecting.

By the way, to refine this a bit

the definition used in feller is that a collection of ##n## discrete random variables (i.e. finite set of them) are mutually independent if for any combination of values (X, y, ... , w) assumed by them

##Pr\{X = x, Y=y, ... W=w\} = Pr\{X =x\}Pr\{Y = y\}...Pr\{W=w\}##

note there is ##n## random variables above (not necessarily the length of the alphabet)

what I was suggesting is that it is enough to assume that you are interested in the first ##n-1## random variables but not the ##n##th one, but nevertheless know that they are mutually independent. (Use of induction/recursion then gives the result for the ##k## random variables you are actually interested in for ##k \in \{2, 3, ..., n-1\}##)
- - - -
Since the random variables are discrete, the sample space is discrete and there is a natural correspondence with integers. Our attention is really on the correspondence here with ##W##'s sample space.

So you may define events:

##A_1: = \text{event that X = x, Y=y, ..., W=}w^{(1)}##
##A_2: = \text{event that X = x, Y=y, ..., W=}w^{(2)}##
##A_3: = \text{event that X = x, Y=y, ..., W=}w^{(3)}##
and so on
- - - -
(some technical nits: the above may not be totally satisfying with respect to ordering though we could easily clean this up via the use of conditional probabilities-- i.e. we can divide each side by ##Pr\{X = x, Y=y, ...,V=v\}## when that probability is non-zero and the zero probability cases need a little more care but ultimately can be ignored -- i.e. the implication is that for any positive probability event of ##w## if the right hand side is zero, it must be because the product of n-1 terms is zero which gives the desired relation and our analysis of events is only concerned with the positive probability sample points of original ##w## as the zero probability events for ##w## have no impact in the marginalization process that we're doing here-- but I think that all this is obscures the argument too much.)
- - - -
Now we know the relationship, by assumption:

##Pr\{X = x, Y=y, ... W=w\} = Pr\{X =x\}Pr\{Y = y\}...Pr\{W=w\}##

where I introduce random variable ##V## as the (n-1)th one -- hopefully that doesn't confuse things.

so summing over ##w##, we have
##\Big(Pr\{A_1\}+Pr\{A_2\} + Pr\{A_3\}+... \Big)= \sum_w Pr\{X = x, Y=y, ... W=w\}##
##= \sum_w Pr\{X =x\}Pr\{Y = y\}...Pr\{V = v\}Pr\{W=w\}##
##= Pr\{X =x\}Pr\{Y = y\}...Pr\{V = v\}\sum_w Pr\{W=w\} ##

which, because summing over all w on the RHS sums to one, simplifies to
##\Big(Pr\{A_1\}+Pr\{A_2\} + Pr\{A_3\}+... \Big)= Pr\{X =x\}Pr\{Y = y\}...Pr\{V = v\}##

from here, apply Boole's Inequality (/ Union Bound) and see

##Pr\{X = x, Y=y, ..., V=v\} = Pr\{A_1 \cup A_2 \cup A_3 \cup ... \}\leq \Big(Pr\{A_1\}+Pr\{A_2\} + Pr\{A_3\}+... \Big)= Pr\{X =x\}Pr\{Y = y\}...Pr\{V = v\}##

But this must be an equality. Two ways to reason: look at the equality conditions for Boole's Inequality (mutually exclusive events). Alternatively sum over all X, Y, ..., V, and consider that if the inequality is strict for even one of those instances then you end up with ##1= \sum_x\sum_y ... \sum_v Pr\{X = x, Y=y, ..., V=v\} \lt 1## which is a contradiction.

So the end result is:

##Pr\{X = x, Y=y, ..., V=v\} = Pr\{X =x\}Pr\{Y = y\}...Pr\{V = v\}##

as desired

Last edited:
WWCY
This helped a lot, cheers and thanks for your time!