Probability Theory: Need help understanding a step

WWCY
Messages
476
Reaction score
14

Homework Statement



Discrete random variables ##X,Y,Z## are mutually independent if for all ##x_i, y_j, z_k##,
$$P(X=x_i \wedge Y=y_j \wedge Z=z_k ) = P(X=x_i)P(Y=y_j)P(Z=z_k )$$

I am trying to show (or trying to understand how someone has shown) that ##X,Y## are also independent as a result of ##X,Y,Z## being mutually independent.

Homework Equations

The Attempt at a Solution



It starts of with
$$P(X=x_i \wedge Y=y_j ) = \sum_k P(X=x_i \wedge Y=y_j \wedge Z=z_k )$$
before going using the definition of mutual independence for the three variables to complete the proof. This is the step I don't understand. Why is the probability of getting results ##x_i,y_j## equal to the sum (over ##k##) of probabilities of getting results ##x_i, y_j, z_k##?

Many thanks in advance!
 
Physics news on Phys.org
You are looking for a probability of some case A. You then need to add up the probabilities for all outcomes where A is true, in this case that X and Y take particular values. This is true independent of Z whenever X and Y take the correct values so you end up with a sum over the possible outcomes for Z.
 
WWCY said:
Why is the probability of getting results ##x_i,y_j## equal to the sum (over ##k##) of probabilities of getting results ##x_i, y_j, z_k##?

It's as @Oroduin said - and the concept is significant enough to have its own name: https://en.wikipedia.org/wiki/Law_of_total_probability.

Rather than being a "law of nature", it is implicit in the definition of a probability space, which depends on the definition of a probability "measure", whose definition says it is an "additive" function when applied to disjoint measureable sets. That's an outline of the mathematical structure, which is not made clear by the Wikipedia article.
 
  • Like
Likes Orodruin
Stephen Tashi said:
It's as @Orodruin said - and the concept is significant enough to have its own name: https://en.wikipedia.org/wiki/Law_of_total_probability.

Rather than being a "law of nature", it is implicit in the definition of a probability space, which depends on the definition of a probability "measure", whose definition says it is an "additive" function when applied to disjoint measureable sets. That's an outline of the mathematical structure, which is not made clear by the Wikipedia article.
Well said (and a bit more direct than I managed on my phone this morning). In general, I think the connection between probability theory and the measure theory is typically underemphasised in introductory courses on probability (at least for non-mathematicians). Also, just for OP's reference: https://en.wikipedia.org/wiki/Measure_(mathematics)
 
This seem to be a re-cut of a recent thread that I answered, here https://www.physicsforums.com/threads/independent-events-and-variables.954016/#post-6047400

I tried to emphasize the role of events and (sub) additivity, but if OP did not understand the event partitioning (and union) argument, then introducing measures... is a step in the wrong direction. And it certainly is not needed for discrete random variables.
- - - - -
another approach is to unpack the joint probability into multiplicative conditional probability. Ignoring any nits about zero probability events, we have the identity

##P\Big(X=x_i, Y=y_j, Z= z_k\Big) = P\Big(X=x_i\Big)P\Big(Y=y_j \big \vert X = x_i\Big)P\Big( Z= z_k\big \vert X = x_i, Y = y_j \Big)##

But we need to recall that conditional probabilities are in fact probabilities, so summing over all ##k##

##\sum_k P\Big(X=x_i, Y=y_j, Z= z_k\Big) ##
##= \sum_k P\Big(X=x_i\Big)P\Big(Y=y_j \big \vert X = x_i\Big)P\Big( Z= z_k\big \vert X= x_i, Y= y_j\Big) ##
##= P\Big(X=x_i\Big)P\Big(Y=y_j \big \vert X = x_i\Big)\cdot \sum_k P\Big( Z= z_k\big \vert X= x_i, Y= y_j\Big) ##
##= P\Big(X=x_i\Big)P\Big(Y=y_j \big \vert X = x_i\Big)\cdot 1 ##
##=P\Big(X=x_i\Big)P\Big(Y=y_j \big \vert X = x_i\Big)##
##=P\Big(X=x_i, Y=y_j\Big)##

as desired
 
Thanks for the responses.

I did not think to apply the law of total probability for the case of variables. Now I see the connection.

@StoneTemplePython I did manage to grasp the idea behind your proof in the other thread, but couldn't do so for the method (just that step) presented above, hence the question. Thanks again for your time!
 
There are two things I don't understand about this problem. First, when finding the nth root of a number, there should in theory be n solutions. However, the formula produces n+1 roots. Here is how. The first root is simply ##\left(r\right)^{\left(\frac{1}{n}\right)}##. Then you multiply this first root by n additional expressions given by the formula, as you go through k=0,1,...n-1. So you end up with n+1 roots, which cannot be correct. Let me illustrate what I mean. For this...

Similar threads

Back
Top