Random variables: Total probability, Transformations & CDFs

AI Thread Summary
The discussion revolves around evaluating the cumulative distribution function (CDF) F_Y(t) for a transformed random variable Y derived from a uniformly distributed variable X. The transformation is piecewise defined, leading to confusion when applying the law of total probability (LTP) to compute probabilities. The participant initially misapplies conditional probabilities, resulting in discrepancies between analytical and graphical methods. After clarification, it is emphasized that the correct approach involves expressing events in terms of X and considering the entire range of Y values. The conversation concludes with encouragement to explore more analytical methods for complex transformations.
danielakkerma
Messages
230
Reaction score
0
Hello All!
A recent problem has stuck with me, and I was hoping you could help me resolve it.
Consider the following premise: Let us assume that X \sim \mathcal{U}(-3,3)
(U is the continuous, uniform distribution).
And let the transformation Y be applied thus:
<br /> Y = \left\{<br /> \begin{align*}<br /> X+1, &amp; &amp; -1 \leq X \leq 0 \\<br /> 1-X, &amp; &amp; 0\leq X \leq 1 \\<br /> 0~~~, &amp; &amp; \rm{otherwise}<br /> \end{align*}<br /> \right.<br />
Then one desires to evaluate F_Y(t), Where F(t) is the cumulative dist. func. for Y.
Obviously, the simplest approach would be to find the expression using elementary means -- for example, by plotting the new domain of Y as a function of X.
However, I attempted to obtain the same result by considering the problem from more general principles, particularly, the law of total probability.
I considered the following statement:
<br /> F_Y(t) = \mathbf{P}(Y \leq t)<br />
By LTP:
<br /> \mathbf{P}(Y \leq t) = \sum_{i} P(Y \leq t ~ \mathbf{|} X \in A_i) \cdot P(X \in A_i)<br />
Where {A} is the set of all the regions on which Y is defined, as a function of X. For example, A_1 = [-1,0],A_2 =[0,1], and so forth.
Thus, I would get:
<br /> \mathbf{P}(Y \leq t) = P(Y \leq t ~ \mathbf{|} -1 \leq X \leq 0) \cdot P(-1 \leq X \leq 0) + P(Y \leq t ~ \mathbf{|} 0 \leq X \leq 1) \cdot P(0 \leq X \leq 1) + \\ + P(Y \leq t ~ \mathbf{|} 1 \leq X \leq 3 \cup -3 \leq X \leq -1) \cdot P(1 \leq X \leq 3 \cup -3 \leq X \leq -1)<br />
Then I observe that: P(Y \leq t ~ \mathbf{|} -1 \leq X \leq 0) is merely P(Y \leq t \cap Y = X+1) = P(X+1 \leq t) = F_X(t-1).
This I then apply to all the conditional probabilities(i.e., separating the values of Y according to the constituent Xs(as shown)) and using the CDF for X.
However, I obtain a completely different(and erroneous!) result here, compared with the direct approach(i.e., graphic, and others).
What went wrong?
Is my approach at all correct(or possible/permissible)?
Thank you very much for your attention,
Daniel
 
Last edited:
Physics news on Phys.org
danielakkerma said:
Then I observe that: P(Y \leq t ~ \mathbf{|} -1 \leq X \leq 0) is merely P(Y \leq t \cap Y = X+1) = P(X+1 \leq t) = F_X(t-1).

Does this amount to claiming that P(A|B) = P(A and B) ?
 
Not exactly...

Thanks for your reply.
I meant to say that that event corresponded to:
<br /> P(Y \leq t ~ | ~ -1 \leq X \leq 0) = P(X+1 \leq t)<br />
Since logically, the two are -- or at least should be -- equivalent(one is only possible with the other in tandem).
Is this reasoning invalid?
What is, then, the probability of that conditional statement?
Thanks again,
Daniel
 
Suppose t = -1/2.

P(X+1 \le -1/2) = P(X \le -3/2) = \frac{ (-3/2) -(-3)}{6} = 3/12 = 1/4

P(Y \le -1/2 | -1 \le X \le 0) = 0
 
You are, of course, correct!

You're obviously right. I can't believe I didn't detect such a boneheaded mistake, sooner; thank you!
I see I should have written that equality, using the LTP, in this manner:
<br /> P(Y \leq t) = \sum_i P(Y \leq t \cap A_i)
Where again A_i form the domains of Y.
I can therefore get, for one of the subtended regions:
<br /> P(Y \leq t \cap -1\leq X\leq 0) = P(X \leq t-1 \cap -1 \leq X \leq 0)=<br /> \left\{<br /> \begin{align*}<br /> F_X(t-1)-F_X(-1), &amp; &amp; 0 \leq t \leq 1 \\<br /> 1~~~~, &amp;&amp; t&gt;1 \\<br /> 0, &amp;&amp; else<br /> \end{align*}<br /> \right.<br />
But, for the other term:
<br /> P(Y \leq t \cap 0\leq X\leq 1) = P(X \geq 1-t \cap 0 \leq X \leq 1)=<br /> \left\{<br /> \begin{align*}<br /> F_X(1)-F_X(1-t), &amp; &amp; 0 \leq t \leq 1 \\<br /> 1~~~~, &amp;&amp; t&gt;1 \\<br /> 0, &amp;&amp; else<br /> \end{align*}<br /> \right.<br />
Here, it is already evident that when summing these two results(as per the LTP), one would obtain that \lim_{t \to \infty} F_Y(t) = 2 and not 1, which is a fundamental property of the CDF, lost here.
How do I correct this discrepancy?
Thanks,
Daniel
 
danielakkerma said:
<br /> P(Y \leq t \cap -1\leq X\leq 0) = P(X \leq t-1 \cap -1 \leq X \leq 0)=<br />

But Y \leq t is not the same event as X \leq t -1. So you can't equate the events Y \leq t \ \cap -1 \leq X \leq 0 and X \leq t-1 \ \cap -1 \leq X \leq 0.

Let Y be a function of the random variable X. Let the sets A_i partition the domain of X.

Then P(Y \leq t) = \sum_{i=1}^n P(Y \le t \ \cap X \in A_i)
= \sum_{i=1}^n P(Y \leq t| X \in A_i) P(X \in A_i).

To compute P(Y \leq t | X \in A_i) you can express the statement that defines Y \leq t as an equivalent statement about X. Then compute P(Y \leq t) as P(Y \leq t \ \cap X \in A_i)/ P(X \in A_i).

For example, in your problem:

\{Y \leq 1/2\} = S = \{-3 \leq X \leq -1\} \cup \{ -1 \leq X \leq -1/2 \} \cup \{ 1/2 \leq X \leq 3 \}

\{Y \leq 1/2\} \cap \{-1 \leq X \leq 0 \} = S \cap \{-1 \leq X \leq 0\} = \{-1 \leq X \leq -1/2\}


The way such examples are usually solved is to express the event Y \le t as an equivalent statement about X being in one of a union of mutually exclusive sets B_i. The sets B_i depend on t. Then the law of total probability is used to compute P(X \in (B_1 \cup B_2\cup...\cup B_n)).

Thinking of Yas a function that maps a set s in the domain of X to a set Y(s) in the domain of Y the way to find P(Y \leq t) is to find the probability of the set Y^{-1}( \{Y \leq t\}) using the distribution of X
 
Now, it's finally clear!

Stephen,
Thanks again for your patient and diligent aid here! it's finally dawned on me(and I'm sorry it has taken so long).
I now see that I should have accounted for the various values Y≤t could take, irrespective of X; and obviously, as you point out, the intersection between Ys and Xs would not -- necessarily -- result in limiting Y itself to any particular domain(as a function of X).
Thinking of Y as a function that maps a set s in the domain of X to a set Y(s) in the domain of Y the way to find P(Y≤t) is to find the probability of the set Y-1({Y≤t}) using the distribution of X
This is what I have been doing hitherto, but I was hoping I could find a more analytical method to compute these sets, especially, when the transformations are not quite so trivial(and may involve multi-valued inverse solutions).
Still, I quite see now where I was mistaken.
Thanks again for all your help!
Daniel
 
danielakkerma said:
but I was hoping I could find a more analytical method to compute these sets, especially, when the transformations are not quite so trivial(and may involve multi-valued inverse solutions).

Don't completely give up on that goal. I don't know what progress can be made, but if you find something, it would be a great service to mathematical humanity. Maybe the cure for the problems of transformations is yet more transformations.
 
Back
Top