Random variables: Total probability, Transformations & CDFs

In summary: I think! This is the most logical and coherent explanation for LTP I've encountered in literature.I now see that the thing I had been missing was the concept of partitioning the domain of the independent variable, and the definition of the "preimage" -- prior to this, I had only seen the definition that uses the term "inverse image" (of a set).I am very grateful for your time and consideration, and for the effort you took in writing and explaining all this.In summary, The conversation discusses the use of the law of total probability in evaluating the cumulative distribution function for a transformation of a random variable. The speaker considers a more general approach, but makes a mistake in equating two events. The
  • #1
danielakkerma
231
0
Hello All!
A recent problem has stuck with me, and I was hoping you could help me resolve it.
Consider the following premise: Let us assume that [tex] X \sim \mathcal{U}(-3,3) [/tex]
(U is the continuous, uniform distribution).
And let the transformation Y be applied thus:
[tex]
Y = \left\{
\begin{align*}
X+1, & & -1 \leq X \leq 0 \\
1-X, & & 0\leq X \leq 1 \\
0~~~, & & \rm{otherwise}
\end{align*}
\right.
[/tex]
Then one desires to evaluate [itex] F_Y(t) [/itex], Where F(t) is the cumulative dist. func. for Y.
Obviously, the simplest approach would be to find the expression using elementary means -- for example, by plotting the new domain of Y as a function of X.
However, I attempted to obtain the same result by considering the problem from more general principles, particularly, the law of total probability.
I considered the following statement:
[tex]
F_Y(t) = \mathbf{P}(Y \leq t)
[/tex]
By LTP:
[tex]
\mathbf{P}(Y \leq t) = \sum_{i} P(Y \leq t ~ \mathbf{|} X \in A_i) \cdot P(X \in A_i)
[/tex]
Where [itex]{A}[/itex] is the set of all the regions on which Y is defined, as a function of X. For example, [itex]A_1 = [-1,0][/itex],[itex]A_2 =[0,1][/itex], and so forth.
Thus, I would get:
[tex]
\mathbf{P}(Y \leq t) = P(Y \leq t ~ \mathbf{|} -1 \leq X \leq 0) \cdot P(-1 \leq X \leq 0) + P(Y \leq t ~ \mathbf{|} 0 \leq X \leq 1) \cdot P(0 \leq X \leq 1) + \\ + P(Y \leq t ~ \mathbf{|} 1 \leq X \leq 3 \cup -3 \leq X \leq -1) \cdot P(1 \leq X \leq 3 \cup -3 \leq X \leq -1)
[/tex]
Then I observe that: [itex]P(Y \leq t ~ \mathbf{|} -1 \leq X \leq 0)[/itex] is merely [itex]P(Y \leq t \cap Y = X+1) = P(X+1 \leq t) = F_X(t-1)[/itex].
This I then apply to all the conditional probabilities(i.e., separating the values of Y according to the constituent Xs(as shown)) and using the CDF for X.
However, I obtain a completely different(and erroneous!) result here, compared with the direct approach(i.e., graphic, and others).
What went wrong?
Is my approach at all correct(or possible/permissible)?
Thank you very much for your attention,
Daniel
 
Last edited:
Physics news on Phys.org
  • #2
danielakkerma said:
Then I observe that: [itex]P(Y \leq t ~ \mathbf{|} -1 \leq X \leq 0)[/itex] is merely [itex]P(Y \leq t \cap Y = X+1) = P(X+1 \leq t) = F_X(t-1)[/itex].

Does this amount to claiming that P(A|B) = P(A and B) ?
 
  • #3
Not exactly...

Thanks for your reply.
I meant to say that that event corresponded to:
[tex]
P(Y \leq t ~ | ~ -1 \leq X \leq 0) = P(X+1 \leq t)
[/tex]
Since logically, the two are -- or at least should be -- equivalent(one is only possible with the other in tandem).
Is this reasoning invalid?
What is, then, the probability of that conditional statement?
Thanks again,
Daniel
 
  • #4
Suppose [itex] t = -1/2 [/itex].

[itex] P(X+1 \le -1/2) = P(X \le -3/2) = \frac{ (-3/2) -(-3)}{6} = 3/12 = 1/4 [/itex]

[itex] P(Y \le -1/2 | -1 \le X \le 0) = 0 [/itex]
 
  • #5
You are, of course, correct!

You're obviously right. I can't believe I didn't detect such a boneheaded mistake, sooner; thank you!
I see I should have written that equality, using the LTP, in this manner:
[tex]
P(Y \leq t) = \sum_i P(Y \leq t \cap A_i) [/tex]
Where again [itex] A_i [/itex] form the domains of Y.
I can therefore get, for one of the subtended regions:
[tex]
P(Y \leq t \cap -1\leq X\leq 0) = P(X \leq t-1 \cap -1 \leq X \leq 0)=
\left\{
\begin{align*}
F_X(t-1)-F_X(-1), & & 0 \leq t \leq 1 \\
1~~~~, && t>1 \\
0, && else
\end{align*}
\right.
[/tex]
But, for the other term:
[tex]
P(Y \leq t \cap 0\leq X\leq 1) = P(X \geq 1-t \cap 0 \leq X \leq 1)=
\left\{
\begin{align*}
F_X(1)-F_X(1-t), & & 0 \leq t \leq 1 \\
1~~~~, && t>1 \\
0, && else
\end{align*}
\right.
[/tex]
Here, it is already evident that when summing these two results(as per the LTP), one would obtain that [tex]\lim_{t \to \infty} F_Y(t) = 2 [/tex] and not 1, which is a fundamental property of the CDF, lost here.
How do I correct this discrepancy?
Thanks,
Daniel
 
  • #6
danielakkerma said:
[tex]
P(Y \leq t \cap -1\leq X\leq 0) = P(X \leq t-1 \cap -1 \leq X \leq 0)=
[/tex]

But [itex] Y \leq t [/itex] is not the same event as [itex] X \leq t -1 [/itex]. So you can't equate the events [itex] Y \leq t \ \cap -1 \leq X \leq 0 [/itex] and [itex] X \leq t-1 \ \cap -1 \leq X \leq 0 [/itex].

Let [itex] Y [/itex] be a function of the random variable [itex] X [/itex]. Let the sets [itex] A_i [/itex] partition the domain of [itex] X [/itex].

Then [itex] P(Y \leq t) = \sum_{i=1}^n P(Y \le t \ \cap X \in A_i) [/itex]
[itex] = \sum_{i=1}^n P(Y \leq t| X \in A_i) P(X \in A_i) [/itex].

To compute [itex] P(Y \leq t | X \in A_i) [/itex] you can express the statement that defines [itex] Y \leq t [/itex] as an equivalent statement about [itex] X [/itex]. Then compute [itex] P(Y \leq t) [/itex] as [itex] P(Y \leq t \ \cap X \in A_i)/ P(X \in A_i) [/itex].

For example, in your problem:

[itex] \{Y \leq 1/2\} = S = \{-3 \leq X \leq -1\} \cup \{ -1 \leq X \leq -1/2 \} \cup \{ 1/2 \leq X \leq 3 \} [/itex]

[itex] \{Y \leq 1/2\} \cap \{-1 \leq X \leq 0 \} = S \cap \{-1 \leq X \leq 0\} = \{-1 \leq X \leq -1/2\} [/itex]


The way such examples are usually solved is to express the event [itex] Y \le t [/itex] as an equivalent statement about [itex] X [/itex] being in one of a union of mutually exclusive sets [itex] B_i [/itex]. The sets [itex] B_i [/itex] depend on [itex] t [/itex]. Then the law of total probability is used to compute [itex] P(X \in (B_1 \cup B_2\cup...\cup B_n)) [/itex].

Thinking of [itex]Y [/itex]as a function that maps a set [itex] s [/itex] in the domain of [itex]X [/itex] to a set [itex] Y(s) [/itex] in the domain of [itex] Y [/itex] the way to find [itex] P(Y \leq t) [/itex] is to find the probability of the set [itex] Y^{-1}( \{Y \leq t\}) [/itex] using the distribution of [itex] X [/itex]
 
  • #7
Now, it's finally clear!

Stephen,
Thanks again for your patient and diligent aid here! it's finally dawned on me(and I'm sorry it has taken so long).
I now see that I should have accounted for the various values Y≤t could take, irrespective of X; and obviously, as you point out, the intersection between Ys and Xs would not -- necessarily -- result in limiting Y itself to any particular domain(as a function of X).
Thinking of Y as a function that maps a set s in the domain of X to a set Y(s) in the domain of Y the way to find P(Y≤t) is to find the probability of the set Y-1({Y≤t}) using the distribution of X
This is what I have been doing hitherto, but I was hoping I could find a more analytical method to compute these sets, especially, when the transformations are not quite so trivial(and may involve multi-valued inverse solutions).
Still, I quite see now where I was mistaken.
Thanks again for all your help!
Daniel
 
  • #8
danielakkerma said:
but I was hoping I could find a more analytical method to compute these sets, especially, when the transformations are not quite so trivial(and may involve multi-valued inverse solutions).

Don't completely give up on that goal. I don't know what progress can be made, but if you find something, it would be a great service to mathematical humanity. Maybe the cure for the problems of transformations is yet more transformations.
 

FAQ: Random variables: Total probability, Transformations & CDFs

1. What is a random variable?

A random variable is a numerical quantity that takes on different values based on the outcome of a random event. It can be discrete, taking on only specific values, or continuous, taking on any value within a range.

2. What is total probability?

Total probability is a concept used in probability theory to calculate the probability of an event based on the probabilities of different outcomes. It states that the sum of the probabilities of all possible outcomes must equal 1.

3. What are transformations of random variables?

Transformations of random variables involve changing the scale or units of measurement of a random variable. This is often done to simplify calculations or to make the variable more interpretable. Examples include multiplying or adding a constant, taking the logarithm, or applying a function.

4. What is a cumulative distribution function (CDF)?

A cumulative distribution function (CDF) is a function that maps every possible value of a random variable to its cumulative probability. In other words, it gives the probability that the variable takes on a value less than or equal to a specific value.

5. How are CDFs used in statistics?

CDFs are useful in calculating probabilities and making inferences about the distribution of a random variable. They can also be used to generate random samples from a particular distribution. Additionally, CDFs can be used to compare two or more random variables or to assess the fit of a theoretical distribution to a set of observed data.

Back
Top