Transformation of random variable

philipp_w · Sep 13, 2012

Hi there,

I am currently reading Rohatgi's book "An introduction to probability and statistics" (http://books.google.de/books?id=IMbVyKoZRh8C&lpg=PP1&hl=de&pg=PA62#v=onepage&q&f=true). My questions concerns the "technique" of finding the PDF of a transformed random varibale Y by a function, let's call it g. So we got Y=g(X), where X is a random variable of continuos type.

There is this method existing, where you first calculate the DF of Y, let us call it F_y, and then obtain the PDF regarding Y by differention of F_y.

There are some special cases, where g is differentiable and absolutely increasing, but the tricky part is, where g does not fullfil such conditions.

My question: In the most presented, more complicated examples in the textbooks, one determines F_y and then differentiates it to obtain the PDF, but what tells me in advance that F_y is in fact an (absolutely) continuous DF? My opinion is, that this technique is more like a guess, to obtain a "possible" PDF.

How can one be sure that F_y is a (absolutely) continuous DF, depending on properties of g, if Y=g(X)? This technique of differentiation, does not work in the general case, am I right? Put it in other words: Of course, one can differentiate F_y, but nothing guarantees in advance that F_y got an integral representation with a continuous function under the integration sign and that this differentiation of F_y results in the PDF, what we are looking for.
How is the correct argumentation or where is my lack of understanding, so that there is no contradiction any longer with the argumentation with the cited textbook from above?

Thanks in advance and best regards
Philipp

chiro · Sep 13, 2012

Hey philipp_w and welcome to the forums.

There is the requirement that the derivatives exist across the domain of the random variable. Basically if the original random variable is continuous and it's CDF continuous and differentiable, and that the transformation function (like Y = g(X) where we are considering g now) is differentiable, then all the results will follow.

The transformation theorem is not a guess: it is done systematically. From my own statistic book, the proof is along the lines of the following:

Let U = h(Y) and we know R.V. Y with pdf f(y) and function h. Then

P(U <= u) = P(h(Y) <= u) = P(h^-1(h(Y)) <= h^-1(u)) = P(Y <= h^-1(u)) which means
P(U <= u) = G(u) = F(h^-1(u)) where F is the CDF for the Y random variable.

So we differentiate both sides to get the PDF and the result follows.

So as long as we have differentiability at all steps of the PDF and that we have the relevant inverse, everything follows mathematically since all the pure mathematicians have already proved all this stuff before hand.

philipp_w · Sep 14, 2012

chiro said:

Let U = h(Y) and we know R.V. Y with pdf f(y) and function h. Then

P(U <= u) = P(h(Y) <= u) = P(h^-1(h(Y)) <= h^-1(u)) = P(Y <= h^-1(u)) which means
P(U <= u) = G(u) = F(h^-1(u)) where F is the CDF for the Y random variable.

So we differentiate both sides to get the PDF and the result follows.

I will try to follow your notation:

Your treatments ended with the result

[itex]P(U \leq u ) = G(u) = F(h^{-1}(u))[/itex]

But only if I knew in advance that [itex]P(U \leq u)[/itex] is of continuous type, meaning that [itex]P_{U} \ll \mathbf{\lambda}[/itex], where ##P_{U}## is the image measure of the probability measure ##P## defined on a suited probability space and ##\mathbf{\lambda}## is the Lebesgue measure. Which means that ##P_{U}(B) = \int_B g \, \mathrm{d} \mathbf{\lambda}##, where ##B \in \mathcal{B}(ℝ)## and ##\mathcal{B}(ℝ)## is the Borel-Sigma-Algebra over ##ℝ##. One needs an existence argument of the type that one can guarantee, that ## P(U \leq u ) = \int_{(-∞,u]} g \, \mathrm{d} \mathbf{\lambda}## holds or with Riemann integral ## P(U \leq u ) =G(u)= \int_{-∞}^u \tilde{g}(x) \, \mathrm{d} x## where ##g=\tilde{g} \, \lambda##-almost everywhere

So whenever I differentiate your last equation on both sides, I would truly find ##G^{\prime}(u)=\tilde{g}(u)##, if we assume that ##\tilde{g}## is continuous. If I would not know that ##G(u)## got an integral representation with continuous function ##\tilde{g}##, then differentiation on both sides in your last equation gives me anything, but not ##\tilde{g}##.

I give you an example from Rohatgis textbook:

Let ##X## be an RV with PDF

##f(x) = \frac{1}{\sqrt{2 \pi}} \exp^{-x^2/2}, \qquad -∞ < x < ∞##

Let ##Y=X^2## and for ##y > 0## one got

[tex]\begin{align}P[{Y \leq y}] &= P[{\sqrt{-y} \leq X \leq \sqrt{y}}] \\<br /> &= F(\sqrt{y}) - F(-\sqrt{y}) \end{align}[/tex]
where ##F## is the CDF of X. Rohatgi now says that differentiating on both sides gives the PDF of Y, but how does one know that ##Y## is of continuous type? Of course one can differentiate in a technical way, but what does it tell me about the PDF of Y, if I do not know in advance that Y is of continuous type?

I hope that example made my problem clear, if I not, I will try to do it better.

chiro · Sep 14, 2012

If a function is analytic and differentiable (i.e. the CDF) then it's definitely analytic (and so is the PDF if this is the case).

The function g(x) where g(x) = x^2 also has that property as well.

Ultimately, the transformation says that it must have an inverse function and that you need differentiability so if you have those two then you're done.

If you have an analytic form of the PDF and you can differentiate it and show that the derivative exists and is finite across the right region, then you're done for the CDF. You also need to show that an inverse exists for the right region you are considering and that this function is also differentiable (again for the right region).

For your example above, you need to to remember that the inverse of g(x) = x^2 will have two branches so these need to be considered.

So remember that to check CDF for differentiability across the domain and also the transformation function and that is has an inverse (you may need to split things up) and that the inverse is differentiable as well.

philipp_w · Sep 14, 2012

I think I got it now, let us try to formalize this a bit:

We only consider absolutely continuous RV, but will use the Riemann integral for simplicity.
Let ##I \subseteq ℝ## be an open interval and X be an RV with ##X \colon \Omega \to I## of continuous type. Let ##\Phi \colon I \to J## be continuous differentiable with ##\Phi^{\prime} \neq 0## for all ##x \in I##, then the distribution of ##\Phi(X)## is absolutely continuous with PDF

##f_{\Phi(X)}(y) = f_X(\Phi^{-1}(y)) \vert (\Phi^{-1})^{\prime}(y)\vert## for ##y \in \Phi(I)## and ##0## elsewhere.

proof: We have ##\Phi^{\prime} > 0## on ##I## or ##\Phi^{\prime} < 0## on I. Consider the first case, then ##\Phi## is strictly increasing and is a bijection from ##I## onto ##\Phi(I)##. We perform following equivalent manipulations

## F_{\Phi(X)}(c) = P[{\Phi(X) \leq c}] = P[{X \leq \Phi^{-1}(c)}] = F_X(\Phi^{-1}(c))##

for all ##c \in \Phi(I)##. Because of the chain rule ##F_{\Phi(X)}## is for almost all ##c \in \Phi(I)## differentiable and one got

##F^{\prime}_{\Phi(X)}(c) = f_X(\Phi^{-1}(c))(\Phi^{-1})^{\prime}(c)##

The statement now follows from the fundamental theorem of calculus and

##P[{\Phi(x) \notin \Phi(I)}] = 0##

In the case where a countable number of inverses for ##\Phi## exist, one has to partition the the interval ##I## into subintervals, then apply the property of the probability measure ##P## of being countable additive to every subinterval, so that the above conditions are satisfied by the restriction of ##\Phi## to each subinterval. Look at remark 3 in section 2.5 in Rohatgi's book for further information.

I hope everything is correct now.

chiro · Sep 14, 2012

That sounds right (of course you need to consider intervals that are not only monotonically increasing but also monotonically decreasing as well).

In the Riemann Integrable sense, the integral and the measure is pretty simple but one thing to keep in mind is when you have a continuous transformation and PDF (but that is not differentiable) and if this is the case, then this is going to complicate things when it comes to differentiation (since it will not like the analytic sense that you have with Riemann-style functions and measures).

So if you are looking at general spaces and measures as opposed to the nice and specific Riemann type problems, then this is going to require a more detailed and rigorous approach because of the generalities that follow.

But if you are considering Riemann style, then this pretty much just clarifies the less formal proof given above, while taking into account partitions of inverses as opposed to have just one inverse.

philipp_w · Sep 15, 2012

chiro said:

So if you are looking at general spaces and measures as opposed to the nice and specific Riemann type problems, then this is going to require a more detailed and rigorous approach because of the generalities that follow.

Do you know any source e.g. textbook which treats this more general problem? I would like to consider the Lesbegue integral, where this differentiation technique finally reaches its boundaries ;)

chiro · Sep 15, 2012

I'm sorry I don't know much about books that cover these general spaces.

Transformation of random variable

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Graduate Probability puzzle

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Undergrad Understanding permutations and combinations in a coin toss experiment

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect