# Transformation of Random Variables

1. Apr 21, 2012

### jimbobian

Ok, so I have this written in my notes and while going over it I have a few questions:

Suppose cubical boxes are made so that the length, X (in cm) of an edge is distributed as

$$f(x)=\frac{1}{2}$$
for 9≤X≤11
0 otherwise

What sort of distribution will the volume, Y, of the boxes have, Y in cm^3.

So in my notes it says to do this:

FY(y) = P(Y≤y) = P(X3≤y)=P(X≤y1/3)=FX(y1/3)

But why is it not possible to go straight from the PDF of X to the PDF of Y, using the same technique of substituting X3 for Y like so:

fY(y) = P(Y=y) = P(X3=y)=P(X=y1/3 )=fX(y1/3)

Have put this here, because it isn't a homework question, more a general question that I've come across while revising but by all means move it if you disagree!

Last edited: Apr 21, 2012
2. Apr 21, 2012

### jimbobian

Pressed submit by accident, haven't finished writing the OP!

Edit: Finished now!

Last edited: Apr 21, 2012
3. Apr 21, 2012

### Stephen Tashi

This is good question and an important one. (You see that the two methods produce different answers, right?)

The PDF of a discrete random variable X can be interpreted as the "the probability that ...", but it is technically incorrect to interpret it this was for a continuous distribution. $f_X(x)$ is not $P(X = x)$. Often you can get away with thinking of continuous PDF's the wrong way and still get the right formulas. It's rather like how people think of $\frac{dy}{dx}$ as the ratio of two finite numbers and this helps them remember formulas in calculus. Thinking the wrong way is often helpful but it has pitfalls.

A better way to think is that $f_X(x)$ is a function that is one factor in an expression that approximates the probability for $X$ being in an interval. For example, $P(x - dx \le X \le x + dx) \approx f_X(x) 2 dx$, thinking of $dx$ as a finite length.

If we approach this problem by reasoning with PDFs, we must use intervals and things don't look simple.

$$P( y - dy \le Y \le y + dy) = P( y - dy \le X^3 \le y+ dy)$$
$$= P( (y-dy)^\frac{1}{3} \le X \le (y+dy)^\frac{1}{3})$$
$$= \int_{(y-dy)^\frac{1}{3}}^{(y+dy)^\frac{1}{3}} f_X(x) dx$$
$$= \bigg|_{(y-dy)^\frac{1}{3}}^{(y+dy)^\frac{1}{3}} (\frac{1}{2} x)$$

Perhaps we can do more manipulations with the dy's and dx's to get to the right answer. At least this suggests that, plugging-in $x= y^\frac{1}{3}$ into $f_X( x)$ isn't likely to work.

The question is related to a question from calculus: When we make the substitution x = g(y) in an integration, why can't we just change dx to dy? Why does the substitution involve a g'(y) ?