# Probability density of a new variable

Gold Member

## Homework Statement

Let X be a continuous random variable with parameters $\langle x \rangle$ and $\sigma$.
Calculate the probability density of the variable Y=exp(X). Calculate the mean and the variance of Y.

## Homework Equations

Reichl's 2nd edition book page 180:
$P_Y(y)=\sum _{-\infty}^{\infty} \delta (y-H(x))P_X(x)dx$ where H(x)=Y, so I guess, H(x)=exp (x).

## The Attempt at a Solution

I used the given formula and reached $P_Y(y)=P_X (\ln y)$. Is this the correct answer?
I don't really know how to calculate the mean, $\langle y \rangle$. I guess they want it in function of the mean of X, namely $\langle x \rangle$.
I know that $\langle x \rangle=\int _{-\infty}^{\infty} xP_X(x)dx$ and that $\langle y \rangle=\int _{0}^{\infty} yP_Y(y)dy$. But this does not help me much. Any idea on how to continue further?

Ray Vickson
Homework Helper
Dearly Missed

## Homework Statement

Let X be a continuous random variable with parameters $\langle x \rangle$ and $\sigma$.
Calculate the probability density of the variable Y=exp(X). Calculate the mean and the variance of Y.

## Homework Equations

Reichl's 2nd edition book page 180:
$P_Y(y)=\sum _{-\infty}^{\infty} \delta (y-H(x))P_X(x)dx$ where H(x)=Y, so I guess, H(x)=exp (x).

## The Attempt at a Solution

I used the given formula and reached $P_Y(y)=P_X (\ln y)$. Is this the correct answer?
I don't really know how to calculate the mean, $\langle y \rangle$. I guess they want it in function of the mean of X, namely $\langle x \rangle$.
I know that $\langle x \rangle=\int _{-\infty}^{\infty} xP_X(x)dx$ and that $\langle y \rangle=\int _{0}^{\infty} yP_Y(y)dy$. But this does not help me much. Any idea on how to continue further?

I hate using canned formulas without thoroughly understanding them, so I prefer to proceed from first principles. In this case, Y > 0 because exp(X) is never negative or zero. So, given a value y > 0 we have P{Y <= y} = P{exp(X) <= y} = P{X <= ln(y)} = F(ln(y)), where F is the cumulative distaribution of X. That is, FY(y) = FX(ln(y)). Now get the density of Y as fY(y) = d FY(y) /dy, which you can express in terms of fX. Now, even if we know m = EX and σ^2 = Var(X) that does not help: different fX, all having the same mean and variance, can give different values for EY and Var(Y). However, we can express EY and Var(Y) in terms of integrations involving y and fX(ln y).

RGV

Gold Member
First let me thank you very much Ray for all the help so far.
I hate using canned formulas without thoroughly understanding them, so I prefer to proceed from first principles. In this case, Y > 0 because exp(X) is never negative or zero. So, given a value y > 0 we have P{Y <= y} = P{exp(X) <= y} = P{X <= ln(y)} = F(ln(y)), where F is the cumulative distaribution of X. That is, FY(y) = FX(ln(y)).
Ok this is what I've got, namely that $P_Y(y)=P_X (\ln y)$, using Dirac's delta.
Now get the density of Y as fY(y) = d FY(y) /dy, which you can express in terms of fX.
I am a bit lost here. Why do you take the derivative?
Now, even if we know m = EX and σ^2 = Var(X) that does not help: different fX, all having the same mean and variance, can give different values for EY and Var(Y). However, we can express EY and Var(Y) in terms of integrations involving y and fX(ln y).

RGV

Okay I'll try once I understand the previous point!

Ray Vickson
Homework Helper
Dearly Missed
First let me thank you very much Ray for all the help so far.Ok this is what I've got, namely that $P_Y(y)=P_X (\ln y)$, using Dirac's delta.

I am a bit lost here. Why do you take the derivative?

Okay I'll try once I understand the previous point!

Why the derivative? Well, F is the cumulative distribution, so to get the density you need to differentiate it! Surely you must know this.

RGV

Gold Member
Why the derivative? Well, F is the cumulative distribution, so to get the density you need to differentiate it! Surely you must know this.

RGV
I had never read the term "cumulative distribution" before, nor does it appear in my textbook. But apparently it's "just" the distribution function and I had never seen the relationship between it and the probability density!
So it means that what I've done using the definition in the book is wrong. I'll work on this tomorrow as I must wake up in a few hours.
Thanks for the lesson.

Ray Vickson
Homework Helper
Dearly Missed
I had never read the term "cumulative distribution" before, nor does it appear in my textbook. But apparently it's "just" the distribution function and I had never seen the relationship between it and the probability density!
So it means that what I've done using the definition in the book is wrong. I'll work on this tomorrow as I must wake up in a few hours.
Thanks for the lesson.

It is true that the modifier 'cumulative' is often dropped nowadays, and we just say 'distribution function', but I was not sure what convention you or your book were following. However, I am shocked that you have not been exposed to the relation between the distribution function F and the density function f; that is absolutely one of the basic probability relations.

What you did might not have been wrong; maybe your book uses a different notation or terminology from everyone else. Rather than relying on words and names, you need to look instead at ideas and concepts.

RGV

Gold Member
Ok I have more time for the next days for these problems.
I've checked my textbook, now I see the difference between the distribution function and the probability function, I also realize that the latter is the derivative of the former.
However, even though the book uses that same terminology, it still clearly says
Often we wish to find the probability density, not for the stochastic variable, X, but for some new stochastic variable, Y=H(x), where H(x) is a known function of X. The probability density, $P_Y(y)$, for the stochastic variable, Y, is defined as $P_Y(y)=\int _{-\infty}^{\infty} \delta (y-H(x))P_X(x)dx$
And a few pages before,
Note that the probability density function is just the derivative of the distribution function, $P_X(x)=dF_X(x)/dx$
So the book uses the same notation as yours. Yet I found out (using the cookbook formula), $P_Y(y)=P_X (\ln y)$ while you found out $F_Y(y) = F_X(ln(y))$ which are two different things.
I guess I haven't applied well the Dirac's delta then? Is the textbook wrong on that formula? Since my professor gave it in the course that seems rather strange, though possible of course.

Ray Vickson
Homework Helper
Dearly Missed
Ok I have more time for the next days for these problems.
I've checked my textbook, now I see the difference between the distribution function and the probability function, I also realize that the latter is the derivative of the former.
However, even though the book uses that same terminology, it still clearly says
And a few pages before,
So the book uses the same notation as yours. Yet I found out (using the cookbook formula), $P_Y(y)=P_X (\ln y)$ while you found out $F_Y(y) = F_X(ln(y))$ which are two different things.
I guess I haven't applied well the Dirac's delta then? Is the textbook wrong on that formula? Since my professor gave it in the course that seems rather strange, though possible of course.

You need to be very, very careful about using things like ##\delta(y - H(x))## in an x-integral; that is why you obtained the wrong answer. Essentially, you need to change variables to z = H(x), with $$dz = H'(x) dx \longrightarrow dx = \frac{dz}{H'(H^{-1}(z))}.$$ For H(x) = exp(x), this implies
$$dz = z dx, \longrightarrow dx = dz/z.$$ We thus get
$$\int \delta(y - H(x)) f_X(x) \, dx = \int \delta(y - z) f_X(\ln z) \frac{dz}{z} = \frac{1}{y} f_X(\ln y).$$ This is exactly what you get if you apply the method I suggested before. It disagrees with your computation.

Because use of the δ is so tricky, I prefer to avoid it like the plague. However, use it if you want, but be careful.

RGV

Gold Member
You need to be very, very careful about using things like ##\delta(y - H(x))## in an x-integral; that is why you obtained the wrong answer. Essentially, you need to change variables to z = H(x), with $$dz = H'(x) dx \longrightarrow dx = \frac{dz}{H'(H^{-1}(z))}.$$ For H(x) = exp(x), this implies
$$dz = z dx, \longrightarrow dx = dz/z.$$ We thus get
$$\int \delta(y - H(x)) f_X(x) \, dx = \int \delta(y - z) f_X(\ln z) \frac{dz}{z} = \frac{1}{y} f_X(\ln y).$$ This is exactly what you get if you apply the method I suggested before. It disagrees with your computation.

Because use of the δ is so tricky, I prefer to avoid it like the plague. However, use it if you want, but be careful.

RGV
Ok thank you infinitely for this. I've just played around with Dirac's delta and found an "intuitive way" (for me at least) not to fall into the trap I've fell over.
If you're interested here is how I think about it: $\int _{-\infty} ^{\infty} \delta (y-e^{x})P_X(x)dx$. Now I need to get the argument of the Dirac's delta equal to 0. This implies $x=\ln y$. If I replace x by ln y in the expression, the dx becomes $d \ln y = \frac{dy}{y}$. So that the integral is simply worth $\frac{1}{y}P_X(\ln y)=P_Y(y)$.
Edit: Ok this would answer the first question.
For the mean value, I get that $\langle Y \rangle =\int _{-\infty} ^{\infty} y P_Y(y)dy= \int _0 ^{\infty} P_X (\ln y ) dy$. If I'm not wrong this is worth 1??!! Because the distribution is supposed to be normalized. (details: $\int _0 ^{\infty} P_X (\ln y ) dy=\int _{-\infty}^{\infty}P_X(y)dy$ but since y is a dummy variable, this is equal to $\int _{-\infty}^{\infty}P_X(x)dx$ and I'm sure this is worth 1.)

Last edited:
Ray Vickson
Homework Helper
Dearly Missed
Ok thank you infinitely for this. I've just played around with Dirac's delta and found an "intuitive way" (for me at least) not to fall into the trap I've fell over.
If you're interested here is how I think about it: $\int _{-\infty} ^{\infty} \delta (y-e^{x})P_X(x)dx$. Now I need to get the argument of the Dirac's delta equal to 0. This implies $x=\ln y$. If I replace x by ln y in the expression, the dx becomes $d \ln y = \frac{dy}{y}$. So that the integral is simply worth $\frac{1}{y}P_X(\ln y)=P_Y(y)$.
Edit: Ok this would answer the first question.
For the mean value, I get that $\langle Y \rangle =\int _{-\infty} ^{\infty} y P_Y(y)dy= \int _0 ^{\infty} P_X (\ln y ) dy$. If I'm not wrong this is worth 1??!! Because the distribution is supposed to be normalized. (details: $\int _0 ^{\infty} P_X (\ln y ) dy=\int _{-\infty}^{\infty}P_X(y)dy$ but since y is a dummy variable, this is equal to $\int _{-\infty}^{\infty}P_X(x)dx$ and I'm sure this is worth 1.)

That is exactlywhat I did above.

RGV

Gold Member
That is exactlywhat I did above.

RGV
Yes I know :) I kind of reformulated what you wrote into my own "words".
What do you think about the mean value I found?