Probability density function of a function of a random variable

1. Sep 25, 2011

Sameer_1987

Hello everyone!
I am stuck in my research with a probability density function problem..
I have 'Alpha' which is a random variable from 0-180. Alpha has a uniform pdf equal to 1/180.
Now, 'Phi' is a function of 'Alpha' and the relation is given by,
Phi = (-0.000001274370471*Alpha^4) + (0.000393888213304*Alpha^3) - (0.037767210716686*Alpha^2) + (1.66089231837634*Alpha) - 5.83257204679921
I am trying to find the PDF of 'Alpha' but having a hard time in doing that..

Any help will be greatly appreciated!

Thanks.

2. Sep 25, 2011

Bacle

Do you mean the PDF of Phi? I understood you to have said alpha was uniform.

3. Sep 26, 2011

Sameer_1987

Oh yes, I meant PDF of Phi...sorry for that typing mistake..I am trying to find the PDF of Phi.

4. Sep 26, 2011

chiro

Hey Sameer_1987 and welcome to the forums.

Before attempting to find an analytic or even piece-wise analytic representation of your PDF, you should try using simulation to get an idea of what the PDF looks like.

In terms of getting an analytic version of the PDF, your best bet is to use two tools: the first is the transformation theorem that allows you to get the PDF of a transformed variable, and the second tool is the convolution theorem.

The convolution theorem will be used if you have say two random variables X and Y, then the PDF of P(X + Y < s) is given by taking the convolution of PDF(X) with PDF(Y).

What you are probably better off doing is to break up your transformation into parts, like changing Phi = (-0.000001274370471*Alpha^4) + (0.000393888213304*Alpha^3) - (0.037767210716686*Alpha^2) + (1.66089231837634*Alpha) - 5.83257204679921 into
Phi = A + B + C + D + E and using the convolution theorem to get the CDF of the sum of all the variables and hence the PDF using differentiation.

But before doing anything analytic, do a decent simulation and get a visual idea of what the distribution should look like. I would try 10,000 or 50,000 samples at the minimum which should give at least a rough idea of the distribution. You can do this quite easily in R using the command runif (type "runif" in R to get help for supplying arguments).

5. Sep 26, 2011

Sameer_1987

Hi Chiro, Thanks for the reply

I have only one random variable(i.e Alpha) ? Can I still use the convolution theorem?
Also, my Alpha is a random variable between 0-180. So I am not sure if I can get a sample of about 10,000.

Thanks!

6. Sep 26, 2011

Sameer_1987

Hi Bacle,
Oh yes, I meant PDF of Phi...sorry for that typing mistake..I am trying to find the PDF of Phi.

7. Sep 26, 2011

Stephen Tashi

No, it doesn't apply.

Are you saying that Alpha takes only discrete values? 0,1,2...,180? Even if it does, why would that prevent you from taking a sample of 10,000? There's nothing wrong with repeated values in a sample of 10,000.

8. Sep 26, 2011

Sameer_1987

Right, I can have 10,000 values or more between 0-180....Thanks!

9. Sep 26, 2011

Sameer_1987

Can you tell me how I can find out, what kind of distribution I have?..I mean with the given expression of phi in terms of Alpha(which is a random variable between 0-180), can I find out what kind of distribution will phi have?

Thank you!

10. Sep 26, 2011

Stephen Tashi

Do you understand the basic idea of how to do it?

Graph the function

$$\Phi(\alpha) = (-0.000001274370471*\alpha^4) + (0.000393888213304*\alpha^3) - (0.037767210716686*\alpha^2) + (1.66089231837634*\alpha) - 5.83257204679921$$

on the interval $[0, 180.0]$.

(I'm sure you're not the first person who encountered this type of problem and people may developed slick ways of handling it. What I describe is what you do if you just put one foot in front of the other. )

Let $F(y)$ be the cumulative distribution of $\Phi$. To find a the value of $F(y)$ at particular value of $y$ draw a horizontal line through (0,y) on your graph. Look at the parts of the graph that fall below that line. Look at the points on the horizontal axis that are the $\alpha$ values corresponding to those parts of the graph. In your case, these points on the horizontal axis will form one or more intervals. The value of F(y) is the probabilty that $\alpha$ falls in one of these intervals. Since $\alpha$ is uniformly distributed, you can find the probability by adding up the total length of the intervals and multiplying it by (1/180).

To find the density of $\Phi$ you would compute (or approximate) the derivative of $F(y)$.

Now suppose we try to do the above process analytically. To find $F(y)$ we must solve the inequality $\Phi(\alpha) \leq y$. Since we know the answer will be intervals, it is sufficient to solve the equation $\Phi(\alpha) = y$ in order to find their endpionts. I think this type of equation is called a "quartic" and you can look up a formula for solving it.

Since you are going to have various cases ( on interval, two intervals), I suspect the function $F(y)$ is not going to be given by a simple algebraic formula. It will probably be defined on several intervals with different formula applied on each of them.

11. Sep 26, 2011

Sameer_1987

Thank you Stephen! This should be really helpful.

12. Sep 26, 2011

chiro

The reason I suggested convolution is basically so that you break up your polynomial into parts that are easier to deal with and then use convolution on those distributions to get your final CDF and hence your PDF.

For example you have a polynomial ax^4 + bx^3 and so on. What I suggest is you look at each term separately and then use convolution to get the addition of terms.

13. Sep 27, 2011

Mute

I don't think that will work. That convolution theorem requires that the random variables are independent, which is not the case here. It is the same instance of the random variable in all terms of the polynomial.

The easiest way to do it analytically is what Stephen Tashi suggested, I think. Given $\phi = \Phi(\alpha)$, then if you know the CDF for alpha is $F(\alpha)$, the CDF of phi is just $F(\alpha = \Phi^{-1}(\phi))$. The trick here, as has been suggested, is that the function $\Phi(\alpha)$ is not 1 to 1, so the inverse function will have different branches, and you will have to stitch them together somehow to get the true distribution function. The numerical simulation will come in handy for trying to stitch the roots together.

And... it might not be pretty. Solving $\phi = \Phi(\alpha)$ for alpha using wolframalpha gives some ugly-looking roots. Of course, wolframalpha it hasn't simplified the expression much - there are factors with large positive powers and large negative powers that could probably be simplified somewhat. Maybe if I had used mathematica it could be massaged somewhat.

14. Sep 27, 2011

Bacle

Hi, Sameer:

Just a technical point, Re Tashi's comment about finding a solution for a quartic:

there is no general formula for sorving a quartic; it is

a result by Galois that there is no general method for an n-th degree equation

for n>3 , tho I think if you find the Galois group of the polynomial, you may be able

to find the roots.

15. Sep 27, 2011

chiro

I see what you are saying and I agree, but even then inverting a polynomial with the right cuts in the domain is gonna be "interesting" to say the least.

I still think though that the OP should do some simulation to get a feel of what the distribution looks like. They could use that to verify that the PDF that they calculate is correct. They could even use some kind of interpolation scheme to get an PDF that is a good enough approximation.

If you are out there OP, let us know how you go.

16. Sep 27, 2011

Sameer_1987

Thank you Gentlemen for your inputs! I will work on this and keep you updated.

Thanks again.

17. Oct 2, 2011

Sameer_1987

Hi Stephen,

By 'y' do you mean 'phi'? I have attached a jpeg file which shows what I have done. I am not sure how to use the 'Alpha' intervals that I have got. Also, I think I can solve the quartic equation using the 'depressing the quartic' method that I read somewhere recently.

Thanks!

Attached Files:

• Untitled.jpg
File size:
46.8 KB
Views:
120
18. Oct 2, 2011

Stephen Tashi

Phi is a particular function of alpha. Y is an arbitrarily selected value on the vertical axis.

To find the cumulative density F(y) of Phi, you must find formulae that give an answer for all possible values of y.

The particular y = 45 that you selected shows that

$$F(45) = \frac{ (95 - 0) + (180 - 175) } { 180 }$$

19. Oct 2, 2011

Sameer_1987

Hi Chiro,

I calculated a histogram of phi in 'R' language using the runif function as,

Alpha=runif(50000,0,180)

Phi=(−0.000001274370471*Alpha^4)+(0.000393888213304*Alpha^3)−(0.037767210716686*Alpha^2)+(1.66089231837634*Alpha)−5.83257204679921

Hist(Phi, breaks=50)

and got a histogram. I have attached the image of the histogram. I am not sure what kind of distribution that is or which kind of PDF distribution will give the closest answer. But maybe by analytically evaluating the PDF, I will get a more clear picture.

Thanks

20. Oct 2, 2011

Sameer_1987

I think I forgot to attach the histogram image file in the last message.

Attached Files:

• Untitled.jpg
File size:
12.4 KB
Views:
118
21. Oct 3, 2011

Sameer_1987

Hi Mute,

So, are you saying to find the CDF of 'Phi' I need to calculate the inverse of ,
Φ(α)=(−0.000001274370471∗α^4)+(0.000393888213304∗α^3)−(0.037767210716686∗α^2)+(1.66089231837634∗α)−5.83257204679921 ?

Thanks.

22. Oct 20, 2011

Sameer_1987

Hi guys,
I found a simpler way to get the pdf. I generated some 100,000 random values of Alpha between 0-180 and used them to calcuate 100,000 values of phi. Then I got a histogram with abt 100-120 bins. Then i calculated the x coordinates of the pdf by adding the bin width and the y coordinate by taking the frequency values from histogram, dividing each frequency by the sample size(100,000), and then dividing by the bin width. Then I plot these x and y coordinates to get my pdf!

23. Oct 20, 2011

Stephen Tashi

Why did you generate random values? It would have been more accurate to generate 100,000 equally spaced values since Alpha has a uniform distribution.

24. Oct 20, 2011

Sameer_1987

hmmm....since alpha is a random variable between 0-180 i generated 100,000 random values. But I get what you are saying. That's a good point! Do you know if there is a way to generate for eg- 100,000 equally spaced values between 0-180 in R language?
is it, seq(0,180,(180/100,000)) ?

Thanks!

25. Oct 20, 2011

Sameer_1987

Oh btw, to check the uniformity of those 100,000 random alpha values that I generated, I had generated its histogram and it was quite flat.