Change of variables in conditional pdf

ivalmian · Jul 27, 2013

Hello,

I have a simple question regarding changing variables in a conditional distribution.

I have two independent variables

[itex]r \in \mathbb{R}, r>0 \\
t \in \mathbb{I}, t>0[/itex]

where r is "rate" (can be any positive real number although most likely to be around 1) and t is "time" (positive integers ie 1,2,3,4...).

I have a conditional probability function (really a probability density function) of the form

[itex]P(r;t) \mathrm{d}r[/itex]

which is "probability that the rate is r (within an interval [itex]\mathrm{d}r[/itex]) at time t"

This has a normalization condition

[itex]\int_0^\infty P(r;t) \mathrm{d}r = 1[/itex]

which means that there is some rate at any given time

I am actually interested in finding a different conditional probability function

[itex]P(R;t)\mathrm{d}R[/itex]

where R is the cumulative rate up to time t

So if if have outcomes for [itex]t = 1,2,3,4,\dots,t_{f}[/itex] that are [itex]r_{1},r_{2},r_{3},r_{4},\dots,r_{t_{f}}[/itex] then the cumulative rate is [itex]R_{t_{f}} =\mathrm{ \Pi}_{i=1}^{t_{f}}r_{i}[/itex] , which is to say the product of [itex]r_{i}[/itex] for i up to [itex]t_f[/itex]

Again, this would have to have a normalization condition

[itex]\int_0^\infty P(R;t) \mathrm{d}R = 1[/itex]

since for any given time there has be some cumulative rate.

If you can help me find [itex]P(R;t)\mathrm{d}R[/itex] from [itex]P(r;t)\mathrm{d}r[/itex] I would greatly appreciate it.

Thank you very much for you help.

Ilya

Stephen Tashi · Jul 27, 2013

ivalmian said:

I have a conditional probability function (really a probability density function) of the form

[itex]P(r;t) \mathrm{d}r[/itex]

which is "probability that the rate is r (within an interval [itex]\mathrm{d}r[/itex]) at time t"

That's clear, but you didn't define what you meant by "the cumulative rate" at time [itex] t [/itex]. A cumulative thing is usually the integral of something, but how do you intend to integrate a random process? (This can be done, but it's not an elementary topic.)

When you ask for the cumulative rate at time [itex] t [/itex], are you trying to think of every instant in time before time [itex] t [/itex] as having an independent random selection of a rate? i.e. if you were to approximate this process by a Monte-Carlo program would you divide the interval [itex] [0,t] [/itex] up into small subintervals of length [itex] \triangle t [/itex] and make an independent random selection of a rate from the distribution [itex] P(r;t) [/itex] that applied to each interval?

If you have something like this in mind, I think, in the limit as [itex] \triangle t \rightarrow 0 [/itex], the cumulative rate would be infinite with probability 1. If you want a process that produces a finite result with a non-zero probability then you could look at something like the mean cumulative rate. To Monte-Carlo that, you would make an independent random selection of a rate for each subinterval and instead of just adding these rates up, you would add them up and then divide by the total number of rates. However, if you do this then I think that (in the limit) the cumulative rate will be equal to the mean of [itex] P(r;t) [/itex] with probability 1.

The only way I know to make an idea like yours interesting is to define the process (your "cumulative") as a limit of processes where the independent random selection in the intervals of length [itex] \triangle t [/itex] is drawn from a distribution that changes as [itex] \triangle t [/itex] changes. This is the idea behind Brownian motion and the Wiener process.

ivalmian · Jul 27, 2013

Hi Stephen,

Thank you for your response.

Perhaps I was not clear enough. The process is well defined and I am interested in the answer for my exact question, not a different problem. However, I think my use of the word "cumulative" is possibly not correct.

If you notice from my original post, t is discreet and starts at 1, ie there is t = 1, 2, 3, 4, and so on. This is not a continuous-time process and there is no possibility for non-integer differences in t

At each t you have an independent drawing of r (which experimentally may not be the best assumption, but let's keep it). The probability of a particular r being drawn is defined by P(r;t)dr. Which is to say, P(r;t) is a probability density function which is defined at a particular t. In fact, these probability density functions are generated by binning of some real data so P(r;t) is usually skewed and multimodal.

Thus as t=1 you draw some r1, at t=2 you draw some r2... etc.

However, I can also define a new variable. I call it R. This variable is such that at t=1 R1=r1; at t=2 R2=r1*r2; at t=3 R3=r1*r2*r3.

Since I know the probability density functions which governs the drawings of all r's, can I generate a probability density function which governs the value of R's (which are essentially the products of r's).

Hope this clarifies my question,
With appreciation,

Ilya

NegativeDept · Jul 28, 2013

ivalmian said:

I have two independent variables
[itex]r \in \mathbb{R}, r>0 \\
t \in \mathbb{I}, t>0[/itex]
where r is "rate" (can be any positive real number although most likely to be around 1) and t is "time" (positive integers ie 1,2,3,4...).

Is ##t## random? From the description of the problem, I get the impression that it's a deterministic parameter.

ivalmian said:

I have a conditional probability function (really a probability density function) of the form
[itex]P(r;t) \mathrm{d}r[/itex]
which is "probability that the rate is r (within an interval [itex]\mathrm{d}r[/itex]) at time t"

...

I am actually interested in finding a different conditional probability function
[itex]P(R;t)\mathrm{d}R[/itex]
where R is the cumulative rate up to time t

So if if have outcomes for [itex]t = 1,2,3,4,\dots,t_{f}[/itex] that are [itex]r_{1},r_{2},r_{3},r_{4},\dots,r_{t_{f}}[/itex] then the cumulative rate is [itex]R_{t_{f}} =\mathrm{ \Pi}_{i=1}^{t_{f}}r_{i}[/itex] , which is to say the product of [itex]r_{i}[/itex] for i up to [itex]t_f[/itex]

If I understand correctly, the key part of the problem is this:

You know the pdf (probability density function) of each ##r_t##
You want to know the pdf of the product of many ##r_t##'s

My favorite trick is to use logarithms to convert the problem into a sum of random variables. Define random variables ##\{ q_t \}## and ##Q_t## like this:

##
q_t \equiv \log(r_t)
##
##
Q_t \equiv \log(R_t)
= \log(r_1 * r_2 * \cdots)
= \log(r_1) + \log(r_2) + \cdots
= q_1 + q_2 + \cdots
##

An easy special case is when the ##\{ r_t \}## are all independent, identically-distributed, and Gaussian. Then you can cheat and look up the answer: it's a lognormal distribution.

If all the ##\{ r_t \}## are independent of each other, then the ##\{ q_t \}## are independent of each other. That means you can convolve all of their pdfs together to get a pdf for ##Q_t##. This works even if the ##\{ r_t \}## are not Gaussian.

If two or more of the ##\{ r_t \}## are dependent, then the convolution method won't work. That's a harder problem which I'm not sure how to solve easily.

Stephen Tashi · Jul 28, 2013

ivalmian said:

However, I can also define a new variable. I call it R. This variable is such that at t=1 R1=r1; at t=2 R2=r1*r2; at t=3 R3=r1*r2*r3.

Since I know the probability density functions which governs the drawings of all r's, can I generate a probability density function which governs the value of R's (which are essentially the products of r's).

The probability density of ln R at t = k will be the k-fold convolution of the probability density of r. Are you asking how to convert this density so it reads as a density of R instead of ln R ?

ivalmian · Jul 28, 2013

OK, this sounds like the right direction. Thank you NegativeDept and Stephen for helping me. I have not done statistics related things for quite a few years so please be patient with me!

It's true, t is not a random variables, but that's why I only called it an independent variable. Furthermore, while r are independent of each other (fully described by ##P(r;t)##) they are neither identical nor Gaussian.

Please let me know if the following is the correct procedure for what I want to do.

1. I define ##q_t=\ln(r_t)## and ##Q_t=\sum_{i=1}^t q_t = \ln(R)##
2. For my initial probability density function, which is ##P(r;t)\textrm{d}r## I make a change of variables to ##q## so ##P(r;t)\textrm{d}r = P(\exp(q);t)\exp(q)\textrm{d}q##
3. I then find Fourier transforms for each probability density function ##\hat{P}(k;t)=\textrm{fft}(P(e^q);t)e^q)##
4. I then add all of the ##\hat{P}(k;t)## to obtain ## \hat{P_f}(k;t)=\sum_{i=1}^t \hat{P}(k;t)##
5. Finally, I obtain ##P_f(Q;t)=\textrm{ifft}( \hat{P_f}(k;t))##
6. Then ##P_f(Q;t) \textrm{d}Q =P_f(\ln(R);t)/R \textrm{d}R## which means that ##P(R;t) = P_f(\ln(R);t)/R##

Is this correct?

With appreciation,

Ilya

Change of variables in conditional pdf

1. What is a change of variables in conditional pdf?

2. Why is a change of variables necessary in conditional pdf?

3. How is a change of variables in conditional pdf performed?

4. What are the benefits of using a change of variables in conditional pdf?

5. Are there any limitations to using a change of variables in conditional pdf?

Similar threads

Hot Threads

Recent Insights