Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Change of variables in conditional pdf

  1. Jul 27, 2013 #1

    I have a simple question regarding changing variables in a conditional distribution.

    I have two independent variables

    [itex]r \in \mathbb{R}, r>0 \\
    t \in \mathbb{I}, t>0[/itex]

    where r is "rate" (can be any positive real number although most likely to be around 1) and t is "time" (positive integers ie 1,2,3,4...).

    I have a conditional probability function (really a probability density function) of the form

    [itex]P(r;t) \mathrm{d}r[/itex]

    which is "probability that the rate is r (within an interval [itex]\mathrm{d}r[/itex]) at time t"

    This has a normalization condition

    [itex]\int_0^\infty P(r;t) \mathrm{d}r = 1[/itex]

    which means that there is some rate at any given time

    I am actually interested in finding a different conditional probability function


    where R is the cumulative rate up to time t

    So if if have outcomes for [itex]t = 1,2,3,4,\dots,t_{f}[/itex] that are [itex]r_{1},r_{2},r_{3},r_{4},\dots,r_{t_{f}}[/itex] then the cumulative rate is [itex]R_{t_{f}} =\mathrm{ \Pi}_{i=1}^{t_{f}}r_{i}[/itex] , which is to say the product of [itex]r_{i}[/itex] for i up to [itex]t_f[/itex]

    Again, this would have to have a normalization condition

    [itex]\int_0^\infty P(R;t) \mathrm{d}R = 1[/itex]

    since for any given time there has be some cumulative rate.

    If you can help me find [itex]P(R;t)\mathrm{d}R[/itex] from [itex]P(r;t)\mathrm{d}r[/itex] I would greatly appreciate it.

    Thank you very much for you help.

    Last edited: Jul 27, 2013
  2. jcsd
  3. Jul 27, 2013 #2

    Stephen Tashi

    User Avatar
    Science Advisor

    That's clear, but you didn't define what you meant by "the cumulative rate" at time [itex] t [/itex]. A cumulative thing is usually the integral of something, but how do you intend to integrate a random process? (This can be done, but it's not an elementary topic.)

    When you ask for the cumulative rate at time [itex] t [/itex], are you trying to think of every instant in time before time [itex] t [/itex] as having an independent random selection of a rate? i.e. if you were to approximate this process by a Monte-Carlo program would you divide the interval [itex] [0,t] [/itex] up into small subintervals of length [itex] \triangle t [/itex] and make an independent random selection of a rate from the distribution [itex] P(r;t) [/itex] that applied to each interval?

    If you have something like this in mind, I think, in the limit as [itex] \triangle t \rightarrow 0 [/itex], the cumulative rate would be infinite with probability 1. If you want a process that produces a finite result with a non-zero probability then you could look at something like the mean cumulative rate. To Monte-Carlo that, you would make an independent random selection of a rate for each subinterval and instead of just adding these rates up, you would add them up and then divide by the total number of rates. However, if you do this then I think that (in the limit) the cumulative rate will be equal to the mean of [itex] P(r;t) [/itex] with probability 1.

    The only way I know to make an idea like yours interesting is to define the process (your "cumulative") as a limit of processes where the independent random selection in the intervals of length [itex] \triangle t [/itex] is drawn from a distribution that changes as [itex] \triangle t [/itex] changes. This is the idea behind Brownian motion and the Wiener process.
  4. Jul 27, 2013 #3
    Hi Stephen,

    Thank you for your response.

    Perhaps I was not clear enough. The process is well defined and I am interested in the answer for my exact question, not a different problem. However, I think my use of the word "cumulative" is possibly not correct.

    If you notice from my original post, t is discreet and starts at 1, ie there is t = 1, 2, 3, 4, and so on. This is not a continuous-time process and there is no possibility for non-integer differences in t

    At each t you have an independent drawing of r (which experimentally may not be the best assumption, but let's keep it). The probability of a particular r being drawn is defined by P(r;t)dr. Which is to say, P(r;t) is a probability density function which is defined at a particular t. In fact, these probability density functions are generated by binning of some real data so P(r;t) is usually skewed and multimodal.

    Thus as t=1 you draw some r1, at t=2 you draw some r2... etc.

    However, I can also define a new variable. I call it R. This variable is such that at t=1 R1=r1; at t=2 R2=r1*r2; at t=3 R3=r1*r2*r3.

    Since I know the probability density functions which governs the drawings of all r's, can I generate a probability density function which governs the value of R's (which are essentially the products of r's).

    Hope this clarifies my question,
    With appreciation,

    Last edited: Jul 27, 2013
  5. Jul 28, 2013 #4
    Is ##t## random? From the description of the problem, I get the impression that it's a deterministic parameter.

    If I understand correctly, the key part of the problem is this:
    • You know the pdf (probability density function) of each ##r_t##
    • You want to know the pdf of the product of many ##r_t##'s
    My favorite trick is to use logarithms to convert the problem into a sum of random variables. Define random variables ##\{ q_t \}## and ##Q_t## like this:

    q_t \equiv \log(r_t)
    Q_t \equiv \log(R_t)
    = \log(r_1 * r_2 * \cdots)
    = \log(r_1) + \log(r_2) + \cdots
    = q_1 + q_2 + \cdots

    An easy special case is when the ##\{ r_t \}## are all independent, identically-distributed, and Gaussian. Then you can cheat and look up the answer: it's a lognormal distribution.

    If all the ##\{ r_t \}## are independent of each other, then the ##\{ q_t \}## are independent of each other. That means you can convolve all of their pdfs together to get a pdf for ##Q_t##. This works even if the ##\{ r_t \}## are not Gaussian.

    If two or more of the ##\{ r_t \}## are dependent, then the convolution method won't work. That's a harder problem which I'm not sure how to solve easily.
  6. Jul 28, 2013 #5

    Stephen Tashi

    User Avatar
    Science Advisor

    The probability density of ln R at t = k will be the k-fold convolution of the probability density of r. Are you asking how to convert this density so it reads as a density of R instead of ln R ?
  7. Jul 28, 2013 #6
    OK, this sounds like the right direction. Thank you NegativeDept and Stephen for helping me. I have not done statistics related things for quite a few years so please be patient with me!

    It's true, t is not a random variables, but that's why I only called it an independent variable. Furthermore, while r are independent of each other (fully described by ##P(r;t)##) they are neither identical nor Gaussian.

    Please let me know if the following is the correct procedure for what I want to do.

    1. I define ##q_t=\ln(r_t)## and ##Q_t=\sum_{i=1}^t q_t = \ln(R)##
    2. For my initial probability density function, which is ##P(r;t)\textrm{d}r## I make a change of variables to ##q## so ##P(r;t)\textrm{d}r = P(\exp(q);t)\exp(q)\textrm{d}q##
    3. I then find Fourier transforms for each probability density function ##\hat{P}(k;t)=\textrm{fft}(P(e^q);t)e^q)##
    4. I then add all of the ##\hat{P}(k;t)## to obtain ## \hat{P_f}(k;t)=\sum_{i=1}^t \hat{P}(k;t)##
    5. Finally, I obtain ##P_f(Q;t)=\textrm{ifft}( \hat{P_f}(k;t))##
    6. Then ##P_f(Q;t) \textrm{d}Q =P_f(\ln(R);t)/R \textrm{d}R## which means that ##P(R;t) = P_f(\ln(R);t)/R##

    Is this correct?

    With appreciation,

    Last edited: Jul 28, 2013
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook