Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Distribution of sum of discrete random variable

  1. Mar 30, 2012 #1
    Edit: I have to think more about this, I'll post later.
     
    Last edited: Mar 30, 2012
  2. jcsd
  3. Mar 30, 2012 #2

    chiro

    User Avatar
    Science Advisor

    If you need a hint, assuming independence, think about convolution theorem for discrete random variables.
     
  4. Mar 31, 2012 #3
    Ok, my problem is about poker tourneys.

    Consider the random variable Y, which is the sum of N random variables X:

    [tex]Y=X+X+...+X=NX[/tex]

    X is the random variable that assigns a prize value to each in-the-money position, and assigns -1 to out-of-money positions. So:

    [tex]X(position)=\left\{\begin{matrix}-1, position = "OTM"
    \\w_{1}, position = "1st"
    \\w_{2}, position = "2nd"
    \\...
    \\w_{n}, position = "nth"

    \end{matrix}\right.[/tex]

    OTM means out-of-money, and here n is the number of in-the-money positions. w1, w2,..., wn are constants of course.

    The probability mass function of X is:

    [tex]f(x)=\left\{\begin{matrix} \beta_{1}, x = -1
    \\ \beta_{2}, x = w_{1}
    \\ \beta_{3}, x = w_{2}
    \\ ...
    \\ \beta_{n+1}, x = w_{n}

    \end{matrix}\right.[/tex]

    What I want to know is the probability mass function of Y (which represents the profit/loss of N tourneys). I could find this by using the convolution theorem, but that's where my problem arises. As I understood it, Y needs to depend on a variable y, so:

    [tex]Y(y)=X(x)+X(x)+...+X(x)=NX(x)[/tex]

    But I don't know how to define y... This is why I can't use the convolution theorem, because I didn't really understand this part.
     
  5. Mar 31, 2012 #4

    mathman

    User Avatar
    Science Advisor
    Gold Member

    Assuming all the X's are independent, it would be better to use the characteristic function (Fourier transform of distribution). Then the characteristic function for Y is the Nth power of the char. function for X. To get the distribution function for Y, take the inverse transform.
     
  6. Mar 31, 2012 #5
    Isn't that only for continuous random variables (and not discrete)?

    Anyway, shouldn't it be easy by convolution? I think that I'm only missing something very basic...
     
  7. Mar 31, 2012 #6

    chiro

    User Avatar
    Science Advisor

    The y variable is just a dummy variable and you can call it whatever you want. As long as you are doing a discrete convolution for PDF's with univariate distributions (i.e. with PDF in form P(X = x) = blah) then your dummy variable will always correspond to this x or whatever other dummy variable you have chosen.
     
  8. Apr 1, 2012 #7

    mathman

    User Avatar
    Science Advisor
    Gold Member

    You can do it for discrete random variables. The density function is a linear combination of Dirac delta functions, so the characteristic function is a linear combination of exponential functions.
     
  9. Apr 2, 2012 #8
    I'm not comfortable with that, I never even studied dirac delta function.

    About the convolution, I still don't understand some things...

    1st problem:

    Definition of convolution for discrete r.v:
    [tex]f_{Z}(z)=\sum_{x=-\inf}^{x=+inf}f_{Y}(z-x)f_{X}(x)[/tex]
    where x is an integer.

    The problem here is that [tex]f_{X}(x)[/tex] may be always 0 in my example (except for x = -1), since there may be no integer prizes... And if I defined the r.v. X slightly differently (not assigning that -1 value) that function would always be 0 and there would be no probablity mass function for Z? Something's not right here...
     
  10. Apr 2, 2012 #9

    chiro

    User Avatar
    Science Advisor

    It would help if you stated the discrete PDF's for f(x) and for g(y) corresponding to the distributions of X and Y, just for clarification.
     
  11. Apr 3, 2012 #10
    Ok I'll write everything again with the pmf included, so everything will be on the same post.

    Definition of convolution of pmf's for discrete random variables X and Y:

    [tex]f_{Z}(z)=\sum_{x=-inf}^{x=+inf}f_{Y}(z-x)f_{X}(x)[/tex]

    where x is an integer.

    PMF of X and Y (they have the same distribution):

    [tex]f(X=x)=\left\{\begin{matrix}a_{1}, x=-1
    \\ a_{2}, x=w_{1}
    \\ a_{3}, x=w_{2}
    \\ ...
    \\ a_{n}, x=w_{n-1}

    \end{matrix}\right.[/tex]

    where w1, w2, ..., wn-1 are real constants.

    Problem:

    In the series, [tex]f_{X}(x)[/tex] may be always 0 except for -1, since w1, w2, ..., wn-1 in general won't be integers.
     
  12. Apr 3, 2012 #11

    chiro

    User Avatar
    Science Advisor

    If X and Y have the same distribution, then you won't have any problem. It doesn't matter what w1, w2, etc are if they are integers, rationals, or reals: you're dealing with defining the probabilities for Z and if X and Y have the same values that correspond to some probability (remember you said that X and Y have the same distribution which implies the same a(1),w1,w2 etc, then you will find the probability distribution using the probabilities and then create the mapping to values afterwards.

    Are you wondering about the mapping procedure if your values that are mapped to probabilities are not integers?
     
  13. Apr 4, 2012 #12
    Sorry, I didn't understand this part. What do you mean by "find the probability distribution using the probabilities"? Are you referring to the convolution?

    Yes, that's my problem.
     
    Last edited: Apr 4, 2012
  14. Apr 4, 2012 #13

    Stephen Tashi

    User Avatar
    Science Advisor

    This is faulty notation if the various 'X's' can have different outcomes. You should say "is the sum of N random variables [itex] X_1,X_2,...X_N [/itex]. If you want to indicate that the [itex] X_i [/itex] have the same distribution, just say that in words. Don't do it by naming them as if they all realize the same value.


    You mean:
    [tex] Y(y) = X_1(x_1) + ...+ X_N(x_N) [/tex]

    and you can't combine these terms.


    The variable y represents a particular value of the random variable Y. It just like saying "Let W be the random variable that represents the value of the face on a roll of a fair die". W(w) would be the particular event that the face value was w. (e.g. W(2) says the face that came up was 2.

    To find the probability of particular value of Y, such as y = $40, you must sum the probabilities of all combinations of values [itex] x_i [/itex] for the [itex] X_i [/itex] which add up $40. So you consider all possible values of the [itex] x_i [/itex] that meet that requirement. That is essentially what "convolving" random variables means.
     
  15. Apr 4, 2012 #14

    chiro

    User Avatar
    Science Advisor

    Stephen Tashi answered your question: consider all possibilities of values given your distributions and then use the convolution algorithm (or theorem) to get the actual probability for each mapping.

    The best way to do this is to do it for the first two random variables and then simplify and repeat to incorporate the rest of them. Basically you should have n x m outputs for your convoluted distribution if n is the number of outputs for your starting and m is the number of outputs for the one you are convolving with. When you do this repeatedly m because the number of mappings in the last distribution you calculated through convolution.
     
  16. Apr 5, 2012 #15
    Thanks for the replies :smile:

    I was taught that that variable w described the event, not the value of the random variable. In your example, and in my understanding, what would say that the value of the face on a roll of a fair die was 2 is the value of W, not w. I'd represent that as W(2) = 2 or W("2 came off") = 2, depending on what's the set I define w to be in.

    Sum the probabilities? Don't you mean multiply? If I sum the probabilities, and if I understood right, I'll get probabilities higher than 1 in some cases.
    For example, Y = w1 + w1 + w1 + ... + w1 = N*w1 is a possibility for Y. If I sum the probabilities of X=w1 N times I can easily get a number higher than 1.

    This is what I understood, please tell me if I'm wrong:

    For simplicity let's say Y = X1 + X2, and each of these X's can only be w1 and w2. Then Y will be like this (eta is a dummy variable, didn't even bother to write it in each value of Y):
    [tex]Y(\eta )=\left\{\begin{matrix}
    \\ w_{1}+w_{1}
    \\ w_{1}+w_{2}
    \\ w_{2}+w_{1}
    \\ w_{2}+w_{2}

    \end{matrix}\right.[/tex]

    In general, for the sum of 2 random variables, it will have n x m values, where n is the number of possible values X1 can have, and m is the number of possible values X2 can have (like chiro said).

    Now, if I attribute a probability of 0.7 to w2 and 0.3 to w1 for example, the probability of Y=w2+w2 would be 1.4, in my understanding.
     
    Last edited: Apr 5, 2012
  17. Apr 5, 2012 #16

    chiro

    User Avatar
    Science Advisor

    You won't end up doing this: you have to use the calculations from the convolution algorithm which will involve multiplying probabilities and also adding (using the convolution theorem for discrete random variables).

    Remember that it won't be N*w1: instead you will have to do a convolution for the first two random variables, then do a convolution with this calculated PDF with the next variable and then you keep doing this until you have a PDF for n-1 random variables with the nth random variable. If you expand this out, you'll see that it is a lot more involved than the N*w1 behaviour that you are implying: it doesn't work like that.

    Convolution simply represents a way to multiply frequencies together of two functions and this idea is not only for probability but it's used for many areas of signal analysis and other areas of applied mathematics.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook