Distribution of sum of discrete random variable

Click For Summary

Discussion Overview

The discussion revolves around the distribution of the sum of discrete random variables, particularly in the context of poker tournaments. Participants explore the probability mass function (pmf) of a random variable representing the total profit or loss across multiple tournaments, considering various approaches such as convolution and characteristic functions.

Discussion Character

  • Exploratory
  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant describes a random variable Y as the sum of N random variables X, where X assigns values based on tournament positions.
  • Another participant suggests using the convolution theorem for independent random variables to find the pmf of Y.
  • Some participants express uncertainty about defining the variable y in the context of convolution.
  • There is a discussion about the applicability of characteristic functions for discrete random variables, with some arguing it is only for continuous variables.
  • Concerns are raised about the pmf being zero for certain values if the prize values are not integers, questioning the validity of the convolution approach.
  • A participant emphasizes that the variable y represents a specific outcome of the random variable Y, suggesting clarity in notation for different random variables.

Areas of Agreement / Disagreement

Participants do not reach a consensus on the best approach to determine the pmf of Y, with multiple competing views on the use of convolution versus characteristic functions and ongoing uncertainty about the definitions involved.

Contextual Notes

Participants mention limitations regarding the assumptions of independence, the nature of the prize values, and the definitions of the random variables involved, which may affect the application of convolution and the resulting probability distributions.

Tosh5457
Messages
130
Reaction score
28
Edit: I have to think more about this, I'll post later.
 
Last edited:
Physics news on Phys.org
Tosh5457 said:
Edit: I have to think more about this, I'll post later.

If you need a hint, assuming independence, think about convolution theorem for discrete random variables.
 
Ok, my problem is about poker tourneys.

Consider the random variable Y, which is the sum of N random variables X:

[tex]Y=X+X+...+X=NX[/tex]

X is the random variable that assigns a prize value to each in-the-money position, and assigns -1 to out-of-money positions. So:

[tex]X(position)=\left\{\begin{matrix}-1, position = "OTM"<br /> \\w_{1}, position = "1st" <br /> \\w_{2}, position = "2nd" <br /> \\...<br /> \\w_{n}, position = "nth"<br /> <br /> \end{matrix}\right.[/tex]

OTM means out-of-money, and here n is the number of in-the-money positions. w1, w2,..., wn are constants of course.

The probability mass function of X is:

[tex]f(x)=\left\{\begin{matrix} \beta_{1}, x = -1<br /> \\ \beta_{2}, x = w_{1}<br /> \\ \beta_{3}, x = w_{2}<br /> \\ ...<br /> \\ \beta_{n+1}, x = w_{n} <br /> <br /> \end{matrix}\right.[/tex]

What I want to know is the probability mass function of Y (which represents the profit/loss of N tourneys). I could find this by using the convolution theorem, but that's where my problem arises. As I understood it, Y needs to depend on a variable y, so:

[tex]Y(y)=X(x)+X(x)+...+X(x)=NX(x)[/tex]

But I don't know how to define y... This is why I can't use the convolution theorem, because I didn't really understand this part.
 
Assuming all the X's are independent, it would be better to use the characteristic function (Fourier transform of distribution). Then the characteristic function for Y is the Nth power of the char. function for X. To get the distribution function for Y, take the inverse transform.
 
mathman said:
Assuming all the X's are independent, it would be better to use the characteristic function (Fourier transform of distribution). Then the characteristic function for Y is the Nth power of the char. function for X. To get the distribution function for Y, take the inverse transform.

Isn't that only for continuous random variables (and not discrete)?

Anyway, shouldn't it be easy by convolution? I think that I'm only missing something very basic...
 
Tosh5457 said:
Isn't that only for continuous random variables (and not discrete)?

Anyway, shouldn't it be easy by convolution? I think that I'm only missing something very basic...

The y variable is just a dummy variable and you can call it whatever you want. As long as you are doing a discrete convolution for PDF's with univariate distributions (i.e. with PDF in form P(X = x) = blah) then your dummy variable will always correspond to this x or whatever other dummy variable you have chosen.
 
Tosh5457 said:
Isn't that only for continuous random variables (and not discrete)?

Anyway, shouldn't it be easy by convolution? I think that I'm only missing something very basic...
You can do it for discrete random variables. The density function is a linear combination of Dirac delta functions, so the characteristic function is a linear combination of exponential functions.
 
You can do it for discrete random variables. The density function is a linear combination of Dirac delta functions, so the characteristic function is a linear combination of exponential functions.

I'm not comfortable with that, I never even studied dirac delta function.

About the convolution, I still don't understand some things...

1st problem:

Definition of convolution for discrete r.v:
[tex]f_{Z}(z)=\sum_{x=-\inf}^{x=+inf}f_{Y}(z-x)f_{X}(x)[/tex]
where x is an integer.

The problem here is that [tex]f_{X}(x)[/tex] may be always 0 in my example (except for x = -1), since there may be no integer prizes... And if I defined the r.v. X slightly differently (not assigning that -1 value) that function would always be 0 and there would be no probability mass function for Z? Something's not right here...
 
Tosh5457 said:
I'm not comfortable with that, I never even studied dirac delta function.

About the convolution, I still don't understand some things...

1st problem:

Definition of convolution for discrete r.v:
[tex]f_{Z}(z)=\sum_{x=-\inf}^{x=+inf}f_{Y}(z-x)f_{X}(x)[/tex]
where x is an integer.

The problem here is that [tex]f_{X}(x)[/tex] may be always 0 in my example (except for x = -1), since there may be no integer prizes... And if I defined the r.v. X slightly differently (not assigning that -1 value) that function would always be 0 and there would be no probability mass function for Z? Something's not right here...

It would help if you stated the discrete PDF's for f(x) and for g(y) corresponding to the distributions of X and Y, just for clarification.
 
  • #10
Ok I'll write everything again with the pmf included, so everything will be on the same post.

Definition of convolution of pmf's for discrete random variables X and Y:

[tex]f_{Z}(z)=\sum_{x=-inf}^{x=+inf}f_{Y}(z-x)f_{X}(x)[/tex]

where x is an integer.

PMF of X and Y (they have the same distribution):

[tex]f(X=x)=\left\{\begin{matrix}a_{1}, x=-1<br /> \\ a_{2}, x=w_{1}<br /> \\ a_{3}, x=w_{2}<br /> \\ ...<br /> \\ a_{n}, x=w_{n-1}<br /> <br /> \end{matrix}\right.[/tex]

where w1, w2, ..., wn-1 are real constants.

Problem:

In the series, [tex]f_{X}(x)[/tex] may be always 0 except for -1, since w1, w2, ..., wn-1 in general won't be integers.
 
  • #11
Tosh5457 said:
Ok I'll write everything again with the pmf included, so everything will be on the same post.

Definition of convolution of pmf's for discrete random variables X and Y:

[tex]f_{Z}(z)=\sum_{x=-inf}^{x=+inf}f_{Y}(z-x)f_{X}(x)[/tex]

where x is an integer.

PMF of X and Y (they have the same distribution):

[tex]f(X=x)=\left\{\begin{matrix}a_{1}, x=-1<br /> \\ a_{2}, x=w_{1}<br /> \\ a_{3}, x=w_{2}<br /> \\ ...<br /> \\ a_{n}, x=w_{n-1}<br /> <br /> \end{matrix}\right.[/tex]

where w1, w2, ..., wn-1 are real constants.

Problem:

In the series, [tex]f_{X}(x)[/tex] may be always 0 except for -1, since w1, w2, ..., wn-1 in general won't be integers.

If X and Y have the same distribution, then you won't have any problem. It doesn't matter what w1, w2, etc are if they are integers, rationals, or reals: you're dealing with defining the probabilities for Z and if X and Y have the same values that correspond to some probability (remember you said that X and Y have the same distribution which implies the same a(1),w1,w2 etc, then you will find the probability distribution using the probabilities and then create the mapping to values afterwards.

Are you wondering about the mapping procedure if your values that are mapped to probabilities are not integers?
 
  • #12
chiro said:
you're dealing with defining the probabilities for Z and if X and Y have the same values that correspond to some probability (remember you said that X and Y have the same distribution which implies the same a(1),w1,w2 etc, then you will find the probability distribution using the probabilities and then create the mapping to values afterwards.

Sorry, I didn't understand this part. What do you mean by "find the probability distribution using the probabilities"? Are you referring to the convolution?

Are you wondering about the mapping procedure if your values that are mapped to probabilities are not integers?

Yes, that's my problem.
 
Last edited:
  • #13
Tosh5457 said:
Consider the random variable Y, which is the sum of N random variables X:

[tex]Y=X+X+...+X=NX[/tex]

This is faulty notation if the various 'X's' can have different outcomes. You should say "is the sum of N random variables [itex]X_1,X_2,...X_N[/itex]. If you want to indicate that the [itex]X_i[/itex] have the same distribution, just say that in words. Don't do it by naming them as if they all realize the same value.


[tex]Y(y)=X(x)+X(x)+...+X(x)=NX(x)[/tex]

You mean:
[tex]Y(y) = X_1(x_1) + ...+ X_N(x_N)[/tex]

and you can't combine these terms.

But I don't know how to define y... This is why I can't use the convolution theorem, because I didn't really understand this part.


The variable y represents a particular value of the random variable Y. It just like saying "Let W be the random variable that represents the value of the face on a roll of a fair die". W(w) would be the particular event that the face value was w. (e.g. W(2) says the face that came up was 2.

To find the probability of particular value of Y, such as y = $40, you must sum the probabilities of all combinations of values [itex]x_i[/itex] for the [itex]X_i[/itex] which add up $40. So you consider all possible values of the [itex]x_i[/itex] that meet that requirement. That is essentially what "convolving" random variables means.
 
  • #14
Tosh5457 said:
Yes, that's my problem.

Stephen Tashi answered your question: consider all possibilities of values given your distributions and then use the convolution algorithm (or theorem) to get the actual probability for each mapping.

The best way to do this is to do it for the first two random variables and then simplify and repeat to incorporate the rest of them. Basically you should have n x m outputs for your convoluted distribution if n is the number of outputs for your starting and m is the number of outputs for the one you are convolving with. When you do this repeatedly m because the number of mappings in the last distribution you calculated through convolution.
 
  • #15
Thanks for the replies :smile:

The variable y represents a particular value of the random variable Y. It just like saying "Let W be the random variable that represents the value of the face on a roll of a fair die". W(w) would be the particular event that the face value was w. (e.g. W(2) says the face that came up was 2.

I was taught that that variable w described the event, not the value of the random variable. In your example, and in my understanding, what would say that the value of the face on a roll of a fair die was 2 is the value of W, not w. I'd represent that as W(2) = 2 or W("2 came off") = 2, depending on what's the set I define w to be in.

To find the probability of particular value of Y, such as y = $40, you must sum the probabilities of all combinations of values xi for the Xi which add up $40. So you consider all possible values of the xi that meet that requirement. That is essentially what "convolving" random variables means.

Sum the probabilities? Don't you mean multiply? If I sum the probabilities, and if I understood right, I'll get probabilities higher than 1 in some cases.
For example, Y = w1 + w1 + w1 + ... + w1 = N*w1 is a possibility for Y. If I sum the probabilities of X=w1 N times I can easily get a number higher than 1.

This is what I understood, please tell me if I'm wrong:

For simplicity let's say Y = X1 + X2, and each of these X's can only be w1 and w2. Then Y will be like this (eta is a dummy variable, didn't even bother to write it in each value of Y):
[tex]Y(\eta )=\left\{\begin{matrix}<br /> \\ w_{1}+w_{1}<br /> \\ w_{1}+w_{2}<br /> \\ w_{2}+w_{1}<br /> \\ w_{2}+w_{2}<br /> <br /> \end{matrix}\right.[/tex]

In general, for the sum of 2 random variables, it will have n x m values, where n is the number of possible values X1 can have, and m is the number of possible values X2 can have (like chiro said).

Now, if I attribute a probability of 0.7 to w2 and 0.3 to w1 for example, the probability of Y=w2+w2 would be 1.4, in my understanding.
 
Last edited:
  • #16
Tosh5457 said:
Thanks for the replies :smile:

Sum the probabilities? Don't you mean multiply? If I sum the probabilities, and if I understood right, I'll get probabilities higher than 1 in some cases.
For example, Y = w1 + w1 + w1 + ... + w1 = N*w1 is a possibility for Y. If I sum the probabilities of X=w1 N times I can easily get a number higher than 1.

You won't end up doing this: you have to use the calculations from the convolution algorithm which will involve multiplying probabilities and also adding (using the convolution theorem for discrete random variables).

Remember that it won't be N*w1: instead you will have to do a convolution for the first two random variables, then do a convolution with this calculated PDF with the next variable and then you keep doing this until you have a PDF for n-1 random variables with the nth random variable. If you expand this out, you'll see that it is a lot more involved than the N*w1 behaviour that you are implying: it doesn't work like that.

Convolution simply represents a way to multiply frequencies together of two functions and this idea is not only for probability but it's used for many areas of signal analysis and other areas of applied mathematics.
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 30 ·
2
Replies
30
Views
5K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K