Distribution of sum of discrete random variable

In summary, the conversation revolves around using convolution theorem for discrete random variables, specifically in the context of poker tournaments. The discussion also touches on using the characteristic function and Dirac delta functions. The main issue is defining the probability mass function for a random variable, given certain constraints and parameters.
  • #1
Tosh5457
134
28
Edit: I have to think more about this, I'll post later.
 
Last edited:
Physics news on Phys.org
  • #2
Tosh5457 said:
Edit: I have to think more about this, I'll post later.

If you need a hint, assuming independence, think about convolution theorem for discrete random variables.
 
  • #3
Ok, my problem is about poker tourneys.

Consider the random variable Y, which is the sum of N random variables X:

[tex]Y=X+X+...+X=NX[/tex]

X is the random variable that assigns a prize value to each in-the-money position, and assigns -1 to out-of-money positions. So:

[tex]X(position)=\left\{\begin{matrix}-1, position = "OTM"
\\w_{1}, position = "1st"
\\w_{2}, position = "2nd"
\\...
\\w_{n}, position = "nth"

\end{matrix}\right.[/tex]

OTM means out-of-money, and here n is the number of in-the-money positions. w1, w2,..., wn are constants of course.

The probability mass function of X is:

[tex]f(x)=\left\{\begin{matrix} \beta_{1}, x = -1
\\ \beta_{2}, x = w_{1}
\\ \beta_{3}, x = w_{2}
\\ ...
\\ \beta_{n+1}, x = w_{n}

\end{matrix}\right.[/tex]

What I want to know is the probability mass function of Y (which represents the profit/loss of N tourneys). I could find this by using the convolution theorem, but that's where my problem arises. As I understood it, Y needs to depend on a variable y, so:

[tex]Y(y)=X(x)+X(x)+...+X(x)=NX(x)[/tex]

But I don't know how to define y... This is why I can't use the convolution theorem, because I didn't really understand this part.
 
  • #4
Assuming all the X's are independent, it would be better to use the characteristic function (Fourier transform of distribution). Then the characteristic function for Y is the Nth power of the char. function for X. To get the distribution function for Y, take the inverse transform.
 
  • #5
mathman said:
Assuming all the X's are independent, it would be better to use the characteristic function (Fourier transform of distribution). Then the characteristic function for Y is the Nth power of the char. function for X. To get the distribution function for Y, take the inverse transform.

Isn't that only for continuous random variables (and not discrete)?

Anyway, shouldn't it be easy by convolution? I think that I'm only missing something very basic...
 
  • #6
Tosh5457 said:
Isn't that only for continuous random variables (and not discrete)?

Anyway, shouldn't it be easy by convolution? I think that I'm only missing something very basic...

The y variable is just a dummy variable and you can call it whatever you want. As long as you are doing a discrete convolution for PDF's with univariate distributions (i.e. with PDF in form P(X = x) = blah) then your dummy variable will always correspond to this x or whatever other dummy variable you have chosen.
 
  • #7
Tosh5457 said:
Isn't that only for continuous random variables (and not discrete)?

Anyway, shouldn't it be easy by convolution? I think that I'm only missing something very basic...
You can do it for discrete random variables. The density function is a linear combination of Dirac delta functions, so the characteristic function is a linear combination of exponential functions.
 
  • #8
You can do it for discrete random variables. The density function is a linear combination of Dirac delta functions, so the characteristic function is a linear combination of exponential functions.

I'm not comfortable with that, I never even studied dirac delta function.

About the convolution, I still don't understand some things...

1st problem:

Definition of convolution for discrete r.v:
[tex]f_{Z}(z)=\sum_{x=-\inf}^{x=+inf}f_{Y}(z-x)f_{X}(x)[/tex]
where x is an integer.

The problem here is that [tex]f_{X}(x)[/tex] may be always 0 in my example (except for x = -1), since there may be no integer prizes... And if I defined the r.v. X slightly differently (not assigning that -1 value) that function would always be 0 and there would be no probablity mass function for Z? Something's not right here...
 
  • #9
Tosh5457 said:
I'm not comfortable with that, I never even studied dirac delta function.

About the convolution, I still don't understand some things...

1st problem:

Definition of convolution for discrete r.v:
[tex]f_{Z}(z)=\sum_{x=-\inf}^{x=+inf}f_{Y}(z-x)f_{X}(x)[/tex]
where x is an integer.

The problem here is that [tex]f_{X}(x)[/tex] may be always 0 in my example (except for x = -1), since there may be no integer prizes... And if I defined the r.v. X slightly differently (not assigning that -1 value) that function would always be 0 and there would be no probablity mass function for Z? Something's not right here...

It would help if you stated the discrete PDF's for f(x) and for g(y) corresponding to the distributions of X and Y, just for clarification.
 
  • #10
Ok I'll write everything again with the pmf included, so everything will be on the same post.

Definition of convolution of pmf's for discrete random variables X and Y:

[tex]f_{Z}(z)=\sum_{x=-inf}^{x=+inf}f_{Y}(z-x)f_{X}(x)[/tex]

where x is an integer.

PMF of X and Y (they have the same distribution):

[tex]f(X=x)=\left\{\begin{matrix}a_{1}, x=-1
\\ a_{2}, x=w_{1}
\\ a_{3}, x=w_{2}
\\ ...
\\ a_{n}, x=w_{n-1}

\end{matrix}\right.[/tex]

where w1, w2, ..., wn-1 are real constants.

Problem:

In the series, [tex]f_{X}(x)[/tex] may be always 0 except for -1, since w1, w2, ..., wn-1 in general won't be integers.
 
  • #11
Tosh5457 said:
Ok I'll write everything again with the pmf included, so everything will be on the same post.

Definition of convolution of pmf's for discrete random variables X and Y:

[tex]f_{Z}(z)=\sum_{x=-inf}^{x=+inf}f_{Y}(z-x)f_{X}(x)[/tex]

where x is an integer.

PMF of X and Y (they have the same distribution):

[tex]f(X=x)=\left\{\begin{matrix}a_{1}, x=-1
\\ a_{2}, x=w_{1}
\\ a_{3}, x=w_{2}
\\ ...
\\ a_{n}, x=w_{n-1}

\end{matrix}\right.[/tex]

where w1, w2, ..., wn-1 are real constants.

Problem:

In the series, [tex]f_{X}(x)[/tex] may be always 0 except for -1, since w1, w2, ..., wn-1 in general won't be integers.

If X and Y have the same distribution, then you won't have any problem. It doesn't matter what w1, w2, etc are if they are integers, rationals, or reals: you're dealing with defining the probabilities for Z and if X and Y have the same values that correspond to some probability (remember you said that X and Y have the same distribution which implies the same a(1),w1,w2 etc, then you will find the probability distribution using the probabilities and then create the mapping to values afterwards.

Are you wondering about the mapping procedure if your values that are mapped to probabilities are not integers?
 
  • #12
chiro said:
you're dealing with defining the probabilities for Z and if X and Y have the same values that correspond to some probability (remember you said that X and Y have the same distribution which implies the same a(1),w1,w2 etc, then you will find the probability distribution using the probabilities and then create the mapping to values afterwards.

Sorry, I didn't understand this part. What do you mean by "find the probability distribution using the probabilities"? Are you referring to the convolution?

Are you wondering about the mapping procedure if your values that are mapped to probabilities are not integers?

Yes, that's my problem.
 
Last edited:
  • #13
Tosh5457 said:
Consider the random variable Y, which is the sum of N random variables X:

[tex]Y=X+X+...+X=NX[/tex]

This is faulty notation if the various 'X's' can have different outcomes. You should say "is the sum of N random variables [itex] X_1,X_2,...X_N [/itex]. If you want to indicate that the [itex] X_i [/itex] have the same distribution, just say that in words. Don't do it by naming them as if they all realize the same value.


[tex]Y(y)=X(x)+X(x)+...+X(x)=NX(x)[/tex]

You mean:
[tex] Y(y) = X_1(x_1) + ...+ X_N(x_N) [/tex]

and you can't combine these terms.

But I don't know how to define y... This is why I can't use the convolution theorem, because I didn't really understand this part.


The variable y represents a particular value of the random variable Y. It just like saying "Let W be the random variable that represents the value of the face on a roll of a fair die". W(w) would be the particular event that the face value was w. (e.g. W(2) says the face that came up was 2.

To find the probability of particular value of Y, such as y = $40, you must sum the probabilities of all combinations of values [itex] x_i [/itex] for the [itex] X_i [/itex] which add up $40. So you consider all possible values of the [itex] x_i [/itex] that meet that requirement. That is essentially what "convolving" random variables means.
 
  • #14
Tosh5457 said:
Yes, that's my problem.

Stephen Tashi answered your question: consider all possibilities of values given your distributions and then use the convolution algorithm (or theorem) to get the actual probability for each mapping.

The best way to do this is to do it for the first two random variables and then simplify and repeat to incorporate the rest of them. Basically you should have n x m outputs for your convoluted distribution if n is the number of outputs for your starting and m is the number of outputs for the one you are convolving with. When you do this repeatedly m because the number of mappings in the last distribution you calculated through convolution.
 
  • #15
Thanks for the replies :smile:

The variable y represents a particular value of the random variable Y. It just like saying "Let W be the random variable that represents the value of the face on a roll of a fair die". W(w) would be the particular event that the face value was w. (e.g. W(2) says the face that came up was 2.

I was taught that that variable w described the event, not the value of the random variable. In your example, and in my understanding, what would say that the value of the face on a roll of a fair die was 2 is the value of W, not w. I'd represent that as W(2) = 2 or W("2 came off") = 2, depending on what's the set I define w to be in.

To find the probability of particular value of Y, such as y = $40, you must sum the probabilities of all combinations of values xi for the Xi which add up $40. So you consider all possible values of the xi that meet that requirement. That is essentially what "convolving" random variables means.

Sum the probabilities? Don't you mean multiply? If I sum the probabilities, and if I understood right, I'll get probabilities higher than 1 in some cases.
For example, Y = w1 + w1 + w1 + ... + w1 = N*w1 is a possibility for Y. If I sum the probabilities of X=w1 N times I can easily get a number higher than 1.

This is what I understood, please tell me if I'm wrong:

For simplicity let's say Y = X1 + X2, and each of these X's can only be w1 and w2. Then Y will be like this (eta is a dummy variable, didn't even bother to write it in each value of Y):
[tex]Y(\eta )=\left\{\begin{matrix}
\\ w_{1}+w_{1}
\\ w_{1}+w_{2}
\\ w_{2}+w_{1}
\\ w_{2}+w_{2}

\end{matrix}\right.[/tex]

In general, for the sum of 2 random variables, it will have n x m values, where n is the number of possible values X1 can have, and m is the number of possible values X2 can have (like chiro said).

Now, if I attribute a probability of 0.7 to w2 and 0.3 to w1 for example, the probability of Y=w2+w2 would be 1.4, in my understanding.
 
Last edited:
  • #16
Tosh5457 said:
Thanks for the replies :smile:

Sum the probabilities? Don't you mean multiply? If I sum the probabilities, and if I understood right, I'll get probabilities higher than 1 in some cases.
For example, Y = w1 + w1 + w1 + ... + w1 = N*w1 is a possibility for Y. If I sum the probabilities of X=w1 N times I can easily get a number higher than 1.

You won't end up doing this: you have to use the calculations from the convolution algorithm which will involve multiplying probabilities and also adding (using the convolution theorem for discrete random variables).

Remember that it won't be N*w1: instead you will have to do a convolution for the first two random variables, then do a convolution with this calculated PDF with the next variable and then you keep doing this until you have a PDF for n-1 random variables with the nth random variable. If you expand this out, you'll see that it is a lot more involved than the N*w1 behaviour that you are implying: it doesn't work like that.

Convolution simply represents a way to multiply frequencies together of two functions and this idea is not only for probability but it's used for many areas of signal analysis and other areas of applied mathematics.
 

1. What is a discrete random variable?

A discrete random variable is a type of random variable that can only take on a finite or countably infinite set of values. These values are typically represented by whole numbers and are often the result of counting or measuring a certain event or phenomenon.

2. How do you calculate the distribution of the sum of two discrete random variables?

To calculate the distribution of the sum of two discrete random variables, you first need to determine the possible values of the sum by adding all possible combinations of values from the two variables. Then, you can calculate the probability of each sum by multiplying the probabilities of the corresponding values from each variable. Finally, you can create a table or graph to display the distribution of the sum.

3. What is the difference between the distribution of a single discrete random variable and the distribution of the sum of two discrete random variables?

The distribution of a single discrete random variable shows the probability of each possible value that the variable can take on. On the other hand, the distribution of the sum of two discrete random variables shows the probability of each possible sum that can result from adding values from the two variables.

4. Can the distribution of the sum of discrete random variables be approximated by a continuous distribution?

Yes, the distribution of the sum of discrete random variables can often be approximated by a continuous distribution, such as the normal distribution, when the number of variables is large enough and the probabilities are not too close to 0 or 1. This is known as the Central Limit Theorem.

5. How is the distribution of the sum of discrete random variables used in real-world applications?

The distribution of the sum of discrete random variables is used in many real-world applications, such as in finance, engineering, and statistics. For example, it can be used to model the total sales of a company, the amount of rainfall in a certain area, or the duration of a phone call. By understanding the distribution of the sum, scientists and researchers can make predictions and make informed decisions in various fields.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
365
  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
434
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
331
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
865
Back
Top