Probability: What are E(x) and Var(x) of this specific X

In summary, the probability density function (p.d.f.) of a random variable X is P(X=k)= p^{k} (1-p) where 0<p<1 and k=0,1,2,... The moment generating function (m.g.f.) of X is \psi (t) = \frac{1-p}{1-e^tp} E(X)= \frac{p}{1-p} and Var(X) = \frac{p}{(1-p)^2} . If X_{1},\; X_{2},\;\;...,\;\; X_{n} are sampled from X, the distribution of \sum_{i=1}^{n} X_i can be approximated by
  • #1
sanctifier
58
0

Homework Statement


The probability density function (p.d.f.) of a random variable X is

[itex] P(X=k)= p^{k} (1-p) \;\;\;\; where \;\;\;\; 0<p<1 \;\;\;\; and \;\;\;\; k=0,1,2,... [/itex]

Question 1: what is the moment generating function (m.g.f.) of X?

Question 2: What are expectation E(x) and variance Var(x) of X?

Question 3: if [itex] X_{1},\; X_{2},\;\;...,\;\; X_{n} [/itex] are sampled from X, what is the distribution of [itex] \sum_{i=1}^{n} X_i [/itex]

Homework Equations


Nothing special.


The Attempt at a Solution


Answer 1:

[itex] E[e^{tx} ] = \sum_{x=0, 1, ...}^{ \infty } e^{tx} p^x (1-p)=(1-p) \sum_{x=0, 1, ...}^{ \infty } {e^tp}^x=(1-p)\lim_{n \to \infty} \frac{1-{e^tp}^n}{1-e^tp} = \frac{1-p}{1-e^tp} [/itex]
[itex] when \;\;0<e^tp<1,\;\;i.e.,\;\;t<ln \frac{1}{p} [/itex]
Hence, m.g.f. is [itex] \psi (t) = \frac{1-p}{1-e^tp} [/itex]

Answer 2:


[itex] \begin{cases} \psi' (t) = \frac{d\psi (t)}{dt} = - \frac{1-p}{(1-e^tp)^2} (-e^tp) = \frac{(1-p)e^tp}{(1-e^tp)^2} \\ \psi'' (t) = \frac{d\psi' (t)}{dt} =(1-p)p \frac{e^t(1-e^tp)^2+e^t2(1-e^tp)e^tp}{(1-e^tp)^4}=(1-p)pe^t \frac{1+e^tp}{(1-e^tp)^3} \end{cases} [/itex]

[itex] \begin{cases} \psi' (0) = \frac{(1-p)p}{(1-p)^2}= \frac{p}{1-p} \\ \psi'' (0) = \frac{(1-p)p(1+p)}{(1-p)^3} = \frac{p(1+p)}{(1-p)^2} \end{cases} [/itex]

[itex] E(X)= \psi' (0) \;\; and\;\; Var(X) = E[X^2]-E[X]^2 = \psi'' (0) - ( \psi' (0))^2 = \frac{p}{(1-p)^2} [/itex]

Answer 3:

[itex] {\overline{X}} _n = \frac{\sum_{i=1}^{n} X_i }{n} [/itex] can be approximated by a normal distribution of [itex] \mu =E(x) [/itex] and [itex] \sigma ^2 =\frac{Var(X)}{n} [/itex], because [itex] \sum_{i=1}^{n} X_i = n {\overline{X}} _n [/itex], hence [itex] \sum_{i=1}^{n} X_i [/itex] has a normal distribution of [itex] \mu =nE(X) [/itex] and [itex] \sigma ^2 =nVar(X) [/itex]

Are these answers correct? Thank you in advance!
 
Physics news on Phys.org
  • #2
sanctifier said:

Homework Statement


The probability density function (p.d.f.) of a random variable X is

[itex] P(X=k)= p^{k} (1-p) \;\;\;\; where \;\;\;\; 0<p<1 \;\;\;\; and \;\;\;\; k=0,1,2,... [/itex]

Question 1: what is the moment generating function (m.g.f.) of X?

Question 2: What are expectation E(x) and variance Var(x) of X?

Question 3: if [itex] X_{1},\; X_{2},\;\;...,\;\; X_{n} [/itex] are sampled from X, what is the distribution of [itex] \sum_{i=1}^{n} X_i [/itex]

Homework Equations


Nothing special.


The Attempt at a Solution


Answer 1:

[itex] E[e^{tx} ] = \sum_{x=0, 1, ...}^{ \infty } e^{tx} p^x (1-p)=(1-p) \sum_{x=0, 1, ...}^{ \infty } {e^tp}^x=(1-p)\lim_{n \to \infty} \frac{1-{e^tp}^n}{1-e^tp} = \frac{1-p}{1-e^tp} [/itex]
[itex] when \;\;0<e^tp<1,\;\;i.e.,\;\;t<ln \frac{1}{p} [/itex]
Hence, m.g.f. is [itex] \psi (t) = \frac{1-p}{1-e^tp} [/itex]

Answer 2:


[itex] \begin{cases} \psi' (t) = \frac{d\psi (t)}{dt} = - \frac{1-p}{(1-e^tp)^2} (-e^tp) = \frac{(1-p)e^tp}{(1-e^tp)^2} \\ \psi'' (t) = \frac{d\psi' (t)}{dt} =(1-p)p \frac{e^t(1-e^tp)^2+e^t2(1-e^tp)e^tp}{(1-e^tp)^4}=(1-p)pe^t \frac{1+e^tp}{(1-e^tp)^3} \end{cases} [/itex]

[itex] \begin{cases} \psi' (0) = \frac{(1-p)p}{(1-p)^2}= \frac{p}{1-p} \\ \psi'' (0) = \frac{(1-p)p(1+p)}{(1-p)^3} = \frac{p(1+p)}{(1-p)^2} \end{cases} [/itex]

[itex] E(X)= \psi' (0) \;\; and\;\; Var(X) = E[X^2]-E[X]^2 = \psi'' (0) - ( \psi' (0))^2 = \frac{p}{(1-p)^2} [/itex]

Answer 3:

[itex] {\overline{X}} _n = \frac{\sum_{i=1}^{n} X_i }{n} [/itex] can be approximated by a normal distribution of [itex] \mu =E(x) [/itex] and [itex] \sigma ^2 =\frac{Var(X)}{n} [/itex], because [itex] \sum_{i=1}^{n} X_i = n {\overline{X}} _n [/itex], hence [itex] \sum_{i=1}^{n} X_i [/itex] has a normal distribution of [itex] \mu =nE(X) [/itex] and [itex] \sigma ^2 =nVar(X) [/itex]

Are these answers correct? Thank you in advance!

Question 3 did not specify an approximation; it asked for the distribution of the sum. Hint: your question is closely related to the so-called negative binomial distribution. see, eg., http://en.wikipedia.org/wiki/Negative_binomial_distribution and http://www.math.ntu.edu.tw/~hchen/teaching/StatInference/notes/lecture16.pdf .
 
  • #3
Because a sample [itex] (X_{1},\; X_{2},\;\;...,\;\; X_{n}) [/itex] is taken from X, then [itex] (X_{1},\; X_{2},\;\;...,\;\; X_{n}) [/itex] can be envisaged as a vector from the n-dimensional sample space in which each vector has been assigned a probability [itex] P(X_1)P(X_2)...P(X_n) [/itex].

Since what we concerned is [itex] \sum_{i=1}^{n} X_i [/itex], then we don’t care the order of components in a vector.

The number of all vectors whose component sum is [itex] \sum_{i=1}^{n} X_i [/itex] should be [itex]{\sum_{i=1}^{n} X_i +n \choose n} [/itex].

Hence ,the p.d.f. of [itex] \sum_{i=1}^{n} X_i [/itex] is

[itex]{\sum_{i=1}^{n} X_i +n \choose n} P(X_1)P(X_2)...P(X_n) ={\sum_{i=1}^{n} X_i +n \choose n} p^{\sum_{i=1}^{n} X_i } q^n[/itex]

I don’t know whether this is the correct solution, please proofread it, Ray, thank you in advance.
 
  • #4
sanctifier said:
Because a sample [itex] (X_{1},\; X_{2},\;\;...,\;\; X_{n}) [/itex] is taken from X, then [itex] (X_{1},\; X_{2},\;\;...,\;\; X_{n}) [/itex] can be envisaged as a vector from the n-dimensional sample space in which each vector has been assigned a probability [itex] P(X_1)P(X_2)...P(X_n) [/itex].

Since what we concerned is [itex] \sum_{i=1}^{n} X_i [/itex], then we don’t care the order of components in a vector.

The number of all vectors whose component sum is [itex] \sum_{i=1}^{n} X_i [/itex] should be [itex]{\sum_{i=1}^{n} X_i +n \choose n} [/itex].

Hence ,the p.d.f. of [itex] \sum_{i=1}^{n} X_i [/itex] is

[itex]{\sum_{i=1}^{n} X_i +n \choose n} P(X_1)P(X_2)...P(X_n) ={\sum_{i=1}^{n} X_i +n \choose n} p^{\sum_{i=1}^{n} X_i } q^n[/itex]

I don’t know whether this is the correct solution, please proofread it, Ray, thank you in advance.

Did you take the time to read the citations I supplied? My guess is that you did not, at least judging from your answer. I can't figure out what you are doing, but I don't think it is meaningful.
 
  • #5
Ray Vickson said:
Did you take the time to read the citations I supplied? My guess is that you did not, at least judging from your answer. I can't figure out what you are doing, but I don't think it is meaningful.

It's wrong? Then what is the correct one?

I read the citations and then tried to follow the pattern of negative binomial distribution, that's why the resulting p.d.f. looks like the one of negative binomial distribution.
 
  • #6
sanctifier said:
It's wrong? Then what is the correct one?

I read the citations and then tried to follow the pattern of negative binomial distribution, that's why the resulting p.d.f. looks like the one of negative binomial distribution.

Well, I cannot figure out what you are doing. The quantities ##X_i## are random variables, but you are using them in things like
[tex] {\sum_{i=1}^{n} X_i +n \choose n}[/tex]
etc. That makes no sense, and is not how people write things out in probability presentations. The random variable ##\sum_{i=1}^n X_i## can take values of ##k = 0, 1, 2, \ldots ##, and for each such value of ##k## you are supposed to find ##P(\sum_{i=1}^n X_i = k)##.
 
  • #7
Ray Vickson said:
Question 3 did not specify an approximation; it asked for the distribution of the sum.
I concur. Almost. I concur that the result should be exact. Question 3 did not ask for the distribution of the sum. It only asks for the mean and variance of the sum.
Hint: your question is closely related to the so-called negative binomial distribution. see, eg., http://en.wikipedia.org/wiki/Negative_binomial_distribution and http://www.math.ntu.edu.tw/~hchen/teaching/StatInference/notes/lecture16.pdf .
That's good, but personally I think this question is even more closely related to the geometric distribution. Substitute p with 1-p and it is the geometric distribution.

That said, this isn't the best of hints. An even better hint: There's a nice simple relationship for the moment generating function of a sum of N i.i.d. random variables and the moment generating function for each of those random variables.
 
  • #8
Finally, I know what’s going on. Let [itex] Y=\sum_{i=1}^n X_i [/itex].

Then [itex] \psi_Y(t) =\prod_{i=1}^{n}\psi_i(t)=( \frac{1-p}{1-e^tp} )^n [/itex].

Compare [itex] \psi_Y(t) [/itex] with the m.g.f. of a negative binomial distribution, it actually is the m.g.f. of a negative binomial distribution with parameters n and 1-p

Thank you both!
 
  • #9
sanctifier said:
Finally, I know what’s going on. Let [itex] Y=\sum_{i=1}^n X_i [/itex].

Then [itex] \psi_Y(t) =\prod_{i=1}^{n}\psi_i(t)=( \frac{1-p}{1-e^tp} )^n [/itex].

Compare [itex] \psi_Y(t) [/itex] with the m.g.f. of a negative binomial distribution, it actually is the m.g.f. of a negative binomial distribution with parameters n and 1-p

Thank you both!

It is not quite a negative binomial, because your "geometric" random variable starts at 0 instead of 1 (and the normal definition of negative binomial uses the the geometric that starts at 1---so the negative binomial starts at n.). So, you need to modify things a bit.
 
  • #10
sanctifier said:
Finally, I know what’s going on. Let [itex] Y=\sum_{i=1}^n X_i [/itex].

Then [itex] \psi_Y(t) =\prod_{i=1}^{n}\psi_i(t)=( \frac{1-p}{1-e^tp} )^n [/itex].

Compare [itex] \psi_Y(t) [/itex] with the m.g.f. of a negative binomial distribution, it actually is the m.g.f. of a negative binomial distribution with parameters n and 1-p

Thank you both!

It is not quite a negative binomial, because your "geometric" random variable starts at 0 instead of 1 (and the normal definition of negative binomial uses the the geometric that starts at 1---so the negative binomial starts at n). It was exactly for that reason that I included two citations, and it is for that reason that you need to be careful.
 
  • #11
Starting from zero rather than one is one of the two definitions of a geometric random variable.

sanctifier, I suspect the instructor wants you to determine the mean and variance from the mgf your found rather than looking it up. In other words, compute the moments from ##\left.\psi_n'(t;p)\right|_{t=0}## and ##\left.\psi_n''(t;p)\right|_{t=0}## and then compute the mean and variance from those moments.
 
  • #12
Actually, I have no instructor to consult.

I learned the probability and statistics by reading textbooks and doing exercises in the ends of each chapters.

This may be the only place I can acquire the correct answers with the help of kind people like you and Ray.

Current answer is good enough for me.

Thank you very much.
 
  • #13
sanctifier said:
Current answer is good enough for me.
It shouldn't be. By looking for a match in a table of commonly used distributions you are missing a key point of moment generating functions. They are extremely useful for calculating moments, including the moments of sums of independent random variables (not necessarily i.i.d.). When looking at things from the perspective of the pdf you get convolutions, and these can get incredibly convoluted (pardon the pun) when the number of random variables is large. Those convolutions become a product when you use the moment generating functions.

That product of moment generating functions becomes even easier if the random variables to be summed are i.i.d. In particular, the product ##m_{X_1}(t) m_{X_2}(t) \cdots m_{X_n}(t)## becomes ##m_X(t)^n##. I'm not going to derive the mean and variance for you using this fact. You should. It's not hard.

When you are studying on your own, you are only hurting yourself when you take shortcuts such as looking up the answer rather than deriving it.
 
  • #14
sanctifier said:
Actually, I have no instructor to consult.

I learned the probability and statistics by reading textbooks and doing exercises in the ends of each chapters.

This may be the only place I can acquire the correct answers with the help of kind people like you and Ray.

Current answer is good enough for me.

Thank you very much.

I would concur with some of what DH has told you about the importance of moment-generating functions, but I would not agree that moment-generating functions are the best way to go in this problem. Probability generating functions are typically more useful when dealing with discrete random variables, so using ##\tilde{p}(z) = \sum_{k=0}^{\infty} p_k z^k## would be more usual in this type of problem. In fact, that is precisely how the Negative Binomial distribution gets its name: its probability generating function
[tex] \tilde{P}(z) = \left( \frac{1-p}{1-pz} \right)^n = (1-p)^n (1-pz)^{-n} [/tex]
involves the negative binomial expansion
[tex] (1-pz)^{-n}= \sum_{k=0}^{\infty}(-1)^k {-n \choose k} p^k z^k
= \sum_{k=0}^{\infty} \frac{n (n+1) \cdots (n+k-1)}{k!} p^k z^k
= \sum_{k=0}^{\infty} {n+k-1 \choose k} p^k z^k [/tex]
This gives
[tex] \text{P} \left(\sum_{i=1}^n X_i = k \right) = {n+k-1 \choose k} (1-p)^n p^k, \: k = 0,1,2 \ldots . [/tex]
You could, of course, get the same type of thing from the mgf, but then you would be looking for the coefficients of ##e^{kt}## instead of ##z^k##. It is largely a matter of preference, but you seem to be married to the mgf, and that is not necessarily the best way to learn the subject. What you are doing is not wrong; it is just not what the majority of people working in this area have found by experience to be the most useful.

Also, when seeking means and variances, the fastest and easiest way is to use the results
[tex] E \sum X_i = \sum E X_i \\
\text{Var} \sum X_i = \sum \text{Var} X_i[/tex]
The first holds whether or not the ##X_i## are independent, while the second holds if the ##X_i## are independent or, at least, uncorrelated.

So, in a problem like this one you could (if you wished) use the mgf to get the mean and variance of a single ##X_i##, then just multiply by ##n## to get the mean and variance of ##\sum X_i##.
 
Last edited:
  • Like
Likes 1 person

FAQ: Probability: What are E(x) and Var(x) of this specific X

1. What does E(x) represent in probability?

E(x) or expected value is a measure of central tendency that represents the average value of a random variable over an infinite number of trials. It takes into account the probability of each possible outcome and multiplies it by the corresponding value of the random variable.

2. How is E(x) calculated?

The formula for E(x) is: E(x) = Σ(x * P(x)), where x represents the value of the random variable and P(x) represents the probability of that value occurring. This calculation is performed for each possible outcome and the results are summed together to get the expected value.

3. What is Var(x) in probability?

Var(x) or variance is a measure of the spread or variability of a random variable's values. It tells us how much the random variable deviates from its expected value, E(x). A smaller variance indicates that the values are closer to the expected value, while a larger variance indicates more variability.

4. How is Var(x) calculated?

The formula for Var(x) is: Var(x) = Σ((x-E(x))^2 * P(x)), where x represents the value of the random variable, E(x) represents the expected value, and P(x) represents the probability of that value occurring. This calculation is performed for each possible outcome and the results are summed together to get the variance.

5. What is the relationship between E(x) and Var(x)?

The expected value, E(x), and variance, Var(x), are both measures of central tendency, but they represent different aspects of a random variable. E(x) tells us the average or most likely value of the variable, while Var(x) tells us how much the values deviate from this average. A higher variance means the values are more spread out, while a lower variance means the values are closer to the expected value.

Similar threads

Back
Top