Var of Sum of Random Variables when n is a Random Variable?

jimmy1 · Jan 8, 2008

If I have a set of indepenent and identically distributed random variables X1,...Xn, then Var(\sum_{i=1}^{n}X_i) = \sum_{i=1}^{n}Var(X_i).

Now I want to know what the sum of variances of Xi would be when n is a random variable?
I'm guessing the above statement still holds when n is a random variable, but when I work out both sides of the above statement, I get two different answers.

For example, Var(\sum_{i=1}^{n}X_i) will now be the variance of a random sum of random variables which can be worked out using the total law of variance, and comes out as E(n)*Var(X1) + Var(n)*E(X1)^2.
But evaluating the other side of the expression \sum_{i=1}^{n}Var(X_i) when n is a random variable comes out as E(n)*Var(X1).

So, I don't understand why I'm getting two different answers here?? Which one is correct?? I think they should be the same.

mathman · Jan 8, 2008

The first equation (var(sum)=sum(var)) does not hold if n is a random variable.

EnumaElish · Jan 8, 2008

By Law of Total Variance, and letting E[X] = \mu and Var[X] = \sigma^2,

Var[\sum_1^n X] = E[Var[\sum_1^n X|n]] + Var[E[\sum_1^n X|n]] = E[Var[\sum_1^n X]] + Var[n\mu|n] = E[n\sigma^2] + 0 = \sigma^2E[n].

jimmy1 · Jan 8, 2008

EnumaElish said:

Var[\sum_1^n X] = E[Var[\sum_1^n X|n]] + Var[E[\sum_1^n X|n]] = E[Var[\sum_1^n X]] + Var[n\mu|n] = E[n\sigma^2] + 0 = \sigma^2E[n].

Are you sure the terms highlighted are correct?

Shouldn't it read:

Var[E[\sum_1^n X|n]] = Var[n\mu] = Var[n]\mu^2 ??

And therefore, as mathman has stated (var(sum)=sum(var)) does not hold when n is a random variable?

Pere Callahan · Jan 8, 2008

I agree with Jimmy.

EnumaElish · Jan 9, 2008

I agree to the general proposition; in my previous post I was making an error: I wrote Var[E[sum|n]] = Var[n\mu|n], which should have been Var[n\mu].

As I thought about the problem, I came to realize the following two special cases.

First, if \mu = 0 then the "complicated" formula (with the Var[n\mu] term) reduces to the simple formula. For example, if the X's are distributed normally with mean 0, then there is no difference between the two formulas.

Second, the linear relationship E[\sum_1^n X|n] = a + b n, where a = 0 and b = \mu implies:

Explained variance/Total variance = Var[E[\sum_1^n X|n]]/Var[\sum_1^n X] = Corr[\sum_1^n X, n]² or Var[E[\sum_1^n X|n]] = Corr[\sum_1^n X, n]² Var[\sum_1^n X] ... ... ... ... ... ... ... ... ... [Eq. 1],

which implies that the degree to which that the simple formula differs from the complicated formula is an empirical question. If it so happens that the correlation between the sum of X's and n is not significantly different from zero, then the two formulas will produce practically an identical result.

Here is a neat point, though: one can look at the equation \sum_1^n X = \alpha + \betan + \epsilon as a least squares regression, where E\alpha = 0, E\beta = \mu and E\epsilon = 0. Remember that the least squares estimator \beta of b in Y = a + bZ is \beta = Cov(Y,Z)/Var[Z]. By letting Y = \sum_1^n X and Z = n, one has \beta = Cov(\sum_1^n X, n)/Var[n]. But E\beta = \mu, so when Var[n] is given, there is a direct relationship between the Cov term and \mu, the mean of each X. And since Corr(Y,Z) = Cov(Y,Z)/(\sigma_Y\sigma_Z), there is a direct relationship between Corr[\sum_1^n X, n] and \mu.

With a little computer programming, one can verify that when the X's are i.i.d. Normal[0,1], Corr[\sum_1^n X, n] ---> 0 as expected (because \mu = 0, the linear relationship implies zero correlation: b = \mu = 0 and the [unbiased] least squares estimator of b = \beta = 0; therefore the correlation has to be zero).

The intuition is that if the X's are sampled equally on both sides of the origin, then the number of X's being summed up does not change the expected value of the sum (= zero). Therefore the correlation between the sum and n is zero. Even neater, if \mu\approx 0, then Cov \approx 0, i.e. the simple relationship can be a good approximation even when \mu isn't identically zero. The approximation worsens as \mu gets farther away from zero. Which is a roundabout way of saying that Var[n\mu] \approx 0 if \mu\approx 0.

sgl400 · Apr 9, 2008

hello people,

this is sofie.
i was wondering how jimmy1 got the result:
var(\sum_{i=1}^{N})=E(N)var(X) + var(N)(E(X))^2 using the law of total variance.

hope someone can tell me,
thanks!

sgl400 · Apr 9, 2008

I'm confused because you all use the notation n for both the random variable N and for the value of this N.
I get E(nVar(X1))+(E(X1))^2Var(n) and I don't know how that is the same as when n would be replaced by N.

Var of Sum of Random Variables when n is a Random Variable?

Similar threads

Hot Threads

B A Little Probability Puzzle

I Need help solving this Existence Algorithm for truth

A Does this computation satisfy LTL formulas?

I Stochastic calculus: Ito's lemma and differentials

I The reason for lambda calculus being universal

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective