# What is Var(X) for my defined X?

1. Jan 11, 2013

### Pzi

This applies to natural numbers $n$ and $N$ where $n<N$.

We have $N$ balls representing numbers $1,2,...,N$.
We randomly choose $n$ of those balls which happen to represent numbers ${k_1},{k_2},...,{k_n}$.
We then define a random variable $X = {k_1} + {k_2} + ... + {k_n}$.
What is the mean and variation of $X$?

Well there are $\left( {\begin{array}{*{20}{c}}N\\n\end{array}} \right)$ equally likely combinations and every one of them brings $n$ summands. So we get $\left( {\begin{array}{*{20}{c}} N\\ n \end{array}} \right) \cdot n$ summands overall. Since there is no bias towards any particular number it means that every number is added $\left( {\begin{array}{*{20}{c}} N\\ n \end{array}} \right) \cdot \frac{n}{N}$ times hence our mean:
$E\left( X \right) = \frac{{\left( {\begin{array}{*{20}{c}} N\\ n \end{array}} \right) \cdot \frac{n}{N} \cdot \left( {1 + 2 + ... + N} \right)}}{{\left( {\begin{array}{*{20}{c}} N\\ n \end{array}} \right)}} = \frac{n}{N} \cdot \frac{{N\left( {N + 1} \right)}}{2} = \frac{{n\left( {N + 1} \right)}}{2}$.

Unfortunately variance seems to be an entirely different animal since I cannot just rip sums apart and add numbers in a different order. Any ideas?

Last edited: Jan 11, 2013
2. Jan 11, 2013

### mathman

It is a straightforward, but messy, calculation.

Let X = ∑ ki. mean = E(X), var = E(X2) - (E(X))2
To compute E(X) all you need is E(ki) = N/2.

To compute E(X2), some work is required.
You need (a) E(kikj) for i ≠ j, and (b) E(ki2).
For (a) I got N(N+1)/4 - (2N+1)/6, for (b) I got (N+1)(2N+1)/6.

To get the final answers you need to multiply the mean by n.
For the second moment {E(X2)}, there are n(n-1) (a) terms and n (b) terms.

Good luck!

3. Jan 14, 2013

### mathman

In case you weren't able to work it out, I got the following:

E(X) = n(N+1)/2
Var(X) = n(N+1)(N-n)/12

See if you get same.

4. Jan 17, 2013

### Pzi

I appreciate your dedication very much. Sorry for not getting back to this topic in a reasonable amount of time.

So for i≠j you apparently got E(kikj) = N(N+1)/4 - (2N+1)/6.
Meanwhile I got
$$E({k_i}{k_j}) = \frac{1}{{\left( {\begin{array}{*{20}{c}} N\\ 2 \end{array}} \right)}}\sum\limits_{i = 1}^{N - 1} {i \cdot \sum\limits_{j = i + 1}^N j } = \frac{2}{{N(N - 1)}}\sum\limits_{i = 1}^{N - 1} {i \cdot \frac{{(N + i + 1)(N - i)}}{2}} = \frac{{N(N + 1)}}{2} + \frac{{1 - 2N}}{6} + \frac{{N(1 - N)}}{4}$$

Is my logic flawed here? Also I checked the literature for answers and can confirm that your conclusion about mean and variation is correct.

5. Jan 17, 2013

### mathman

You made at least one arithmetic error. The summand in the second summation should be
i(N-i+1)(N-i)/2. I'll let you finish from there.

In any case I got E(X2) = n(n-1)(N+1)(3N+2)/12 + n(N+1)(2N+1)/6,
where the first term is the contribution of all products of different numbers and the second term is the contribution of all the squares.

Last edited: Jan 17, 2013
6. Jan 17, 2013

### Pzi

Why do you say so?
(i+1)+(i+2)+(i+3)...+(N-2)+(N-1)+N = (N+i+1)(N-i)/2

Well the first term is E(kikj) added n(n-1) times isn't it? Hence E(kikj)=(N+1)(3N+2)/12 (for i≠j) which is not the same as previously stated by yourself E(kikj)=N(N+1)/4-(2N+1)/6.
Sorry if I missed something.

P.S. http://www.wolframalpha.com/input/?i=2/(N*(N-1))*sum(i*sum(j,+j=i+1,N),+i=1,N-1)
it gives the same result as mine so I'm confused now, is it the wrong way to calculate E(kikj) ?

Last edited: Jan 17, 2013
7. Jan 18, 2013

### mathman

Sorry - I sometimes get sloppy in my arithmetic (Jan. 11 - my error was in dividing by N(N+1) when I should have
divided by (N-1)N), including my comment saying you had an error. However the calculation on Jan. 14 was done very carefully (where I divided by N(N-1)). This included the result I mentioned yesterday.

I can't figure out why yours comes out differently.
My calculation - E(XiXj) = {(∑i)2 - ∑i2}/N(N-1), where the sums are i = (1,N).

Last edited: Jan 18, 2013
8. Jan 18, 2013

### mathman

One quick check, for N = 2, the product of different x's ≡ 2, so the expectation is obviously 2.

In any case, your result is the same as mine.

Last edited: Jan 18, 2013
9. Jan 20, 2013

### Pzi

Thanks for all the help.

To sum things up I'd just like to say that I almost got it right after your first post. Unfortunately I made a very silly mistake at the very end. Which in turn led to a whole phase of contemplation and doubts.

I attach that almost-correct handwritten solution just for fun.
Differences in notation: E(X)=M(X), Var(X)=D(X), X=W.

File size:
95.6 KB
Views:
79