What is Var(X) for my defined X?

• Pzi
In summary, the conversation discusses the calculation of the mean and variance of a random variable X, which is defined as the sum of n randomly chosen balls from a set of N balls representing natural numbers. The mean is found to be n(N+1)/2, while the variance is calculated to be n(N+1)(N-n)/12. There was a small calculation error in the conversation, but ultimately the same result was reached.
Pzi
This applies to natural numbers $n$ and $N$ where $n<N$.

We have $N$ balls representing numbers $1,2,...,N$.
We randomly choose $n$ of those balls which happen to represent numbers ${k_1},{k_2},...,{k_n}$.
We then define a random variable $X = {k_1} + {k_2} + ... + {k_n}$.
What is the mean and variation of $X$?

Well there are $\left( {\begin{array}{*{20}{c}}N\\n\end{array}} \right)$ equally likely combinations and every one of them brings $n$ summands. So we get $\left( {\begin{array}{*{20}{c}} N\\ n \end{array}} \right) \cdot n$ summands overall. Since there is no bias towards any particular number it means that every number is added $\left( {\begin{array}{*{20}{c}} N\\ n \end{array}} \right) \cdot \frac{n}{N}$ times hence our mean:
$E\left( X \right) = \frac{{\left( {\begin{array}{*{20}{c}} N\\ n \end{array}} \right) \cdot \frac{n}{N} \cdot \left( {1 + 2 + ... + N} \right)}}{{\left( {\begin{array}{*{20}{c}} N\\ n \end{array}} \right)}} = \frac{n}{N} \cdot \frac{{N\left( {N + 1} \right)}}{2} = \frac{{n\left( {N + 1} \right)}}{2}$.

Unfortunately variance seems to be an entirely different animal since I cannot just rip sums apart and add numbers in a different order. Any ideas?

Last edited:
It is a straightforward, but messy, calculation.

Let X = ∑ ki. mean = E(X), var = E(X2) - (E(X))2
To compute E(X) all you need is E(ki) = N/2.

To compute E(X2), some work is required.
You need (a) E(kikj) for i ≠ j, and (b) E(ki2).
For (a) I got N(N+1)/4 - (2N+1)/6, for (b) I got (N+1)(2N+1)/6.

To get the final answers you need to multiply the mean by n.
For the second moment {E(X2)}, there are n(n-1) (a) terms and n (b) terms.

Good luck!

In case you weren't able to work it out, I got the following:

E(X) = n(N+1)/2
Var(X) = n(N+1)(N-n)/12

See if you get same.

mathman said:
In case you weren't able to work it out, I got the following:

E(X) = n(N+1)/2
Var(X) = n(N+1)(N-n)/12

See if you get same.

I appreciate your dedication very much. Sorry for not getting back to this topic in a reasonable amount of time.

So for i≠j you apparently got E(kikj) = N(N+1)/4 - (2N+1)/6.
Meanwhile I got
$$E({k_i}{k_j}) = \frac{1}{{\left( {\begin{array}{*{20}{c}} N\\ 2 \end{array}} \right)}}\sum\limits_{i = 1}^{N - 1} {i \cdot \sum\limits_{j = i + 1}^N j } = \frac{2}{{N(N - 1)}}\sum\limits_{i = 1}^{N - 1} {i \cdot \frac{{(N + i + 1)(N - i)}}{2}} = \frac{{N(N + 1)}}{2} + \frac{{1 - 2N}}{6} + \frac{{N(1 - N)}}{4}$$

Is my logic flawed here? Also I checked the literature for answers and can confirm that your conclusion about mean and variation is correct.

Pzi said:
I appreciate your dedication very much. Sorry for not getting back to this topic in a reasonable amount of time.

So for i≠j you apparently got E(kikj) = N(N+1)/4 - (2N+1)/6.
Meanwhile I got
$$E({k_i}{k_j}) = \frac{1}{{\left( {\begin{array}{*{20}{c}} N\\ 2 \end{array}} \right)}}\sum\limits_{i = 1}^{N - 1} {i \cdot \sum\limits_{j = i + 1}^N j } = \frac{2}{{N(N - 1)}}\sum\limits_{i = 1}^{N - 1} {i \cdot \frac{{(N + i + 1)(N - i)}}{2}} = \frac{{N(N + 1)}}{2} + \frac{{1 - 2N}}{6} + \frac{{N(1 - N)}}{4}$$

Is my logic flawed here? Also I checked the literature for answers and can confirm that your conclusion about mean and variation is correct.

You made at least one arithmetic error. The summand in the second summation should be
i(N-i+1)(N-i)/2. I'll let you finish from there.

In any case I got E(X2) = n(n-1)(N+1)(3N+2)/12 + n(N+1)(2N+1)/6,
where the first term is the contribution of all products of different numbers and the second term is the contribution of all the squares.

Last edited:
mathman said:
You made at least one arithmetic error. The summand in the second summation should be
i(N-i+1)(N-i)/2. I'll let you finish from there.
Why do you say so?
(i+1)+(i+2)+(i+3)...+(N-2)+(N-1)+N = (N+i+1)(N-i)/2

mathman said:
In any case I got E(X2) = n(n-1)(N+1)(3N+2)/12 + n(N+1)(2N+1)/6,
where the first term is the contribution of all products of different numbers and the second term is the contribution of all the squares.

Well the first term is E(kikj) added n(n-1) times isn't it? Hence E(kikj)=(N+1)(3N+2)/12 (for i≠j) which is not the same as previously stated by yourself E(kikj)=N(N+1)/4-(2N+1)/6.
Sorry if I missed something.

P.S. http://www.wolframalpha.com/input/?i=2/(N*(N-1))*sum(i*sum(j,+j=i+1,N),+i=1,N-1)
it gives the same result as mine so I'm confused now, is it the wrong way to calculate E(kikj) ?

Last edited:
Sorry - I sometimes get sloppy in my arithmetic (Jan. 11 - my error was in dividing by N(N+1) when I should have
divided by (N-1)N), including my comment saying you had an error. However the calculation on Jan. 14 was done very carefully (where I divided by N(N-1)). This included the result I mentioned yesterday.

I can't figure out why yours comes out differently.
My calculation - E(XiXj) = {(∑i)2 - ∑i2}/N(N-1), where the sums are i = (1,N).

Last edited:
One quick check, for N = 2, the product of different x's ≡ 2, so the expectation is obviously 2.

In any case, your result is the same as mine.

Last edited:
Thanks for all the help.

To sum things up I'd just like to say that I almost got it right after your first post. Unfortunately I made a very silly mistake at the very end. Which in turn led to a whole phase of contemplation and doubts.

I attach that almost-correct handwritten solution just for fun.
Differences in notation: E(X)=M(X), Var(X)=D(X), X=W.

Attachments

• failure.jpg
53.9 KB · Views: 403

What is Var(X) for my defined X?

The variance of a variable X is a measure of how much the values of X vary from the average value. It is calculated by finding the average squared difference between each value and the mean of X. The formula for variance is Var(X) = (sum of (X-mean)^2)/n, where n is the number of values in X.

Why is it important to know the variance of my defined variable X?

Knowing the variance of a variable can provide valuable insights into the data and its distribution. It can help identify any outliers or extreme values that may affect the overall analysis. It is also used in many statistical calculations, such as calculating standard deviation and confidence intervals.

How does the variance of my defined variable X affect my analysis?

The variance of a variable can affect the accuracy and reliability of your analysis. If the variance is high, it indicates that the data is spread out and the average value may not be representative of the entire dataset. This can lead to misleading conclusions. On the other hand, a low variance indicates that the data points are close to the mean and the average value is a good representation of the data.

Can the variance of my defined variable X be negative?

No, the variance of a variable can never be negative. This is because variance is calculated by squaring the differences between each value and the mean, which ensures that all values are positive. A negative value would indicate that the data has a negative spread, which is not possible.

How can I interpret the variance of my defined variable X?

The interpretation of variance depends on the context of the data and its distribution. In general, a lower variance indicates that the data points are closer to the mean, while a higher variance indicates that the data points are more spread out. It is also important to consider the mean and standard deviation of the data in order to fully understand the variance.

Replies
2
Views
1K
Replies
0
Views
967
Replies
1
Views
872
Replies
9
Views
1K
Replies
0
Views
1K
Replies
8
Views
2K
Replies
1
Views
284
Replies
2
Views
1K
Replies
8
Views
1K
Replies
18
Views
2K