# Standard deviation of m sample means of n observations each

Tags:
1. Dec 15, 2014

### FredericChopin

1. The problem statement, all variables and given/known data
"Consider a standard uniform density. The mean for this density is .5 and the variance is 1 / 12. You sample 1,000 sample means where each sample mean is comprised of 100 observations. You take the standard deviation of the 1,000 sample means. About what number would you expect it to be?

A: $\frac{1}{12}$

B: $\frac{1}{ \sqrt{12*100} }$

C: $\frac{1}{(12*100)}$

D: $\frac{1}{ \sqrt{12*1000} }$"

2. Relevant equations
$$Var( \bar{X} ) = \frac{ \sigma ^{2} }{n}$$

3. The attempt at a solution
If the variance of a sample mean with n observations is $Var( \bar{X} ) = \frac{ \sigma ^{2} }{n}$, then that would mean that the standard deviation of a sample mean with n observations is $\sqrt{ \frac{ \sigma ^{2} }{n} }$, which is $\frac{ \sigma }{ \sqrt{n} }$.

But in this problem, there are 1000 sample means each with 100 observations. In general, this is a question about finding the standard deviation of m sample means of n observations each.

I'm puzzled that there is no option for the answer that since there are a total of 1000*100 observations, the standard deviation is $\sqrt{ \frac{ \frac{1}{12} }{1000*100} }$. So I tried the fact that there are 1000 sample means, so I answered $\sqrt{ \frac{ \frac{1}{12} }{1000} }$, but it got marked incorrect. Is it possible because there are 100 observations for each sample mean, regardless of how many there are, that the answer is $\sqrt{ \frac{ \frac{1}{12} }{100} }$? And in that case, why are the previous two answers I gave incorrect?

Thank you.

2. Dec 15, 2014

### Ray Vickson

My answer would be "none of the above".

3. Dec 15, 2014

### FredericChopin

4. Dec 15, 2014

### Ray Vickson

Sorry: I misinterpreted the question. One of the answers listed is correct.

Last edited: Dec 15, 2014
5. Dec 15, 2014

### FredericChopin

Well... yes, of course one of the answers listed is correct.

I'm thinking that $\frac{1}{ \sqrt{12*100} }$ is correct based on my reasoning above. If it is the correct answer, why is it correct? And why is $\frac{1}{ \sqrt{12*(1000*100)} }$ and $\frac{1}{ \sqrt{12*1000} }$ incorrect?

Last edited: Dec 15, 2014
6. Dec 15, 2014

### haruspex

Not always :(
Let's back up to a simpler scenario. There is some random variable X with unknown variance. You take 1000 samples of it. How would you estimate the variance of X?
In this problem, what is X?

7. Dec 15, 2014

### Stephen Tashi

You also need the theorems:

If ${X1,X2,..X_n}$ are uncorrelated random variables and $Y$ is the random variable $Y = \sum_{i=1}^n X_i$ then $Var(Y) = \sum_{i=1}^n Var(X_i)$.

and

If $Y$ is a random variable and $k$ is a constant and $W$ is the random variable $W = kY$ then $Var(W) = k^2 Var(Y)$.

You problem asks about the variance of the sample mean. The sample mean can be viewed as $k = (1/100)$ times the sum of 100 uncorrelated random variables.

Last edited: Dec 16, 2014
8. Dec 16, 2014

### BvU

Dear Frederic,
I think your reasoning is just fine. If I read the exercise correctly, the 1000 just stands for 'a big enough number' so that your $\sigma_m = {\sigma\over \sqrt n}$ shows up well enough ('about') as the standard deviation of these means (of 100 observations each) .

9. Dec 16, 2014

### Ray Vickson

I thought that at first, but that led to an incorrect answer that did not fit with what was actually asked (which, IMHO is a somewhat silly question). I think the simplest translation of the problem into plain English is: "what is the standard deviation of the mean of an independent sample of size 100 from the distribution U(0,1)". The 1000 samples really have nothing much to do with the problem.

10. Dec 16, 2014

### Stephen Tashi

That's a good interpretation. "You take the standard deviation of the 1,000 sample means." implies we compute a single number. What does that number estimate? The sample standard deviation estimates the standard deviation of the random variable being sampled. The random variable being sampled is the mean of samples of size 100. If we assume the estimate is approximately correct, it should be near the actual mean of samples of size 100.

It's temping to extrapolate "You take the standard deviation of the 1,000 sample means" to imply further calculations such as "We estimate the standard deviation of a sample of size 1000 of those means by using ....[whatever information one might use] ".

Given how exercises are written, let's hope the skill of interpreting language is correlated with the accomplishment of acquiring new mathematical knowledge.

11. Dec 16, 2014

### haruspex

Yes, and that's what I was hoping to lead to with my post #6. Of course, it's not quite that simple since in principle there is the 1000/999 correction, but it serves to illustrate that with a large number of samples you can ignore that.
i disagree. It strikes me as an entirely valid exercise in getting the student to understand that, in the 1/√n rule, n is the batch size of the aggregated data, not the number of samples of the aggregates.

12. Dec 16, 2014

### Ray Vickson

Upon further thought, I agree with you. The problem's wording almost seems to be designed to confuse the student, and undoing the confusion may have some value.

13. Dec 16, 2014

### FredericChopin

Right. I seem to be getting the answer I was expecting. Or maybe Ray Vickson is right and my math is faulty. Have a look:

The variance of a sample mean is:

$$Var( \bar{X} ) = Var( \frac{1}{n} \sum_{i=1}^{n} X_{i})$$

Since there are 1000 sample means of 100 observations each:

$$Var(1000 \bar{X} ) = Var( (\frac{1}{100} \sum_{i=1}^{100} X_{i})*1000)$$

, which becomes:

$$Var( 1000 \bar{X} ) = Var( 10 \sum_{i=1}^{100} X_{i})$$

Using the theorem of the variance of a constant:

$$1000000 Var( \bar{X} ) = 100 Var( \sum_{i=1}^{100} X_{i})$$

$$1000000 Var( \bar{X} ) = 100 (Var(X_{1}) + Var(X_{2}) + ... + Var(X_{100}))$$

Assuming the random variables are independent and identically distributed:

$$1000000 Var( \bar{X} ) = 100 ( \sigma^2 + \sigma^2 + ... + \sigma^2)$$

$$1000000 Var( \bar{X} ) = 100 * 100\sigma^2$$

$$1000000 Var( \bar{X} ) = 10000 \sigma^2$$

$$Var( \bar{X} ) = \frac{\sigma^2}{100 }$$

And so the standard deviation of the sample means is:

$$\sqrt{ \frac{\sigma^2}{100}} = \sqrt{ \frac{\frac{1}{12}}{100 }} = \sqrt{ \frac{1}{12 * 100} } = \frac{1}{\sqrt{12 * 100}}$$

, which is the seemingly correct answer. But regardless of how many sample means there are, the variance of a sample mean will always be $\frac{ \sigma ^2}{n}$ due to the properties of the variance of a constant! (Because they will always cancel each other out!)

These are great explanations on their own without the math, which are made all the more clearer with the math (which I hope is right).

Thank you. Tell me what you think.