Standard deviation of m sample means of n observations each

In summary: So, the first theorem is inapplicable. The second theorem is inapplicable because W isn't a random variable.
  • #1
FredericChopin
101
0

Homework Statement


"Consider a standard uniform density. The mean for this density is .5 and the variance is 1 / 12. You sample 1,000 sample means where each sample mean is comprised of 100 observations. You take the standard deviation of the 1,000 sample means. About what number would you expect it to be?

A: [itex] \frac{1}{12} [/itex]

B: [itex]\frac{1}{ \sqrt{12*100} } [/itex]

C: [itex] \frac{1}{(12*100)} [/itex]

D: [itex]\frac{1}{ \sqrt{12*1000} } [/itex]"

Homework Equations


[tex]Var( \bar{X} ) = \frac{ \sigma ^{2} }{n} [/tex]

The Attempt at a Solution


If the variance of a sample mean with n observations is [itex]Var( \bar{X} ) = \frac{ \sigma ^{2} }{n} [/itex], then that would mean that the standard deviation of a sample mean with n observations is [itex] \sqrt{ \frac{ \sigma ^{2} }{n} } [/itex], which is [itex] \frac{ \sigma }{ \sqrt{n} } [/itex].

But in this problem, there are 1000 sample means each with 100 observations. In general, this is a question about finding the standard deviation of m sample means of n observations each.

I'm puzzled that there is no option for the answer that since there are a total of 1000*100 observations, the standard deviation is [itex] \sqrt{ \frac{ \frac{1}{12} }{1000*100} } [/itex]. So I tried the fact that there are 1000 sample means, so I answered [itex] \sqrt{ \frac{ \frac{1}{12} }{1000} } [/itex], but it got marked incorrect. Is it possible because there are 100 observations for each sample mean, regardless of how many there are, that the answer is [itex] \sqrt{ \frac{ \frac{1}{12} }{100} } [/itex]? And in that case, why are the previous two answers I gave incorrect?

Thank you.
 
Physics news on Phys.org
  • #2
FredericChopin said:

Homework Statement


"Consider a standard uniform density. The mean for this density is .5 and the variance is 1 / 12. You sample 1,000 sample means where each sample mean is comprised of 100 observations. You take the standard deviation of the 1,000 sample means. About what number would you expect it to be?

A: [itex] \frac{1}{12} [/itex]

B: [itex]\frac{1}{ \sqrt{12*100} } [/itex]

C: [itex] \frac{1}{(12*100)} [/itex]

D: [itex]\frac{1}{ \sqrt{12*1000} } [/itex]"

Homework Equations


[tex]Var( \bar{X} ) = \frac{ \sigma ^{2} }{n} [/tex]

The Attempt at a Solution


If the variance of a sample mean with n observations is [itex]Var( \bar{X} ) = \frac{ \sigma ^{2} }{n} [/itex], then that would mean that the standard deviation of a sample mean with n observations is [itex] \sqrt{ \frac{ \sigma ^{2} }{n} } [/itex], which is [itex] \frac{ \sigma }{ \sqrt{n} } [/itex].

But in this problem, there are 1000 sample means each with 100 observations. In general, this is a question about finding the standard deviation of m sample means of n observations each.

I'm puzzled that there is no option for the answer that since there are a total of 1000*100 observations, the standard deviation is [itex] \sqrt{ \frac{ \frac{1}{12} }{1000*100} } [/itex]. So I tried the fact that there are 1000 sample means, so I answered [itex] \sqrt{ \frac{ \frac{1}{12} }{1000} } [/itex], but it got marked incorrect. Is it possible because there are 100 observations for each sample mean, regardless of how many there are, that the answer is [itex] \sqrt{ \frac{ \frac{1}{12} }{100} } [/itex]? And in that case, why are the previous two answers I gave incorrect?

Thank you.

My answer would be "none of the above".
 
  • #3
Ray Vickson said:
My answer would be "none of the above".

And can you justify (preferably mathematically) your answer?
 
  • #4
FredericChopin said:
And can you justify (preferably mathematically) your answer?

Sorry: I misinterpreted the question. One of the answers listed is correct.
 
Last edited:
  • #5
Ray Vickson said:
Sorry: I misinterpreted the question. One of the answers listed is correct.

Well... yes, of course one of the answers listed is correct. :rolleyes:

I'm thinking that [itex] \frac{1}{ \sqrt{12*100} } [/itex] is correct based on my reasoning above. If it is the correct answer, why is it correct? And why is [itex] \frac{1}{ \sqrt{12*(1000*100)} } [/itex] and [itex]\frac{1}{ \sqrt{12*1000} }[/itex] incorrect?
 
Last edited:
  • #6
FredericChopin said:
Well... yes, of course one of the answers listed is correct.
Not always :(
FredericChopin said:
I'm thinking that [itex] \frac{1}{ \sqrt{12*100} } [/itex] is correct based on my reasoning above. If it is the correct answer, why is it correct? And why is [itex] \frac{1}{ \sqrt{12*(1000*100)} } [/itex] and [itex]\frac{1}{ \sqrt{12*1000} }[/itex] incorrect?
Let's back up to a simpler scenario. There is some random variable X with unknown variance. You take 1000 samples of it. How would you estimate the variance of X?
In this problem, what is X?
 
  • #7
FredericChopin said:

Homework Equations


[tex]Var( \bar{X} ) = \frac{ \sigma ^{2} }{n} [/tex]

You also need the theorems:

If [itex] {X1,X2,..X_n} [/itex] are uncorrelated random variables and [itex] Y [/itex] is the random variable [itex] Y = \sum_{i=1}^n X_i [/itex] then [itex] Var(Y) = \sum_{i=1}^n Var(X_i) [/itex].

and

If [itex] Y [/itex] is a random variable and [itex] k [/itex] is a constant and [itex] W [/itex] is the random variable [itex] W = kY [/itex] then [itex] Var(W) = k^2 Var(Y) [/itex].

You problem asks about the variance of the sample mean. The sample mean can be viewed as [itex] k = (1/100) [/itex] times the sum of 100 uncorrelated random variables.
 
Last edited:
  • #8
Dear Frederic,
I think your reasoning is just fine. If I read the exercise correctly, the 1000 just stands for 'a big enough number' so that your ##\sigma_m = {\sigma\over \sqrt n}## shows up well enough ('about') as the standard deviation of these means (of 100 observations each) .
 
  • #9
Stephen Tashi said:
You also need the theorems:

If [itex] {X1,X2,..X_n} [/itex] are uncorrelated random variables and [itex] Y [/itex] is the random variable [itex] Y = \sum_{i=1}^n X_i [/itex] then [itex] Var(Y) = \sum_{i=1}^n Var(X_i) [/itex].

and

If [itex] Y [/itex] is a random variable and [itex] k [/itex] is a constant and [itex] W [/itex] is the random variable [itex] W = kY [/itex] then [itex] Var(W) = k^2 Var(Y) [/itex].

You problem asks about the variance of the sample mean. The sample mean can be viewed as [itex] k = (1/100) [/itex] times the sum of 100 uncorrelated random variables.

I thought that at first, but that led to an incorrect answer that did not fit with what was actually asked (which, IMHO is a somewhat silly question). I think the simplest translation of the problem into plain English is: "what is the standard deviation of the mean of an independent sample of size 100 from the distribution U(0,1)". The 1000 samples really have nothing much to do with the problem.
 
  • #10
Ray Vickson said:
I think the simplest translation of the problem into plain English is: "what is the standard deviation of the mean of an independent sample of size 100 from the distribution U(0,1)". .

That's a good interpretation. "You take the standard deviation of the 1,000 sample means." implies we compute a single number. What does that number estimate? The sample standard deviation estimates the standard deviation of the random variable being sampled. The random variable being sampled is the mean of samples of size 100. If we assume the estimate is approximately correct, it should be near the actual mean of samples of size 100.

It's temping to extrapolate "You take the standard deviation of the 1,000 sample means" to imply further calculations such as "We estimate the standard deviation of a sample of size 1000 of those means by using ...[whatever information one might use] ".

Given how exercises are written, let's hope the skill of interpreting language is correlated with the accomplishment of acquiring new mathematical knowledge.
 
  • #11
Ray Vickson said:
"what is the standard deviation of the mean of an independent sample of size 100 from the distribution U(0,1)".
Yes, and that's what I was hoping to lead to with my post #6. Of course, it's not quite that simple since in principle there is the 1000/999 correction, but it serves to illustrate that with a large number of samples you can ignore that.
Ray Vickson said:
a somewhat silly question
i disagree. It strikes me as an entirely valid exercise in getting the student to understand that, in the 1/√n rule, n is the batch size of the aggregated data, not the number of samples of the aggregates.
 
  • #12
haruspex said:
Yes, and that's what I was hoping to lead to with my post #6. Of course, it's not quite that simple since in principle there is the 1000/999 correction, but it serves to illustrate that with a large number of samples you can ignore that.

i disagree. It strikes me as an entirely valid exercise in getting the student to understand that, in the 1/√n rule, n is the batch size of the aggregated data, not the number of samples of the aggregates.

Upon further thought, I agree with you. The problem's wording almost seems to be designed to confuse the student, and undoing the confusion may have some value.
 
  • #13
Stephen Tashi said:
You also need the theorems:

If [itex] {X1,X2,..X_n} [/itex] are uncorrelated random variables and [itex] Y [/itex] is the random variable [itex] Y = \sum_{i=1}^n X_i [/itex] then [itex] Var(Y) = \sum_{i=1}^n Var(X_i) [/itex].

and

If [itex] Y [/itex] is a random variable and [itex] k [/itex] is a constant and [itex] W [/itex] is the random variable [itex] W = kY [/itex] then [itex] Var(W) = k^2 Var(Y) [/itex].

You problem asks about the variance of the sample mean. The sample mean can be viewed as [itex] k = (1/100) [/itex] times the sum of 100 uncorrelated random variables.

Right. I seem to be getting the answer I was expecting. Or maybe Ray Vickson is right and my math is faulty. Have a look:

The variance of a sample mean is:

[tex]Var( \bar{X} ) = Var( \frac{1}{n} \sum_{i=1}^{n} X_{i}) [/tex]

Since there are 1000 sample means of 100 observations each:

[tex]Var(1000 \bar{X} ) = Var( (\frac{1}{100} \sum_{i=1}^{100} X_{i})*1000)[/tex]

, which becomes:

[tex]Var( 1000 \bar{X} ) = Var( 10 \sum_{i=1}^{100} X_{i})[/tex]

Using the theorem of the variance of a constant:

[tex]1000000 Var( \bar{X} ) = 100 Var( \sum_{i=1}^{100} X_{i}) [/tex]

[tex]1000000 Var( \bar{X} ) = 100 (Var(X_{1}) + Var(X_{2}) + ... + Var(X_{100}))[/tex]

Assuming the random variables are independent and identically distributed:

[tex]1000000 Var( \bar{X} ) = 100 ( \sigma^2 + \sigma^2 + ... + \sigma^2)[/tex]

[tex]1000000 Var( \bar{X} ) = 100 * 100\sigma^2[/tex]

[tex]1000000 Var( \bar{X} ) = 10000 \sigma^2[/tex]

[tex]Var( \bar{X} ) = \frac{\sigma^2}{100 }[/tex]

And so the standard deviation of the sample means is:

[tex] \sqrt{ \frac{\sigma^2}{100}} = \sqrt{ \frac{\frac{1}{12}}{100 }} = \sqrt{ \frac{1}{12 * 100} } = \frac{1}{\sqrt{12 * 100}} [/tex]

, which is the seemingly correct answer. But regardless of how many sample means there are, the variance of a sample mean will always be [itex]\frac{ \sigma ^2}{n}[/itex] due to the properties of the variance of a constant! (Because they will always cancel each other out!) :))

BvU said:
Dear Frederic,
I think your reasoning is just fine. If I read the exercise correctly, the 1000 just stands for 'a big enough number' so that your ##\sigma_m = {\sigma\over \sqrt n}## shows up well enough ('about') as the standard deviation of these means (of 100 observations each) .

Ray Vickson said:
I thought that at first, but that led to an incorrect answer that did not fit with what was actually asked (which, IMHO is a somewhat silly question). I think the simplest translation of the problem into plain English is: "what is the standard deviation of the mean of an independent sample of size 100 from the distribution U(0,1)". The 1000 samples really have nothing much to do with the problem.

Stephen Tashi said:
That's a good interpretation. "You take the standard deviation of the 1,000 sample means." implies we compute a single number. What does that number estimate? The sample standard deviation estimates the standard deviation of the random variable being sampled. The random variable being sampled is the mean of samples of size 100. If we assume the estimate is approximately correct, it should be near the actual mean of samples of size 100.

It's temping to extrapolate "You take the standard deviation of the 1,000 sample means" to imply further calculations such as "We estimate the standard deviation of a sample of size 1000 of those means by using ...[whatever information one might use] ".

Given how exercises are written, let's hope the skill of interpreting language is correlated with the accomplishment of acquiring new mathematical knowledge.

These are great explanations on their own without the math, which are made all the more clearer with the math (which I hope is right).

Thank you. Tell me what you think.
 

FAQ: Standard deviation of m sample means of n observations each

1. What is the purpose of calculating the standard deviation of m sample means of n observations each?

The purpose of calculating the standard deviation of m sample means of n observations each is to measure the variability or spread of the sample means around the overall mean. It is used as a measure of how much the individual sample means differ from the average of all the sample means.

2. How is the standard deviation of m sample means of n observations each calculated?

The standard deviation of m sample means of n observations each is calculated by taking the square root of the average of the squared differences between each sample mean and the overall mean. This value is also known as the root mean square deviation.

3. What does a high standard deviation of m sample means of n observations each indicate?

A high standard deviation of m sample means of n observations each indicates that the individual sample means are more spread out from the overall mean. This means that the data points in the samples vary widely from each other, making it difficult to draw conclusions about the overall population.

4. How does the sample size affect the standard deviation of m sample means of n observations each?

The sample size has an inverse relationship with the standard deviation of m sample means of n observations each. As the sample size increases, the standard deviation decreases. This is because a larger sample size provides more data points, resulting in a more accurate estimate of the overall mean and less variability between sample means.

5. Can the standard deviation of m sample means of n observations each be negative?

No, the standard deviation of m sample means of n observations each cannot be negative. It is always a positive value since it is calculated by taking the square root of the sum of squared differences. A negative value would not make sense in this context and is not a valid measure of variability.

Similar threads

Replies
3
Views
1K
2
Replies
39
Views
908
Replies
3
Views
3K
Replies
1
Views
1K
Replies
4
Views
2K
Replies
2
Views
1K
Replies
2
Views
2K
Replies
1
Views
2K
Back
Top