Standard deviation of m sample means of n observations each

FredericChopin · Dec 15, 2014

Homework Statement

"Consider a standard uniform density. The mean for this density is .5 and the variance is 1 / 12. You sample 1,000 sample means where each sample mean is comprised of 100 observations. You take the standard deviation of the 1,000 sample means. About what number would you expect it to be?

A: [itex]\frac{1}{12}[/itex]

B: [itex]\frac{1}{ \sqrt{12*100} }[/itex]

C: [itex]\frac{1}{(12*100)}[/itex]

D: [itex]\frac{1}{ \sqrt{12*1000} }[/itex]"

Homework Equations

[tex]Var( \bar{X} ) = \frac{ \sigma ^{2} }{n}[/tex]

The Attempt at a Solution

If the variance of a sample mean with n observations is [itex]Var( \bar{X} ) = \frac{ \sigma ^{2} }{n}[/itex], then that would mean that the standard deviation of a sample mean with n observations is [itex]\sqrt{ \frac{ \sigma ^{2} }{n} }[/itex], which is [itex]\frac{ \sigma }{ \sqrt{n} }[/itex].

But in this problem, there are 1000 sample means each with 100 observations. In general, this is a question about finding the standard deviation of m sample means of n observations each.

I'm puzzled that there is no option for the answer that since there are a total of 1000*100 observations, the standard deviation is [itex]\sqrt{ \frac{ \frac{1}{12} }{1000*100} }[/itex]. So I tried the fact that there are 1000 sample means, so I answered [itex]\sqrt{ \frac{ \frac{1}{12} }{1000} }[/itex], but it got marked incorrect. Is it possible because there are 100 observations for each sample mean, regardless of how many there are, that the answer is [itex]\sqrt{ \frac{ \frac{1}{12} }{100} }[/itex]? And in that case, why are the previous two answers I gave incorrect?

Thank you.

Ray Vickson · Dec 15, 2014

FredericChopin said:

Homework Statement

"Consider a standard uniform density. The mean for this density is .5 and the variance is 1 / 12. You sample 1,000 sample means where each sample mean is comprised of 100 observations. You take the standard deviation of the 1,000 sample means. About what number would you expect it to be?

A: [itex]\frac{1}{12}[/itex]

B: [itex]\frac{1}{ \sqrt{12*100} }[/itex]

C: [itex]\frac{1}{(12*100)}[/itex]

D: [itex]\frac{1}{ \sqrt{12*1000} }[/itex]"

Homework Equations

[tex]Var( \bar{X} ) = \frac{ \sigma ^{2} }{n}[/tex]

The Attempt at a Solution

If the variance of a sample mean with n observations is [itex]Var( \bar{X} ) = \frac{ \sigma ^{2} }{n}[/itex], then that would mean that the standard deviation of a sample mean with n observations is [itex]\sqrt{ \frac{ \sigma ^{2} }{n} }[/itex], which is [itex]\frac{ \sigma }{ \sqrt{n} }[/itex].

But in this problem, there are 1000 sample means each with 100 observations. In general, this is a question about finding the standard deviation of m sample means of n observations each.

I'm puzzled that there is no option for the answer that since there are a total of 1000*100 observations, the standard deviation is [itex]\sqrt{ \frac{ \frac{1}{12} }{1000*100} }[/itex]. So I tried the fact that there are 1000 sample means, so I answered [itex]\sqrt{ \frac{ \frac{1}{12} }{1000} }[/itex], but it got marked incorrect. Is it possible because there are 100 observations for each sample mean, regardless of how many there are, that the answer is [itex]\sqrt{ \frac{ \frac{1}{12} }{100} }[/itex]? And in that case, why are the previous two answers I gave incorrect?

Thank you.

My answer would be "none of the above".

FredericChopin · Dec 15, 2014

Ray Vickson said:

My answer would be "none of the above".

And can you justify (preferably mathematically) your answer?

Ray Vickson · Dec 15, 2014

FredericChopin said:

And can you justify (preferably mathematically) your answer?

Sorry: I misinterpreted the question. One of the answers listed is correct.

FredericChopin · Dec 15, 2014

Ray Vickson said:

Sorry: I misinterpreted the question. One of the answers listed is correct.

Well... yes, of course one of the answers listed is correct.

I'm thinking that [itex]\frac{1}{ \sqrt{12*100} }[/itex] is correct based on my reasoning above. If it is the correct answer, why is it correct? And why is [itex]\frac{1}{ \sqrt{12*(1000*100)} }[/itex] and [itex]\frac{1}{ \sqrt{12*1000} }[/itex] incorrect?

haruspex · Dec 15, 2014

FredericChopin said:

Well... yes, of course one of the answers listed is correct.

Not always :(

FredericChopin said:

I'm thinking that [itex]\frac{1}{ \sqrt{12*100} }[/itex] is correct based on my reasoning above. If it is the correct answer, why is it correct? And why is [itex]\frac{1}{ \sqrt{12*(1000*100)} }[/itex] and [itex]\frac{1}{ \sqrt{12*1000} }[/itex] incorrect?

Let's back up to a simpler scenario. There is some random variable X with unknown variance. You take 1000 samples of it. How would you estimate the variance of X?
In this problem, what is X?

Stephen Tashi · Dec 15, 2014

FredericChopin said:

Homework Equations

[tex]Var( \bar{X} ) = \frac{ \sigma ^{2} }{n}[/tex]

You also need the theorems:

If [itex]{X1,X2,..X_n}[/itex] are uncorrelated random variables and [itex]Y[/itex] is the random variable [itex]Y = \sum_{i=1}^n X_i[/itex] then [itex]Var(Y) = \sum_{i=1}^n Var(X_i)[/itex].

and

If [itex]Y[/itex] is a random variable and [itex]k[/itex] is a constant and [itex]W[/itex] is the random variable [itex]W = kY[/itex] then [itex]Var(W) = k^2 Var(Y)[/itex].

You problem asks about the variance of the sample mean. The sample mean can be viewed as [itex]k = (1/100)[/itex] times the sum of 100 uncorrelated random variables.

BvU · Dec 16, 2014

Dear Frederic,
I think your reasoning is just fine. If I read the exercise correctly, the 1000 just stands for 'a big enough number' so that your ##\sigma_m = {\sigma\over \sqrt n}## shows up well enough ('about') as the standard deviation of these means (of 100 observations each) .

Ray Vickson · Dec 16, 2014

Stephen Tashi said:

You also need the theorems:

If [itex]{X1,X2,..X_n}[/itex] are uncorrelated random variables and [itex]Y[/itex] is the random variable [itex]Y = \sum_{i=1}^n X_i[/itex] then [itex]Var(Y) = \sum_{i=1}^n Var(X_i)[/itex].

and

If [itex]Y[/itex] is a random variable and [itex]k[/itex] is a constant and [itex]W[/itex] is the random variable [itex]W = kY[/itex] then [itex]Var(W) = k^2 Var(Y)[/itex].

You problem asks about the variance of the sample mean. The sample mean can be viewed as [itex]k = (1/100)[/itex] times the sum of 100 uncorrelated random variables.

I thought that at first, but that led to an incorrect answer that did not fit with what was actually asked (which, IMHO is a somewhat silly question). I think the simplest translation of the problem into plain English is: "what is the standard deviation of the mean of an independent sample of size 100 from the distribution U(0,1)". The 1000 samples really have nothing much to do with the problem.

Stephen Tashi · Dec 16, 2014

Ray Vickson said:

I think the simplest translation of the problem into plain English is: "what is the standard deviation of the mean of an independent sample of size 100 from the distribution U(0,1)". .

That's a good interpretation. "You take the standard deviation of the 1,000 sample means." implies we compute a single number. What does that number estimate? The sample standard deviation estimates the standard deviation of the random variable being sampled. The random variable being sampled is the mean of samples of size 100. If we assume the estimate is approximately correct, it should be near the actual mean of samples of size 100.

It's temping to extrapolate "You take the standard deviation of the 1,000 sample means" to imply further calculations such as "We estimate the standard deviation of a sample of size 1000 of those means by using ...[whatever information one might use] ".

Given how exercises are written, let's hope the skill of interpreting language is correlated with the accomplishment of acquiring new mathematical knowledge.

haruspex · Dec 16, 2014

Ray Vickson said:

"what is the standard deviation of the mean of an independent sample of size 100 from the distribution U(0,1)".

Yes, and that's what I was hoping to lead to with my post #6. Of course, it's not quite that simple since in principle there is the 1000/999 correction, but it serves to illustrate that with a large number of samples you can ignore that.

Ray Vickson said:

a somewhat silly question

i disagree. It strikes me as an entirely valid exercise in getting the student to understand that, in the 1/√n rule, n is the batch size of the aggregated data, not the number of samples of the aggregates.

Ray Vickson · Dec 16, 2014

haruspex said:

Yes, and that's what I was hoping to lead to with my post #6. Of course, it's not quite that simple since in principle there is the 1000/999 correction, but it serves to illustrate that with a large number of samples you can ignore that.

i disagree. It strikes me as an entirely valid exercise in getting the student to understand that, in the 1/√n rule, n is the batch size of the aggregated data, not the number of samples of the aggregates.

Upon further thought, I agree with you. The problem's wording almost seems to be designed to confuse the student, and undoing the confusion may have some value.

FredericChopin · Dec 16, 2014

Stephen Tashi said:

You also need the theorems:

If [itex]{X1,X2,..X_n}[/itex] are uncorrelated random variables and [itex]Y[/itex] is the random variable [itex]Y = \sum_{i=1}^n X_i[/itex] then [itex]Var(Y) = \sum_{i=1}^n Var(X_i)[/itex].

and

If [itex]Y[/itex] is a random variable and [itex]k[/itex] is a constant and [itex]W[/itex] is the random variable [itex]W = kY[/itex] then [itex]Var(W) = k^2 Var(Y)[/itex].

You problem asks about the variance of the sample mean. The sample mean can be viewed as [itex]k = (1/100)[/itex] times the sum of 100 uncorrelated random variables.

Right. I seem to be getting the answer I was expecting. Or maybe Ray Vickson is right and my math is faulty. Have a look:

The variance of a sample mean is:

[tex]Var( \bar{X} ) = Var( \frac{1}{n} \sum_{i=1}^{n} X_{i})[/tex]

Since there are 1000 sample means of 100 observations each:

[tex]Var(1000 \bar{X} ) = Var( (\frac{1}{100} \sum_{i=1}^{100} X_{i})*1000)[/tex]

, which becomes:

[tex]Var( 1000 \bar{X} ) = Var( 10 \sum_{i=1}^{100} X_{i})[/tex]

Using the theorem of the variance of a constant:

[tex]1000000 Var( \bar{X} ) = 100 Var( \sum_{i=1}^{100} X_{i})[/tex]

[tex]1000000 Var( \bar{X} ) = 100 (Var(X_{1}) + Var(X_{2}) + ... + Var(X_{100}))[/tex]

Assuming the random variables are independent and identically distributed:

[tex]1000000 Var( \bar{X} ) = 100 ( \sigma^2 + \sigma^2 + ... + \sigma^2)[/tex]

[tex]1000000 Var( \bar{X} ) = 100 * 100\sigma^2[/tex]

[tex]1000000 Var( \bar{X} ) = 10000 \sigma^2[/tex]

[tex]Var( \bar{X} ) = \frac{\sigma^2}{100 }[/tex]

And so the standard deviation of the sample means is:

[tex]\sqrt{ \frac{\sigma^2}{100}} = \sqrt{ \frac{\frac{1}{12}}{100 }} = \sqrt{ \frac{1}{12 * 100} } = \frac{1}{\sqrt{12 * 100}}[/tex]

, which is the seemingly correct answer. But regardless of how many sample means there are, the variance of a sample mean will always be [itex]\frac{ \sigma ^2}{n}[/itex] due to the properties of the variance of a constant! (Because they will always cancel each other out!) :))

BvU said:

Dear Frederic,
I think your reasoning is just fine. If I read the exercise correctly, the 1000 just stands for 'a big enough number' so that your ##\sigma_m = {\sigma\over \sqrt n}## shows up well enough ('about') as the standard deviation of these means (of 100 observations each) .

Ray Vickson said:

I thought that at first, but that led to an incorrect answer that did not fit with what was actually asked (which, IMHO is a somewhat silly question). I think the simplest translation of the problem into plain English is: "what is the standard deviation of the mean of an independent sample of size 100 from the distribution U(0,1)". The 1000 samples really have nothing much to do with the problem.

Stephen Tashi said:

That's a good interpretation. "You take the standard deviation of the 1,000 sample means." implies we compute a single number. What does that number estimate? The sample standard deviation estimates the standard deviation of the random variable being sampled. The random variable being sampled is the mean of samples of size 100. If we assume the estimate is approximately correct, it should be near the actual mean of samples of size 100.

It's temping to extrapolate "You take the standard deviation of the 1,000 sample means" to imply further calculations such as "We estimate the standard deviation of a sample of size 1000 of those means by using ...[whatever information one might use] ".

Given how exercises are written, let's hope the skill of interpreting language is correlated with the accomplishment of acquiring new mathematical knowledge.

These are great explanations on their own without the math, which are made all the more clearer with the math (which I hope is right).

Thank you. Tell me what you think.

Standard deviation of m sample means of n observations each

Homework Help Overview

Discussion Character

Approaches and Questions Raised

Discussion Status

Contextual Notes

Homework Statement

Homework Equations

The Attempt at a Solution

Homework Statement

Homework Equations

The Attempt at a Solution

Homework Equations

Similar threads

Polar integral

Hi! Can someone explain about Differential Equations?

Deriving spatial derivatives

Is this the correct general solution of the given PDE?

J_1(x) = (x^2/10)*(J_1(x) + J_3(x)) How to solve?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect