Register to reply

Variances of samples

by georg gill
Tags: samples, variances
Share this thread:
georg gill
#1
Mar5-12, 01:15 PM
P: 100
https://onlinecourses.science.psu.edu/stat414/node/167

in the link they prove variance of samples. It is a bit long to write it all here so I hope one can read link. Main problem for me i will write here. It starts with:

[tex]Var(\frac{X_1+X_2...X_n}{n})[/tex]

they write n here because it is n samples to get mean of samples. But later in the text they say :

If the instructor had taken larger samples of students, she would have seen less variability in the samples she was obtaining. That is a good thing , but of course, in general, the costs of research studies no doubt increase as the sample size n increases

it seems they are refering to n as number of elements in each sample but from proof it seems to be number of samples. I dont get this. And they also assume that variance is the same for all samples (the second final step) but the example refered to about the teacher which is on the page before in the link have differnt variances.

From examples in my text book it also seems that n is number of tests in a sample not the number of samples which i thought would make sense from this proof
Phys.Org News Partner Science news on Phys.org
What lit up the universe?
Sheepdogs use just two simple rules to round up large herds of sheep
Animals first flex their muscles
mathman
#2
Mar5-12, 03:29 PM
Sci Advisor
P: 6,065
Xk is a sample from a population. n is the number of samples. All samples are from the same population, so the variance will be the same for each sample.
georg gill
#3
Mar5-12, 03:44 PM
P: 100
Quote Quote by mathman View Post
Xk is a sample from a population. n is the number of samples. All samples are from the same population, so the variance will be the same for each sample.

Here is an example from my book where they use n as number of elements in a sample to calculate standard deviation:

http://bildr.no/view/1124676

I just cant see how that is possible friom the derivation given in the link in my first post

Stephen Tashi
#4
Mar5-12, 08:06 PM
Sci Advisor
P: 3,295
Variances of samples

Quote Quote by georg gill View Post
I just cant see how that is possible friom the derivation given in the link in my first post
How what is possible?

The term sample can refer to a single realization of a random variable or it can refer to a set of realizations of a random variable, so, as data, a "sample" may be one number or a set of numbers. When the sample consists of more than one number, the number of numbers in the sample is the "sample size".

When the sample size is n, the computation of the sample mean involves dividing by n. The link you gave in this post is an example about the distribution of the sample mean. Note the bar over the X in that example. This is the notation for the sample mean.
georg gill
#5
Mar6-12, 01:21 AM
P: 100
Quote Quote by Stephen Tashi View Post
How what is possible?

The term sample can refer to a single realization of a random variable or it can refer to a set of realizations of a random variable, so, as data, a "sample" may be one number or a set of numbers. When the sample consists of more than one number, the number of numbers in the sample is the "sample size".

When the sample size is n, the computation of the sample mean involves dividing by n. The link you gave in this post is an example about the distribution of the sample mean. Note the bar over the X in that example. This is the notation for the sample mean.
Say you have n samples with one in each:
You could say that they have the same variance (does not make sense to me but).
If you assume that you could calculate them from formula for variance of samples:


[tex]\frac{\sigma^2}{n}[/tex] (a)

Where n is the number of groups of samples consisting of one in each

But since this is n different samples with only one in each one could have found variation between samples (since each sample would be the same as one attempt I would have thought) as:

[tex]\Sigma_{x=1}^{n}\frac{1}{n-1}(X_i-\mu)^2[/tex] (b)

(I used n-1 bcause variance could be biased I am not sure if anything is more correct then any other here)

for example here they use (b):

http://bildr.no/view/1124823

But these two are not the same. How does this work one will get different Z from them in approximating normal distribution

If I were to solve this as an assignment I would assume only one variance would be correct. This confuses me.
Stephen Tashi
#6
Mar6-12, 02:40 AM
Sci Advisor
P: 3,295
My analysis of your difficulty is that your don't use language precisely. In the first place, one must distinguish among three different terms involving the word "variance".

1) There is the "variance of a random variable". (If we speak of the sample mean as a random variable, its variance is computed by an expression involving the integral of a probability density, or, for discrete distribution, by a sum involving the probabilities in the distribution).

2) There is the "variance of a sample" when we speak of the sample as a particular set of numbers. Let [itex] \bar{X} [/itex] be the numerical value of the sample mean. Textbooks vary in how they define the variance of a sample. Some define the variance of [itex] n [/itex] numbers [itex] \{ X_1, X_2,...X_n \}[/itex]
to be [itex] \frac { \sum_{i=1}^n (X_i - \bar{X})^2}{n} [/itex] and some define it to be [itex] \frac { \sum_{i=1}^n (X_i- \bar{X})^2}{n-1} [/itex]

3) There is the "unbiased estimator of the population variance". This is a function of the sample values. It is the function [itex] \frac { \sum_{i=1}^n (X_i - \bar{X})^2}{n-1} [/itex] (Note that an "estimator" is technically a function, not a single number. When we have a particular sample, we can substitute particular numbers into the formula for the estimator and get a particular estimate.)

It should be clear that the mean of a sample of a population might not be the mean of the population. Likewise the the variance of a sample may not be the variance of the population. Likewise an estimate produced by the unbiased estimator of the population variance may fail to equal the population variance.

All three of the above things are dealt with in the various links you gave. Your confusion comes from the fact that you think all the various links are referring to only one single concept of "variance".
georg gill
#7
Mar6-12, 10:53 AM
P: 100
Quote Quote by Stephen Tashi View Post
My analysis of your difficulty is that your don't use language precisely. In the first place, one must distinguish among three different terms involving the word "variance".

1) There is the "variance of a random variable". (If we speak of the sample mean as a random variable, its variance is computed by an expression involving the integral of a probability density, or, for discrete distribution, by a sum involving the probabilities in the distribution).

2) There is the "variance of a sample" when we speak of the sample as a particular set of numbers. Let [itex] \bar{X} [/itex] be the numerical value of the sample mean. Textbooks vary in how they define the variance of a sample. Some define the variance of [itex] n [/itex] numbers [itex] \{ X_1, X_2,...X_n \}[/itex]
to be [itex] \frac { \sum_{i=1}^n (X_i - \bar{X})^2}{n} [/itex] and some define it to be [itex] \frac { \sum_{i=1}^n (X_i- \bar{X})^2}{n-1} [/itex]

3) There is the "unbiased estimator of the population variance". This is a function of the sample values. It is the function [itex] \frac { \sum_{i=1}^n (X_i - \bar{X})^2}{n-1} [/itex] (Note that an "estimator" is technically a function, not a single number. When we have a particular sample, we can substitute particular numbers into the formula for the estimator and get a particular estimate.)

It should be clear that the mean of a sample of a population might not be the mean of the population. Likewise the the variance of a sample may not be the variance of the population. Likewise an estimate produced by the unbiased estimator of the population variance may fail to equal the population variance.

All three of the above things are dealt with in the various links you gave. Your confusion comes from the fact that you think all the various links are referring to only one single concept of "variance".

I guess my understanding is vague.


http://bildr.no/view/1124676 (b)


But I think I got it now. Here is an example that makes the difference a bit clearer. This is normal distribution for one element compared to many in (b):

http://bildr.no/view/1125322 (c)

but what i do not get is this definition from central limit theroem

http://bildr.no/view/1125334 (d)

the part i dont get is as [tex]n \rightarrow \infty[/tex] , is the standard normal distribution n(z;0,1)

is that possible to prove?
Stephen Tashi
#8
Mar6-12, 07:25 PM
Sci Advisor
P: 3,295
Quote Quote by georg gill View Post

is that possible to prove?
Yes, but I'm not saying I can do the proof!

It's an interesting problem in itself just to define what it means for a sequence of distributions to approach another distribution.


Register to reply

Related Discussions
Expression for the variances Introductory Physics Homework 0
Sample variances & ANOVA: How different is too different? Set Theory, Logic, Probability, Statistics 3
Paired samples-equality of variance and 95% CI around difference in variances Set Theory, Logic, Probability, Statistics 0
Variance of Variances and Space General Discussion 1