Estimator for variance when sampling without replacement

logarithmic · Feb 22, 2011

Does anyone know the formula for an unbiased estimator of the population variance [tex]\frac{1}{n}\sum_{i=1}^{n}(x_i - \bar{x})^2[/tex] when taking r samples without replacement from a finite population [tex]\{x_1, \dots, x_n\}[/tex] whose mean is [tex]\bar{x}[/tex]?

A google search doesn't find anything useful other than the the special cases of when r = n the estimator is of course [tex]\frac{r-1}{r}s^2[/tex], where [tex]s^2 = \frac{1}{r-1}\sum_{i=1}^{r}(x_i - \bar{x})^2[/tex] which is of course the unbiased estimator when taking r samples with replacement.

I know that a (relatively) simple formula exists, I've seen it somewhere before, just don't remember where.

MaxManus · Feb 22, 2011

Is it hypergeometric distribution?
http://en.wikipedia.org/wiki/Hypergeometric_distribution

logarithmic · Feb 23, 2011

MaxManus said:

Is it hypergeometric distribution?
http://en.wikipedia.org/wiki/Hypergeometric_distribution

Not that one. That's the distribution for the number of black balls drawn without replacement from a box with black and white balls. It isn't an estimator for the population variance.

MaxManus · Feb 23, 2011

logarithmic said:

Not that one. That's the distribution for the number of black balls drawn without replacement from a box with black and white balls. It isn't an estimator for the population variance.

But isn't that the distribution you described? And isn't the variance formula on the right table an estimator?

logarithmic · Feb 23, 2011

MaxManus said:

But isn't that the distribution you described? And isn't the variance formula on the right table an estimator?

Not really. I'm looking for an estimator, that is a function of the samples: f(X_1, ..., X_r) which itself is a random variable, such that E(f(X_1, ..., X_r)) = true population variance.

That variance formula isn't a random variable (i.e. it can't be an estimator), it's the variance of a certain random variable that counts. But a hypergeometric random variable isn't appropriate for measuring the count of the samples since, the number of samples is assumed to be fixed as r.

uart · Feb 25, 2011

logarithmic said:

Does anyone know the formula for an unbiased estimator of the population variance [tex]\frac{1}{n}\sum_{i=1}^{n}(x_i - \bar{x})^2[/tex] when taking r samples without replacement from a finite population [tex]\{x_1, \dots, x_n\}[/tex] whose mean is [tex]\bar{x}[/tex]?

Hi logarithmic. With that definition of [itex]\bar{x}[/itex] (which is actually the definition of the population mean) then the unbiased estimator of the population variance is simply,

[tex]\frac{1}{r}\sum_{i=1}^{r}(x_i - \bar{x})^2[/tex]

However I think you really meant for [itex]\bar{x}[/itex] to denote the sample mean of the "r" chosen samples rather than the population mean. In which case unbiased estimator is,

[tex]\left(\frac{n-1}{n}\right) \, \frac {1}{r-1}\sum_{i=1}^{r}(x_i - \bar{x})^2[/tex]

Estimator for variance when sampling without replacement

1. What is an estimator for variance when sampling without replacement?

2. How is an estimator for variance calculated when sampling without replacement?

3. What are the assumptions made when using an estimator for variance when sampling without replacement?

4. How accurate is an estimator for variance when sampling without replacement?

5. What are some advantages of using an estimator for variance when sampling without replacement?

Similar threads

Hot Threads

Recent Insights