Estimator for variance when sampling without replacement

  • Thread starter Thread starter logarithmic
  • Start date Start date
  • Tags Tags
    Sampling Variance
logarithmic
Messages
103
Reaction score
0
Does anyone know the formula for an unbiased estimator of the population variance \frac{1}{n}\sum_{i=1}^{n}(x_i - \bar{x})^2 when taking r samples without replacement from a finite population \{x_1, \dots, x_n\} whose mean is \bar{x}?

A google search doesn't find anything useful other than the the special cases of when r = n the estimator is of course \frac{r-1}{r}s^2, where s^2 = \frac{1}{r-1}\sum_{i=1}^{r}(x_i - \bar{x})^2 which is of course the unbiased estimator when taking r samples with replacement.

I know that a (relatively) simple formula exists, I've seen it somewhere before, just don't remember where.
 
Last edited:
Physics news on Phys.org
MaxManus said:

Not that one. That's the distribution for the number of black balls drawn without replacement from a box with black and white balls. It isn't an estimator for the population variance.
 
logarithmic said:
Not that one. That's the distribution for the number of black balls drawn without replacement from a box with black and white balls. It isn't an estimator for the population variance.

But isn't that the distribution you described? And isn't the variance formula on the right table an estimator?
 
MaxManus said:
But isn't that the distribution you described? And isn't the variance formula on the right table an estimator?
Not really. I'm looking for an estimator, that is a function of the samples: f(X_1, ..., X_r) which itself is a random variable, such that E(f(X_1, ..., X_r)) = true population variance.

That variance formula isn't a random variable (i.e. it can't be an estimator), it's the variance of a certain random variable that counts. But a hypergeometric random variable isn't appropriate for measuring the count of the samples since, the number of samples is assumed to be fixed as r.
 
logarithmic said:
Does anyone know the formula for an unbiased estimator of the population variance \frac{1}{n}\sum_{i=1}^{n}(x_i - \bar{x})^2 when taking r samples without replacement from a finite population \{x_1, \dots, x_n\} whose mean is \bar{x}?

Hi logarithmic. With that definition of \bar{x} (which is actually the definition of the population mean) then the unbiased estimator of the population variance is simply,

\frac{1}{r}\sum_{i=1}^{r}(x_i - \bar{x})^2

However I think you really meant for \bar{x} to denote the sample mean of the "r" chosen samples rather than the population mean. In which case unbiased estimator is,

\left(\frac{n-1}{n}\right) \, \frac {1}{r-1}\sum_{i=1}^{r}(x_i - \bar{x})^2
 
There are two things I don't understand about this problem. First, when finding the nth root of a number, there should in theory be n solutions. However, the formula produces n+1 roots. Here is how. The first root is simply ##\left(r\right)^{\left(\frac{1}{n}\right)}##. Then you multiply this first root by n additional expressions given by the formula, as you go through k=0,1,...n-1. So you end up with n+1 roots, which cannot be correct. Let me illustrate what I mean. For this...

Similar threads

Back
Top