Is Unbiasedness in Estimation Truly Reflective of Population Parameters?

fog37 · Mar 29, 2023

Hello (again).

I have a basic question about standard error and unbiased estimators.

Let's say we have a population and with a certain mean height and a corresponding variance. We can never know these two parameters, the mean and the variance, and we can only estimate them. Certainly, the more accurate the estimates, the better.

To achieve that, we want out estimators to be unbiased so the expectation value of the mean "tends" to the actual population mean. This unbiasdness is rooted in the theoretical idea that if we took many many many many samples and calculated their mean and created the sample distribution of the means, the mean of the means would be the actual population mean. However, the sample means are all different from each other, some are close to the true mean some are very off...The standard error of the sample mean is essentially the variance of that normal sampling distribution telling us how much the various sample means differ from each other.

That said, assuming it is correct, we only work with a single sample of size ##n## and have a single sample mean whose value could still be very very far from the actual mean, i.e. we could be very off! Isn't that a problem? The idea that ##E[sample mean]=true mean## seems very abstract. I know that, in statistics, we always have not choice but to deal with uncertainty...I guess knowing that ##E[sample mean]=true mean## gives us a little more confidence that our sample statistics is a decent result? It is a better type of uncertainty than other uncertainties...

The same idea applies to ##95% ##confidence intervals: if we took a million samples and calculated their ##CI##, the true population means would be captured inside 95% of those sample CI intervals. That is an interesting result but, working with a single sample as we always do, it may be well possible that our constructed ##CI## does not contain the true population parameter!

Thank you!

Dale · Mar 29, 2023

Yes, what you say is correct.

However, it is not the only way to look at these issues. Instead of considering that there is a "true mean" at all you can instead consider the population mean itself to be a random variable. That is Bayesian probability.

fog37 · Mar 29, 2023

Dale said:

Yes, what you say is correct.

However, it is not the only way to look at these issues. Instead of considering that there is a "true mean" at all you can instead consider the population mean itself to be a random variable. That is Bayesian probability.

I see. I am not there yet to get into Bayesian probability :)

Another thing I am wondering:
in statistics, lots of sophisticated statistical tests are run to check for key assumptions being verified or not. For example, when we create a model using OLS, the estimates can be not trustworthy if the assumptions about the residuals are not met...

In machine learning, working with lots of data, the main concern are the performance metrics assessing how good the model is....It is all about prediction and less about inference, correct? Does that mean that all those statistical test are less of a concern when the amount of data is large and the goal is prediction?

thanks!

FactChecker · Mar 29, 2023

fog37 said:

The idea that ##E[sample mean]=true mean## seems very abstract.

To me, it doesn't seem abstract to say that the formula we use has the same expected value as the correct answer.

fog37 said:

working with a single sample as we always do, it may be well possible that our constructed ##CI## does not contain the true population parameter!

True, but you have 95% confidence that the ##CI_{95}## contains the true value. And if you want more confidence, you can work with other common confidence levels like 97.5% or 99%, which just expands the intervals to cover more.

In estimating the variance, ##\sigma^2##, of a random variable, you might consider ##\sum {(X_i-\mu)^2}/n##, ##\sum {(X_i-\bar X)^2}/n##, or ##\sum {(X_i-\bar X)^2}/(n-1)##.
1) ##\sum {(X_i-\mu)^2}/n## is unbiased, but it requires knowledge of the true population mean, ##\mu##.

2) ##\sum {(X_i-\bar X)^2}/n## uses the sample mean to estimate ##\mu##, but it is biased. It gives an estimate of the variance that tends to be low. The way to see this is to realize that each ##(X_i - \bar X)^2## is a little small because the value of ##X_i## has drawn ##\bar X## toward it, which wouldn't happen with ##(X_i-\mu)^2##.

3) ##\sum {(X_i-\bar X)^2}/(n-1)## does not need the knowledge of ##\mu## and it exactly corrects for the problem of ##\sum {(X_i-\bar X)^2}/n## tending to be too small. It is unbiased.

Dale · Mar 29, 2023

fog37 said:

lots of sophisticated statistical tests are run to check for key assumptions being verified or not.

I would say that lots of sophisticated statistical tests should be run to check. Unfortunately, a lot of time people just plug in their data and look for a p-value.

Hornbein · Mar 30, 2023

fog37 said:

The same idea applies to ##95% ##confidence intervals: if we took a million samples and calculated their ##CI##, the true population means would be captured inside 95% of those sample CI intervals. That is an interesting result but, working with a single sample as we always do, it may be well possible that our constructed ##CI## does not contain the true population parameter!

Thank you!

You may choose any confidence level you like. 99.9999% is quite possible.

Dale · Mar 30, 2023

Or you can just construct the confidence interval ##(-\infty,\infty)## and get 100% confidence.

Stephen Tashi · Mar 30, 2023

fog37 said:

That said, assuming it is correct, we only work with a single sample of size ##n## and have a single sample mean whose value could still be very very far from the actual mean, i.e. we could be very off! Isn't that a problem? The idea that ##E[sample mean]=true mean## seems very abstract. I know that, in statistics, we always have not choice but to deal with uncertainty...I guess knowing that ##E[sample mean]=true mean## gives us a little more confidence that our sample statistics is a decent result?

Yes, there is a fundamental problem in conceptualizing statistics and probability theory. We seek to develop mathematics in careful way so that its theorems are true - i.e. certainly true. So what happens when the subject matter of the mathematics is supposed to represent something that is uncertain?

The bottom line is that is that probability theory (when interpreted as uncertainty about whether something happens) only tells you things abut probabilities. It has no theorems that tell you that something is certain to happen. (The closest you get to that type of result are theorems about the limits of probabilities approaching 1.) The theorems of probability theory have the form: if the probability of something is such-and-such then the probability of this other thing is so-and-so. You can see this pattern in sampling statistics. A sample statistic of a random variable is itself a random variable, so the sample statistic has a distribution. The distribution of sample statistic has its own sample statistics and they have their own distributions - etc.

So applying probability theory depends on the science of whatever subject you are applying it too. There are no theorems in probability theory that gurantee the correctness of using it in a particular way in all possible practical situations. Things like the familiar numbers used for "statistical significance" are not consequences of mathematical theorems. They've come into use because they've been empirically useful in many situations.

Hornbein · Mar 31, 2023

Stephen Tashi said:

So applying probability theory depends on the science of whatever subject you are applying it too. There are no theorems in probability theory that gurantee the correctness of using it in a particular way in all possible practical situations. Things like the familiar numbers used for "statistical significance" are not consequences of mathematical theorems. They've come into use because they've been empirically useful in many situations.

I would like to point out that it is true of all applied mathematics. Physics was all applied mathematics until string theory emerged.

The challenge of General Relativity wasn't the math, it lay in convincing that the mathematics corresponded to the real world.

statdad · Apr 14, 2023

fog37 said:

TL;DR Summary: Unbiasdness of estimates

Hello (again).

I have a basic question about standard error and unbiased estimators.

Let's say we have a population and with a certain mean height and a corresponding variance. We can never know these two parameters, the mean and the variance, and we can only estimate them. Certainly, the more accurate the estimates, the better.

To achieve that, we want out estimators to be unbiased so the expectation value of the mean "tends" to the actual population mean. This unbiasdness is rooted in the theoretical idea that if we took many many many many samples and calculated their mean and created the sample distribution of the means, the mean of the means would be the actual population mean. However, the sample means are all different from each other, some are close to the true mean some are very off...The standard error of the sample mean is essentially the variance of that normal sampling distribution telling us how much the various sample means differ from each other.

That said, assuming it is correct, we only work with a single sample of size ##n## and have a single sample mean whose value could still be very very far from the actual mean, i.e. we could be very off! Isn't that a problem? The idea that ##E[sample mean]=true mean## seems very abstract. I know that, in statistics, we always have not choice but to deal with uncertainty...I guess knowing that ##E[sample mean]=true mean## gives us a little more confidence that our sample statistics is a decent result? It is a better type of uncertainty than other uncertainties...

The same idea applies to ##95% ##confidence intervals: if we took a million samples and calculated their ##CI##, the true population means would be captured inside 95% of those sample CI intervals. That is an interesting result but, working with a single sample as we always do, it may be well possible that our constructed ##CI## does not contain the true population parameter!

Thank you!

A slightly different comment here. From your post, this

"we want out estimators to be unbiased so the expectation value of the mean "tends" to the actual population mean."
confuses two things. When we talk about an estimator being unbiased we mean that when the expected value of the estimator is taken that expected value is the parameter you're estimating, and this is true regardless of the sample size: it is a property of the "structure" of the estimator, not any particular sample size.
Saying "the expectation value of the mean "tends" to the actual population mean" isn't the same thing. This is "asymptotically unbiased": here the estimator may not be unbiased but the sequence of expectations of the estimator converges (as n tends to infinity) to the target parameter. This is an asymptotic property, a limiting property.
I'll also comment that, occasionally, we're willing to accept using an estimator with a small amount of bias if its standard error is significantly lower than the standard error of an unbiased estimator.

Is Unbiasedness in Estimation Truly Reflective of Population Parameters?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Graduate Probability puzzle

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Undergrad Understanding permutations and combinations in a coin toss experiment

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect