Why is/was consistency of estimators desired?

Stephen Tashi · Jul 30, 2012

Why is/was "consistency" of estimators desired?

In an article, I found while researching another thread ("Revisiting a 90-year-old debate: the advantages of the mean deviation", http://www.leeds.ac.uk/educol/documents/00003759.htm ), the author states this bit of statistics history:

Fisher had proposed that the quality of any statistic could be judged in terms of three characteristics. The statistic, and the population parameter that it represents, should be consistent (i.e. calculated in the same way for both sample and population). The statistic should be sufficient in the sense of summarising all of the relevant information to be gleaned from the sample about the population parameter. In addition, the statistic should be efficient in the sense of having the smallest probable error as an estimate of the population parameter. Both SD and MD meet the first two criteria (to the same extent). According to Fisher, it was in meeting the last criteria that SD proves superior.

I recognize the description of "sufficient" and "efficient" as modern criteria. But the description of "consistent" seems rather simple minded. Was the idea of "consistent" that if the estimator and the population parameter were calculated "in the same way" that the probability of the estimate being near true value of the parameter would approach 1.0 as the sample size approached infinity?

Number Nine · Jul 30, 2012

Was the idea of "consistent" that if the estimator and the population parameter were calculated "in the same way" that the probability of the estimate being near true value of the parameter would approach 1.0 as the sample size approached infinity?

I glanced at my old mathematical statistics textbook, and it defines a sequence of estimators (in most cases, taken as sample size increases) as consistent if it converges in probability to the true value of the parameter, which is the only way I remember ever seeing it defined anywhere. I assume there are some other equivalent definitions.

chiro · Jul 31, 2012

My understanding is pretty much the same as Number Nine's with the exception that it is quantified in terms of the variance converging to 0 for some estimator as the number of samples reaches the size of the population: in something like a census, this is finite but for a theoretical distribution, it's infinite.

In terms of things being calculated "the same way", it would seem that there would be some similarity between the population parameter and the estimated parameter's distribution since they are both based on the same underlying PDF, but I'd be interested to here any further comments on this.

I guess the only other thing though that I see as important is the actual nature of the convergence as opposed to the condition that convergence simply exists.

Typically the way this is looked at is in terms of how the variance changes with an increasing sample size, but I would think that it's equally important to see how P(X = x) changes as n -> infinity rather than how just the variance changes.

Stephen Tashi · Jul 31, 2012

I understand (or can understand if I read carefully) the modern definition of "consistency" for an estimator. My original post is mainly about the old fashioned definition of consistency that says the estimator must be computed "in the same way" as the parameter that it estimates.

(An interesting historical question is "When did the modern definition of consistency" supercede the old one?".)

I think the condition "in the same way" can be made precise by saying we compute a (old fashioned) consistent estimator for the parameter P by treating the sample as a population (i.e. as defining a distribution) and define the estimate by the same formula as we define the parameter P.

If that's what was meant in olden times, then technically the unbiased estimator for the variance of a Gaussian distribution was not consistent since it is not computed "in the same way" as the population parameter.

Why is/was consistency of estimators desired?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad Please Explain (actually explain) The Monty Hall Problem

Undergrad A variant of the Monty Hall problem

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad How do E[X] and E[|X|] relate?

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight