To find mean wedivide the sum of no.s with 'n'for variance why 'n-1'

dexterdev · Aug 16, 2013

Hi all,
For finding average we take the sum of sequence numbers and divide by the number of elements. Why for variance this changes to number of elements minus 1.

-Devanand T

Stephen Tashi · Aug 16, 2013

The definition of the formula for the "sample variance" varies from book to book. Many books prefer the formula that uses division by n-1 because this is also the formula for the "unbiased estimator" of the population variance. Do you know about the definition of an "unbiased" estimator? - or the definition of an "estimator" for that matter?

chiro · Aug 16, 2013

To add to what Stephen Tashi said, there are general formulas for an arbitrary number of degrees of freedom.

If you do enough statistics you'll see n-1, n-2, n-4 instead of n-1 and the reason is because you are estimating a quantity using p known parameters which is why you divide by n-p to get un-biased estimator.

You'll learn this when you look at degrees of freedom in depth.

dexterdev · Aug 16, 2013

Thank you guys , I will see what an estimator and degrees of freedom etc...No idea on these.

haruspex · Aug 24, 2013

Let me try to show why n-1 is right in certain situations.
Suppose there is a total population N with unknown mean μ and variance σ². From this, you draw a sample S = {x_i} of size n. You can calculate the mean μ_S = Ʃx_i/n and variance σ²_S = Ʃ(x_i-μ_S)²/n of the sample (just treated as a set of numbers).
Consider the function V_S(y) = Ʃ(x_i-y)²/n.
If somehow you knew the real value of μ you could write down an unbiased estimate for σ² as V_S(μ).
The mean of the sample, μ_S, is that number y which minimises V_S(y). Since the true mean of the population is likely to be a little different, V_S(μ) tends to be greater than V_S(μ_S).
One way to think of this is that taking all the samples relative to μ_S removes one degree of freedom, leaving only n-1 degrees for how the samples are scattered around it.
Anyway, it's not hard to show that σ²_S*n/(n-1) is the least biased estimator for σ².
Sadly, the terminology is very misleading. "Sample variance" generally refers to the estimated variance of the population given the sample (i.e. the n-1 form), when it sounds like it should be the variance of the sample taken as an abstract set of numbers (the n form).

To find mean wedivide the sum of no.s with 'n'for variance why 'n-1'

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect