Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Least-Squares Error Variance

  1. Mar 7, 2009 #1
    In the textbook it says this:

    http://img6.imageshack.us/img6/1896/imgcxv.jpg [Broken]

    Where does this hocus pokus 'it turns out that dividing by n-2 rather than n appropriately compensates for this' come from?
    Last edited by a moderator: May 4, 2017
  2. jcsd
  3. Mar 8, 2009 #2
    You divide by n-2 because you only have n-2 degrees of freedom. Are you by chance doing a least mean fit on a line where you need two points to determine a line.

    Anyway, the result is only valid if your errors are statistical independent. If there is low frequency noise then you have even less then n-2 degrees of freedom.
    Last edited by a moderator: May 4, 2017
  4. Mar 8, 2009 #3
    Why do I have n-2 degrees of freedom?
  5. Mar 8, 2009 #4
    The book says that the formula you are questioning is derived in (7.21) of section (7.2), maybe post that part if you are confused.

    You are trying to fit a line to the data. Their are two points required to define a a line. You have n data points. The number degrees of freedom is equal to the number of data points minus the number points you need to fit a curve. It is therefore n-2.
  6. Mar 11, 2009 #5


    User Avatar
    Homework Helper

    In simple linear regression two things go on.
    First, you are expressing the mean value of [tex] Y [/tex] as linear function; this essentially says you are splitting [tex] Y [/tex] itself into two sources, a deterministic piece (the linear term) and a probabilistic term (the random error)

    Y = \underbrace{\beta_0 + \beta_1 x}_{\text{Deterministic}} +\overbrace{\varepsilon}^{\text{Random}}

    When it comes to the ANOVA table, this also means that the total variability in [tex] Y [/tex] can be attributed to two sources: the deterministic portion and the random portion. It turns out that in this approach the variability in [tex] Y [/tex] can be broken into two sources. It is customary to do this with the sums of squares first. The basic notation used is

    SSE &= \sum (y-\widehat y)^2 \\
    SST & = \sum (y-\overline y)^2 \\
    SSR & = SST - SSE = \sum (\overline y - \widehat y)^2

    SST is the numerator of the usual sample variance of [tex] Y [/tex] - think of it as measuring the variability around the sample mean
    SSE is the sum of the squared residuals - think of this as measuring the variability around
    the regression line (which is another way of modeling the mean value of [tex] Y [/tex], when you think of it)
    SSR is measures the error between the sample mean and the linear-regression predicted values

    Every time you measure variability with a sum of squares like these, you have to worry about the appropriate degrees of freedom. Mathematically, these also add - just like the sums of squares do.

    The ordinary sample variance has [tex] n - 1 [/tex] degrees of freedom. Perhaps think of this because, in order to calculate this, you must have done [tex] 1 [/tex] calculation, the sample mean. Thinking this way, [tex] SSE [/tex] must have [tex] n - 2 [/tex] degrees of freedom, since its calculation requires two pieces of work - the slope and the intercept.
    This leaves [tex] 1 [/tex] degree of freedom for [tex] SSR [/tex].
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook