1. Limited time only! Sign up for a free 30min personal tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Homework Help: Simple regression: not including the intercept term

  1. Sep 29, 2014 #1


    User Avatar

    1. The problem statement, all variables and given/known data

    The simple regression model is y = α + βx + u, where u is the error term. If you don't include α, when is β unbiased?

    2. Relevant equations
    y = α + βx + u

    3. The attempt at a solution

    Not including α doesn't affect whether β is unbiased because α is a constant.
  2. jcsd
  3. Sep 29, 2014 #2

    Ray Vickson

    User Avatar
    Science Advisor
    Homework Helper

    If the true model is ##y = \beta x + \epsilon##, you get an unbiased estimate of ##\beta## by using the least-squares method on the model ##\hat{y} = a + b x##---including the intercept! The point is that ##a, b## are both unbiased for the true model ##y = \alpha+\beta x + \epsilon##, and this is true even if it happens that ##\alpha = 0##. Therefore, my guess would be that the estimated obtained from the no-intercept fit ##\hat{y} = bx## would be biased. After all, the two estimates of ##\beta## would be given by different formulas in the ##(x_i, y_i)## data points, and one of the formulas gives an unbiased result.
  4. Sep 30, 2014 #3


    User Avatar
    Homework Helper

    If [itex] \alpha \ne 0 [/itex] but you fit the "no intercept" model then the estimate of the slope will be biased. To see this begin with
    E(b) = E[\left( X'X \right)^{-1}X' y] = E[\left( X'X \right)^{-1}X' \left(\alpha + X \beta + \epsilon \right)]

    and work through the right side. You'll be able to see the only two conditions where the estimate of the slope won't be biased. Essentially - it's biased because you're fitting an incorrect model: fitting no intercept when one exists.

    Regression without the intercept is rarely a good idea, for this reason AND for the fact that it means the traditional [itex]R^2[/itex] statistic is rendered useless (there are other issues as well).
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted