Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Coefficient of Determination in case of repeat points, in linear regression

  1. Feb 4, 2010 #1
    Hello,

    In simple linear regression (or even in multiple linear regression) how does one prove that the coefficient of determination, given by

    [tex]R^2 = \frac{SS_{Reg}}{SS_{Total}} = 1-\frac{SS_{Res}}{SS_{Total}}= 1-\frac{\sum_{i=1}^{n}(y_i-\hat{y}_i)^2}{\sum_{i=1}^{n}(y_i-\overline{y})^2}[/tex]

    is strictly less than 1, if there are repeat points? That is, if there are multiple values of the response [itex]y_i[/itex] at one value of the regressor [itex]x_i[/itex]?

    Thanks in advance.
     
  2. jcsd
  3. Feb 4, 2010 #2

    EnumaElish

    User Avatar
    Science Advisor
    Homework Helper

    Wouldn't a general proof be sufficient?
     
  4. Feb 4, 2010 #3
    Well, it is easy to see that [itex]R^2 \leq 1[/itex]. For the repeat-point case, I want to show that [itex]R^2 < 1[/itex].
     
  5. Feb 5, 2010 #4

    EnumaElish

    User Avatar
    Science Advisor
    Homework Helper

    Ah; thanks for pointing that out.

    I have the outline of a heuristic proof. For R^2 = 1 the regression line has to coincide with all data points. Also, as a general matter, the regression line y = a + b x has to go through the sample averages of (X, Y) -- that is, mean(Y) = a + b mean(X). Suppose your data are {(x1, y1), (x1, y2), (x2, y3)}, y1 is not equal to y2, and your slope coefficient satisfies -infty < b < +infty.

    If b(mean(X) - x1) equals mean(Y) - y1 then Y(x1) = y1, and the regression line does not go through y2.

    If b(mean(X) - x1) equals mean(Y) - y2 then Y(x1) = y2, and the regression line does not go through y1.

    If b(mean(X) - x1) equals neither mean(Y) - y1 nor mean(Y) - y2 then the regression line does not go through y1 or y2.
     
    Last edited: Feb 5, 2010
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: Coefficient of Determination in case of repeat points, in linear regression
  1. Linear regression (Replies: 7)

Loading...