Undergrad R-squared statistic for goodness-of-fit

  • Thread starter Thread starter EngWiPy
  • Start date Start date
  • Tags Tags
    Statistic
Click For Summary
The discussion centers on the R-squared statistic, which measures the goodness-of-fit in linear regression, defined mathematically as R^2 = 1 - (sum of squared errors / total variance). Participants explore how R-squared values range from 0 to 1, with 1 indicating a perfect fit and 0 suggesting that the model does not explain any variability in the data beyond the mean. Clarification is sought on how R-squared can equal 0, leading to a deeper understanding of covariance and correlation coefficients as foundational concepts. The relationship between R-squared and the correlation coefficient is emphasized, with R-squared being the square of the Pearson correlation coefficient. Overall, the conversation highlights the importance of understanding these statistical concepts to interpret regression analysis effectively.
EngWiPy
Messages
1,361
Reaction score
61
Hello all,

While I was reading about linear regression, I stumbled on the concept r-squared statistic that measures the goodness-of-fit of the line to the data points. It is defined as:

R^2 = 1 - \frac{\sum_i (y_i - f(x_i))^2}{\sum_i (y_i - E[y])^2}

where f(x_i) is the fitted/predicted response value due to x_i, y_i is the actual observed response variable, and E[y] is the expected value of {y}_i.

It is said that this statistic falls between 0 and 1. I can understand why r-squared could be 1 (it means that y_i = f(x_i), i.e., the line fits the data points exactly), but how could r-squared be 0? This implies, I think, that the maximum variation of y is around its mean, and thus the numerator cannot exceeds that value? Is this true?

Thanks
 
Physics news on Phys.org
I'd start with a simpler question:

do you know what a correlation coefficient is? Or even more simply: do you know what covariance is? How can you have zero covarariance? note that these build on each other:

##cov(X,Y) \to \rho(X,Y) \to R^2_{X,Y,...}##
 
StoneTemplePython said:
I'd start with a simpler question:

do you know what a correlation coefficient is? Or even more simply: do you know what covariance is? How can you have zero covarariance?note that these build on each other:

##cov(X,Y) \to \rho(X,Y) \to R^2_{X,Y,...}##

Thanks. I know the denominator is proportional to the variance of y, but what about the numerator? I still don't see the connection. How these are related?
 
##R^2## generalizes the correlation coefficient.

https://en.wikipedia.org/wiki/Coefficient_of_determination

wikipedia said:
As squared correlation coefficient
In linear least squares regression with an estimated intercept term, R2 equals the square of the Pearson correlation coefficient between the observed y and modeled (predicted) f data values of the dependent variable.
If you understand covariance, and when you can get a zero there, then this leads directly to getting the answer for your question. You can further generalize from here as needed.
 
In the Wikipedia page it says:

The better the linear regression (on the right) fits the data in comparison to the simple average (on the left graph), the closer the value of ##R^2## is to 1

I think I understand now why ##R^2## could be 0. In calculating the parameters of the line to fit the data, we minimized the sum of errors squared. This means that:

\sum_i(y_i - f(x_i))^2 \leq \sum_i (y_i - E[y])^2

with worst case scenario is when the equality holds. I think the correlation coefficient is another way of understanding the coefficient of determination.
 
The standard _A " operator" maps a Null Hypothesis Ho into a decision set { Do not reject:=1 and reject :=0}. In this sense ( HA)_A , makes no sense. Since H0, HA aren't exhaustive, can we find an alternative operator, _A' , so that ( H_A)_A' makes sense? Isn't Pearson Neyman related to this? Hope I'm making sense. Edit: I was motivated by a superficial similarity of the idea with double transposition of matrices M, with ## (M^{T})^{T}=M##, and just wanted to see if it made sense to talk...

Similar threads

Replies
24
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
Replies
1
Views
4K
  • · Replies 42 ·
2
Replies
42
Views
5K
  • · Replies 19 ·
Replies
19
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 11 ·
Replies
11
Views
1K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K