When is Linear Model not Good despite r^2 close to 1?

  • Context: Undergrad 
  • Thread starter Thread starter Bacle
  • Start date Start date
  • Tags Tags
    Linear Model
Click For Summary

Discussion Overview

The discussion revolves around the limitations of linear models in least-squares regression, particularly in cases where the coefficient of determination (r²) is close to 1. Participants explore various factors that may lead to ineffective linear modeling despite high r² values, including the distribution of residuals and model complexity.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Some participants suggest that non-linear distributions of residuals can indicate that a linear model is inappropriate, even with high r² values.
  • Others propose that overfitting or insufficient data points relative to the number of dimensions can undermine the validity of a model, regardless of correlation strength.
  • A participant mentions the importance of adjusted r² as a more reliable statistic than r², as it accounts for the number of parameters in the model.
  • There is a question about the next steps if adjusted r² is low, considering whether to explore linear models with multiple variables or to investigate polynomial models.
  • Suggestions include adding square or interaction terms to the model, but a low r² may indicate that the chosen regressors do not adequately explain the dependent variable.
  • A resource is shared for model evaluation that uses adjusted r² and Akaike Information Criterion values.

Areas of Agreement / Disagreement

Participants express multiple competing views on the effectiveness of linear models and the interpretation of r² and adjusted r², indicating that the discussion remains unresolved.

Contextual Notes

Limitations include potential assumptions about the data distribution, the impact of model complexity, and the adequacy of the chosen regressors, which are not fully explored or resolved in the discussion.

Bacle
Messages
656
Reaction score
1
Hi, All:
I was reading of cases in which linear models in least-squares regression were found to be
innefective, despite values of r, r^2 being close to 1 (obviously, both go together ).
I think the issue has to see with the distribution of the residuals being distinctively non-linear (and, definitely, not being normal), e.g., having a histogram that looks like a parabola, or a cubic, etc.
Just curious to see if someone knows of some examples and/or results in this respect, and of what other checks can be made to see if a linear model makes sense for a data set. Checks I know of are Lack-of-fit Sum of Squares F-test and inference for regression (with Ho:= Slope is zero.)

Thanks.
 
Physics news on Phys.org
Another way - suppose there is overfitting, or not enough data points for the number of dimensions. If you have 100 data points but are using a model with 100 different dimensions it doesn't matter how good your correlation is.
 
A high [itex]R^{2}[/itex] is not the only important statistic to check. I prefer adjusted [itex]R^{2}[/itex], because the more parameters you add to the former it'll tend to inflate it.
 
Thanks, Pyrrhus:

What do I then do if the adjusted R^2 is low ? Do I start considering linear models on two-or-more variables, or do I consider quadratic, cubic, etc. models?
 
You could try adding square terms, and interaction terms, but if the r-squared is still low it might just be that the regressors don't do a good job to explain the dependent variable.
 
Try this incredible free http://creativemachines.cornell.edu/eureqa" developed at Cornell. I've used it in my own research, rating fits by adjusted r2 and Akaike Information Criterion values.
 
Last edited by a moderator:
Excellent, Thanks!.
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
2K
Replies
3
Views
3K
  • · Replies 23 ·
Replies
23
Views
4K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 21 ·
Replies
21
Views
3K
  • · Replies 5 ·
Replies
5
Views
9K
  • · Replies 12 ·
Replies
12
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K