Regression: Regularization parameter 0

Click For Summary

Discussion Overview

The discussion centers around the implications of obtaining a regularization parameter of zero when using ScikitLearn's Elastic Net for regression on a dataset where the number of data points exceeds the number of features. Participants explore the meaning of this result in the context of overfitting and the effectiveness of different regression methods.

Discussion Character

  • Technical explanation
  • Conceptual clarification
  • Debate/contested

Main Points Raised

  • One participant notes that obtaining a hyperparameter of ##\alpha=0## suggests a preference for no regularization, raising questions about the implications for overfitting.
  • Another participant argues that if the number of features is much smaller than the number of data points and there is little correlation among features, ordinary least squares (OLS) may be the best method.
  • A different viewpoint suggests that overfitting may not be a concern when there are significantly more data points than features, as fitting a model with many parameters to a large dataset may inherently limit overfitting.
  • One participant explains that the Elastic Net combines lasso and ridge regression, indicating that similar results to OLS imply that the predictors are appropriate.
  • Another participant adds that none of these methods effectively address autocorrelation, suggesting the use of generalized least squares (GLS) or generalized method of moments (GMM) instead.

Areas of Agreement / Disagreement

Participants express differing views on the implications of a zero regularization parameter and the relevance of overfitting in the context of the dataset's characteristics. The discussion remains unresolved regarding the interpretation of the results and the best approach to take.

Contextual Notes

Participants do not fully address the assumptions underlying their claims, such as the nature of the data, the correlation between features, and the specific conditions under which overfitting may or may not occur.

SchroedingersLion
Messages
211
Reaction score
56
TL;DR
ScikitLearn Elastic Net regression gives a hyperparameter of 0, implying that ordinary least squares is the best method.
Hi guys,

I am using ScikitLearn's Elastic Net implementation to perform regression on a data set where number of data points is larger than number of features. The routine uses crossvalidation to find the two hyperparameters: ElasticNetCV
The elastic net minimizes ##\frac {1}{2N} ||y-Xw||^2 + \alpha c ||w||_1 + \frac 1 2 \alpha (1-c) ||w||_2^2 ##, where ##\alpha## and ##c## are the hyperparameters.

However, I obtain a hyperparameter of ##\alpha=0##, which means the routine prefers no regularization at all. I was wondering what this means. The regularization is done in order to decrease overfitting on test data. What does a parameter of 0 imply? Does it mean I cannot have overfitting in this case?SL
 
Physics news on Phys.org
Generally, if the number of features <<< data points and there is little correlation between them, then OLS is the best method
 
So overfitting is not an issue when I have more data points than features?
Makes somewhat sense, because if I want to fit a line with n parameters, and I have N>>n data points, I can not hit each data point as good as I want, not even in the training data. So overfitting is suppressed.
 
The elastic net is a combination of lasso and ridge and will penalize collinear and low t-stat variables - so if you get the same results as an OLS your predictors are fine
 
  • Like
Likes   Reactions: SchroedingersLion
Thank you!
 
Should add that none of these methods deal well with autocorellation - need GLS or GMM for that
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 27 ·
Replies
27
Views
4K
  • · Replies 12 ·
Replies
12
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 18 ·
Replies
18
Views
4K