- #1

- 1,539

- 107

- TL;DR Summary
- Linear regression, feature scaling, and regression coefficients

Hello,

In studying linear regression more deeply, I learned that scaling play an important role in multiple ways:

a) the range of the independent variables ##X## affects the values of the regression coefficients. For example, a predictor variable ##X## with a large range typically get assigned a larger regression coefficient and comparing the relative importance of the regression coefficients solely based on coefficient magnitude is misleading. The more appropriate way to compare coefficients to determine relative importance is to standardize the independent variables (standardization is a form of scaling) before building the model.

Another benefit of scaling the predictor variables (standardization, normalization or any other scaling technique) is to extract more meaning from the interpretation of the coefficients: sometimes a regression coefficient may be extremely small and that may just be due to the particular scaling of the data. It is possible to get a larger coefficient and extract more understanding about the relationship between ##Y## and ##X## by properly scaling the predictor variable.

I also read that certain statistical and ML algorithm really require scaling while other (rule-based ones) don't.

So, in essence, scaling is useful but not always required. However, in some cases, it is required as a pre-processing step...

Finally my question: without any type of scaling the independent variables, does linear regression (multiple or single) perform properly, i.e. are the regression coefficient computed correctly? Aside from interpretability issues, does linear regression (OLS) generate larger coefficients for variables with larger range?

Thank you for any input on this!

In studying linear regression more deeply, I learned that scaling play an important role in multiple ways:

a) the range of the independent variables ##X## affects the values of the regression coefficients. For example, a predictor variable ##X## with a large range typically get assigned a larger regression coefficient and comparing the relative importance of the regression coefficients solely based on coefficient magnitude is misleading. The more appropriate way to compare coefficients to determine relative importance is to standardize the independent variables (standardization is a form of scaling) before building the model.

Another benefit of scaling the predictor variables (standardization, normalization or any other scaling technique) is to extract more meaning from the interpretation of the coefficients: sometimes a regression coefficient may be extremely small and that may just be due to the particular scaling of the data. It is possible to get a larger coefficient and extract more understanding about the relationship between ##Y## and ##X## by properly scaling the predictor variable.

I also read that certain statistical and ML algorithm really require scaling while other (rule-based ones) don't.

So, in essence, scaling is useful but not always required. However, in some cases, it is required as a pre-processing step...

Finally my question: without any type of scaling the independent variables, does linear regression (multiple or single) perform properly, i.e. are the regression coefficient computed correctly? Aside from interpretability issues, does linear regression (OLS) generate larger coefficients for variables with larger range?

Thank you for any input on this!