Scaling and Standardization in Statistical Analysis

  • I
  • Thread starter fog37
  • Start date
  • #1
1,293
86
Summary:
scaling and standardization in statistical analysis
Hello everyone,

When working with variables in a data set to find the appropriate statistical model (linear, nonlinear regression, etc.), the variables can have different range, standard deviation, mean, etc.

Should all the input variables be always standardized and scaled before the analysis is applied so they have the same mean and range?

For example, when determining the price of a house (target output variable) using a multivariate linear regression model, the input variables (square footage, year it was build, number of rooms, etc.) have very different ranges....It could happen that a certain variables gets a larger weight just because of the range of its values...

What do do?
 

Answers and Replies

  • #2
Dale
Mentor
Insights Author
2020 Award
32,156
9,103
I wouldn’t say “always”, but certainly “often”.
 
  • #3
FactChecker
Science Advisor
Gold Member
6,650
2,697
Most algorithms or equations will include the appropriate scaling and normalization. Usually, you do not need to do it yourself.
 
  • #4
BWV
1,032
1,101
And often the variable of interest is a difference or % change in which case scale does not matter. This is how finance and economics mostly works. SDs generally do not get whitened for OLS
 
Last edited:
  • #5
FactChecker
Science Advisor
Gold Member
6,650
2,697
using a multivariate linear regression model, the input variables (square footage, year it was build, number of rooms, etc.) have very different ranges....It could happen that a certain variables gets a larger weight just because of the range of its values...
The statistical significance of independent variables in multivariate linear regression does not depend on the scale of the variable values. That effect is compensated for. The magnitude and variance of the multiplying coefficients are affected by the scale of the variables but the statistical significance is not.
 

Related Threads on Scaling and Standardization in Statistical Analysis

  • Last Post
Replies
6
Views
4K
  • Last Post
Replies
1
Views
11K
Replies
3
Views
3K
  • Last Post
Replies
0
Views
2K
  • Last Post
Replies
2
Views
9K
Replies
13
Views
280
  • Last Post
Replies
4
Views
2K
  • Last Post
Replies
2
Views
6K
  • Last Post
Replies
8
Views
2K
Top