Hello,(adsbygoogle = window.adsbygoogle || []).push({});

I have a question regarding multiple regression.

I am reading a paper in which the author performed a multiple regression topredictthe energy consumption of an electric car based on a 27 variables measured during journeys, such as speed and acceleration etc.

The author categorised the variables into 4 groups as shown in this table. If 2 variables were correlated within the group he dropped one variable. At the end he has 16 nominated variables for the regression.

https://dl.dropbox.com/u/54057365/All/regtable.JPG [Broken]

My questions are:

1. What is the advantage of using the categories?

2. What if two variables in separate groups are correlated?

3. Could he have put all the variables in one group and did a stepwise or best subsets regression?

The reason I am asking these questions is because, multicollinearity does not matter if your regression is for prediction. He is removing correlated variables within the categories but not between the categories.

I would of thought that leaving them all in one category, dropping one of two highly correlated variables and then doing a best subsets regression would be a better approach.

My main question is, what if any is the advantage of using the 4 categories?

Thank you

John

**Physics Forums | Science Articles, Homework Help, Discussion**

The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

# Multiple regression, why use categories

**Physics Forums | Science Articles, Homework Help, Discussion**