# Multilinear Regression. Interpreting "Insignificant"p-values

• I
Science Advisor
Gold Member
2019 Award

## Main Question or Discussion Point

Hi all,

Hi all, hope this is not too simple; please feel free to give me a reference if this is so. I want to know how to address having "insignificant" coefficients in my regression:

I just did a multilinear regression returning $y=a_1x_1+...+a_kx_k$

In the resulting analysis, two of the coefficients , say $a_1, a_2$, came out to be insignificant (at a 95% level) . Still, when I remove them and do a new regression $y=a_3x_3+a_4x_4+.....+a_k x_k$ the adjusted R^2 drops very slightly. Is it standard practice to drop "insignificant" variables?

## Answers and Replies

Related Set Theory, Logic, Probability, Statistics News on Phys.org
Dale
Mentor
R^2 will always drop with a simpler model. The more complicated model lets you fit to the noise.

When comparing two models usually you will use an ANOVA or something like the minimum Bayes information criterion or the Akaike information criterion.

Hi all,

Hi all, hope this is not too simple; please feel free to give me a reference if this is so. I want to know how to address having "insignificant" coefficients in my regression:

I just did a multilinear regression returning $y=a_1x_1+...+a_kx_k$

In the resulting analysis, two of the coefficients , say $a_1, a_2$, came out to be insignificant (at a 95% level) . Still, when I remove them and do a new regression $y=a_3x_3+a_4x_4+.....+a_k x_k$ the adjusted R^2 drops very slightly. Is it standard practice to drop "insignificant" variables?
This is called "model selection". It is indeed good practice to drop insignificant variables, this is done in stepwise, forward or backwards model selection. You should only take care that you don't keep an interaction term while dropping a main effects term, this is called the heredity principle.

When dropping an insignificant variable, you should usually see that the other variables become more significant.

There are many ways to compare models like adjusted $R^2$, Mallow's $C_p$, AIC, AICc, BIC and even $R^2$ can be used for this practice with some care.

(Also: do check for multicollinearity using the Variance Inflation Factors to see whether you have insignificance because of that).

chiro
Science Advisor
Hey WWGD.

In addition to micromass' comment, I'd suggest understanding the nature of things like adjusted R^2 (or other adjusted values) in addition to a normal R^2 (or other similar test-statistic).

The adjustment for R^2 is done because of how correlation easily increases with more variables/information. It's a good thing to understand this when you actually interpret the test statistics for model selection.

Also - in regression modeling there are step-up and step-down models for regression that are procedural and they are used with a variety of test statistics to find models that maximize some information criterion and minimize the number of variables.

PCA (Principal Component Analysis) is another thing that helps with model fitting for multiple random variables.

Dale
Mentor
If you do use any form of data-driven model selection then it is important to partition your data into two random groups. Do your model selection on one group and your model testing on the other group. You need to avoid "double dipping" or using the same data for selecting the model and testing the model.

If you do use a data-driven model selection then it is important to partition your data into two random groups. Do your model selection on one group and your model testing on the other group. You need to avoid "double dipping" or using the same data for selecting the model and testing the model.
It is recommended to actually use three groups if the size of your data permit it. One group for model selection. One group for coefficient estimation. One group for model testing. But if the size of your data set is very small then there are still things you can do like the leave-one-out model testing.

Science Advisor
Gold Member
2019 Award
This is called "model selection". It is indeed good practice to drop insignificant variables, this is done in stepwise, forward or backwards model selection. You should only take care that you don't keep an interaction term while dropping a main effects term, this is called the heredity principle.

When dropping an insignificant variable, you should usually see that the other variables become more significant.

There are many ways to compare models like adjusted $R^2$, Mallow's $C_p$, AIC, AICc, BIC and even $R^2$ can be used for this practice with some care.

(Also: do check for multicollinearity using the Variance Inflation Factors to see whether you have insignificance because of that).
Thank you, I am aware of the general area and methods; I was looking for theorems and specific criteria to keep or get rid of terms.

Science Advisor
Gold Member
2019 Award
Thanks all. I ended up using the best subsets method.

scottdave
Science Advisor
Homework Helper
This is an old thread. I came across it from a search. I am currently taking an online class for Analytics Modeling, from edX and Georgia Tech. In regards to separating training and test data, the TA used an example this week, which I thought sums it up.

I am paraphrasing said:
The class has a midterm exam coming up soon. Before the exam, we hand out a sample test, for students to get an idea of what to expect. If they show up for the actual exam, and it is the same exact questions as the sample test, the students will perform much better than for what they have actually learned.
Apparently models work the same way. If you test them on the same exact data that they were trained on, they will appear to be much better than how they will behave one future real world data.