Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

I Multilinear Regression. Interpreting "Insignificant"p-values

  1. Jul 4, 2016 #1

    WWGD

    User Avatar
    Science Advisor
    Gold Member

    Hi all,

    Hi all, hope this is not too simple; please feel free to give me a reference if this is so. I want to know how to address having "insignificant" coefficients in my regression:

    I just did a multilinear regression returning ## y=a_1x_1+...+a_kx_k ##

    In the resulting analysis, two of the coefficients , say ## a_1, a_2 ##, came out to be insignificant (at a 95% level) . Still, when I remove them and do a new regression ## y=a_3x_3+a_4x_4+.....+a_k x_k ## the adjusted R^2 drops very slightly. Is it standard practice to drop "insignificant" variables?
     
  2. jcsd
  3. Jul 4, 2016 #2

    Dale

    Staff: Mentor

    R^2 will always drop with a simpler model. The more complicated model lets you fit to the noise.

    When comparing two models usually you will use an ANOVA or something like the minimum Bayes information criterion or the Akaike information criterion.
     
  4. Jul 4, 2016 #3
    This is called "model selection". It is indeed good practice to drop insignificant variables, this is done in stepwise, forward or backwards model selection. You should only take care that you don't keep an interaction term while dropping a main effects term, this is called the heredity principle.

    When dropping an insignificant variable, you should usually see that the other variables become more significant.

    There are many ways to compare models like adjusted ##R^2##, Mallow's ##C_p##, AIC, AICc, BIC and even ##R^2## can be used for this practice with some care.

    (Also: do check for multicollinearity using the Variance Inflation Factors to see whether you have insignificance because of that).
     
  5. Jul 4, 2016 #4

    chiro

    User Avatar
    Science Advisor

    Hey WWGD.

    In addition to micromass' comment, I'd suggest understanding the nature of things like adjusted R^2 (or other adjusted values) in addition to a normal R^2 (or other similar test-statistic).

    The adjustment for R^2 is done because of how correlation easily increases with more variables/information. It's a good thing to understand this when you actually interpret the test statistics for model selection.

    Also - in regression modeling there are step-up and step-down models for regression that are procedural and they are used with a variety of test statistics to find models that maximize some information criterion and minimize the number of variables.

    PCA (Principal Component Analysis) is another thing that helps with model fitting for multiple random variables.
     
  6. Jul 4, 2016 #5

    Dale

    Staff: Mentor

    If you do use any form of data-driven model selection then it is important to partition your data into two random groups. Do your model selection on one group and your model testing on the other group. You need to avoid "double dipping" or using the same data for selecting the model and testing the model.
     
  7. Jul 4, 2016 #6
    It is recommended to actually use three groups if the size of your data permit it. One group for model selection. One group for coefficient estimation. One group for model testing. But if the size of your data set is very small then there are still things you can do like the leave-one-out model testing.
     
  8. Jul 8, 2016 #7

    WWGD

    User Avatar
    Science Advisor
    Gold Member

    Thank you, I am aware of the general area and methods; I was looking for theorems and specific criteria to keep or get rid of terms.
     
  9. Jul 8, 2016 #8

    WWGD

    User Avatar
    Science Advisor
    Gold Member

    Thanks all. I ended up using the best subsets method.
     
  10. Feb 20, 2018 #9

    scottdave

    User Avatar
    Science Advisor
    Homework Helper
    Gold Member
    2017 Award

    This is an old thread. I came across it from a search. I am currently taking an online class for Analytics Modeling, from edX and Georgia Tech. In regards to separating training and test data, the TA used an example this week, which I thought sums it up.

    Apparently models work the same way. If you test them on the same exact data that they were trained on, they will appear to be much better than how they will behave one future real world data.
     
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted



Loading...