Multicollinearity and Interactions

  • Context: Undergrad 
  • Thread starter Thread starter fog37
  • Start date Start date
  • Tags Tags
    Interactions
Click For Summary

Discussion Overview

The discussion revolves around the concepts of multicollinearity and interaction terms in multiple regression models. Participants explore the implications of these concepts on regression coefficients and interpretations, as well as the relationship between independence and correlation in this context.

Discussion Character

  • Conceptual clarification
  • Debate/contested
  • Technical explanation

Main Points Raised

  • One participant explains multicollinearity as the correlation between independent variables affecting regression coefficients and interpretations, while noting that it does not impact predictive results.
  • Another participant points out that independence implies no correlation, suggesting confusion around terminology.
  • It is mentioned that statistical analysis might treat multicollinearity and interactions similarly in practice.
  • A participant reiterates that in multiple regression, independent variables are termed as such regardless of correlation, emphasizing the distinction between independent and dependent variables.
  • There is a discussion on the definition of interaction terms, with one participant stating that they are constructed by multiplying two variables, and this decision does not necessarily depend on correlation.
  • A question is raised about how correlations between variables should influence the decision to introduce interaction terms in regression models.

Areas of Agreement / Disagreement

Participants express differing views on the relationship between independence, correlation, and the implications for regression analysis. There is no consensus on how these concepts interrelate or on the terminology used.

Contextual Notes

Participants highlight potential confusion regarding the definitions of independence and correlation, as well as the criteria for introducing interaction terms in regression models. The discussion reflects varying interpretations and assumptions about these concepts.

fog37
Messages
1,566
Reaction score
108
TL;DR
Multicollinearity and Interactions
Hello,

I understand the concept of multicollinearity: when dealing with a multiple regression model with two or more independent variables, some of the independent variables may be pairwise correlated. This does not affect the model in terms of its predictive results but it impacts the regression coefficients and how we interpret the various variables (IVs).

Multiplicative interaction terms can also be included in a linear regression model. Multicollinearity and interactions are disjoint in the sense that a model with interaction terms does not need to have multicollinearity and vice versa (interesting things probably happen when the interaction terms are multicollinear).

That said, in the case of multicollinearity, one independent variable ##X_1## affects the dependent variable ##Y## but another independent variable ##X_2## affect (is correlated) with the first independent variable ##X_1##. Isn't that similar to what interaction does? Interaction means that when one IV changes the dependent variable but there is another IV that changes the first IV...

Thank you!
 
Physics news on Phys.org
Terminology confusing: Independence implies no correlation.
 
Statistical analysis would usually treat the two situations the same way.
 
mathman said:
Terminology confusing: Independence implies no correlation.
In a multiple regression model, ##Y = a_0 +a_1 X_1 + a_2 X_2 + ... + a_n X_n + \epsilon##, the ##X_i##s are called the "independent variables" regardless of whether they are correlated. ##Y## is the dependent variable.
 
Last edited:
Independence (in probability theory) means no connection and implies no correlation. Your question seems to be about terminology - I am not familiar with the definitions of these terms as you are using them.
 
According to sources on the web ( including https://aarongullickson.github.io/stat_book/interaction-terms.html )
An interaction term is a variable that is constructed from two other variables by multiplying those two variables together.

With that definition, an interaction term results from a decision about the form of the function that is being fitted to data. This decision need not be based on a correlation between variables. By contrast the correlation (or lack of it) between variables in a model is usually inferred from the data, but does not necessarily cause us to introduce interaction terms (and convert a linear regression model to a nonlinear model).

An interesting (and presumably well studied) question: How should correlations between variables influence our decision about whether to introduce interaction terms in the function we are fitting to the data?
 
  • Like
Likes   Reactions: FactChecker

Similar threads

  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 13 ·
Replies
13
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 13 ·
Replies
13
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 14 ·
Replies
14
Views
3K