Regression analysis - case of multicollinearity

flying_young
Messages
9
Reaction score
0
What are some of the elementary remedial procedures to multicollinearity (VIF >= 10) in linear regression? We were told to simply just drop that particular independent variable, but someone else suggested we could center the predictor variables (ie., xi = Xi - Xbar). Can somebody explain why centering may also be appropriate in this case?

Thank you very much in advance!
 
Physics news on Phys.org
Centering won't do much to alleviate the condition of collinearity, but does make some calculations simpler to represent. As one example, if the data matrix has been centered (and, as is often done, scaled so that each column has unit length), then

<br /> R = X&#039; X, \quad R^{-1} = \left(X &#039; X\right)^{-1}<br />

are the correlation and inverse of the correlation matrices.

One problem with the variance inflation factor comes from its calculation:

<br /> VIF_i = \frac 1 {1 - R^2_i}<br />

where R^2_i is the multiple correlation coefficient (determination) of X_i when regressed on the other predictors. If the VIF is large, that indicates a R^2_i that is near one, so there is collinearity somewhere. It does not say whether you have a single case of collinearity (one variable depending on others) or whether there are several variables that exhibit close relationships. In short, you know you have a problem, but you don't know what type of problem you have.
It's also worth mentioning that the cutoff of 10 you mention for the size of VIF is arbitrarily set: there is no easily determined cutoff for what constitutes a large value.

That said, if you are looking for a simple attack (I'm assuming this is an introductory level course, or possibly a non-statistics course using multiple regression as an aside?) you can try removing the predictor that corresponds to the high VIF and re-run the analysis. There are other diagnostics available that allow a more detailed investigation of the problem, but that doesn't seem to be what you're after.

Good luck - hope something here helped.
 
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Namaste & G'day Postulate: A strongly-knit team wins on average over a less knit one Fundamentals: - Two teams face off with 4 players each - A polo team consists of players that each have assigned to them a measure of their ability (called a "Handicap" - 10 is highest, -2 lowest) I attempted to measure close-knitness of a team in terms of standard deviation (SD) of handicaps of the players. Failure: It turns out that, more often than, a team with a higher SD wins. In my language, that...
Back
Top