Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Using correlation coefficients as x in a regression?

  1. Apr 2, 2014 #1
    Using correlation coefficients as x in a regression??

    I was reading an article in the Wall street journal and the author was using a rolling correlation coefficient, on a set of variables, as his predictor variable in a linear regression.

    Basically it was a uni-variate linear regression , y= mx+b and x was the Pearson correlation coefficient calculated using a 30 day window on two random variables.

    This seems "wrong" but I am not sure that it is. I don't know of any regression assumption this violates but at the same time it just doesn't seem like you can do this sort of thing.

    What do you think ?
  2. jcsd
  3. Apr 3, 2014 #2

    Stephen Tashi

    User Avatar
    Science Advisor

    I think this isn't a mathematical question until we know what is assumed and what is asserted. For example, what are the assumptions of "regression"? What consequences follow from those assumptions? There are several types of regression.
  4. Apr 3, 2014 #3


    User Avatar
    Science Advisor
    Gold Member
    2017 Award

    There is nothing mathematically wrong with this model, per say. It's easy to define an example . Suppose we have two random variables, C and D and X is the rolling correlation between C and D. We can define a random variable Y as Y = ax+b for real constants a,b. Then the linear regression would be a perfect model for Y.

    I thought of a type of situation that would naturally lead to a model like you have. When the subject of interest, Y, is related to how well two other variables, C and D, are in or out of balance, this type of model might easily occur. Example: Suppose you are studying predator/pray (C and D) and trying to predict the rate, Y, that predators are killing that prey. When they are out of balance, too many predators or not enough predators, the rate of kills, Y, will be smaller than when they are in balance. The rolling correlation, X, between predator and prey numbers indicates how well they stay in balance. So a natural model would be a linear regression between X and Y.

    Other examples are where the rate of something, Y, increases when two other things, C and D, get out of balance.
    Last edited: Apr 4, 2014
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook