Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Using Correlation to Predict Values

  1. Dec 21, 2011 #1
    I've searched the forums but am unable to find an answer to this:

    Given two variables with a correlation, you can predict one from the other using the familiar
    E(Y|X) = EY + r * s_y * (X - EX) / s_x

    What I want to know is how to predict values from multiple variables, especially when these variables themselves are correlated.

    E(Y | A B C) = ??
     
  2. jcsd
  3. Dec 21, 2011 #2

    Stephen Tashi

    User Avatar
    Science Advisor

    You example shows computation for the expected value of a random variable, but you are using the word "predict" to phrase your question. Are you trying to "predict" the value of a random variable Y given the values of other random variables? Or is your goal to compute the expected value of Y give the distribution functions for other random variables?
     
  4. Dec 21, 2011 #3
    You are correct, I am looking to calculate the expected value of Y given A, B, C and known correlations YA, YB, YC, AB, AC, BC (and necessary variances, etc...)
     
  5. Dec 22, 2011 #4

    Stephen Tashi

    User Avatar
    Science Advisor

    I've only seen that formula applied to random variables that have a joint bivariate normal distribution. Are you assuming all the random variables in your question have a joint multinormal distribution?
     
  6. Dec 22, 2011 #5
    If I understand the definition correctly, then I think so. Y, A, B, C are normally distributed about a mean, but not necessarily independent (i.e. covariance != 0).

    A thought I had was to perform principle component analysis on A, B, C so I then would have some new (independent) eigenvectors to work with. Perhaps then I could do multiple regression with my new A', B', C' working out an n-1 dimensional "plane" through my n space, thus working out E(Y|A', B', C')?

    But I assume this is a solved problem and I'm just not looking in the right places.
     
  7. Dec 23, 2011 #6

    Stephen Tashi

    User Avatar
    Science Advisor

    I looked too. I think this page (in the section called "The Multivariate Normal Distribution") gives the answer, but I haven't deciphered all the matrix notation.

    As I recall, the fact that the marginal distributions are normal does not guarantee that the joint distribution is a multivariate normal. So you need to examine this assumption.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook