Using Correlation to Predict Values

Soveraign · Dec 21, 2011

I've searched the forums but am unable to find an answer to this:

Given two variables with a correlation, you can predict one from the other using the familiar
E(Y|X) = EY + r * s_y * (X - EX) / s_x

What I want to know is how to predict values from multiple variables, especially when these variables themselves are correlated.

E(Y | A B C) = ??

Stephen Tashi · Dec 21, 2011

You example shows computation for the expected value of a random variable, but you are using the word "predict" to phrase your question. Are you trying to "predict" the value of a random variable Y given the values of other random variables? Or is your goal to compute the expected value of Y give the distribution functions for other random variables?

Soveraign · Dec 21, 2011

Stephen Tashi said:

You example shows computation for the expected value of a random variable, but you are using the word "predict" to phrase your question. Are you trying to "predict" the value of a random variable Y given the values of other random variables? Or is your goal to compute the expected value of Y give the distribution functions for other random variables?

You are correct, I am looking to calculate the expected value of Y given A, B, C and known correlations YA, YB, YC, AB, AC, BC (and necessary variances, etc...)

Stephen Tashi · Dec 22, 2011

Soveraign said:

E(Y|X) = EY + r * s_y * (X - EX) / s_x

I've only seen that formula applied to random variables that have a joint bivariate normal distribution. Are you assuming all the random variables in your question have a joint multinormal distribution?

Soveraign · Dec 22, 2011

Stephen Tashi said:

I've only seen that formula applied to random variables that have a joint bivariate normal distribution. Are you assuming all the random variables in your question have a joint multinormal distribution?

If I understand the definition correctly, then I think so. Y, A, B, C are normally distributed about a mean, but not necessarily independent (i.e. covariance != 0).

A thought I had was to perform principle component analysis on A, B, C so I then would have some new (independent) eigenvectors to work with. Perhaps then I could do multiple regression with my new A', B', C' working out an n-1 dimensional "plane" through my n space, thus working out E(Y|A', B', C')?

But I assume this is a solved problem and I'm just not looking in the right places.

Stephen Tashi · Dec 23, 2011

Soveraign said:

But I assume this is a solved problem and I'm just not looking in the right places.

I looked too. I think this page (in the section called "The Multivariate Normal Distribution") gives the answer, but I haven't deciphered all the matrix notation.

As I recall, the fact that the marginal distributions are normal does not guarantee that the joint distribution is a multivariate normal. So you need to examine this assumption.

Using Correlation to Predict Values

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect