Correlations with combined sums and products

mcorazao · Oct 15, 2007

I was trying to build a probability-related software package and needed to have a theoretical framework to deal with some less common issues (i.e. stuff that you don't find in the average textbook). I was hoping that somebody could give me pointers as to where to find the proper formulae.

Basically I am trying to figure out how to do arbitrary (symbolic) calculations on random variables which involve both sum and product operations fully taking into account arbitrary correlations between the variables. If I have the correlation coefficients of the variables then obviously summing them is easy. Also if I have the correlation coefficients of their logs multiplying them is easy (since a product involves summing the logs and the log/exponent formulae for normals are well defined). And, of course, determining the resulting correlations between the sums or products and the original variables is straightforward.

So where things are more murky (for me at least) is when I go beyond these simple cases. If, say, I want to multiply two sums, how do I handle the correlations? E.g. Let's say z = (a + b) * (c + d). If I know the correlation of a and b I can easily determine the correlation of their sum to a or b. Similarly c + d is easy. But now if I multiply those two sums, how do I determine the correlation of z to the original variables a, b, c, and d? And if all the variables originally had some non-zero correlation how do I take that into account in the product since the result of the original summations gave me the correlation coefficients wrt the original variables but not the correlations of the logarithms which is what I would need for the multiplication?

Can anybody give me a clue how to start figuring this out?

Thanks.

EnumaElish · Oct 16, 2007

Can't you apply the log rule to x*y where x = a+b and y = c+d?

mcorazao · Oct 16, 2007

Again, how? If I do

x*y = e^(log(x) + log(y))

Then I have to know the correlation of log(x) and log(y). How do I figure that out?

EnumaElish · Oct 16, 2007

I was going with your statement "if I have the correlation coefficients of their logs multiplying them is easy." Now I realize that you don't have Corr(Log x, Log y) and the problem looks really difficult, because Corr is a linear operator and Log is a nonlinear function. Which tells me you need to somehow linearize the log function. E.g. if x is near 1, then Log(x) is approximately equal to x - 1. Maybe you can devise some kind of a scaling that would result in [itex]\overline\xi[/itex] = 1 where [itex]\xi[/itex] is the scaled version of x.

mcorazao · Oct 16, 2007

Well, there are all sorts of approximations I can devise but they would not be precise. Since log(x) and log(y) are both normal (i.e. log of normal is still normal) there should be a simple correlation coefficient that can be calculated and used. I presume, then, there should be a closed form solution for calculating it although I don't know what that would be.

One observation: If the correlation of x and y was zero, then the correlation of log(x) and log(y) is zero. Also if x and y have the same mean/stdev and their correlation is unity, then the correlation of log(x) and log(y) is also unity. That would seem to point toward the correlations always being equal although intuitively that doesn't sound right.

EnumaElish · Oct 16, 2007

mcorazao said:

log of normal is still normal

Wrong. Log of Lognormal is normal.

If the correlation of x and y was zero, then the correlation of log(x) and log(y) is zero

I am not sure that's necessarily the case. Similarly for the unit correlation case.

mcorazao · Oct 16, 2007

EnumaElish said:

Wrong. Log of Lognormal is normal.

Actually you're right. I'm not thinking clearly.

EnumaElish said:

I am not sure that's necessarily the case. Similarly for the unit correlation case.

This one you're not thinking clearly about. If x and y are uncorrelated then it is not possible that their logs could have any correlation. The log transformation does not introduce any component that they could have in common.
If they are perfectly correlated and they have the same mean and standard deviation then they are, by definition, exactly the same number. Therefore their logs are exactly the same number. Therefore their logs are perfectly correlated.
These are the only obvious cases that I can think of at the moment. Any other cases seem to require a more elaborate proof.

EnumaElish · Oct 16, 2007

mcorazao said:

If x and y are uncorrelated then it is not possible that their logs could have any correlation. The log transformation does not introduce any component that they could have in common.

The point is, Log is a nonlinear transformation; but corr is a linear operation. In general properties of linear operators are not invariant under a nonlinear transformation.

A trivial example is Corr(a, b) = 0 where some elements of a and/or b are zero (or negative). Then Corr(Log(a), Log(b)) is undefined.

For similar examples, see: http://en.wikipedia.org/wiki/Correlation#Correlation_and_linearity

mcorazao · Oct 16, 2007

EnumaElish said:

The point is, Log is a nonlinear transformation; but corr is a linear operation. In general properties of linear operators are not invariant under a nonlinear transformation.

Of course, that's what I said. But that doesn't prove that there isn't a direct, even linear, relationship between the variable correlations and the log correlations. Seems unlikely but as yet I haven't found a proof one way or the other.

EnumaElish said:

A trivial example is Corr(a, b) = 0 where some elements of a and/or b are zero (or negative). Then Corr(Log(a), Log(b)) is undefined.

Well, by that argument log of normal is undefined period since any normal distribution has negative values.

Anyway, point is, I need a theoretical framework to do the calculation. I know there are other software packages that do this sort of thing without doing Monte Carlo analysis but I don't know what the math behind their calculations is.

Thanks, BTW, for the interest.

EnumaElish · Oct 16, 2007

mcorazao said:

by that argument log of normal is undefined period

Precisely.

mcorazao · Oct 16, 2007

EnumaElish said:

Precisely.

Well, not "precisely". The reality is that it is not undefined. It is just not real. E.g.

ln(-1) = i*3.1415927

Similarly the correlation is not undefined although it could perhaps have imaginary components.

In other words, the question is not moot it is just "complex" (pun intended).

EnumaElish · Oct 16, 2007

You are right; that was a non sequitur. Still, my point about non-linearity applies. Just because vectors a and b are uncorrelated does not mean their nonlinear functions cannot be correlated.

Here is a numerical example:
x = {
0.147281700000000,
0.230993647671506,
0.427692041391026,
0.079822616900000,
0.291048299000000,
0.185000000000000,
0.088631713672936,
0.266815063276460,
0.182298600000000,
0.850679700000000
};
y = {
0.872438309154607,
0.186970421455947,
0.738597327308731,
0.598236593500000,
0.462298740000000,
0.330000000000000,
0.115598225897281,
0.107657376896975,
0.207345000000000,
0.325996428800000
};
Corr(x,y) [itex]\approx[/itex] 0 ( = 5.1841*10^-12)
But Corr(Log(x), Log(y)) = 0.0873581.

mcorazao · Oct 17, 2007

You are absolutely right. The non-linearity can create these oddball situations. When I work these problems out I normally think of the correlation as distinguishing between a single correlated component and a single uncorrelated component. For linear operations this approach is valid. But when you have non-linear operations then this simplification breaks down (usually accurate but not necessarily).

Curiouser and curiouser ...

Doesn't get me any closer to an answer, though. :-)

EnumaElish · Oct 17, 2007

Somehow, you need to linearize.

Correlations with combined sums and products

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect