How to Calculate the Covariance of Two Random Sums?

jimmy1
Messages
60
Reaction score
0
Suppose X_1,...,X_n are independent and identically distributed random variables.
Now suppose I picked m_1 random variables from the set X_1,...,X_n and defined Y_1 as the sum of the m_1 variables, where m_1 is also a random variable.
Now suppose I did this again and I picked m_2 random variables from the set X_1,...,X_n and defined Y_2 as the sum of the m_2 variables, where m_2 again is a random variable.
I also know the expected number of random variables from the set X_1,...,X_n, that are contained in both sums Y_1 and Y_2. Call this number a.


So I basically have two random sums, Y_1 and Y_2, and I want to find the covariance bwteen them, Cov(Y_1, Y_2). I came up with the following solution but it doesn't seem to work, so any pointers on what's wrong or how to go about doing it would be great.


So I just simply used the definition of covariance of sums, ie. For sequences of random variables A_1,...,A_m and B_1,...,B_n, we have Cov(\sum_{i=1}^{m}A_i,\sum_{j=1}^{n}B_j) = \sum_{i=1}^{m}\sum_{j=1}^{n}Cov(A_i,B_j).
So applying the above formula to my situation of Cov(Y_1, Y_2), I have that because X_1,...,X_n are independent, most of the terms in the double sum in the above formula will be zero, and will only be non-zero if X_i \equiv X_j, in which case Cov(X_i,X_j) will be just Var(X_1).
Hence the Cov(Y_1, Y_2), will be just a*Var(X_1) ?

There is something wrong in the logic above, as the formula a*Var(X_1) doesn't seem to work, but I can't figure out where I am going wrong. Any help??
 
Physics news on Phys.org
If all the X's are mutually independent then doesn't that allow you to make a statement about Cov(X1+X2, X3+X4+X5), for example?
 
All the X's are mutually independent, so if I apply the definition Cov(\sum_{i=1}^{m}A_i,\sum_{j=1}^{n}B_j) = \sum_{i=1}^{m}\sum_{j=1}^{n}Cov(A_i,B_j) to your example Cov(X1+X2, X3+X4+X5), then the answer is 0.

But in my situation there is a certain amount of overlap, For example, suppose I have the set X_1,...X_{20}, and m_1=5 and m_2=7, then I might have a situation where Y_1=X_1+X_3+X_6+X_8+X_9 and Y_2=X_1+X_2+X_3+X_6+X_{10}+X_{12}+X_{15}.

So in this case if I apply the above covariance of sum definition then Cov(Y_1,Y_2) will not be 0, as there will be 3 non-zero terms in the sum (ie. Cov(X_1,X_1), Cov(X_3,X_3), Cov(X_6,X_6)).
So, as all X's are identically distributed, we get Cov(Y_1,Y_2) = a*Var(X) = 3*Var(X).

Now this formula, a*Var(X), works when m_1 and m_2 are not random variables, but when they are random variables it doesn't work anymore. When they are random variables, I know what the expected values of m_1 and m_2 are going to be, and also know what the expected number of overlapping elements will be, call this a.

So from this information, anyone know how to get the expression for Cov(Y_1,Y_2) when m_1 and m_2 are random variables??
 
Last edited:
To simplify, suppose you have X1, X2.

Then m = 1 or 2, and n = 1 or 2.

If m = 1 then Y1 is X1 or X2. If m = 2 then Y1 is X1+X2.

Similarly if n = 1 then Y2 is X1 or X2. If n = 2 then Y2 is X1+X2.

If you can make a table of these possible outcomes and assign a probability to each, you can calculate a probability-weighted average of the covariance formulas for each possible case.
 
Namaste & G'day Postulate: A strongly-knit team wins on average over a less knit one Fundamentals: - Two teams face off with 4 players each - A polo team consists of players that each have assigned to them a measure of their ability (called a "Handicap" - 10 is highest, -2 lowest) I attempted to measure close-knitness of a team in terms of standard deviation (SD) of handicaps of the players. Failure: It turns out that, more often than, a team with a higher SD wins. In my language, that...
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Back
Top