# Given two lists of different size, find E(XY)

1. Jun 1, 2014

### johnqwertyful

1. The problem statement, all variables and given/known data
I'm given list x, with 8 elements. List y, with 10 elements. Find expectation and variance of x,y,x+y,xy

2. Relevant equations

3. The attempt at a solution
E(x), E(y), E(x+y), var(x),var(y) are all straightforward. But how do I find E(xy)? Do I just have to multiply out everything? Then I would need to find mean and variance of a sample set of 80. Is there an easier way?

I am given literally no information about the numbers. Not what they represent, if they're related, nothing.

2. Jun 1, 2014

### LCKurtz

Use var(x+y) = var(x) + var(y) + 2cov(x,y) and
cov(x,y) = E(xy) - E(x)E(y)

3. Jun 1, 2014

### johnqwertyful

How do I find cov(x,y)?

4. Jun 1, 2014

### LCKurtz

Substitute the second formula for cov(x,y) into the first equation. You can then calculate everything and solve for E(xy).

5. Jun 1, 2014

### johnqwertyful

Then how do I find var(x+y)? You've given me two equations and three unknowns (var(x+y), cov(x,y), E(xy)).

6. Jun 1, 2014

### LCKurtz

I guess I was responding too quickly here remembering formulas from long ago. You could calculate$$cov(X,Y) = E((X-\mu_x)(Y-\mu_y))$$but I guess that is as much work as calculating E(XY) directly. Maybe Ray will drop by and tell you if there is a shortcut.

7. Jun 2, 2014

### Ray Vickson

Are you given the numbers themselves, even if you are not given information about what they represent?

In order to compute something like E(XY) you need to know if there is a relationship between the two lists, and if so, what that relationship is. Are the two lists completely independent of each other? If so, they (seemingly) represent two independent random variables X and Y. You can easily look up in your textbook how to compute E(XY) for independent X and Y.

One way would be to sum up all the 80 individual products x*y, but you would also need to know the probability of that combination. If all numbers in each list are regarded as "equally likely" and the two list are independent, then all 80 numbers are equally likely; of course, the different actual numerical values of x*y may not be equally likely, because we may, for example, have an outcome of 4 that occurs as 1*4, 4*1, and 2*2, making the value '4' have probability 3/80.(because each of the three ways of getting it has probability 1/80).

You may suspect there are much easier ways of doing it, and you would be right. Look in your book or your notes. There you may find the standard result that the covariance of X and Y is given by
$$\text{Cov}(X,Y) = E(XY) - (EX)(EY)$$
which is much easier to work with than the original definition
$$\text{Cov}(X,Y) \equiv E[(X - EX)(Y - EY)]$$
So, what do you know about the covariance between independent random variables? What does that tell you?

Last edited: Jun 2, 2014
8. Jun 2, 2014

### johnqwertyful

We're not given that they're independent is the thing. So I have no idea what cov(X,Y) is. I have no idea what E(XY) is either. I am not given any relationship between the two numbers, not what they represent, nothing. Just two lists and what to find.

9. Jun 2, 2014

### Ray Vickson

In my previous post I asked "Are you given the numbers themselves, even if you are not given information about what they represent?", but you have still not given an answer. It is important to know.

Assuming you are given two lists of actual numbers, but no other information, you cannot really say what are $EX, EY, \text{Var}(X), \text{Var}(Y)$ and $\text{Cov}(X,Y)$. If you say there is an unspecified bivariate probability mass function $h(i,j), i=1,\ldots, 8, \: j = 1, \ldots, 10$ on the cartesian product of the two lists, then depending on the nature of the function h you can get lots of different numerical values of the quantities listed above.

You could try to tackle some best/worst-case questions, such as: determine $h(i,j)$ to:
(1) Minimize/maximize $EX$; ditto for $EY$
(2) Minimize/maximize $E(X+Y)$
(3) Minimize/maximize $\text{Var}(X)$; ditto for $Y$. I hope you can see that the minimization problems are easy. However, the max problems can present a challenge.
(4) Minimize/maximize $\text{Cov}(X,Y)$
(5) Minimize/maximize $\text{Var}(X+Y)$

10. Jun 2, 2014

### johnqwertyful

I am only given numbers, no other information.

11. Jun 2, 2014

### Ray Vickson

Well, then, all you can do is either make an assumption (such as independence of lists and equal probabilities on all list members) or else do what I suggested above. For example, to minimize EX, just put probability = 1 on the smallest x-value and zero probability elsewhere; to maximize EX, put all the probability on the largest x-value. To minimize Var(X) just put all the probability on any single x-value; that will make Var(X) = 0.

The maximization of Var(X) is more challenging, and I will leave it to you. It is a quadratic optimization problem in the probabilities $p_1, p_2, \ldots, p_8$ on the eight listed x-values (assuming X list has 8 entries). You want to choose $\{ p_i \}$ to maximize
$$V(p) = E(X^2) - (EX)^2 = \sum_{i=1}^8 x_i^2 p_i - \sum_{i=1}^8 \sum_{j=1}^8 x_i x_j p_i p_j,$$
subject to the constraints $p_i \geq 0, i=1,\ldots, 8$ and $\sum_{i=1}^8 p_i = 1$. This is a quadratic optimization problem in non-negative variables $p_i$, subject to a single linear equality constraint. (Here, I assume the $x_i$ are just some given, known numbers.)

Similarly, you can look at the problems of max or min $\text{Cov}(X,Y)$ or $\text{Var}(X+Y)$ as a quadratic optimization problem involving the 80 variables $h(i,j), \: i = 1,\ldots,8 \: j = 1, \ldots,10.$ You need all $h(i,j) \geq 0$ and $\sum_i \sum_j h(i,j) = 1.$ Such problems can be handled numerically using, for example, the EXCEL Solver. Alternatively, you can think harder about the problem and try to get some properties of a solution. Be warned, though, that in such cases there can be a significant difference between the minimization and maximization problems; one of these may be relatively easy and the other very hard (for example, needing global optimization methods in case there are multiple local optima, etc.)