Linear combination of random variables

Silviu · Apr 7, 2018

Homework Statement

Let ##X_1 \sim N(3,2^2)## and ##X_2 \sim N(-8,5^2)## be independent. Let ##U=aX_1+bX_2##. What is the distribution of ##U##

Homework Equations

The Attempt at a Solution

As they are independent, we can write the distribution of ##U## as the convolution of the 2. So I get ##f_u(u)=\int_{-\infty}^{\infty} {f_{x_1}(x_1)f_{x_2}(\frac{u-ax_1}{b})dx_1}##. I am not sure how to solve the integral. The convolution of 2 gaussians is also a gaussian (with the new ##\mu=\mu_1+\mu_2## and ##\sigma^2=\sigma_1^2+\sigma_2^2)##, but I am not sure how to transform the new ##\mu## and ##\sigma## such that to take into account the ##a## and ##b##. Thank you!

Orodruin · Apr 7, 2018

Use the explicit form for a Gaussian pdf and apply exponent rules.

tnich · Apr 7, 2018

Silviu said:

Homework Statement

Let ##X_1 \sim N(3,2^2)## and ##X_2 \sim N(-8,5^2)## be independent. Let ##U=aX_1+bX_2##. What is the distribution of ##U##

Homework Equations
The Attempt at a Solution

As they are independent, we can write the distribution of ##U## as the convolution of the 2. So I get ##f_u(u)=\int_{-\infty}^{\infty} {f_{x_1}(x_1)f_{x_2}(\frac{u-ax_1}{b})dx_1}##. I am not sure how to solve the integral. The convolution of 2 gaussians is also a gaussian (with the new ##\mu=\mu_1+\mu_2## and ##\sigma^2=\sigma_1^2+\sigma_2^2)##, but I am not sure how to transform the new ##\mu## and ##\sigma## such that to take into account the ##a## and ##b##. Thank you!

Define new random variables ##Y_1 = aX_1## and ##Y_2=bX_2##. What are the means and standard deviations of ##Y_1## and ##Y_2##?

Ray Vickson · Apr 7, 2018

Silviu said:

Homework Statement

Let ##X_1 \sim N(3,2^2)## and ##X_2 \sim N(-8,5^2)## be independent. Let ##U=aX_1+bX_2##. What is the distribution of ##U##

Homework Equations
The Attempt at a Solution

As they are independent, we can write the distribution of ##U## as the convolution of the 2. So I get ##f_u(u)=\int_{-\infty}^{\infty} {f_{x_1}(x_1)f_{x_2}(\frac{u-ax_1}{b})dx_1}##. I am not sure how to solve the integral. The convolution of 2 gaussians is also a gaussian (with the new ##\mu=\mu_1+\mu_2## and ##\sigma^2=\sigma_1^2+\sigma_2^2)##, but I am not sure how to transform the new ##\mu## and ##\sigma## such that to take into account the ##a## and ##b##. Thank you!

If you know moment-generating functions (or characteristic functions) the problem becomes very easy.

Otherwise, you need to carry out two steps:
(1) Determine the density functions ##f_1(y)## of ##Y_1 = a X_1## and ##f_2(y)## of ##Y_2 = b X_2.## (as suggested in # 3).
(2) As suggested in #2, perform the integrations to get ##f_U(u)##, where ##U = Y_1 + Y_2:##
$$f_U(u) = \int_{-\infty}^{\infty} f_1(y) f_2 (u-y) \, dy.$$
Yes, you really do need to perform the integrations!

Silviu · Apr 7, 2018

Ray Vickson said:

If you know moment-generating functions (or characteristic functions) the problem becomes very easy.

Otherwise, you need to carry out two steps:
(1) Determine the density functions ##f_1(y)## of ##Y_1 = a X_1## and ##f_2(y)## of ##Y_2 = b X_2.## (as suggested in # 3).
(2) As suggested in #2, perform the integrations to get ##f_U(u)##, where ##U = Y_1 + Y_2:##
$$f_U(u) = \int_{-\infty}^{\infty} f_1(y) f_2 (u-y) \, dy.$$
Yes, you really do need to perform the integrations!

So in general we have ##X \sim N(\mu,\sigma^2) \rightarrow cX \sim N(c\mu,c^2\sigma^2)##. So based on the convolution of 2 gaussians and using the fact that the 2 variables are independent, we have that ##f_u(u) \sim N(a\mu_1+b\mu_2,a^2\sigma_1^2+b^2\sigma_2^2)##. Is this correct?

Ray Vickson · Apr 7, 2018

Silviu said:

So in general we have ##X \sim N(\mu,\sigma^2) \rightarrow cX \sim N(c\mu,c^2\sigma^2)##. So based on the convolution of 2 gaussians and using the fact that the 2 variables are independent, we have that ##f_u(u) \sim N(a\mu_1+b\mu_2,a^2\sigma_1^2+b^2\sigma_2^2)##. Is this correct?

Yes, it is a well-known classical result.

You will have to judge whether just quoting the literature is enough, or whether the person setting the problem wants you to actually prove the result, by performing integration or some other means, such as the moment-generating function approach that I already mentioned.

Silviu · Apr 7, 2018

Ray Vickson said:

Yes, it is a well-known classical result.

You will have to judge whether just quoting the literature is enough, or whether the person setting the problem wants you to actually prove the result, by performing integration or some other means, such as the moment-generating function approach that I already mentioned.

Thank you! One more thing, can we say anything about the independence of the variables ##U(a,b)##, if the initial ones are independent (for example are ##2X_1+7X_2## and ##5X_1+22X_2## independent?

Ray Vickson · Apr 7, 2018

Silviu said:

Thank you! One more thing, can we say anything about the independence of the variables ##U(a,b)##, if the initial ones are independent (for example are ##2X_1+7X_2## and ##5X_1+22X_2## independent?

Yes. Let ##X_1## and ##X_2## have a bivariate normal distribution with mean vector ##\mathbf{\mu}## and variance-covariance matrix ##\mathbf{\Sigma}##, so that
$$
\mathbf{\mu} = \pmatrix{\mu_1\\ \mu_2}, \; \; \mathbf{\Sigma} = \pmatrix{\sigma_{11}^2 & \sigma_{12} \\ \sigma_{21} & \sigma_{22}^2},$$
with ##\sigma_{12} = \sigma_{21} = \text{Cov}(X_1, X_2)##. Then ##Y = a X_1 + b X_2## has a normal distribution as well. You will get more benefit if you derive for yourself the mean and variance of ##Y##; the formulas are not complicated, but if you don't have time or are desperate you can always look them up. Again, the formulas are standard and classical and widely used in applications.

StoneTemplePython · Apr 7, 2018

Silviu said:

Thank you! One more thing, can we say anything about the independence of the variables ##U(a,b)##, if the initial ones are independent (for example are ##2X_1+7X_2## and ##5X_1+22X_2## independent?

you should consider a simple test case for your example. Suppose ##X_1 ## and ##X_2## are zero mean. For your example, define ##U:=2X_1+7X_2## and ##V:= 5X_1+22X_2##. If ##U \perp V## then they must have zero covariance. But

##\text{Cov}(UV) = E\big[UV\big] - E\big[U\big]E\big[V\big] ##

##\text{Cov}(UV) = E\big[UV\big] -0 = E\big[10X_1^2 + 44 X_1 X_2 + 35 X_1 X_2 + 154 X_2^2\big] ##

##\text{Cov}(UV) = E\big[10X_1^2 \big] + E\big[44 X_1 X_2\big] + E\big[ 35 X_1 X_2\big] + E\big[154 X_2^2\big] = 10 E\big[X_1^2 \big] + 154 E\big[X_2^2\big] \neq 0##

Silviu · Apr 8, 2018

StoneTemplePython said:

you should consider a simple test case for your example. Suppose ##X_1 ## and ##X_2## are zero mean. For your example, define ##U:=2X_1+7X_2## and ##V:= 5X_1+22X_2##. If ##U \perp V## then they must have zero covariance. But

##\text{Cov}(UV) = E\big[UV\big] - E\big[U\big]E\big[V\big] ##

##\text{Cov}(UV) = E\big[UV\big] -0 = E\big[10X_1^2 + 44 X_1 X_2 + 35 X_1 X_2 + 154 X_2^2\big] ##

##\text{Cov}(UV) = E\big[10X_1^2 \big] + E\big[44 X_1 X_2\big] + E\big[ 35 X_1 X_2\big] + E\big[154 X_2^2\big] = 10 E\big[X_1^2 \big] + 154 E\big[X_2^2\big] \neq 0##

Thank you for this. However, this doesn't really test independence, I think. You can get zero covariance, and ##U## and ##V## still be dependent. Isn't this right? (I am new to this, so I am not sure if the normal distribution has anything particular to ensure independence based on correlation).

Silviu · Apr 8, 2018

Ray Vickson said:

Yes. Let ##X_1## and ##X_2## have a bivariate normal distribution with mean vector ##\mathbf{\mu}## and variance-covariance matrix ##\mathbf{\Sigma}##, so that
$$
\mathbf{\mu} = \pmatrix{\mu_1\\ \mu_2}, \; \; \mathbf{\Sigma} = \pmatrix{\sigma_{11}^2 & \sigma_{12} \\ \sigma_{21} & \sigma_{22}^2},$$
with ##\sigma_{12} = \sigma_{21} = \text{Cov}(X_1, X_2)##. Then ##Y = a X_1 + b X_2## has a normal distribution as well. You will get more benefit if you derive for yourself the mean and variance of ##Y##; the formulas are not complicated, but if you don't have time or are desperate you can always look them up. Again, the formulas are standard and classical and widely used in applications.

Thank you for your reply. I am not sure I understand what you mean. The covariance doesn't ensure independence, right? Don't you need another way to test for that (i.e. ##f_{u,v}(u,v)=f_u(u)f_v(v)##), but I don't know what ##f_{u,v}(u,v)## is, so I am not sure how else to do it.

Ray Vickson · Apr 8, 2018

Silviu said:

Thank you for your reply. I am not sure I understand what you mean. The covariance doesn't ensure independence, right? Don't you need another way to test for that (i.e. ##f_{u,v}(u,v)=f_u(u)f_v(v)##), but I don't know what ##f_{u,v}(u,v)## is, so I am not sure how else to do it.

For the special case of a multivariate normal random vector, independence is mathematically equivalent to zero correlation. That is not true for many types of random variables, but it is true for the normal case. See, eg., http://www.maths.manchester.ac.uk/~mkt/MT3732%20(MVA)/Notes/MVA_Section3.pdf

Note, however: we need more than just normally-distributed marginals; we nee a full-fledged multivariate normal density, because there are examples of multivariate random vectors with normal marginals and zero correlation, but which are nonetheless dependent. Their joint distributions are not normal, even though their marginals are normal.

tnich · Apr 8, 2018

Silviu said:

Thank you for this. However, this doesn't really test independence, I think. You can get zero covariance, and ##U## and ##V## still be dependent. Isn't this right? (I am new to this, so I am not sure if the normal distribution has anything particular to ensure independence based on correlation).

What you are saying is true - independent random variables have zero covariance, but the converse is not true - random variables with zero covariance are not necessarily independent. However, non-zero covariance does prove that the variables are not independent. Do you see why?

Silviu · Apr 8, 2018

tnich said:

What you are saying is true - independent random variables have zero covariance, but the converse is not true - random variables with zero covariance are not necessarily independent. However, non-zero covariance does prove that the variables are not independent. Do you see why?

I see, using that you can prove for sure that 2 variables are not independent, but you can't prove for sure that they are independent.

tnich · Apr 8, 2018

Silviu said:

I see, using that you can prove for sure that 2 variables are not independent, but you can't prove for sure that they are independent.

Right.
"If ##cov(X_1X_2) \neq 0##, then ##X_1## and ##X_2## are not independent"
is the contrapositive of
"if ##X_1## and ##X_2## are independent, then ##cov(X_1X_2)=0##".

Silviu · Apr 8, 2018

tnich said:

Right.
"If ##cov(X_1X_2) \neq 0##, then ##X_1## and ##X_2## are not independent"
is the contrapositive of
"if ##X_1## and ##X_2## are independent, then ##cov(X_1X_2)=0##".

Thanks! And still, is there a way to test if U and V are independent in this case?

tnich · Apr 8, 2018

Silviu said:

Thanks! And still, is there a way to test if U and V are independent in this case?

You could use @StoneTemplePython's method in post #10 to show that they are not independent.
You could also show that ##E(UV)\neq E(U)E(V)##. If you can show that ##E(UV)## and ##E(U)E(V)## are not identically equal under the conditions of your problem, then you have a proof that U and V and not independent.

Ray Vickson · Apr 8, 2018

Silviu said:

Thanks! And still, is there a way to test if U and V are independent in this case?

Your random variables ##U## and ##V## have a bivariate normal distribution ##f_{UV}(u,v)##, but with ##\text{Cov}(U,V) \neq 0##. Therefore, they are automatically dependent, because for any bivariate normal vector, the components are independent if and only if they have zero correlation.

Again: this is true for multivariate NORMAL random variables, but not necessarily for others.

StoneTemplePython · Apr 8, 2018

Silviu said:

Thank you for this. However, this doesn't really test independence, I think. You can get zero covariance, and ##U## and ##V## still be dependent. Isn't this right? (I am new to this, so I am not sure if the normal distribution has anything particular to ensure independence based on correlation).

I'm a bit concerned that you're unaware of the fact that zero covariance is a pre-req for independence for any random variables. I was quite clear when I said

StoneTemplePython said:

If ##U \perp V## then they must have zero covariance.

Then I very showed that covariance is non-zero. The result is simple, general and true for any linear combination example like the one you gave for any type of random variables (except in the very special cases (a) where the random variables are deterministic or (b) they don't have a ~~second~~ first moment).

Ray mentioned some additional structural subtleties for Gaussians... they are important but subtle -- and I like easy stuff.

I also like working with zero mean random variables wherever possible. You should be aware that for two random variables, ##X## and ##Y##, ##\text{cov}(X, Y)## is the same, whether or not you shift (the mean of) ##X## and/or ##Y## - hence you can alway choose to work with zero mean random variables to make the point. The result -- i.e. that shifting (and especially centering) doesn't change Covariance is something you should know.

Here's the math
- - - -
##\text{cov}(X,Y) = E\big[XY\big] - E\big[X\big]E\big[Y\big]##

now consider shifting ##X## by some fixed value called b.$$\text{cov}(X + b,Y) = E\big[(X+b)Y\big] - E\big[X+b\big]E\big[Y\big]\\
\text{cov}(X + b,Y) = E\big[XY+ bY\big] - (E\big[X\big] + E\big[b\big])E\big[Y\big]\\
\text{cov}(X + b,Y) = E\big[XY\big] +E\big[bY\big] - (E\big[X\big] + b)E\big[Y\big]\\
\text{cov}(X + b,Y) = E\big[XY\big] + bE\big[Y\big] - (E\big[X\big]E\big[Y\big] + bE\big[Y\big])\\
\text{cov}(X + b,Y) = E\big[XY\big] - E\big[X\big]E\big[Y\big] \\
\text{cov}(X + b,Y) = \text{cov}(X, Y)$$

Linear combination of random variables

Homework Statement

Homework Equations

The Attempt at a Solution

Homework Statement

Homework Equations

The Attempt at a Solution

Homework Statement

Homework Equations

The Attempt at a Solution

1. What is a linear combination of random variables?

2. Why are linear combinations of random variables important in statistics?

3. How do you calculate the mean of a linear combination of random variables?

4. What is the central limit theorem and how does it relate to linear combinations of random variables?

5. Can you provide an example of a linear combination of random variables?

Similar threads

Hot Threads

Recent Insights