Linear combination of random variables

In summary: Cov}(UV) = E\big[10X_1^2 \big] + E\big[44 X_1 X_2\big] + E\big[ 35 X_1 X_2\big] + E\big[154 X_2^2\big] = 10 E\big[X_1^2 \big] + 154 E\big[X_2^2\big] \neq 0So ##U \perp V ## does not hold. In fact, ##\text{Cov}(UV) = 10 \sigma_1^2 + 154 \sigma_2^2##. In summary
  • #1
Silviu
624
11

Homework Statement


Let ##X_1 \sim N(3,2^2)## and ##X_2 \sim N(-8,5^2)## be independent. Let ##U=aX_1+bX_2##. What is the distribution of ##U##

Homework Equations

The Attempt at a Solution


As they are independent, we can write the distribution of ##U## as the convolution of the 2. So I get ##f_u(u)=\int_{-\infty}^{\infty} {f_{x_1}(x_1)f_{x_2}(\frac{u-ax_1}{b})dx_1}##. I am not sure how to solve the integral. The convolution of 2 gaussians is also a gaussian (with the new ##\mu=\mu_1+\mu_2## and ##\sigma^2=\sigma_1^2+\sigma_2^2)##, but I am not sure how to transform the new ##\mu## and ##\sigma## such that to take into account the ##a## and ##b##. Thank you!
 
Physics news on Phys.org
  • #2
Use the explicit form for a Gaussian pdf and apply exponent rules.
 
  • #3
Silviu said:

Homework Statement


Let ##X_1 \sim N(3,2^2)## and ##X_2 \sim N(-8,5^2)## be independent. Let ##U=aX_1+bX_2##. What is the distribution of ##U##

Homework Equations

The Attempt at a Solution


As they are independent, we can write the distribution of ##U## as the convolution of the 2. So I get ##f_u(u)=\int_{-\infty}^{\infty} {f_{x_1}(x_1)f_{x_2}(\frac{u-ax_1}{b})dx_1}##. I am not sure how to solve the integral. The convolution of 2 gaussians is also a gaussian (with the new ##\mu=\mu_1+\mu_2## and ##\sigma^2=\sigma_1^2+\sigma_2^2)##, but I am not sure how to transform the new ##\mu## and ##\sigma## such that to take into account the ##a## and ##b##. Thank you!
Define new random variables ##Y_1 = aX_1## and ##Y_2=bX_2##. What are the means and standard deviations of ##Y_1## and ##Y_2##?
 
  • #4
Silviu said:

Homework Statement


Let ##X_1 \sim N(3,2^2)## and ##X_2 \sim N(-8,5^2)## be independent. Let ##U=aX_1+bX_2##. What is the distribution of ##U##

Homework Equations

The Attempt at a Solution


As they are independent, we can write the distribution of ##U## as the convolution of the 2. So I get ##f_u(u)=\int_{-\infty}^{\infty} {f_{x_1}(x_1)f_{x_2}(\frac{u-ax_1}{b})dx_1}##. I am not sure how to solve the integral. The convolution of 2 gaussians is also a gaussian (with the new ##\mu=\mu_1+\mu_2## and ##\sigma^2=\sigma_1^2+\sigma_2^2)##, but I am not sure how to transform the new ##\mu## and ##\sigma## such that to take into account the ##a## and ##b##. Thank you!

If you know moment-generating functions (or characteristic functions) the problem becomes very easy.

Otherwise, you need to carry out two steps:
(1) Determine the density functions ##f_1(y)## of ##Y_1 = a X_1## and ##f_2(y)## of ##Y_2 = b X_2.## (as suggested in # 3).
(2) As suggested in #2, perform the integrations to get ##f_U(u)##, where ##U = Y_1 + Y_2:##
$$f_U(u) = \int_{-\infty}^{\infty} f_1(y) f_2 (u-y) \, dy.$$
Yes, you really do need to perform the integrations!
 
  • #5
Ray Vickson said:
If you know moment-generating functions (or characteristic functions) the problem becomes very easy.

Otherwise, you need to carry out two steps:
(1) Determine the density functions ##f_1(y)## of ##Y_1 = a X_1## and ##f_2(y)## of ##Y_2 = b X_2.## (as suggested in # 3).
(2) As suggested in #2, perform the integrations to get ##f_U(u)##, where ##U = Y_1 + Y_2:##
$$f_U(u) = \int_{-\infty}^{\infty} f_1(y) f_2 (u-y) \, dy.$$
Yes, you really do need to perform the integrations!
So in general we have ##X \sim N(\mu,\sigma^2) \rightarrow cX \sim N(c\mu,c^2\sigma^2)##. So based on the convolution of 2 gaussians and using the fact that the 2 variables are independent, we have that ##f_u(u) \sim N(a\mu_1+b\mu_2,a^2\sigma_1^2+b^2\sigma_2^2)##. Is this correct?
 
  • #6
Silviu said:
So in general we have ##X \sim N(\mu,\sigma^2) \rightarrow cX \sim N(c\mu,c^2\sigma^2)##. So based on the convolution of 2 gaussians and using the fact that the 2 variables are independent, we have that ##f_u(u) \sim N(a\mu_1+b\mu_2,a^2\sigma_1^2+b^2\sigma_2^2)##. Is this correct?

Yes, it is a well-known classical result.

You will have to judge whether just quoting the literature is enough, or whether the person setting the problem wants you to actually prove the result, by performing integration or some other means, such as the moment-generating function approach that I already mentioned.
 
  • #7
Ray Vickson said:
Yes, it is a well-known classical result.

You will have to judge whether just quoting the literature is enough, or whether the person setting the problem wants you to actually prove the result, by performing integration or some other means, such as the moment-generating function approach that I already mentioned.
Thank you! One more thing, can we say anything about the independence of the variables ##U(a,b)##, if the initial ones are independent (for example are ##2X_1+7X_2## and ##5X_1+22X_2## independent?
 
  • #8
Silviu said:
Thank you! One more thing, can we say anything about the independence of the variables ##U(a,b)##, if the initial ones are independent (for example are ##2X_1+7X_2## and ##5X_1+22X_2## independent?

Yes. Let ##X_1## and ##X_2## have a bivariate normal distribution with mean vector ##\mathbf{\mu}## and variance-covariance matrix ##\mathbf{\Sigma}##, so that
$$
\mathbf{\mu} = \pmatrix{\mu_1\\ \mu_2}, \; \; \mathbf{\Sigma} = \pmatrix{\sigma_{11}^2 & \sigma_{12} \\ \sigma_{21} & \sigma_{22}^2},$$
with ##\sigma_{12} = \sigma_{21} = \text{Cov}(X_1, X_2)##. Then ##Y = a X_1 + b X_2## has a normal distribution as well. You will get more benefit if you derive for yourself the mean and variance of ##Y##; the formulas are not complicated, but if you don't have time or are desperate you can always look them up. Again, the formulas are standard and classical and widely used in applications.
 
  • #9
Silviu said:
Thank you! One more thing, can we say anything about the independence of the variables ##U(a,b)##, if the initial ones are independent (for example are ##2X_1+7X_2## and ##5X_1+22X_2## independent?

you should consider a simple test case for your example. Suppose ##X_1 ## and ##X_2## are zero mean. For your example, define ##U:=2X_1+7X_2## and ##V:= 5X_1+22X_2##. If ##U \perp V## then they must have zero covariance. But

##\text{Cov}(UV) = E\big[UV\big] - E\big[U\big]E\big[V\big] ##

##\text{Cov}(UV) = E\big[UV\big] -0 = E\big[10X_1^2 + 44 X_1 X_2 + 35 X_1 X_2 + 154 X_2^2\big] ##

##\text{Cov}(UV) = E\big[10X_1^2 \big] + E\big[44 X_1 X_2\big] + E\big[ 35 X_1 X_2\big] + E\big[154 X_2^2\big] = 10 E\big[X_1^2 \big] + 154 E\big[X_2^2\big] \neq 0##
 
  • #10
StoneTemplePython said:
you should consider a simple test case for your example. Suppose ##X_1 ## and ##X_2## are zero mean. For your example, define ##U:=2X_1+7X_2## and ##V:= 5X_1+22X_2##. If ##U \perp V## then they must have zero covariance. But

##\text{Cov}(UV) = E\big[UV\big] - E\big[U\big]E\big[V\big] ##

##\text{Cov}(UV) = E\big[UV\big] -0 = E\big[10X_1^2 + 44 X_1 X_2 + 35 X_1 X_2 + 154 X_2^2\big] ##

##\text{Cov}(UV) = E\big[10X_1^2 \big] + E\big[44 X_1 X_2\big] + E\big[ 35 X_1 X_2\big] + E\big[154 X_2^2\big] = 10 E\big[X_1^2 \big] + 154 E\big[X_2^2\big] \neq 0##
Thank you for this. However, this doesn't really test independence, I think. You can get zero covariance, and ##U## and ##V## still be dependent. Isn't this right? (I am new to this, so I am not sure if the normal distribution has anything particular to ensure independence based on correlation).
 
  • #11
Ray Vickson said:
Yes. Let ##X_1## and ##X_2## have a bivariate normal distribution with mean vector ##\mathbf{\mu}## and variance-covariance matrix ##\mathbf{\Sigma}##, so that
$$
\mathbf{\mu} = \pmatrix{\mu_1\\ \mu_2}, \; \; \mathbf{\Sigma} = \pmatrix{\sigma_{11}^2 & \sigma_{12} \\ \sigma_{21} & \sigma_{22}^2},$$
with ##\sigma_{12} = \sigma_{21} = \text{Cov}(X_1, X_2)##. Then ##Y = a X_1 + b X_2## has a normal distribution as well. You will get more benefit if you derive for yourself the mean and variance of ##Y##; the formulas are not complicated, but if you don't have time or are desperate you can always look them up. Again, the formulas are standard and classical and widely used in applications.
Thank you for your reply. I am not sure I understand what you mean. The covariance doesn't ensure independence, right? Don't you need another way to test for that (i.e. ##f_{u,v}(u,v)=f_u(u)f_v(v)##), but I don't know what ##f_{u,v}(u,v)## is, so I am not sure how else to do it.
 
  • #12
Silviu said:
Thank you for your reply. I am not sure I understand what you mean. The covariance doesn't ensure independence, right? Don't you need another way to test for that (i.e. ##f_{u,v}(u,v)=f_u(u)f_v(v)##), but I don't know what ##f_{u,v}(u,v)## is, so I am not sure how else to do it.

For the special case of a multivariate normal random vector, independence is mathematically equivalent to zero correlation. That is not true for many types of random variables, but it is true for the normal case. See, eg., http://www.maths.manchester.ac.uk/~mkt/MT3732%20(MVA)/Notes/MVA_Section3.pdf

Note, however: we need more than just normally-distributed marginals; we nee a full-fledged multivariate normal density, because there are examples of multivariate random vectors with normal marginals and zero correlation, but which are nonetheless dependent. Their joint distributions are not normal, even though their marginals are normal.
 
Last edited:
  • #13
Silviu said:
Thank you for this. However, this doesn't really test independence, I think. You can get zero covariance, and ##U## and ##V## still be dependent. Isn't this right? (I am new to this, so I am not sure if the normal distribution has anything particular to ensure independence based on correlation).
What you are saying is true - independent random variables have zero covariance, but the converse is not true - random variables with zero covariance are not necessarily independent. However, non-zero covariance does prove that the variables are not independent. Do you see why?
 
  • #14
tnich said:
What you are saying is true - independent random variables have zero covariance, but the converse is not true - random variables with zero covariance are not necessarily independent. However, non-zero covariance does prove that the variables are not independent. Do you see why?
I see, using that you can prove for sure that 2 variables are not independent, but you can't prove for sure that they are independent.
 
  • #15
Silviu said:
I see, using that you can prove for sure that 2 variables are not independent, but you can't prove for sure that they are independent.
Right.
"If ##cov(X_1X_2) \neq 0##, then ##X_1## and ##X_2## are not independent"
is the contrapositive of
"if ##X_1## and ##X_2## are independent, then ##cov(X_1X_2)=0##".
 
  • #16
tnich said:
Right.
"If ##cov(X_1X_2) \neq 0##, then ##X_1## and ##X_2## are not independent"
is the contrapositive of
"if ##X_1## and ##X_2## are independent, then ##cov(X_1X_2)=0##".
Thanks! And still, is there a way to test if U and V are independent in this case?
 
  • #17
Silviu said:
Thanks! And still, is there a way to test if U and V are independent in this case?
You could use @StoneTemplePython's method in post #10 to show that they are not independent.
You could also show that ##E(UV)\neq E(U)E(V)##. If you can show that ##E(UV)## and ##E(U)E(V)## are not identically equal under the conditions of your problem, then you have a proof that U and V and not independent.
 
  • #18
Silviu said:
Thanks! And still, is there a way to test if U and V are independent in this case?

Your random variables ##U## and ##V## have a bivariate normal distribution ##f_{UV}(u,v)##, but with ##\text{Cov}(U,V) \neq 0##. Therefore, they are automatically dependent, because for any bivariate normal vector, the components are independent if and only if they have zero correlation.

Again: this is true for multivariate NORMAL random variables, but not necessarily for others.
 
  • #19
Silviu said:
Thank you for this. However, this doesn't really test independence, I think. You can get zero covariance, and ##U## and ##V## still be dependent. Isn't this right? (I am new to this, so I am not sure if the normal distribution has anything particular to ensure independence based on correlation).

I'm a bit concerned that you're unaware of the fact that zero covariance is a pre-req for independence for any random variables. I was quite clear when I said

StoneTemplePython said:
If ##U \perp V## then they must have zero covariance.

Then I very showed that covariance is non-zero. The result is simple, general and true for any linear combination example like the one you gave for any type of random variables (except in the very special cases (a) where the random variables are deterministic or (b) they don't have a second first moment).

Ray mentioned some additional structural subtleties for Gaussians... they are important but subtle -- and I like easy stuff.

I also like working with zero mean random variables wherever possible. You should be aware that for two random variables, ##X## and ##Y##, ##\text{cov}(X, Y)## is the same, whether or not you shift (the mean of) ##X## and/or ##Y## - hence you can alway choose to work with zero mean random variables to make the point. The result -- i.e. that shifting (and especially centering) doesn't change Covariance is something you should know.

Here's the math
- - - -
##\text{cov}(X,Y) = E\big[XY\big] - E\big[X\big]E\big[Y\big]##

now consider shifting ##X## by some fixed value called b.$$\text{cov}(X + b,Y) = E\big[(X+b)Y\big] - E\big[X+b\big]E\big[Y\big]\\
\text{cov}(X + b,Y) = E\big[XY+ bY\big] - (E\big[X\big] + E\big[b\big])E\big[Y\big]\\
\text{cov}(X + b,Y) = E\big[XY\big] +E\big[bY\big] - (E\big[X\big] + b)E\big[Y\big]\\
\text{cov}(X + b,Y) = E\big[XY\big] + bE\big[Y\big] - (E\big[X\big]E\big[Y\big] + bE\big[Y\big])\\
\text{cov}(X + b,Y) = E\big[XY\big] - E\big[X\big]E\big[Y\big] \\
\text{cov}(X + b,Y) = \text{cov}(X, Y)$$
 
Last edited:

1. What is a linear combination of random variables?

A linear combination of random variables is a mathematical operation that involves multiplying each random variable by a constant and then adding them together. It is a common way to combine multiple random variables to create a new random variable.

2. Why are linear combinations of random variables important in statistics?

Linear combinations of random variables are important in statistics because they allow us to manipulate and analyze complex data sets by creating new variables that are more easily understood. They also help us find relationships between different variables and make predictions about the data.

3. How do you calculate the mean of a linear combination of random variables?

The mean of a linear combination of random variables can be calculated by multiplying each random variable by its respective constant and then adding them together. This result is then divided by the total number of random variables in the combination.

4. What is the central limit theorem and how does it relate to linear combinations of random variables?

The central limit theorem states that the sum of a large number of independent and identically distributed random variables will be approximately normally distributed, regardless of the underlying distribution of the individual variables. This theorem is important in understanding the behavior of linear combinations of random variables, as it allows us to make predictions about the distribution of the resulting random variable.

5. Can you provide an example of a linear combination of random variables?

Sure, let's say we have two random variables, X and Y, with means of 10 and 5 respectively. We can create a new random variable Z by taking the linear combination Z = 3X + 2Y. This means that Z is equal to 3 times the value of X plus 2 times the value of Y. The resulting random variable Z will have a mean of 35 (3*10 + 2*5) and will have a normal distribution if X and Y are normally distributed.

Similar threads

  • Calculus and Beyond Homework Help
Replies
7
Views
1K
  • Calculus and Beyond Homework Help
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
930
  • Calculus and Beyond Homework Help
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Calculus and Beyond Homework Help
Replies
6
Views
1K
  • Calculus and Beyond Homework Help
Replies
6
Views
2K
  • Calculus and Beyond Homework Help
Replies
14
Views
1K
  • Precalculus Mathematics Homework Help
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
2K
Back
Top