Variance of Linear combination of random variable

giddy · Jul 4, 2010

This is a problem from my A levels Stats2 book. I understood the problem but one of my answers doesn't seem to be correct according to the book so I thought I better be sure!

Homework Statement

A piece of laminated plywood consists of 3 pieces of wood of type A and 2 pieces of type B. The thickness of type A has a mean 2mm and variance 0.04mm² and The thickness of type B has a mean 1mm and variance 0.01mm². Find the mean and variance of the thickness of the laminated plywood.

Homework Equations

E(aX + bY) = aE(X) + bE(Y)
Var(aX + bY) = a²Var(X) + b²Var(Y)

The Attempt at a Solution

so T = 3A + 2B
E(T) = 3E(A) + 2E(B) = 8mm (This is correct)
Now : Var(T) = 3²Var(A) + 2²Var(B)
=9*0.04 + 4*0.01 = 0.36+0.04 = 0.4 (Which is wrong... according to the book the answer should be 0.14mm²)
How? What am I doing wrong?

Redbelly98 · Jul 5, 2010

Moderator's note: a post giving more help than allowed has been deleted.

giddy said:

Homework Equations

E(aX + bY) = aE(X) + bE(Y)
Var(aX + bY) = a²Var(X) + b²Var(Y)

The Attempt at a Solution

so T = 3A + 2B
E(T) = 3E(A) + 2E(B) = 8mm (This is correct)
Now : Var(T) = 3²Var(A) + 2²Var(B)
=9*0.04 + 4*0.01 = 0.36+0.04 = 0.4 (Which is wrong... according to the book the answer should be 0.14mm²)
How? What am I doing wrong?

Can you double check the variance formula? My recollection is that the a and b should not be squared.

LCKurtz · Jul 5, 2010

Redbelly98 said:

Moderator's note: a post giving more help than allowed has been deleted.

Can you double check the variance formula? My recollection is that the a and b should not be squared.

Var(cX) = c²Var(X) is correct and if X and Y are independent, so their covariance is 0, then Var(X+Y) = Var(X) + Var(Y). So, assuming A and B are independent, it looks like the a and b are correctly squared. Perhaps the answer key is wrong.

ehild · Jul 5, 2010

3 pieces of wood is three different objects from a set of this kind of wood, with thickness as the random variable. So you have the sum of 5 random variables, 3 from a set with mean 2mm and variance 0.04mm² and two from an other set with mean 1mm and variance 0.01mm². The variances of all the 5 variables add up.

You have to use the square of a constant factor when you have a variable which is a constant times an other variable. For example the price you have to pay (P) for a melon is the prize of 1 kg (c) multiplied by the mass (m): P=c*m. You buy some melons, the variance of the money you pay is

var(P) = c^2 var (m).

ehild

Redbelly98 · Jul 5, 2010

LCKurtz said:

Var(cX) = c²Var(X) is correct and if X and Y are independent, so their covariance is 0, then Var(X+Y) = Var(X) + Var(Y). So, assuming A and B are independent, it looks like the a and b are correctly squared. Perhaps the answer key is wrong.

[STRIKE]But your example uses a=b=1, so I don't see how that distinguishes between whether squared or not squared is correct.[/STRIKE]

How about another example: b=0. You just sum the same random variable, X, a times. I do remember that the standard deviation of the sum is (√a)σ in this case, where σ is the standard deviation of X. Square that result to get the variance; this is consistent with var(X+X+...+X) = a·var(X), i.e. the coefficients a and b are not squared.

EDIT: ehild beat me.

LCKurtz · Jul 5, 2010

LCKurtz said:

Var(cX) = c²Var(X) is correct and if X and Y are independent, so their covariance is 0, then Var(X+Y) = Var(X) + Var(Y). So, assuming A and B are independent, it looks like the a and b are correctly squared. Perhaps the answer key is wrong.

Redbelly98 said:

But your example uses a=b=1, so I don't see how that distinguishes between whether squared or not squared is correct.

It does because if cov(X,Y) = 0 so does cov(aX,bY). That is what gives

var(aX + bY) = var(aX) + var(bY) for constants a and b not necessarily 1.

ehild · Jul 5, 2010

Redbelly98 said:

EDIT: ehild beat me.

We did the same at the same time, it was a big fight :)

ehild

Redbelly98 · Jul 5, 2010

Redbelly98 said:

But your example uses a=b=1, so I don't see how that distinguishes between whether squared or not squared is correct.

LCKurtz said:

It does because if cov(X,Y) = 0 so does cov(aX,bY). That is what gives

var(aX + bY) = var(aX) + var(bY) for constants a and b not necessarily 1.

Now I understood better what you were saying. The problem is that we need to take

(X₁+X₂+...+X_a) + (Y₁+Y₂+...+Y_b)

and not

aX + bY

giddy · Jul 6, 2010

LCKurtz said:

Var(cX) = c²Var(X) is correct and if X and Y are independent, so their covariance is 0, then Var(X+Y) = Var(X) + Var(Y). So, assuming A and B are independent, it looks like the a and b are correctly squared. Perhaps the answer key is wrong.

Possible, I solved more problems with the same equation and they are all correct.

ehild · Jul 6, 2010

Assume that the thickness of the other type of wood (B) has the same mean and variance as type A. E(x)=E(y), Var(x) = Var(y) = σ². Assume that both stack contain a very big lot of wooden pieces. You pick one piece from stack A and one from the stack B and put them together. What is the variance of overall thickness? As everybody agrees, it is Var(x)+Var(y), equal to 2σ² now. Later on, it is found out that the stacks A and B both contain the same type of wood. As the two stacks are identical, no matter if you choose the pieces from different stacks or from a single stack, isn't it? You pick two pieces from stack A. What will be the variance in this case, 2^2 σ²?

Picking one piece of wood is an "experiment" and the thickness of the first piece is the result of the experiment. "The thickness of the firstly picked wood" in general is a random variable.
Choosing a piece second time is an other experiment, its random variable is "the thickness of the wood chosen second time". It does not matter if you choose the pieces from two different stacks of wood or from the same one. The mean values and the variances add up.

ehild

LCKurtz · Jul 6, 2010

Ahhh. This is what happens when trying to recall stuff from 40 years ago

So the real issue for the OP is not that his "relevant formulas" are incorrect as much as they are not relevant. That is, taking the variance of the sum of three independent identically distributed random variables X₁ + X₂ + X₃ is not the same as taking the variance of 3X where X has the same distribution. That is why the formula Var(cX) = c²var(X), while being correct itself, isn't relevant to this problem.

giddy · Jul 6, 2010

Yes... lots of confusion, I'm so sorry and frustrated. Ok so I missed the line in the texbook that says :

"It is important to distinguish between situations in which a single observation is multiplied by a constant and those in which several different observations of the same random variable are added."

So I get what its trying to say... but in the next exercise I just ended up trying both equations and when ever I knew I had a bogus result I used the other. I really tried to understand this...

These are two problems... the variance for each is calculated differently (Var(X) + Var(X) / a²Var(X))

Problem 1 The times of four athletes for the 400m are each distributed normally with a mean of 47 seconds and standard dev. of 2 secs. The four athletes are to compete in a 4 X 400 m relay race. Find the probability that their total time is less then 3 mins.

Problem 2 The capacities of small bottles of perfume are distributed normally with a mean of 50 ml and SD 3 ml. The capacities of large bottles of the same perfume are distributed with a mean of 80 ml and SD of 5 ml. Find the probability that the total capacity is of 3 small bottles is greater than the total capacity of 2 large bottles.

So I already got the answers, BUT with trial and error. I don't understand why I should apply a²Var(X) for problem 2 and Var(X) + Var(X)
for problem 1. Is it because in problem 2 there is a comparison? I don't see how its different since there are mutiple athletes with the same variance... and this case there are multiple bottles with the same variance.

*I'm actually considering a degree course in Math next year! =S

ehild · Jul 6, 2010

I do not see any c*x-type variable in these problems. But these problems are of different type than your previous problem was. I have to look after.
By the way, probability theory is very difficult.

ehild

vela · Jul 6, 2010

Sounds like you did problem 2 incorrectly.

Perhaps an example would help illustrate the difference between the two cases:

Say you had two bottles and the volume of each one is represented by its own random variable, described by the same distribution f(X). Say X₁ is the volume of liquid in the first bottle and X₂ is the volume of liquid in the second bottle. The combined volume Y=X₁+X₂ would have a variance var(Y)=var(X₁)+var(X₂)=2var(X).

In contrast, you use a²Var(X) when you're simply rescaling one random variable. So if you were talking about Z=2X₁, which would represent twice the volume of bottle 1, you'd have var(Z)=var(2X₁)=2²var(X).

Both mean(Y)=mean(X₁)+mean(X₂) and mean(Z)=mean(2X₁) are equal to 2⋅mean(X), but they stand for different quantities. You can see Z can not be the total volume of the two bottles because it doesn't include any information about the second bottle. Your mistake in solving the original problem was using a quantity like Z instead of one like Y for the total thickness of the plywood.

The other problems, as ehild noted, all involve the first situation, not the second. They take several measurements and add them together to find a total.

ehild · Jul 7, 2010

giddy said:

Problem 2 The capacities of small bottles of perfume are distributed normally with a mean of 50 ml and SD 3 ml. The capacities of large bottles of the same perfume are distributed with a mean of 80 ml and SD of 5 ml. Find the probability that the total capacity is of 3 small bottles is greater than the total capacity of 2 large bottles.

I show how to treat such problems.
x stands for the volume of small bottles, y for the big ones. The volumes of the bottles are independent from each other. Pick up three small bottles and two big ones. The resultant volume of the small bottles is Vs=x1+x2+x3, that of the two big bottles is Vb = y1+y2.

As the distribution of the volumes is Gaussian, so is their sums. The mean of Vs is μs=50+50+50=150 ml, the variance is 9+9+9=27 ml², σ_s=3√3 ml.

The volume of two big bottles has the mean of μ_b=160 ml, variance of 50 ml² and standard deviation of σ_b=5√2 ml.

The volume difference ΔV=Vs-Vb is a new variable, which is the sum of Vs and -1*Vb. The mean is μ=150-160= -10 ml, the variance is var(Vs)+(-1)^2 var(Vb)= 27+50 = 77, so σ=√77 ml.

You have a Gaussian distribution for ΔV with known parameters, and have to find the probability that ΔV≥0.

What was your solution?

ehild

giddy · Jul 7, 2010

hey your right!

So I used a²Var(X) and got 0.129 by making other basic math errors too, and thought it was close enough to the answer --> 0.127

P(V > 0) V ~ N(-10,77)
P(V > 0) = P(Z > (0-(-10)/[tex]\sqrt{77}[/tex]) = P(Z > 1.1396)
= 1 - [tex]\Phi[/tex](1.1396) = 1-0.8726 = 0.1274 [tex]\cong[/tex] 0.127

Thanks so much, I understand pretty well the difference now.

On another note, although I like math(And loved it at school) I make tons of basic math errors, I'm slow and I can't remember things like formulas easily. (Was diagnosed with ADD but the meds were useless) A levels thankfully allows us to use calculators and a formula reference sheet.

Since I study entirely on my own, the hopes of a good grade are not promising. (I got a 50% in my AS). I really want to do a degree in computers but to get into this really good college I need a 80% in math, however, to enroll for a degree in Math I only need 45%! So I'm considering it? Would it be easy for me to do a masters in Comp. Sc. somewhere in the US or UK after a math degree? Any thoughts on this? (I do happen to be a wiz at programming)

Sorry for the off topic! Thanks
Thanks

ehild · Jul 7, 2010

To improve your maths, you should solve tons of problems. And it would be better to have a teacher. For me, it looks impossible to learn maths without discussing the problems with somebody else.
You do not need to remember formulas, except the basic ones. You can find the others, or derive them. Logical thinking is very important, and also to understand the definitions and axioms. If you really know and understand them, you can derive every formula in principle, or at least you will choose the appropriate one. ehild

Variance of Linear combination of random variable

Homework Statement

Homework Equations

The Attempt at a Solution

Discussion

Homework Equations

The Attempt at a Solution

"Critical" Triangle Problem

The optimal way of dividing the bet three ways

Hedging on a weather prediction

Solving an elementary trigonometric equation

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Variance of Linear combination of random variable

Homework Statement

Homework Equations

The Attempt at a Solution

Discussion

Homework Equations

The Attempt at a Solution

Similar threads

Undergrad Linear Transformations: Why w1 is a Linear Combination of v

Undergrad Bivariate normal distribution from normal linear combination