Subtraction of normal distributed stochastic variables

Lobotomy
Messages
55
Reaction score
0
hello
if we have set of stochastic variables representing the random time it takes to do something: X,Y,Z,W and C where C is the sum of X Y Z W, thus the time it takes to do these things in sequence. If:
X: N(30,5)
Y: N(30,3)
Z: N(20,2)
W: N(40,7)

makes C adding these together right, mean plus mean and std dev + std dev?
C: N(30+30+20+40,5+3+2+7)=N(120,17) is this correct?

subtraction is the same, assuming we want to subtract W from C, naming it S we get:
S: N(120-40,17-7)=N(80,10)

is this also correct?

if yes, please link to a reliable source, i have googled but not found the proof other than the proof for addition.
would these operations also work for lognormally distributions?edit: it seems like, in the case when the variables are independent, that the std deviation is ADDED also when subtracting. i don't understand this, take this example
you have a washing machine and a drying machine, the time they take for doing a batch can be represented by a stochastic variable that are normally distributed and independent of each other.
time it takes to wash your clothes Normaldist 45min std dev 5min
time it takes to dry the clothes Normaldist 50min std dev 10min

time it takes to wash and dry the clothes N(45+50,5+10)

then if we want to subtract the time it takes to dry the clothes again we get N(45+50-50,5+10+5)=
N(45,20) this does not make sense, why would it suddenly be more variation when washing the clothes? we use the same washing machine as before. The thing here is that washing and drying are separate processes in a sequence and they are not mixed together.
 
Last edited:
Physics news on Phys.org
When the variables are independent, the means add or subtract, just as the variables. However, when it comes to variability, the variances always add.

If \sigma_X = 50, \quad \sigma_Y = 100

then

<br /> \begin{align*}<br /> Var(X + Y) &amp; = 50^2 + 100^2 \\<br /> Var(X - Y) &amp; = 50^2 + 100^2<br /> \end{align*}<br />

Think about the contradiction you'd have if variances subtracted (as, admittedly, first exposure makes it seem they should). If they subtracted, what would you get for the variance of X - Y in my example?
 
statdad said:
When the variables are independent, the means add or subtract, just as the variables. However, when it comes to variability, the variances always add.

If \sigma_X = 50, \quad \sigma_Y = 100

then

<br /> \begin{align*}<br /> Var(X + Y) &amp; = 50^2 + 100^2 \\<br /> Var(X - Y) &amp; = 50^2 + 100^2<br /> \end{align*}<br />

Think about the contradiction you'd have if variances subtracted (as, admittedly, first exposure makes it seem they should). If they subtracted, what would you get for the variance of X - Y in my example?


thanks i know this, but how do you explain my washing machine example then. how come that the washing machine suddenly have an increased variance?
 
"thanks I know this."

Apparently not.
"it seems like, in the case when the variables are independent, that the std deviation is ADDED also when subtracting."
It is not the standard deviation that adds, it is the variance. They are not the same.

You repeat the error here.
"i don't understand this, take this example
you have a washing machine and a drying machine, the time they take for doing a batch can be represented by a stochastic variable that are normally distributed and independent of each other.
time it takes to wash your clothes Normaldist 45min std dev 5min
time it takes to dry the clothes Normaldist 50min std dev 10min

time it takes to wash and dry the clothes N(45+50,5+10)"

where you again add the standard deviations - incorrect. The time to wash and dry clothes would be

<br /> N(40 + 50, \sqrt{5^2 + 10^2})<br />

Finally, here
"then if we want to subtract the time it takes to dry the clothes again we get N(45+50-50,5+10+5)=
N(45,20) this does not make sense, why would it suddenly be more variation when washing the clothes? we use the same washing machine as before. The thing here is that washing and drying are separate processes in a sequence and they are not mixed together."

you do it again. The time would be

<br /> N(40 + 50 - 50, \sqrt{5^2 + 10^2 + 5^2})<br />

why does the variability increase? Your algebraic act of addition can't be viewed as purely mathematics: it would represent some other operation done to the clothes after the initial washing and drying: you aren't subtracting the physical act of drying. With another activity, the variability in elapsed time will increase.
 
statdad said:
"thanks I know this."

Apparently not.
"it seems like, in the case when the variables are independent, that the std deviation is ADDED also when subtracting."
It is not the standard deviation that adds, it is the variance. They are not the same.

You repeat the error here.
"i don't understand this, take this example
you have a washing machine and a drying machine, the time they take for doing a batch can be represented by a stochastic variable that are normally distributed and independent of each other.
time it takes to wash your clothes Normaldist 45min std dev 5min
time it takes to dry the clothes Normaldist 50min std dev 10min

time it takes to wash and dry the clothes N(45+50,5+10)"

where you again add the standard deviations - incorrect. The time to wash and dry clothes would be

<br /> N(40 + 50, \sqrt{5^2 + 10^2})<br />

Finally, here
"then if we want to subtract the time it takes to dry the clothes again we get N(45+50-50,5+10+5)=
N(45,20) this does not make sense, why would it suddenly be more variation when washing the clothes? we use the same washing machine as before. The thing here is that washing and drying are separate processes in a sequence and they are not mixed together."

you do it again. The time would be

<br /> N(40 + 50 - 50, \sqrt{5^2 + 10^2 + 5^2})<br />

why does the variability increase? Your algebraic act of addition can't be viewed as purely mathematics: it would represent some other operation done to the clothes after the initial washing and drying: you aren't subtracting the physical act of drying. With another activity, the variability in elapsed time will increase.
ok thanks i understand a bit more. so arent there a way to calculate the "subtraction of the physical act" ?

the reason I am asking is related to the washing machine example.

im working in a factory and we have separate processes which each require a certain amount of time (similar to washing and drying machine). between the processes there is random waiting time

therefore the total production time is equal to the sum of each of the production times + the waiting times in between. all of these times I see as stochastic variables.

Now what i want to do is to remove the waiting times and physically removing some unnecessary processes through process integration to shorten lead time, but i also intuitively thought that the total time variation would decrease.

so what should i tell my boss. that removing waiting time and unnecessary processes will reduce the total lead time, but it will increase the variation in total lead time? is there a difference between removing the physical processes and removing the waiting time which are not as "physical"
 
Last edited:
Lobotomy,

I think what you are calculating then is A + B + C - B = A + C so then you can use the regular formulas for A + C, if you know those.
 
flat man said:
Lobotomy,

I think what you are calculating then is A + B + C - B = A + C so then you can use the regular formulas for A + C, if you know those.

ok in other words variance do decrease when removing processes physically? removing the process with 5^2 variance from the two processes which together has 5^2+10^2 then gives us 10^2 variance right? i mean this sounds perfectly logic to me anyways.

just for fun. can anyone give an example from the real world for instance using stochastic variables with time where the "normal" subtraction rule with addition is used... i don't understand how this could relate to anything concrete
 
Suppose the total time of the process as it currently exists is N(\mu, \sigma_T) (I use \sigma_T to indicate the standard deviation of the Total time).

And, suppose you can break the current process down into its component parts (illustrated with 3 just for briefness) so that the total process can be thought of as Step 1, Step 2, Step 3, and

<br /> \mu = \mu_1 + \mu_2 + \mu_3<br />

<br /> \sigma_T^2 = \sigma_1^2 + \sigma_2^2 + \sigma_3^2<br />

If you can eliminate step 3, then the mean time for the reduced process is

<br /> \mu_1 + \mu_2<br />

and the variance is only

<br /> \sigma_1^2 + \sigma_2^2<br />

but - you have to have the original process broken into components as shown here in order to determine the amount by which mean time and variability decrease.

part of the confusion in the earlier give-and-take was a lack of precision in exactly what you wanted and lack of understanding on my part.
 
Lobotomy,

Imagine if my car rattles for A minutes before it dies where A ~ N(15,3). Further imagine that it usually dies B minutes after start up, where B ~ N(45,4), then how long after start up will it start to rattle? C = B - A so, C ~ N(30,5).
 
  • #10
statdad said:
Suppose the total time of the process as it currently exists is N(\mu, \sigma_T) (I use \sigma_T to indicate the standard deviation of the Total time).

And, suppose you can break the current process down into its component parts (illustrated with 3 just for briefness) so that the total process can be thought of as Step 1, Step 2, Step 3, and

<br /> \mu = \mu_1 + \mu_2 + \mu_3<br />

<br /> \sigma_T^2 = \sigma_1^2 + \sigma_2^2 + \sigma_3^2<br />

If you can eliminate step 3, then the mean time for the reduced process is

<br /> \mu_1 + \mu_2<br />

and the variance is only

<br /> \sigma_1^2 + \sigma_2^2<br />

but - you have to have the original process broken into components as shown here in order to determine the amount by which mean time and variability decrease.

part of the confusion in the earlier give-and-take was a lack of precision in exactly what you wanted and lack of understanding on my part.


ok yeah this makes sense. i just got confused by that formula i found.
but even if i don't know the exact variation i know that individual physical processes have SOME KIND of variation, and reducing that process physically will reduce the total variation with THAT AMOUNT of variation. and i guess the same holds true for random waiting times...

how about dependency. adding and subtracting variation then also includes a factor of covariance. but i guess that won't make any difference for the general conclusion other than that the amount of variance reduction will be another
 
Back
Top