Subtraction of normal distributed stochastic variables

Click For Summary

Discussion Overview

The discussion revolves around the properties of normally distributed stochastic variables, particularly focusing on the addition and subtraction of these variables in the context of time measurements for sequential processes. Participants explore the implications of independence on mean and variance calculations, and the effects of removing processes on total lead time and variability.

Discussion Character

  • Technical explanation
  • Conceptual clarification
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant proposes that when summing independent normally distributed variables, the means are added and the standard deviations are also added, leading to a new normal distribution.
  • Another participant clarifies that while means can be added or subtracted, variances (the squares of standard deviations) always add, regardless of whether the variables are summed or subtracted.
  • A participant questions the increase in variability when subtracting the time taken for drying from the total time, suggesting that the operations do not seem to align with intuitive expectations.
  • There is a repeated emphasis on the distinction between standard deviation and variance, with one participant correcting another's misunderstanding regarding their addition in calculations.
  • One participant expresses confusion about how to account for the physical act of removing processes in a production context and its effect on total lead time and variability.
  • Another participant suggests that removing unnecessary processes may reduce total lead time but could potentially increase variation in total lead time due to the nature of stochastic variables.
  • There is a request for real-world examples where the normal subtraction rule applies in practical scenarios involving stochastic variables.

Areas of Agreement / Disagreement

Participants generally agree on the mathematical principles regarding the addition and subtraction of means and variances of independent normal distributions. However, there is disagreement and confusion regarding the implications of these principles in practical scenarios, particularly in the context of physical processes and their impact on variability.

Contextual Notes

Participants note limitations in understanding how the removal of processes affects variability, particularly in distinguishing between physical processes and waiting times. There is also an acknowledgment of the complexity involved in applying theoretical principles to real-world situations.

Who May Find This Useful

This discussion may be useful for individuals interested in statistical analysis, operations management, and those dealing with stochastic processes in practical applications such as manufacturing or service industries.

Lobotomy
Messages
55
Reaction score
0
hello
if we have set of stochastic variables representing the random time it takes to do something: X,Y,Z,W and C where C is the sum of X Y Z W, thus the time it takes to do these things in sequence. If:
X: N(30,5)
Y: N(30,3)
Z: N(20,2)
W: N(40,7)

makes C adding these together right, mean plus mean and std dev + std dev?
C: N(30+30+20+40,5+3+2+7)=N(120,17) is this correct?

subtraction is the same, assuming we want to subtract W from C, naming it S we get:
S: N(120-40,17-7)=N(80,10)

is this also correct?

if yes, please link to a reliable source, i have googled but not found the proof other than the proof for addition.
would these operations also work for lognormally distributions?edit: it seems like, in the case when the variables are independent, that the std deviation is ADDED also when subtracting. i don't understand this, take this example
you have a washing machine and a drying machine, the time they take for doing a batch can be represented by a stochastic variable that are normally distributed and independent of each other.
time it takes to wash your clothes Normaldist 45min std dev 5min
time it takes to dry the clothes Normaldist 50min std dev 10min

time it takes to wash and dry the clothes N(45+50,5+10)

then if we want to subtract the time it takes to dry the clothes again we get N(45+50-50,5+10+5)=
N(45,20) this does not make sense, why would it suddenly be more variation when washing the clothes? we use the same washing machine as before. The thing here is that washing and drying are separate processes in a sequence and they are not mixed together.
 
Last edited:
Physics news on Phys.org
When the variables are independent, the means add or subtract, just as the variables. However, when it comes to variability, [tex]the variances[/tex] always add.

If [itex]\sigma_X = 50, \quad \sigma_Y = 100[/itex]

then

[tex] \begin{align*}<br /> Var(X + Y) & = 50^2 + 100^2 \\<br /> Var(X - Y) & = 50^2 + 100^2<br /> \end{align*}[/tex]

Think about the contradiction you'd have if variances subtracted (as, admittedly, first exposure makes it seem they should). If they subtracted, what would you get for the variance of [itex]X - Y[/itex] in my example?
 
statdad said:
When the variables are independent, the means add or subtract, just as the variables. However, when it comes to variability, [tex]the variances[/tex] always add.

If [itex]\sigma_X = 50, \quad \sigma_Y = 100[/itex]

then

[tex] \begin{align*}<br /> Var(X + Y) & = 50^2 + 100^2 \\<br /> Var(X - Y) & = 50^2 + 100^2<br /> \end{align*}[/tex]

Think about the contradiction you'd have if variances subtracted (as, admittedly, first exposure makes it seem they should). If they subtracted, what would you get for the variance of [itex]X - Y[/itex] in my example?


thanks i know this, but how do you explain my washing machine example then. how come that the washing machine suddenly have an increased variance?
 
"thanks I know this."

Apparently not.
"it seems like, in the case when the variables are independent, that the std deviation is ADDED also when subtracting."
It is not the standard deviation that adds, it is the variance. They are not the same.

You repeat the error here.
"i don't understand this, take this example
you have a washing machine and a drying machine, the time they take for doing a batch can be represented by a stochastic variable that are normally distributed and independent of each other.
time it takes to wash your clothes Normaldist 45min std dev 5min
time it takes to dry the clothes Normaldist 50min std dev 10min

time it takes to wash and dry the clothes N(45+50,5+10)"

where you again add the standard deviations - incorrect. The time to wash and dry clothes would be

[tex] N(40 + 50, \sqrt{5^2 + 10^2})[/tex]

Finally, here
"then if we want to subtract the time it takes to dry the clothes again we get N(45+50-50,5+10+5)=
N(45,20) this does not make sense, why would it suddenly be more variation when washing the clothes? we use the same washing machine as before. The thing here is that washing and drying are separate processes in a sequence and they are not mixed together."

you do it again. The time would be

[tex] N(40 + 50 - 50, \sqrt{5^2 + 10^2 + 5^2})[/tex]

why does the variability increase? Your algebraic act of addition can't be viewed as purely mathematics: it would represent some other operation done to the clothes after the initial washing and drying: you aren't subtracting the physical act of drying. With another activity, the variability in elapsed time will increase.
 
statdad said:
"thanks I know this."

Apparently not.
"it seems like, in the case when the variables are independent, that the std deviation is ADDED also when subtracting."
It is not the standard deviation that adds, it is the variance. They are not the same.

You repeat the error here.
"i don't understand this, take this example
you have a washing machine and a drying machine, the time they take for doing a batch can be represented by a stochastic variable that are normally distributed and independent of each other.
time it takes to wash your clothes Normaldist 45min std dev 5min
time it takes to dry the clothes Normaldist 50min std dev 10min

time it takes to wash and dry the clothes N(45+50,5+10)"

where you again add the standard deviations - incorrect. The time to wash and dry clothes would be

[tex] N(40 + 50, \sqrt{5^2 + 10^2})[/tex]

Finally, here
"then if we want to subtract the time it takes to dry the clothes again we get N(45+50-50,5+10+5)=
N(45,20) this does not make sense, why would it suddenly be more variation when washing the clothes? we use the same washing machine as before. The thing here is that washing and drying are separate processes in a sequence and they are not mixed together."

you do it again. The time would be

[tex] N(40 + 50 - 50, \sqrt{5^2 + 10^2 + 5^2})[/tex]

why does the variability increase? Your algebraic act of addition can't be viewed as purely mathematics: it would represent some other operation done to the clothes after the initial washing and drying: you aren't subtracting the physical act of drying. With another activity, the variability in elapsed time will increase.
ok thanks i understand a bit more. so arent there a way to calculate the "subtraction of the physical act" ?

the reason I am asking is related to the washing machine example.

im working in a factory and we have separate processes which each require a certain amount of time (similar to washing and drying machine). between the processes there is random waiting time

therefore the total production time is equal to the sum of each of the production times + the waiting times in between. all of these times I see as stochastic variables.

Now what i want to do is to remove the waiting times and physically removing some unnecessary processes through process integration to shorten lead time, but i also intuitively thought that the total time variation would decrease.

so what should i tell my boss. that removing waiting time and unnecessary processes will reduce the total lead time, but it will increase the variation in total lead time? is there a difference between removing the physical processes and removing the waiting time which are not as "physical"
 
Last edited:
Lobotomy,

I think what you are calculating then is A + B + C - B = A + C so then you can use the regular formulas for A + C, if you know those.
 
flat man said:
Lobotomy,

I think what you are calculating then is A + B + C - B = A + C so then you can use the regular formulas for A + C, if you know those.

ok in other words variance do decrease when removing processes physically? removing the process with 5^2 variance from the two processes which together has 5^2+10^2 then gives us 10^2 variance right? i mean this sounds perfectly logic to me anyways.

just for fun. can anyone give an example from the real world for instance using stochastic variables with time where the "normal" subtraction rule with addition is used... i don't understand how this could relate to anything concrete
 
Suppose the total time of the process as it currently exists is [itex]N(\mu, \sigma_T)[/itex] (I use [itex]\sigma_T[/itex] to indicate the standard deviation of the Total time).

And, suppose you can break the current process down into its component parts (illustrated with 3 just for briefness) so that the total process can be thought of as Step 1, Step 2, Step 3, and

[tex] \mu = \mu_1 + \mu_2 + \mu_3[/tex]

[tex] \sigma_T^2 = \sigma_1^2 + \sigma_2^2 + \sigma_3^2[/tex]

If you can eliminate step 3, then the mean time for the reduced process is

[tex] \mu_1 + \mu_2[/tex]

and the variance is only

[tex] \sigma_1^2 + \sigma_2^2[/tex]

but - you have to have the original process broken into components as shown here in order to determine the amount by which mean time and variability decrease.

part of the confusion in the earlier give-and-take was a lack of precision in exactly what you wanted and lack of understanding on my part.
 
Lobotomy,

Imagine if my car rattles for A minutes before it dies where A ~ N(15,3). Further imagine that it usually dies B minutes after start up, where B ~ N(45,4), then how long after start up will it start to rattle? C = B - A so, C ~ N(30,5).
 
  • #10
statdad said:
Suppose the total time of the process as it currently exists is [itex]N(\mu, \sigma_T)[/itex] (I use [itex]\sigma_T[/itex] to indicate the standard deviation of the Total time).

And, suppose you can break the current process down into its component parts (illustrated with 3 just for briefness) so that the total process can be thought of as Step 1, Step 2, Step 3, and

[tex] \mu = \mu_1 + \mu_2 + \mu_3[/tex]

[tex] \sigma_T^2 = \sigma_1^2 + \sigma_2^2 + \sigma_3^2[/tex]

If you can eliminate step 3, then the mean time for the reduced process is

[tex] \mu_1 + \mu_2[/tex]

and the variance is only

[tex] \sigma_1^2 + \sigma_2^2[/tex]

but - you have to have the original process broken into components as shown here in order to determine the amount by which mean time and variability decrease.

part of the confusion in the earlier give-and-take was a lack of precision in exactly what you wanted and lack of understanding on my part.


ok yeah this makes sense. i just got confused by that formula i found.
but even if i don't know the exact variation i know that individual physical processes have SOME KIND of variation, and reducing that process physically will reduce the total variation with THAT AMOUNT of variation. and i guess the same holds true for random waiting times...

how about dependency. adding and subtracting variation then also includes a factor of covariance. but i guess that won't make any difference for the general conclusion other than that the amount of variance reduction will be another
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
Replies
5
Views
6K
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 9 ·
Replies
9
Views
5K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K