Time Distribution Histogram for a process with two peaks

Tags:
1. Dec 12, 2015

SSGD

I have been studying a processing time for an industrial process. The present analysis just consists of finding the mean value as if the time was distributed normally. I took a sample of data and made a histogram of the data and realized it is not normally distributed at all. The normal distribution isn't a bad fit though. Also I realized that there are two peaks to the distribution. I spent some time watching the process and I realized that an error occurs in the process which can be corrected but it take almost twice the time to complete the process if the error occurs. So I broke the data into two sets one without the error and one without the error. The error free and error histograms fit a gamma distribution well (I don't know if this is the best choice), but the error free process has a mode of about 4 minutes and the process with an error has a mode of about 9 minutes. I also looked at how probable an error is to occur and it was around 30%.

My question is: Is there a way to recombine these two distributions and the probability of error into a single distribution so I can define a mode for the whole process. I want to change the process to reduce the likely hood of error, but with everything it is all about \$. So I have to be able to justify what I want to do.

2. Dec 12, 2015

Staff: Mentor

You would just do a weighted average of the two distributions with the weighting determined by the probability of an error. This is called a mixture distribution.

You probably will not be able to find a nice closed form solution for the distribution, so you will need to go to a numerical approximation.

3. Dec 12, 2015

SSGD

Thanks for the help. This points me in the right direct.

4. Dec 12, 2015

SSGD

Would this be a weighted inner product of the two distributions?

5. Dec 12, 2015

Staff: Mentor

I was just thinking of a weighted sum, not a weighted inner product.

6. Dec 14, 2015

Stephen Tashi

Did you mean "mode" (of a distribution) or "model" (of a process) ?

7. Dec 16, 2015

SSGD

Sorry model for the process. It has two modes. I don't have much experience with a mixture distribution. I looked into mixture distributions but it doesn't define a weight average. I'm assuming I would want to use the follow:

(1-.3)*P(no error,t)+.3*P(error,t). I hope this is the weighted sum

8. Dec 17, 2015

Stephen Tashi

Yes, that is the weighted sum.

In your original post, you said:
Why did your procedure for finding the mean value depend on making an assumption about how the data was distributed? Do you do something other than simply compute the sample mean of the data?

9. Dec 18, 2015

SSGD

I should say the present method is to take the average. I'm not staying it isn't good enough I just think there is more information to be found in the histogram for the process. I just want to find a more precise method. And if it doesn't stick so be it, but I just want to push my understanding of this process... I have learned that a bimodal histogram might have things like correctable errors, and what a weighted average is?

10. Dec 20, 2015

Stephen Tashi

The place to begin understanding is to understand the many different mathematical interpretations of "precise method". One way way to look at Statistics is that it has two main branches: 1) Estimation 2) Hypothesis Testing. In addition to Statistics, there is the mathematical discipline of constructing Probability Models.

As a general rule, my preferred approach to real life problems is to construct Probability Models and implement them as computer simulations. If you have only a few narrow goals that you are trying to accomplish then Statistics can do that. If you can't anticipate all the questions that you'll want to investigate then it's best to construct a simulation.