Time Distribution Histogram for a process with two peaks

SSGD · Dec 12, 2015

I have been studying a processing time for an industrial process. The present analysis just consists of finding the mean value as if the time was distributed normally. I took a sample of data and made a histogram of the data and realized it is not normally distributed at all. The normal distribution isn't a bad fit though. Also I realized that there are two peaks to the distribution. I spent some time watching the process and I realized that an error occurs in the process which can be corrected but it take almost twice the time to complete the process if the error occurs. So I broke the data into two sets one without the error and one without the error. The error free and error histograms fit a gamma distribution well (I don't know if this is the best choice), but the error free process has a mode of about 4 minutes and the process with an error has a mode of about 9 minutes. I also looked at how probable an error is to occur and it was around 30%.

My question is: Is there a way to recombine these two distributions and the probability of error into a single distribution so I can define a mode for the whole process. I want to change the process to reduce the likely hood of error, but with everything it is all about $$$. So I have to be able to justify what I want to do.

Dale · Dec 12, 2015

You would just do a weighted average of the two distributions with the weighting determined by the probability of an error. This is called a mixture distribution.

You probably will not be able to find a nice closed form solution for the distribution, so you will need to go to a numerical approximation.

SSGD · Dec 12, 2015

Thanks for the help. This points me in the right direct.

SSGD · Dec 12, 2015

Would this be a weighted inner product of the two distributions?

Dale · Dec 12, 2015

I was just thinking of a weighted sum, not a weighted inner product.

Stephen Tashi · Dec 14, 2015

SSGD said:

My question is: Is there a way to recombine these two distributions and the probability of error into a single distribution so I can define a mode for the whole process.

Did you mean "mode" (of a distribution) or "model" (of a process) ?

SSGD · Dec 16, 2015

Sorry model for the process. It has two modes. I don't have much experience with a mixture distribution. I looked into mixture distributions but it doesn't define a weight average. I'm assuming I would want to use the follow:

(1-.3)*P(no error,t)+.3*P(error,t). I hope this is the weighted sum

Stephen Tashi · Dec 17, 2015

SSGD said:

(1-.3)*P(no error,t)+.3*P(error,t). I hope this is the weighted sum

Yes, that is the weighted sum.

In your original post, you said:

The present analysis just consists of finding the mean value as if the time was distributed normally.

Why did your procedure for finding the mean value depend on making an assumption about how the data was distributed? Do you do something other than simply compute the sample mean of the data?

SSGD · Dec 18, 2015

I should say the present method is to take the average. I'm not staying it isn't good enough I just think there is more information to be found in the histogram for the process. I just want to find a more precise method. And if it doesn't stick so be it, but I just want to push my understanding of this process... I have learned that a bimodal histogram might have things like correctable errors, and what a weighted average is?

Stephen Tashi · Dec 20, 2015

SSGD said:

I just think there is more information to be found in the histogram for the process. I just want to find a more precise method. And if it doesn't stick so be it, but I just want to push my understanding of this process... I have learned that a bimodal histogram might have things like correctable errors, and what a weighted average is?

The place to begin understanding is to understand the many different mathematical interpretations of "precise method". One way way to look at Statistics is that it has two main branches: 1) Estimation 2) Hypothesis Testing. In addition to Statistics, there is the mathematical discipline of constructing Probability Models.

As a general rule, my preferred approach to real life problems is to construct Probability Models and implement them as computer simulations. If you have only a few narrow goals that you are trying to accomplish then Statistics can do that. If you can't anticipate all the questions that you'll want to investigate then it's best to construct a simulation.

Time Distribution Histogram for a process with two peaks

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Who May Find This Useful

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect