Time Distribution Histogram for a process with two peaks

Click For Summary

Discussion Overview

The discussion revolves around analyzing processing times for an industrial process that exhibits a bimodal distribution, with the aim of combining two distinct distributions—one for error-free processing and one for error-influenced processing—into a single model. Participants explore statistical methods for accurately representing this data, including mixture distributions and weighted averages, while considering the implications for process improvement and cost justification.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant notes that the processing time data is not normally distributed and identifies two distinct peaks corresponding to error-free and error-influenced processes.
  • Another participant suggests using a weighted average of the two distributions based on the probability of an error, referring to this as a mixture distribution.
  • A different participant questions whether the combination of distributions would be a weighted inner product or a weighted sum, leading to clarification that a weighted sum is intended.
  • Concerns are raised about the adequacy of simply computing the sample mean without considering the underlying distribution of the data.
  • One participant expresses a desire to push their understanding of the process and the statistical methods involved, indicating a preference for more precise methods beyond basic averaging.
  • Another participant introduces the idea of constructing probability models and implementing simulations as a way to address real-life problems, contrasting this with traditional statistical methods.

Areas of Agreement / Disagreement

Participants generally agree on the need to combine the two distributions but have differing views on the methods to achieve this. There is no consensus on the best approach or the implications of the statistical methods discussed.

Contextual Notes

Participants express uncertainty regarding the best statistical methods to apply, the definitions of terms like "mode" versus "model," and the adequacy of current averaging techniques. There is also a recognition that the analysis may depend on the specific context of the industrial process being studied.

Who May Find This Useful

Individuals interested in statistical analysis of bimodal distributions, industrial process optimization, and the application of mixture distributions may find this discussion relevant.

SSGD
Messages
49
Reaction score
4
I have been studying a processing time for an industrial process. The present analysis just consists of finding the mean value as if the time was distributed normally. I took a sample of data and made a histogram of the data and realized it is not normally distributed at all. The normal distribution isn't a bad fit though. Also I realized that there are two peaks to the distribution. I spent some time watching the process and I realized that an error occurs in the process which can be corrected but it take almost twice the time to complete the process if the error occurs. So I broke the data into two sets one without the error and one without the error. The error free and error histograms fit a gamma distribution well (I don't know if this is the best choice), but the error free process has a mode of about 4 minutes and the process with an error has a mode of about 9 minutes. I also looked at how probable an error is to occur and it was around 30%.

My question is: Is there a way to recombine these two distributions and the probability of error into a single distribution so I can define a mode for the whole process. I want to change the process to reduce the likely hood of error, but with everything it is all about $$$. So I have to be able to justify what I want to do.
 
Physics news on Phys.org
You would just do a weighted average of the two distributions with the weighting determined by the probability of an error. This is called a mixture distribution.

You probably will not be able to find a nice closed form solution for the distribution, so you will need to go to a numerical approximation.
 
  • Like
Likes   Reactions: SSGD
Thanks for the help. This points me in the right direct.
 
Would this be a weighted inner product of the two distributions?
 
I was just thinking of a weighted sum, not a weighted inner product.
 
SSGD said:
My question is: Is there a way to recombine these two distributions and the probability of error into a single distribution so I can define a mode for the whole process.

Did you mean "mode" (of a distribution) or "model" (of a process) ?
 
Sorry model for the process. It has two modes. I don't have much experience with a mixture distribution. I looked into mixture distributions but it doesn't define a weight average. I'm assuming I would want to use the follow:

(1-.3)*P(no error,t)+.3*P(error,t). I hope this is the weighted sum
 
SSGD said:
(1-.3)*P(no error,t)+.3*P(error,t). I hope this is the weighted sum

Yes, that is the weighted sum.

In your original post, you said:
The present analysis just consists of finding the mean value as if the time was distributed normally.

Why did your procedure for finding the mean value depend on making an assumption about how the data was distributed? Do you do something other than simply compute the sample mean of the data?
 
  • Like
Likes   Reactions: SSGD
I should say the present method is to take the average. I'm not staying it isn't good enough I just think there is more information to be found in the histogram for the process. I just want to find a more precise method. And if it doesn't stick so be it, but I just want to push my understanding of this process... I have learned that a bimodal histogram might have things like correctable errors, and what a weighted average is?
 
  • #10
SSGD said:
I just think there is more information to be found in the histogram for the process. I just want to find a more precise method. And if it doesn't stick so be it, but I just want to push my understanding of this process... I have learned that a bimodal histogram might have things like correctable errors, and what a weighted average is?

The place to begin understanding is to understand the many different mathematical interpretations of "precise method". One way way to look at Statistics is that it has two main branches: 1) Estimation 2) Hypothesis Testing. In addition to Statistics, there is the mathematical discipline of constructing Probability Models.

As a general rule, my preferred approach to real life problems is to construct Probability Models and implement them as computer simulations. If you have only a few narrow goals that you are trying to accomplish then Statistics can do that. If you can't anticipate all the questions that you'll want to investigate then it's best to construct a simulation.
 

Similar threads

  • · Replies 12 ·
Replies
12
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 30 ·
2
Replies
30
Views
5K
  • · Replies 9 ·
Replies
9
Views
5K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 5 ·
Replies
5
Views
4K
Replies
2
Views
3K