Fitting Bimodal/Unimodal Distributions in MATLAB

  • Context: MATLAB 
  • Thread starter Thread starter Lobotomy
  • Start date Start date
  • Tags Tags
    Distributions
Click For Summary

Discussion Overview

The discussion revolves around fitting bimodal and unimodal distributions to a dataset using MATLAB, particularly focusing on the challenges and methods for modeling distributions that may represent worker operation times in a task such as sewing. Participants explore the use of Gaussian mixtures and lognormal distributions, as well as the implications of overlapping distributions.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Some participants inquire about MATLAB functions for fitting bimodal/unimodal distributions, specifically mentioning the gmdistribution function for Gaussian distributions.
  • There is a suggestion that strongly bimodal distributions may arise from overlapping Gaussian populations, which should be disaggregated for accurate modeling.
  • One participant proposes that the sewing operation is lognormally distributed, while the batching process may also follow a lognormal distribution, indicating a desire to describe the entire process with a single distribution for simulation purposes.
  • Another participant suggests that if the mean time for each operation is considered, there may be two distinct unmixed distributions, but acknowledges potential overlap in the total throughput timeline.
  • Participants discuss the complexity of programming a simulation that accounts for both operation and batching times, weighing the benefits of using a single bimodal distribution versus separate distributions for each operation.
  • There is a question about fitting a bimodal lognormal distribution to the measurements, with some participants expressing uncertainty about the appropriate MATLAB commands to achieve this.

Areas of Agreement / Disagreement

Participants express varying opinions on the best approach to fitting distributions to the data, with some favoring the use of gmdistribution for Gaussian mixtures and others advocating for a lognormal approach. The discussion remains unresolved regarding the optimal method for fitting the bimodal distribution.

Contextual Notes

Participants note the importance of understanding the underlying distributions and their parameters, as well as the potential complications arising from overlapping distributions. There is acknowledgment of the need for careful consideration in modeling to avoid oversensitivity to specific data points.

Who May Find This Useful

This discussion may be useful for researchers or practitioners in operations management, statistics, or data analysis who are interested in fitting complex distributions to empirical data, particularly in the context of worker performance and task completion times.

Lobotomy
Messages
55
Reaction score
0
i have some values that seems to have 2 modes and i don't know how to fit a distribution to them in matlab. Does MATLAB have any function for fitting bimodal/unimodal distributions?

edit: it seems like the function gmdistribution have something to do with it but this only concerns gaussian distributions with 2 modes. is there a generalized version or more specifically lognormal version of this?
 
Last edited:
Physics news on Phys.org
Lobotomy said:
i have some values that seems to have 2 modes and i don't know how to fit a distribution to them in matlab. Does MATLAB have any function for fitting bimodal/unimodal distributions?

edit: it seems like the function gmdistribution have something to do with it but this only concerns gaussian distributions with 2 modes. is there a generalized version or more specifically lognormal version of this?

My gmdistribution is for bivariate Gaussian distributions. A strongly bimodal distribution may result from data from two overlapping Gaussian populations, in which case they should be disaggregated and described separately. For this, it's best to go back to the original data.

If the data is naturally strongly bimodal, then it can't (and shouldn't) be normalized. What are you trying to do? You can fit any odd distribution with polynomial regression. However, you have to be careful not to enter too many terms in the equation, or it will be too sensitive to the specific data to be meaningful.

EDIT: If you want to use gmdistribution, it seems you need the parameters of the individual components of the Gaussian mixture to begin with, but I don't know for sure. I haven't used it.
 
Last edited:
SW VandeCarr said:
My gmdistribution is for bivariate Gaussian distributions. A strongly bimodal distribution may result from data from two overlapping Gaussian populations, in which case they should be disaggregated and described separately. For this, it's best to go back to the original data.

If the data is naturally strongly bimodal, then it can't (and shouldn't) be normalized. What are you trying to do? You can fit any odd distribution with polynomial regression. However, you have to be careful not to enter too many terms in the equation, or it will be too sensitive to the specific data to be meaningful.

EDIT: If you want to use gmdistribution, it seems you need the parameters of the individual components of the Gaussian mixture to begin with, but I don't know for sure. I haven't used it.


im trying to find a distribution that fits a worker operation. It is a worker doing a task for instance sewing a pocket on a jacket and every n:th jacket he has sewn he batches them and moves them. n is not a fixed number i.e. batch sizes vary a bit.

so there will be one mode for the mean time of sewing one pocket, and a smaller mode for the mean time of batching.

the pocket sewing operation is lognormally distributed, i think the batching process is as well. and i want to describe the entire process with a distribution which will be used in a computer simulation
 
=Lobotomy;2675432]im trying to find a distribution that fits a worker operation. It is a worker doing a task for instance sewing a pocket on a jacket and every n:th jacket he has sewn he batches them and moves them. n is not a fixed number i.e. batch sizes vary a bit.

so there will be one mode for the mean time of sewing one pocket, and a smaller mode for the mean time of batching.

the pocket sewing operation is lognormally distributed, i think the batching process is as well. and i want to describe the entire process with a distribution which will be used in a computer simulation

If your parameter is mean time for each operation, then it seems you have two distinct unmixed distributions. However if your time line is for total throughput, you would probably have some overlap at some point after zero time in a region where some pockets are being sewed, while others are being batched. This seems to be the kind of bivariate mixed Gaussian distribution for which gmdistribution is useful since you have the parameters for both the sewing and batching distributions.

As for additional modes, you could probably iterate the process forward between the second mode and a third mode, etc.

Again, I remind you, I haven't actually used this application.
 
Last edited:
SW VandeCarr said:
If your parameter is mean time for each operation, then it seems you have two distinct unmixed distributions. However if your time line is for total throughput, you would probably have some overlap at some point after zero time in a region where some pockets are being sewed, while others are being batched. This seems to be the kind of bivariate mixed Gaussian distribution for which gmdistribution is useful since you have the parameters for both the sewing and batching distributions.

As for additional modes, you could probably iterate the process forward between the second mode and a third mode, etc.

Again, I remind you, I haven't actually used this application.


what i measure is the time between output. So the time between output could be

10 11 12 9 11 11 8 13 34 10 9 etc etc

where the 34 is when a batching occurs.


so do you think a better idea is to take one distribution for the operation and using that for simulation of time. and for every n:th element you add another time taken from the batch distribution??

this will be more complicated to program i guess. the best solution for me would be to simulate the time for the operation and batching as taken from one bimodal distribution
 
Lobotomy said:
you think a better idea is to take one distribution for the operation and using that for simulation of time. and for every n:th element you add another time taken from the batch distribution??

this will be more complicated to program i guess. the best solution for me would be to simulate the time for the operation and batching as taken from one bimodal distribution

Well, the latter seemed to be what you were aiming for. Both would provide useful information for an operations manager.
 
SW VandeCarr said:
Well, the latter seemed to be what you were aiming for. Both would provide useful information for an operations manager.


yes, so how can i fit a bimodal distribution to the above measurements?

i can fit a mixed gaussian distribution to my measurements with the gmdistribution, but this is not so correct since the distributions are not gaussian. i would assume that a bimodal lognormal distribution would fit better, and the question is therefore is there a way in MATLAB to fit measurements to a bimodal lognormal distribution.
 
Lobotomy said:
yes, so how can i fit a bimodal distribution to the above measurements?

It shouldn't be a problem if they're lognormal. You'll be fitting the curve to a log scale on the x axis. All stat packages can do that. You've already got the parameters.
 
SW VandeCarr said:
It shouldn't be a problem if they're lognormal. You'll be fitting the curve to a log scale on the x axis. All stat packages can do that. You've already got the parameters.

well it sounds simple when you say it but i don't know how to (im no master at MATLAB either by the way)

so if x is your vector with measurements. which MATLAB commands would you use to fit a distribution and get the parameters for it to be more specific?
 

Similar threads

  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 9 ·
Replies
9
Views
4K
  • · Replies 14 ·
Replies
14
Views
4K
  • · Replies 12 ·
Replies
12
Views
4K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 6 ·
Replies
6
Views
5K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K