How to get the CDF from a histogram

catalin.drago · Oct 23, 2012

Hello,

I have a histogram, where I count the number of occurrences that a function takes particular values in the range 0.8 and 2.2.

I would like to get the cumulative distribution function for the set of values. Is it correct to just count the total number of occurrences until each particular value.

For example, the cdf at 0.9 will be the sum of all the occurrences from 0.8 to 0.9?

Is it correct?

Thank you

Number Nine · Oct 23, 2012

That would be a crude way of doing it, yes. There are a variety of techniques (e.g. maximum likelihood) for fitting a distributions to empirical data. Most statistical software (e.g. R, Matlab with the stats toolbox) should support a few different methods.

Stephen Tashi · Oct 23, 2012

catalin.drago said:

I would like to get the cumulative distribution function for the set of values.

To mathematicians, the usual scenario is that your data is random samples from some probability distribution (i.e. a c.d.f). The data is not the same as the c.d.f. (unless your sample happened to come out "perfectly"). When you make the cumulative histogram of the data, it isn't the same thing as the c.d.f, so the preferred term for it would be "the empirical c.d.f" or just "the cumulative histogram".

If you are trying to make the cumulative histogram, your method is correct. If you are tyring to estimate the underlying c.d.f. of the random variable then, as Number Nine mentions, there may be more sophisticated ways.

How to get the CDF from a histogram

1. How do I convert a histogram to a cumulative distribution function (CDF)?

2. Do I need to have a certain number of bins in my histogram to get an accurate CDF?

3. Can I use any type of data to create a CDF from a histogram?

4. Why is the CDF useful in data analysis?

5. Can I use software or programs to automatically generate a CDF from a histogram?

Similar threads

Hot Threads

Recent Insights