How to get the CDF from a histogram

1. Oct 23, 2012

catalin.drago

Hello,

I have a histogram, where I count the number of occurrences that a function takes particular values in the range 0.8 and 2.2.

I would like to get the cumulative distribution function for the set of values. Is it correct to just count the total number of occurrences until each particular value.

For example, the cdf at 0.9 will be the sum of all the occurrences from 0.8 to 0.9?

Is it correct?

Thank you

2. Oct 23, 2012

Number Nine

That would be a crude way of doing it, yes. There are a variety of techniques (e.g. maximum likelihood) for fitting a distributions to empirical data. Most statistical software (e.g. R, Matlab with the stats toolbox) should support a few different methods.

3. Oct 23, 2012

Stephen Tashi

To mathematicians, the usual scenario is that your data is random samples from some probability distribution (i.e. a c.d.f). The data is not the same as the c.d.f. (unless your sample happened to come out "perfectly"). When you make the cumulative histogram of the data, it isn't the same thing as the c.d.f, so the preferred term for it would be "the empirical c.d.f" or just "the cumulative histogram".

If you are trying to make the cumulative histogram, your method is correct. If you are tyring to estimate the underlying c.d.f. of the random variable then, as Number Nine mentions, there may be more sophisticated ways.