Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

How to get the CDF from a histogram

  1. Oct 23, 2012 #1
    Hello,

    I have a histogram, where I count the number of occurrences that a function takes particular values in the range 0.8 and 2.2.

    I would like to get the cumulative distribution function for the set of values. Is it correct to just count the total number of occurrences until each particular value.

    For example, the cdf at 0.9 will be the sum of all the occurrences from 0.8 to 0.9?

    Is it correct?

    Thank you
     
  2. jcsd
  3. Oct 23, 2012 #2
    That would be a crude way of doing it, yes. There are a variety of techniques (e.g. maximum likelihood) for fitting a distributions to empirical data. Most statistical software (e.g. R, Matlab with the stats toolbox) should support a few different methods.
     
  4. Oct 23, 2012 #3

    Stephen Tashi

    User Avatar
    Science Advisor

    To mathematicians, the usual scenario is that your data is random samples from some probability distribution (i.e. a c.d.f). The data is not the same as the c.d.f. (unless your sample happened to come out "perfectly"). When you make the cumulative histogram of the data, it isn't the same thing as the c.d.f, so the preferred term for it would be "the empirical c.d.f" or just "the cumulative histogram".

    If you are trying to make the cumulative histogram, your method is correct. If you are tyring to estimate the underlying c.d.f. of the random variable then, as Number Nine mentions, there may be more sophisticated ways.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook