Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Distribution of data - alternative presentation

  1. Apr 29, 2013 #1
    If we commonly draw a distribution of data, we should be careful to chose appropriate classes, for instance,
    1-2, 4
    2-3, 6
    3-4, 11
    etc.
    But, if we draw a cumulative distribution, classes are not necessary. For instance
    1-2, 4
    2-3, 10
    3-4, 21
    and still better:
    1, 1
    1.3, 2
    1.4, 3
    1.9, 4
    2.1, 5
    etc

    Does exist any good smoothing of this cumulative curve and calculation (derivation) of noncumulative distribution from it again?
     
    Last edited: Apr 29, 2013
  2. jcsd
  3. Apr 30, 2013 #2

    Stephen Tashi

    User Avatar
    Science Advisor

    There are many methods of smoothing, but whether they are "good" is not a precise mathematical question until you define how "goodness" will be measured. To have a precise mathematical question, you also need to provide some probabilitiy model for how the data is generated.

    A simple way is to pick a family of distributions (such as a Poisson), estimate the value of the parameters of that distribution and use the mathematical formula for it to estimate both the cumulative distribution and the density.

    What you have in mind may be "interpolation" - i.e. perhaps you want a function that exactly matches the cumulative at every point of the data, but gives a smooth representation of the density. If that's your goal, you should express this thought.
     
  4. May 1, 2013 #3
    It is not necessary that interpolation is made. I know that distributions are different, for instance Gaussian or Poisson one. I only asked if my mentioned method, or something similar, is in general use?
     
    Last edited: May 1, 2013
  5. May 1, 2013 #4

    Stephen Tashi

    User Avatar
    Science Advisor

    I don't see that you mentioned a particular method. You only described the goal of getting a smooth representation of the density.
     
  6. May 2, 2013 #5
    Yes, I do not know how to clearly present my question.
    Maybe this:
    Cumulative distribution is universal, is not dependent of classes, or intervals, but common distributions are dependent. But otherise, cummulative distubition is to much abstract.
    So I suspect that it is possible to present common distribution without intervals, as a help at visualization of a cummulative distribution. Probably something such exist already?
     
  7. May 2, 2013 #6

    Stephen Tashi

    User Avatar
    Science Advisor

    If we are talking about the cumulative histogram of indpendent samples of a random variable, this does depend on the interval that gives the precision of the measurement. For example, if we measure to the nearest kg, we get a different representation that if we measure to the nearest gram. All real measurements of continuous random variables have limited precision.

    The only thing that prevents you from making an "exact" representation of want you call the "common distribution" is that you want a histogram . A histogram, by definition, uses intervals to classify the data. You could plot the data without using intervals. At each data point ( you could draw a vertical line of height (k/n) where n is the number of data points and k is the number of times the value occurs, which will usually be 1.

    If we are talking about the cumulative distribution of a continuous random variable both the cumulative distribution and the probability density are smooth curves and neither depends on interval sizes.
     
  8. May 2, 2013 #7
    Yes, histogram is a correct word.

    You could plot the data without using intervals. At each data point ( you could draw a vertical line of height (k/n) where n is the number of data points and k is the number of times the value occurs, which will usually be 1.

    Yes, you could draw so, but if values are contionous, it means for instance 1.234455, 1.345555, your have only vertical lines high 1/N, or zeros. It is not visually well. But cumulative histogram has fine shape even if you do not have intervals.
     
    Last edited: May 2, 2013
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: Distribution of data - alternative presentation
Loading...