Distribution of data - alternative presentation

In summary, the goal of getting a smooth representation of the density may be achieved by using a histogram.
  • #1
exponent137
561
33
If we commonly draw a distribution of data, we should be careful to chose appropriate classes, for instance,
1-2, 4
2-3, 6
3-4, 11
etc.
But, if we draw a cumulative distribution, classes are not necessary. For instance
1-2, 4
2-3, 10
3-4, 21
and still better:
1, 1
1.3, 2
1.4, 3
1.9, 4
2.1, 5
etc

Does exist any good smoothing of this cumulative curve and calculation (derivation) of noncumulative distribution from it again?
 
Last edited:
Physics news on Phys.org
  • #2
exponent137 said:
Does exist any good smoothing of this cumulative curve and calculation (derivation) of noncumulative distribution from it again?

There are many methods of smoothing, but whether they are "good" is not a precise mathematical question until you define how "goodness" will be measured. To have a precise mathematical question, you also need to provide some probabilitiy model for how the data is generated.

A simple way is to pick a family of distributions (such as a Poisson), estimate the value of the parameters of that distribution and use the mathematical formula for it to estimate both the cumulative distribution and the density.

What you have in mind may be "interpolation" - i.e. perhaps you want a function that exactly matches the cumulative at every point of the data, but gives a smooth representation of the density. If that's your goal, you should express this thought.
 
  • #3
It is not necessary that interpolation is made. I know that distributions are different, for instance Gaussian or Poisson one. I only asked if my mentioned method, or something similar, is in general use?
 
Last edited:
  • #4
exponent137 said:
. I only asked if my mentioned method, or something similar, is in general use?

I don't see that you mentioned a particular method. You only described the goal of getting a smooth representation of the density.
 
  • #5
Stephen Tashi said:
I don't see that you mentioned a particular method. You only described the goal of getting a smooth representation of the density.
Yes, I do not know how to clearly present my question.
Maybe this:
Cumulative distribution is universal, is not dependent of classes, or intervals, but common distributions are dependent. But otherise, cummulative distubition is to much abstract.
So I suspect that it is possible to present common distribution without intervals, as a help at visualization of a cummulative distribution. Probably something such exist already?
 
  • #6
exponent137 said:
Cumulative distribution is universal, is not dependent of classes, or intervals

If we are talking about the cumulative histogram of indpendent samples of a random variable, this does depend on the interval that gives the precision of the measurement. For example, if we measure to the nearest kg, we get a different representation that if we measure to the nearest gram. All real measurements of continuous random variables have limited precision.

The only thing that prevents you from making an "exact" representation of want you call the "common distribution" is that you want a histogram . A histogram, by definition, uses intervals to classify the data. You could plot the data without using intervals. At each data point ( you could draw a vertical line of height (k/n) where n is the number of data points and k is the number of times the value occurs, which will usually be 1.

If we are talking about the cumulative distribution of a continuous random variable both the cumulative distribution and the probability density are smooth curves and neither depends on interval sizes.
 
  • #7
Yes, histogram is a correct word.

You could plot the data without using intervals. At each data point ( you could draw a vertical line of height (k/n) where n is the number of data points and k is the number of times the value occurs, which will usually be 1.

Yes, you could draw so, but if values are contionous, it means for instance 1.234455, 1.345555, your have only vertical lines high 1/N, or zeros. It is not visually well. But cumulative histogram has fine shape even if you do not have intervals.
 
Last edited:

1. What is the purpose of alternative presentation of data distribution?

The purpose of alternative presentation of data distribution is to provide a different way of visualizing and analyzing data, aside from the traditional methods such as histograms and box plots. This allows for a more comprehensive understanding of the data and can reveal patterns and insights that may not be apparent in traditional presentations.

2. What are some examples of alternative presentation methods for data distribution?

Some examples of alternative presentation methods for data distribution include dot plots, stem-and-leaf plots, and violin plots. These methods use different visual representations such as points, lines, or shapes to display the distribution of data.

3. How do alternative presentation methods affect the interpretation of data distribution?

Alternative presentation methods can provide a more detailed and nuanced understanding of data distribution. For example, a violin plot can show the shape of the distribution more clearly than a histogram, and a dot plot can highlight individual data points that may be outliers or have a significant impact on the overall distribution.

4. What are the advantages of using alternative presentation methods for data distribution?

One advantage of using alternative presentation methods is that they can cater to different types of data and distributions. For example, a box plot may not be suitable for skewed data, but a violin plot can better display the distribution in such cases. Additionally, alternative methods can be more visually appealing and engaging for data analysis.

5. What should be considered when choosing an alternative presentation method for data distribution?

When choosing an alternative presentation method, it is essential to consider the type of data, the distribution, and the audience. Some methods may be more suitable for certain types of data or distributions, while others may be more intuitive for the audience to interpret. It is also essential to ensure that the chosen method accurately represents the data without distorting or misrepresenting it.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
2K
  • Nuclear Engineering
Replies
7
Views
536
  • Programming and Computer Science
Replies
1
Views
581
Replies
17
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
2
Views
1K
  • Calculus and Beyond Homework Help
Replies
23
Views
3K
Back
Top