Choosing a Probability Distribution for Visualizing Discrete Data Sets

In summary, the individual is seeking advice on how to visualize a discrete data set probabilistically. They mention being weak in Probability and ask for recommendations on a good starting point. The conversation then discusses using a histogram and a CDF plot, as well as choosing a suitable function for plotting the discrete pdf. The individual also asks if they can normalize their data before making a histogram and if that would give them probabilities on the y-axis. The response explains that normalizing the data before making a histogram is possible and how it can provide estimated probabilities for each bin in the histogram.
  • #1
Somefantastik
230
0
I have a discrete set of data. I'd like to visualize it probabilistically. Unfortunately, I focused in Num Methods in grad school and am very weak in Probability. Where is a good place to start to visualize this data set using a discrete pdf?

I know a histagram is good to show # of occurrences for each outcome. I also know a cdf plot shows the probability of the outcome being less than some number. But when I start looking at plotting pdf's, there are many functions to choose from and I'm not sure how to go about choosing one, or translating that to a discrete data set rather than a continuous one.
 
Physics news on Phys.org
  • #2
mkay so I know that I should make a histogram, normalize the histogram, and then fit a curve to the distribution then.

Now my question is, can I normalize my data before making a histogram, and will that process give me the probabilities on the y-axis?
 
  • #3
Yes, you can normalize before making a histogram. Suppose, for instance, that you have N measurements, which come as n distinct values x_1, x_2, ..., x_n with frequencies f_1, f_2, ..., f_n. The frequencies are positive integers that add up to N. If you divide each frequency by N, you now have (estimated) probabilities p_i for each x_i that add up to 1. When you make your histogram you'll be binning the x_i, and you get the probability of that bin by adding up all the p_i that go in it. That's the estimated probability of falling into that bin.
 

What is a PDF?

A PDF (probability density function) is a statistical tool used to represent the probability distribution of a discrete data set. It shows the relative likelihood of each possible outcome occurring in the data set.

How do you calculate a PDF?

To calculate a PDF for a discrete data set, you need to divide the number of occurrences of each value by the total number of data points in the set. This will give you the probability for each value, which can then be plotted on a graph to show the distribution.

What is the difference between a PDF and a histogram?

A PDF and a histogram both show the distribution of a data set, but they differ in how the data is presented. A PDF shows the probability of each value occurring, while a histogram shows the frequency or count of each value.

Why is a PDF useful in data analysis?

A PDF can provide valuable insights into the distribution of a data set. It can help identify patterns and outliers, as well as provide a visual representation of the data. It is also useful in making predictions and calculating probabilities for future outcomes.

Can a PDF be used for continuous data sets?

No, a PDF is specifically designed for discrete data sets. For continuous data sets, a probability density function (also known as a continuous PDF) is used, which takes into account the infinite number of possible values in a continuous data set.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
897
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Science and Math Textbooks
Replies
4
Views
649
  • Set Theory, Logic, Probability, Statistics
Replies
21
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
333
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
Back
Top