Probability Density Function: Converting Experimental Observations to PDF

In summary: For example, if you toss a coin 10 times, the probability that the result will be heads is 50%. If you toss it 100 times, the probability that the result will be heads is 5%. The probability density function plots the fraction of heads results as a function of the number of heads tosses.
  • #1
naveendeveloper
2
0
TL;DR Summary
I am not able to understand how to convert an experiments observation of continuous random variable into probability density function
Hi All
I am currently doing Master in data science. I came across the function PDF probability density function which is used to find cumulative probability(range) of a continuous random variable.
The PDF probability density function is plotted against probability density in y-axis and Random variable in x axis.
I am not able to understand how to convert an experiments observation of continuous random variable into probability density function
Kindly help me understand with a small example
Thank you
 
Last edited by a moderator:
Physics news on Phys.org
  • #2
Do you know about the normal distribution, for example?
 
  • #3
Make a histogram and divide every frequency number by the total sample size. That will approximate the PDF. The histogram cell range should be set appropriately so that there are enough samples in them so that they do not jump up and down too much and also so that there are not too few cells to approximate the continuous PDF..
 
  • #4
FactChecker said:
Make a histogram and divide every frequency number by the total sample size. That will approximate the PDF. The histogram cell range should be set appropriately so that there are enough samples in them so that they do not jump up and down too much and also so that there are not too few cells to approximate the continuous PDF..
Hi
Thank you so much on your explanation. I have attached an excel sheet of height of 100k employees in the following link https://docs.google.com/spreadsheets/d/142Ay2BOh5rOd1weO4f7Jbe2-roYoTDRo/edit?usp=sharing&ouid=116301201506347494587&rtpof=true&sd=true
Kindly can you help me understand how to create the PDF by creating histogram and normalising its area to 1 ( just the logic to do that would be really helpful)

One other query, after creating the PDF the y-axis Probability density what does it represent

Thanks
Naveen
 
  • #5
The steps would depend a lot on what statistics software package you are using. I like R, which is free, well respected, and well documented. R has a function, densityplot, that does it. I don't know what is available in EXCEL.
If you are doing it all yourself, this is a rough description of the process.
1) get the range of the height data, heightMin & heightMax.
2) divide the range evenly into some number of sub-range cells (with 1000 data points, try 20 cells as a first attempt and adjust if desired)
3) count the number of data points in each cell
4) convert the cell counts into fractions by dividing by the total number of data points (1000 in your example)
5) plot the results.

Have you had any classes in probability and statistics? The probability density function shows the fraction of results that would have certain values.
 
Last edited:

FAQ: Probability Density Function: Converting Experimental Observations to PDF

1. What is a probability density function (PDF)?

A probability density function is a mathematical function that describes the probability distribution of a continuous random variable. It shows the relative likelihood of different outcomes occurring within a given range of values.

2. How is a PDF calculated from experimental observations?

A PDF is calculated by dividing the number of observations falling within a specific range of values by the total number of observations, and then dividing that result by the width of the range. This gives the probability of an observation falling within that range.

3. What is the difference between a PDF and a histogram?

A histogram is a visual representation of the distribution of a dataset, while a PDF is a mathematical function that describes the distribution. A histogram shows the frequency of observations falling within different ranges, while a PDF shows the probability of an observation falling within a specific range.

4. Can a PDF be used to make predictions about future observations?

Yes, a PDF can be used to make predictions about future observations by calculating the probability of an observation falling within a given range of values. This can be useful in fields such as finance, where predicting future stock prices or market trends is important.

5. Are there any limitations to using a PDF to analyze experimental data?

One limitation of using a PDF is that it assumes a continuous distribution of data, which may not always be the case in real-world experiments. Additionally, the accuracy of the PDF depends on the quality and quantity of the experimental data collected.

Back
Top