Can the pdf be determined numerically for a given data set?

  • Thread starter Thread starter rohitashwa
  • Start date Start date
  • Tags Tags
    Data Pdf Set
rohitashwa
Messages
3
Reaction score
0
I have just been learning probability density function (pdf) and there is something I need to ask. I understand the idea that for any value v, the pdf (f(v)) gives the probability that a value picked from a data set is less than v. It seems ok to find mean,variance, skewness etc. when f(v) is known. However, how is the expression for f(v) arrived at.

If you have a data set given, can the pdf be found numerically?

Thank you.
 
Physics news on Phys.org
If you have a data set given, can the pdf be found numerically?
Yes. Actually you can approximate the density function (pdf derivative) by setting a bin structure (x intervals) and sort the data into the intervals. Cumulative sums (normalized by dividing by the total number of items) will be an approximation to the pdf.
 
Thanks. But, could you tell me why the cumulative sums approximate the pdf?
 
rohitashwa said:
Thanks. But, could you tell me why the cumulative sums approximate the pdf?
The underlying assumption is that the given data was generated from the pdf. All statistical analysis is based on this assumption, i.e. given enough sample data, the probability distribution can be approximated by the sample distribution.
 
Thank you. That was very helpful.
 
rohitashwa said:
I have just been learning probability density function (pdf) and there is something I need to ask. I understand the idea that for any value v, the pdf (f(v)) gives the probability that a value picked from a data set is less than v. It seems ok to find mean,variance, skewness etc. when f(v) is known. However, how is the expression for f(v) arrived at.

If you have a data set given, can the pdf be found numerically?

Thank you.

Theres a couple of ways to get the pdf. In univariate distributions you could "fit" the results you get to a standard distribution (like say gaussian, lognormal, uniform etc) or you could use numerical analysis to come up with a distribution based on interpolation and other techniques.

If the data happened to fit a "stock" standard distribution then analyzing it would be a lot easier than analyzing a distribution based on numerical analysis since the assumptions of the stock standard distributions are easier understood.
 
Namaste & G'day Postulate: A strongly-knit team wins on average over a less knit one Fundamentals: - Two teams face off with 4 players each - A polo team consists of players that each have assigned to them a measure of their ability (called a "Handicap" - 10 is highest, -2 lowest) I attempted to measure close-knitness of a team in terms of standard deviation (SD) of handicaps of the players. Failure: It turns out that, more often than, a team with a higher SD wins. In my language, that...
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Back
Top