Can the pdf be determined numerically for a given data set?

rohitashwa · Nov 20, 2009

I have just been learning probability density function (pdf) and there is something I need to ask. I understand the idea that for any value v, the pdf (f(v)) gives the probability that a value picked from a data set is less than v. It seems ok to find mean,variance, skewness etc. when f(v) is known. However, how is the expression for f(v) arrived at.

If you have a data set given, can the pdf be found numerically?

Thank you.

mathman · Nov 20, 2009

If you have a data set given, can the pdf be found numerically?

Yes. Actually you can approximate the density function (pdf derivative) by setting a bin structure (x intervals) and sort the data into the intervals. Cumulative sums (normalized by dividing by the total number of items) will be an approximation to the pdf.

rohitashwa · Nov 20, 2009

Thanks. But, could you tell me why the cumulative sums approximate the pdf?

mathman · Nov 21, 2009

rohitashwa said:

Thanks. But, could you tell me why the cumulative sums approximate the pdf?

The underlying assumption is that the given data was generated from the pdf. All statistical analysis is based on this assumption, i.e. given enough sample data, the probability distribution can be approximated by the sample distribution.

rohitashwa · Nov 21, 2009

Thank you. That was very helpful.

chiro · Nov 22, 2009

rohitashwa said:

I have just been learning probability density function (pdf) and there is something I need to ask. I understand the idea that for any value v, the pdf (f(v)) gives the probability that a value picked from a data set is less than v. It seems ok to find mean,variance, skewness etc. when f(v) is known. However, how is the expression for f(v) arrived at.

If you have a data set given, can the pdf be found numerically?

Thank you.

Theres a couple of ways to get the pdf. In univariate distributions you could "fit" the results you get to a standard distribution (like say gaussian, lognormal, uniform etc) or you could use numerical analysis to come up with a distribution based on interpolation and other techniques.

If the data happened to fit a "stock" standard distribution then analyzing it would be a lot easier than analyzing a distribution based on numerical analysis since the assumptions of the stock standard distributions are easier understood.

Can the pdf be determined numerically for a given data set?

Thread 'Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense'

Similar threads

Undergrad A variant of the Monty Hall problem

Undergrad Please Explain (actually explain) The Monty Hall Problem

Undergrad What Are the Axioms of Fuzzy Logic and How Do They Extend Boolean Algebra?

High School How Rare Is Low Smartphone Usage Among Metro Travelers in Japan?

High School Onto set mapping is the surjective set mapping, and into injective?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers