How to fit given function to blurred data points?

sceptic · Oct 8, 2014

Are there any elaborated theory or method how to fit parameters of a function family to data given by probability distributions of data points instead of given coordinates of points precisely without error? I think this is a very general problem, I hope it is already solved.

Important:

I would like a general method working with any kind of probability distribution around data points, not just a Gaussian which can be described an error value, for example its variance.

I would like to use all information which is available, so a fully Bayesian solution without unnecessary estimation.

DrDu · Oct 8, 2014

In classical statistics, you would set up the Likelihood for your parameters and maximize it.
Bayesian statistics is similar: you multiply the likelihood with the prior distributions of the the parameters to obtain the posterior probability distribution of the parameters.

sceptic · Oct 8, 2014

Yes, I know all the principles. But I need a practical example with equations, maybe a book chapter or a paper with this kind of problem. For example what kind of keyword should I search for? The distributions can be the same, but not Gaussian. Is it practically possible to calculate at all? Maybe in general for lots of data point distributions the problem can exponentially explode, or can't?

mfb · Oct 8, 2014

Likelihood and minimization are good keywords. Usually the point of maximal likelihood is found with iterative approximations.

sceptic said:

Maybe in general for lots of data point distributions the problem can exponentially explode, or can't?

No, it is typically linear with data points (because you have to calculate the likelihood for each data point). Many free parameters can make the problem time-consuming, especially if they are highly correlated.

Number Nine · Oct 13, 2014

There is a concept of an "error-in-variables" model that deals with this kind of thing, although I'd probably just take a hierarchical approach. As an example, suppose that we have observed points ##(x_1,\dots,x_n)## from a normal distribution ##N(\mu,\tau^2)## which we assume are actually measured with normally distributed error ##N(0,\sigma^2_i)##. If ##x_i## has true value ##\mu_i## (which is unobserved), then we have ##x_i \sim N(\mu_i, \sigma^2_i)##, so the full model for the mean is
[tex]x_i = \mu_i + \epsilon_i, \ \ \ where \ \epsilon_i \sim N(0, \tau^2)[/tex]
or
[tex]x_i = \mu + e_i + \epsilon_i, \ \ \ where \ e_i \sim N(0, \sigma^2_i)[/tex]
Basically, we just model the error at two different levels.

A similar regression model might take the form

[tex]y_i = \alpha + \beta \mu_i + \epsilon[/tex]

Note that you can assume any kind of error structure you want; it doesn't have to be normal. The same general approach would still apply.

How to fit given function to blurred data points?

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Graduate Probability puzzle

Undergrad The countability paradox of computable numbers

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

How to fit given function to blurred data points?

Similar threads