Chi-squared fit with errors on both x and y

Malamala · Nov 13, 2019

Hello I have some data points which have errors on both x and y coordinates. I want to fit a straight line to them but I am not sure how to take the error on x into account. Normally, when I have just the error on y, I want to minimize $$\sum\frac{(y_{pred}(x)-y_{measured}(x))^2}{\sigma_y^2}$$
Can I just replace ##\sigma_y^2## with ##\sigma_x^2+\sigma_y^2##? The errors on x and y are not correlated. Thank you!

Vanadium 50 · Nov 13, 2019

No, it's not that simple. See errors-in-variables models and regression dilution. This is an area where you need a very well defined problem to get a statistically valid answer. (Unlike your last problem)

Dale · Nov 13, 2019

It is also called orthogonal distance regression.

Vanadium 50 · Nov 13, 2019

Dale said:

It is also called orthogonal distance regression.

Yes. You start with the obvious thing - a line y = mx + b, and you try and do a least-squares fit using the perpendicular distances between the points and the candidate line instead of the y-distances. Problem is that doesn't always get you a unique unbiased solution.

That's why you need to specify what you are looking for very carefully.

Stephen Tashi · Nov 13, 2019

Here's another Wikipedia link - total least squares https://en.wikipedia.org/wiki/Total_least_squares

Vanadium 50 · Nov 16, 2019

Even though this appears to be a drive-by posting, I'll make one more comment.

If you minimize a function of Δy only, it's clear what you are doing. If you minimize something like Δx² + Δy² it's not even guaranteed that you have a number with consistent dimensions: suppose y is temperature and x is time. What units would Δx² + Δy² even be in?

To get a well-defined answer, one needs to pose a much, much better defined question. And even then it may not exist.

WWGD · Dec 11, 2019

Vanadium 50 said:

Even though this appears to be a drive-by posting, I'll make one more comment.

If you minimize a function of Δy only, it's clear what you are doing. If you minimize something like Δx² + Δy² it's not even guaranteed that you have a number with consistent dimensions: suppose y is temperature and x is time. What units would Δx² + Δy² even be in?

To get a well-defined answer, one needs to pose a much, much better defined question. And even then it may not exist.

Maybe if you standardize your variables you can avoid the issue with units? I understand that is one if the reasons for standardization.

Malamala · Dec 11, 2019

WWGD said:

Maybe if you standardize your variables you can avoid the issue with units? I understand that is one if the reasons for standardization.

What do you mean by this?

WWGD · Dec 11, 2019

Malamala said:

What do you mean by this?

I was replying to @Vanadium 50 regarding his statement on mixed units in the expression ##\sqrt \delta x^2 + \ delta y^2 ##. If you standardize your expression ( assuming normality of data or other) the resulting variable is unitless , from algebra alone ( you're dividing two expressions with the same units ), so that you avoid at least this issue of having mixed units. Seems like something @Stephen Tashi may know about.

Chi-squared fit with errors on both x and y

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect