# Systematic uncertainties in linear fit

• I
• Malamala
For example, say I ignore the intercept and I try to fit just something of the form ##y=ax##. If I have, say, 4 data points one thing I can do is to get ##a ## from each pair of ##(x,y)## and the relative error will be the error on y, say ##10##%. Then, if I assume that all points have the same uncertainty, the average value of ##a## will be the average of the ##a## obtained from the 4 points, and the uncertainty on ##a## would be ##10/\sqrt{4} = 5##%. Doesn't this mean that the error on my data points isf

#### Malamala

Hello! I have some data points ##(x,y)##, with some uncertainties on y that are statistical ##dy_1## and some systematic ##dy_2##. I want to fit this data using a linear function. How exactly should I deal with the different types of uncertainties? Can I just add them in quadrature and perform the fit like that?

What does add them in quadrature mean? I've never heard that phrase.

• • gleem and FactChecker
Can I just add them in quadrature and perform the fit like that?
Yes.

It is the square root of the sum of the squares.

• FactChecker
Hello! I have some data points ##(x,y)##, with some uncertainties on y that are statistical ##dy_1## and some systematic ##dy_2##. I want to fit this data using a linear function. How exactly should I deal with the different types of uncertainties? Can I just add them in quadrature and perform the fit like that?
I don't understand the point of this. Do you actually know the uncertainties and can add the numbers or are you wondering how to model the problem with unknown values? In any case, I would hesitate to treat an unknown constant (systematic?) as though it is a random variable.

One thing to watch out for - if the systematic uncertainty is like, the "true" model is ##y=2x## but you actually measure ##y=2x +0.03x +0.01x^2## because of systematic measuring error, then if you do a normal linear fit depending on the range of your data you'llprobably get a model like ##y=2.05x##. Would you consider this more or less correct than ##2x##?

Thank you all for your replies. Just to clarify, by adding them in quadrature I meant doing ##dy = \sqrt{dy_1^2+dy_2^2}##, then use these ##dy## errors as the errors on y for the fit. What actually confused me a bit and hence I asked the question, is that it seems like, by doing this, the systematic errors also get reduced by ~##\sqrt{N}##, where N is the number of points I have. I thought (but I am really not good at statistics) that this error reduction is true only for statistical errors. Is it true for systematic error, too? Somehow I thought that one is limited by the systematic errors and can't improve that further by adding more data (we can assume that all the points in my case have the same systematic error).

Just to clarify, by adding them in quadrature I meant doing ##dy = \sqrt{dy_1^2+dy_2^2}##, then use these ##dy## errors as the errors on y for the fit.

This doesn't quite clarify what you are doing. There might be an algorithm for fiting a linear function to data that, in some way, "uses" the standard deviation assumed or estimated for each individual data point ##(x,y)##.

Another possibility is that you are doing a least squares fit of a linear to function to data in the usual manner. So the estimated standard deviation of indivual data points does not affect how the fit is done. After the fit is done, you wish to "use" the assumed standard deviation and bias to estimate the standard deviation of each individual data point ##y_k## about the value ##\hat{y_k}## that is predicted by the linear function.

This doesn't quite clarify what you are doing. There might be an algorithm for fiting a linear function to data that, in some way, "uses" the standard deviation assumed or estimated for each individual data point ##(x,y)##.

Another possibility is that you are doing a least squares fit of a linear to function to data in the usual manner. So the estimated standard deviation of indivual data points does not affect how the fit is done. After the fit is done, you wish to "use" the assumed standard deviation and bias to estimate the standard deviation of each individual data point ##y_k## about the value ##\hat{y_k}## that is predicted by the linear function.
I am not sure I understand. I do indeed try to do a simple least square fit to the data and from there extract the slope and intercept and the uncertainties on them. But why would the error on the data points not matter?

For example, say I ignore the intercept and I try to fit just something of the form ##y=ax##. If I have, say, 4 data points one thing I can do is to get ##a ## from each pair of ##(x,y)## and the relative error will be the error on y, say ##10##%. Then, if I assume that all points have the same uncertainty, the average value of ##a## will be the average of the ##a## obtained from the 4 points, and the uncertainty on ##a## would be ##10/\sqrt{4} = 5##%. Doesn't this mean that the error on my data points is directly reflected in the error on the fit parameters?

by adding them in quadrature I meant doing ##dy = \sqrt{dy_1^2+dy_2^2}##, then use these ##dy## errors as the errors on y for the fit.
I do indeed try to do a simple least square fit to the data and from there extract the slope and intercept and the uncertainties on them.
Hmm, you are correct with in quadrature (that is common terminology in my field but perhaps not for everyone). But how do you "use these dy errors as the errors on y for the fit"? That isn't something I have ever seen in the context of a simple least squares fit.

Usually when someone is doing this sort of thing they will use a weighted least squares fit, with the weighting being the inverse of the variance.

Hmm, you are correct with in quadrature (that is common terminology in my field but perhaps not for everyone). But how do you "use these dy errors as the errors on y for the fit"? That isn't something I have ever seen in the context of a simple least squares fit.

Usually when someone is doing this sort of thing they will use a weighted least squares fit, with the weighting being the inverse of the variance.
Hmm I guess this is what I am asking perhaps? I guess in most fitting routines (I am using Python) they use the inverse of the variance as a weight. My questions was, should I calculate that weight using the combined statistical and systematic errors (i.e. add them in quadrature), or treat them somehow separately, given their different nature?

My questions was, should I calculate that weight using the combined statistical and systematic errors (i.e. add them in quadrature), or treat them somehow separately, given their different nature?
You should combine them in quadrature.

However, you need to be aware that if your actual systematic uncertainty is not zero mean then you will be fitting to that as well.

Hmm I guess this is what I am asking perhaps? I guess in most fitting routines (I am using Python) they use the inverse of the variance as a weight. My questions was, should I calculate that weight using the combined statistical and systematic errors (i.e. add them in quadrature), or treat them somehow separately, given their different nature?

To give a mathematical answer to a question, it has to be mathematically specific. Otherwise, you can get only other types of answers (e.g. what is customary, what is intuitive).

The term "error " is not specific. As a generality, "error" often refers to the difference between a predicted value and a true value. In the case of fitting a function to data, there is also the "error" between the predicted value and the observed data.

If you indeed know the systematic error associated with the ##y## value of the ##k##-th data point , you can subtract that value from ##y_k## before you do the fit. Then you are fitting to ##y##-data that has zero systematic error.

How is it that you know the systematic errors in the ##y##-data?

• gleem
To give a mathematical answer to a question, it has to be mathematically specific. Otherwise, you can get only other types of answers (e.g. what is customary, what is intuitive).

The term "error " is not specific. As a generality, "error" often refers to the difference between a predicted value and a true value. In the case of fitting a function to data, there is also the "error" between the predicted value and the observed data.

If you indeed know the systematic error associated with the ##y## value of the ##k##-th data point , you can subtract that value from ##y_k## before you do the fit. Then you are fitting to ##y##-data that has zero systematic error.

How is it that you know the systematic errors in the ##y##-data?
@Dale @Stephen Tashi Thank you for your help with this. Just to clarify, my data points already have uncertainties associated to them from the experiment. The statistical ones come basically from counting errors (i.e. ##\sqrt{n}##, where n is the number of measurements done for each value of y) while the systematic ones come from stuff like limits on the significant digits that the voltmeter displayed. The point is that I know ##(x,y,dy_1,dy_2)## for all my points. I want now to fit a straight line to these points and extract the uncertainty on the slope.

I found this nice page which implements what I need. As you can see, by fitting the default values (you can uncheck the include x-error) the uncertainty on the slope is ##0.342997##. If I make the y errors 10 times bigger, the error on the slope becomes ##3.42997##. So one thing that confuses me from some of the previous replies is that the uncertainty on y doesn't matter for the uncertainty on the parameters of the fit, but it seems it does. Also intuitively, I would imagine that the better you know the points the better you know the slope. So what am I missing?

And back to my original question, for the ##\delta y## column of that website, should I use ##\sqrt{dy_1^2+dy_2^2}## or something else?

• Dale
If you know the values of ##dy_1## and ##dy_2##, then what are we doing? You can subtract all the errors from your data and then build a model that contains no errors.

If you know the values of ##dy_1## and ##dy_2##, then what are we doing? You can subtract all the errors from your data and then build a model that contains no errors.
I am not sure I understand. I know the model for my data, it is a linear function. I just need to estimate the uncertainties on the parameters of the model. My main question is if I should treat the statistical and systematic uncertainties on y in the same way.

should I use ##\sqrt{dy_1^2+dy_2^2}## or something else?
I have already answered this for you in posts 3 and 11. Can you please let me know how many times you require the same answer so that I can just do it in one post and get it over with?

If you know the values of ##dy_1## and ##dy_2##, then what are we doing? You can subtract all the errors from your data and then build a model that contains no errors.
It seems "errors" is interpreted to mean "standard deviations".

I have already answered this for you in posts 3 and 11. Can you please let me know how many times you require the same answer so that I can just do it in one post and get it over with?

That answer is ok if ##dy_1, dy_2## are from independent random variables.
If the "systematic error" is due to a limited number of digits on a voltmeter, I'm curious how it is calculated from that number. For example, if a voltmeter displays 3 significant digits, what procedure estimates the systematic "error" in a set of readings that average to 6.23 (using the 3 digits only)? Do I compute the standard deviation of a random variable with a uniform distribution on lthe interval (2.625, 2.635)? Or is the estimated "systematic error" dependent in some way on the estimate of the population standard deviation from the 3 digit readings?

Do I compute the standard deviation of a random variable with a uniform distribution on lthe interval (2.625, 2.635)?
Yes, that is one way to do it. See section 4.3.7 in the BIPM’s Guide to Uncertainty in Measurement (GUM)

https://www.bipm.org/documents/2012...08_E.pdf/cb0ef43f-baa5-11cf-3f85-4dcd86f77bd6

Other ways are to make your own statistical measurements or to consult the manufacturers documentation.

I have already answered this for you in posts 3 and 11. Can you please let me know how many times you require the same answer so that I can just do it in one post and get it over with?
Sorry about that, it's just that in our case one of the systematic uncertainties comes from theory and we know that the way it behaves is to shift all points by the same (unknown) amount in the same (unknown) direction. The maximum range of that shift is given by ##dy_2##. It just seems to me like adding them in quadrature is not obviously the right approach, given that the statistical uncertainty can shift the points up and down equally likely, while the statistical one induces a systematic shift.

Actually your conversation after confused me a bit. I thought that a systematic uncertainty induces a systematic shift i.e. all the points shift in a well defined way, it's just that we don't know that shift well enough. But based on what you said above, the systematic shift can behave like a random variable, too? I should look more into the definition of that...

Sorry about that, it's just that in our case one of the systematic uncertainties comes from theory and we know that the way it behaves is to shift all points by the same (unknown) amount in the same (unknown) direction.
What do you mean by "all the points"? Do you mean each set of measurements with ##x = x_1## has its corresponding ##y## values shifted (away from the "true" ##y## value) by some number in the interval ##[-dy_1, dy_1]## that is the same for all measurements with ##x = x_1##. And a set of measurements with a different ##x = x_2## has all its corresponding ##y## values shifted by some number in the possibly different interval ##[-dy_2, dy_2]##?

Or do you mean that each value ##x## has all its corresponding ##y## values shifted by some number in the interval ##[-r,r]## which is the same number regardless of the value of ##x## (i.e. ##r = dy_1 = dy_2 = ...##) ?

And do you use the notation ##dy_2## to define a range of values or do you mean ##dy_2## to denote the standard deviation of a random variable ? (e.g. an "uncertainty").

Or do you mean that each value ##x## has all its corresponding ##y## values shifted by some number in the interval ##[-r,r]## which is the same number regardless of the value of ##x## (i.e. ##r = dy_1 = dy_2 = ...##) ?
This is what I meant. Well, in your notation above I assume ##dy_1## is what I would call the systematic uncertainty on the first points (as in my original post ##dy_1## was the statistical uncertainty).

I guess that what I mean by ##dy_2## is just an unknown number, that shifts the actual value of ##y##. There is no distribution of that number, it is fixed, just unknown.

Just to give a bit more context about ##dy_2## (it seems like my understanding about systematic uncertainties was kinda wrong). That terms comes from theoretical input in obtaining ##y##. For example you can think that ##y=wz##, where ##w## is calculated theoretically and ##z## is measured experimentally (##x## is always just measured experimentally). ##w## is calculated using perturbation theory (PT), so we know that the value is deviating from the truth by the amount we ignore in the (PT) truncation. We don't know what that deviation is (as we don't know the truth), but as we do the same calculation for all points, it is the same for all points. Btw, is it right to call this systematic uncertainty (I did so, as increasing the statistics of the measurement wouldn't reduce it)?

This uncertainty sounds like it is still zero mean, since you don't know the direction. The issue is that it is perfectly correlated. So you will still have the quadrature term, that doesn't go away, but in addition you will also have a covariance term to handle the fact that this error is correlated.

This uncertainty sounds like it is still zero mean, since you don't know the direction. The issue is that it is perfectly correlated. So you will still have the quadrature term, that doesn't go away, but in addition you will also have a covariance term to handle the fact that this error is correlated.
How exactly would I deal with this covariance term? Is it like an extra term in the overall error, beside the quadrature term?

One thing that confuses me is that, given that this uncertainty is the same for all points, for example it shouldn't have any effect on the slope (right?). If we move all the points up or down by the same amount, the slope doesn't change, just the intercept. On the other hand statistical errors do add an uncertainty in the slope. Wouldn't adding them in quadrature remove this behavior of the systematic uncertainty?

Adding them in quadrature gives the uncertainty of the true value of a point still. The uncertainty in the slope should deal with the correlation in uncertainties between different points, in a way that is probably hard to think about in full generality. If the issue really is just the line is being pushed up and down, then it's super easy to deal with (just ignore it), so perhaps if we are told the shape of what you're actually dealing with we can figure it out for that specific scenario.

Adding them in quadrature gives the uncertainty of the true value of a point still. The uncertainty in the slope should deal with the correlation in uncertainties between different points, in a way that is probably hard to think about in full generality. If the issue really is just the line is being pushed up and down, then it's super easy to deal with (just ignore it), so perhaps if we are told the shape of what you're actually dealing with we can figure it out for that specific scenario.
What do you mean by the shape? So in my case I have these points I want to fit with a straight line ##y=ax+b##. I have some uncertainties on y which are statistical and also this systematic uncertainty (which is a shift of all points up or down by an unknown amount). I want to extract from the fit a and b and the associated uncertainties on a and b.

The shape of the systematic uncertainty. If it's literally adding the exact same number to every measured y value, then you can entirely ignore it when computing the uncertainty of your slope. For the uncertainty of your intercept, you should compute the uncertainty from your statistical data, then add that in quadrature with your uncertainty on the uniform shift. I think that's it and you're done?

The shape of the systematic uncertainty. If it's literally adding the exact same number to every measured y value, then you can entirely ignore it when computing the uncertainty of your slope. For the uncertainty of your intercept, you should compute the uncertainty from your statistical data, then add that in quadrature with your uncertainty on the uniform shift. I think that's it and you're done?
Thank you! One more question: for the final uncertainty on the intercept, should I add the 2 uncertainties in quadrature, or display them as value(stat)[sys]. I have seen both of them in papers but I've never understood when to use one over the other. I would be inclined to use value(stat)[sys], as adding in quadrature loses information, but I am not sure if there are any rules about when to do that.

I really couldn't tell you. It sounds like a convention for different fields or just different authors.

So in my case I have these points I want to fit with a straight line ##y=ax+b##. I have some uncertainties on y which are statistical and also this systematic uncertainty (which is a shift of all points up or down by an unknown amount). I want to extract from the fit a and b and the associated uncertainties on a and b.

I get the impression that you are asking questions about the behavior of some computer program rather than asking questions about a method that you yourself understand in detail. If we don't know what algorithm the program is using to make the linear fit and what algorithm it uses to output an "uncertainty" about the parameters of the fit, we can't give mathematical answers about how to provide the intputs to the program.

Do you have documentation for the algorithms used by the program?

• WWGD