# Basic doubts about measurements and fitting

• Felipe Lincoln
In summary, the conversation discusses a method for measuring the resistance of a single resistor and the issue of error in the measurement. The individual plans to apply a voltage and record the corresponding voltage and current readings, with an assumed uncertainty of 5%. They then plot the data and perform a linear fitting using python, obtaining an array of coefficients and a covariance matrix. The individual is unsure of the relationship between the error given by the fitting and the error in the measurement, and questions why they should care about the measurement uncertainty if the fitting will give its own error.

Gold Member

## Homework Statement

Let's say I'm doing a experiment in order to measure a resistance of a single resistor of 200 ohm by its V against I graph, where V is the voltage in the resistor's terminal given by and voltmeter and I the current of the circuit, given by and ammeter. The angular coefficient in ##y=ax+b## must be the resistance.
OK, I'll proceed this way: apply a voltage of 0.5V: write down what I read in the voltmeter and the ammter. lift the voltage to 0.75, write down the voltage in resistor and current, and so on until 5V. By doing this I'll get a list o data with uncertainty given by my instruments, let's assume it is 5% of the displayed value. So for every data I have a different error associated.
Now I plot voltage against current and the respective error bars. Now I want to do the linear fitting of these data. When I do this with python I get an array with the coefficients and another which is the covariance matrix. I can get the error of each coefficient through this matrix, but the linear fitting never asked for the error bars, the errors given by the fitting isn't related to the error of my measurements. So my question is what is the relation between the error given by the fitting and the error by the measurement, isn't it related ? If so, why should I care about the measurement uncertainty once the fitting will give it's own error?

## The Attempt at a Solution

[Sorry, I don't know how to apply these kind of question in this formulay...][/B]

Felipe Lincoln said:

## Homework Statement

Let's say I'm doing a experiment in order to measure a resistance of a single resistor of 200 ohm by its V against I graph, where V is the voltage in the resistor's terminal given by and voltmeter and I the current of the circuit, given by and ammeter. The angular coefficient in ##y=ax+b## must be the resistance.
OK, I'll proceed this way: apply a voltage of 0.5V: write down what I read in the voltmeter and the ammter. lift the voltage to 0.75, write down the voltage in resistor and current, and so on until 5V. By doing this I'll get a list o data with uncertainty given by my instruments, let's assume it is 5% of the displayed value. So for every data I have a different error associated.
Now I plot voltage against current and the respective error bars. Now I want to do the linear fitting of these data. When I do this with python I get an array with the coefficients and another which is the covariance matrix. I can get the error of each coefficient through this matrix, but the linear fitting never asked for the error bars, the errors given by the fitting isn't related to the error of my measurements. So my question is what is the relation between the error given by the fitting and the error by the measurement, isn't it related ? If so, why should I care about the measurement uncertainty once the fitting will give it's own error?

## The Attempt at a Solution

[Sorry, I don't know how to apply these kind of question in this formulay...][/B]
Is this really a hypothetical question, or do you have some results you can show us?

Felipe Lincoln said:

## Homework Statement

Let's say I'm doing a experiment in order to measure a resistance of a single resistor of 200 ohm by its V against I graph, where V is the voltage in the resistor's terminal given by and voltmeter and I the current of the circuit, given by and ammeter. The angular coefficient in ##y=ax+b## must be the resistance.
OK, I'll proceed this way: apply a voltage of 0.5V: write down what I read in the voltmeter and the ammter. lift the voltage to 0.75, write down the voltage in resistor and current, and so on until 5V. By doing this I'll get a list o data with uncertainty given by my instruments, let's assume it is 5% of the displayed value. So for every data I have a different error associated.
Now I plot voltage against current and the respective error bars. Now I want to do the linear fitting of these data. When I do this with python I get an array with the coefficients and another which is the covariance matrix. I can get the error of each coefficient through this matrix, but the linear fitting never asked for the error bars, the errors given by the fitting isn't related to the error of my measurements. So my question is what is the relation between the error given by the fitting and the error by the measurement, isn't it related ? If so, why should I care about the measurement uncertainty once the fitting will give it's own error?

## The Attempt at a Solution

[Sorry, I don't know how to apply these kind of question in this formulay...][/B]

In a fairly standard statistical model, the errors in your fit embody information about the errors in your measurements. If we assume that the expected sizes of the errors are independent of ##x## and that different measurements yield statistically independent, zero-mean errors, then the simple linear model is
$$y_i = \alpha + \beta x_i + \epsilon_i, \;\; i = 1,2, \ldots, n.$$
Here, the parameters ##\alpha## and ##\beta## are some unknown values that we want to estimate, and the errors ##\epsilon_1, \epsilon_2, \ldots, \epsilon_n## an independent sample of mean-zero random variables. For example, these might be your measurement errors.

The simplest case is where the statistical distributions of all the different ##\epsilon_i## are the same, so they all come from a single distribution---but are independent random "draws" from that distribution. (Your case of errors that depend on ##x## is a bit more involved, so let's forget about that for now.)

It turns out the the least-squares fit --- the fit that minimizes ##S = \sum_i (a+b x_i - y_i)^2## --- produces unbiased estimates of ##\alpha## and ##\beta.## What does that mean? Well----in any particular run of the experiment you might get a value of ##a## that is too large compared to ##\alpha##, but in some other run you might get a value that is too small. Similarly for ##b## vs. ##\beta.## If you were to repeat the experiment many times (all at the same ##n## values of ##x_i##) you would get a random "sample" of different ##a##-values that would cluster about some central value, and that central value would be ##\alpha## exactly. Therefore, on average, ##a## would estimate ##\alpha## without any systematic bias. Similarly for ##b## vs. ##\beta.## In a single run of the experiment, your best bet is to take ##a## as your estimate of ##\alpha## and ##b## as your estimate of ##\beta.##

So, where do your "original" errors ##\epsilon_i## come in? Well, they are estimated by the "fitted errors" ##e_i = y_i - a - b x_i.## If ##S^* =## minimum value of ##S## above (so is obtained by plugging your solution for ##a,b## into the formula for ##S##), it turns out that ##S^*## is an estimate of ##(n-2) \sigma^2##, where ##\sigma^2## is the variance parameter for the distribution of the errors ##\epsilon##.

As I said already, things are more complicated if the different ##\epsilon_i## are independent, but come from different distributions, as in your conjectured case where the error distribution varies with the value of ##x##.

Last edited:
Felipe Lincoln
Ray Vickson said:
Up to a point.
Consider this extreme situation: the changes in current are so small compared with the gradations on the meter that all readings are the same. Now the fit is perfect. Only the error bars embody the information for the accuracy of the result.
While that is extreme it does illustrate the principle.
Likewise, in the heteroscedastic scenario proposed in post #1 (reading error proportional to value in this case), the error bars can handle this but a fit based purely on the readings will not.

Setting the error bar issue aside and just trying to deal with the heteroscedasticity, it might be not that hard to weight the values. It is complicated by the b term in ax+b. Is the uncertainty to be considered as proportional to x or as proportional to y?

Felipe Lincoln
I think I got it.
In the case I proposed the error increases with x, so if we get points close to 0 and others too far from origin, the error bar would be greater, so in this case I would have more of an accuracy error than a precision error? Since the error is 5%, we could have the instrument taking the measurements 5% lower than it actually is, right?

Felipe Lincoln said:
I think I got it.
In the case I proposed the error increases with x, so if we get points close to 0 and others too far from origin, the error bar would be greater, so in this case I would have more of an accuracy error than a precision error? Since the error is 5%, we could have the instrument taking the measurements 5% lower than it actually is, right?
If you are supposing a percentage error in the instrument, it matters whether you consider it to be independent with each measurement or, more likely, systematic. E.g. if one reading is 2% high then it suggests they all are. That would not be representable by error bars in the plot, and certainly would not be hinted at by the goodness of fit of the regression line. You would instead apply that to the coefficient "a" at the end of your calculations.

One practical point relating to this specific experiment: resistance is affected by temperature. Unless you take steps to keep the resistor at constant temperature your readings may be affected by the order in which they are taken and the duration of each test.

Felipe Lincoln
For what it's worth, if we have measurements (xi, yi) and each yi is considered to be independently subject to a normally distributed error with variance σi2 then we can apply standard regression formulae for y=ax+b but using (Σyiwi)/Σwi in place of Σyi, (Σxiyiwi)/Σwi in place of Σxiyi, etc., where wi=1/σi2.
That gives the unbiased estimates for a and b, but I have not tried to estimate the error in these.

Felipe Lincoln
It's all more clear now!
Thank you all