Hello,
FIrst thing to do in such a case (and many others!) is make a plot of the data:
The blue points are your data and the red lines indicate the given errors. They came with the data points and can be classified as internal errors.
(As opposed to external errors that follow from the scatter of the data. -- I hope I have this right, and perhaps someone has a good reference ?)
I added some ornaments:
The blue line is a linear fit
without weights. It yields ##y = (1.43\pm0.41) x + (25.27\pm 0.76)## (The errors are found using Excel | Data | Data Analysis | Regression ). With only four data points the error on the error estimate is some 50%, so we really should report only ##y = (1.4\pm0.4) x + (25.3\pm 0.8)##. The blue line goes through the unweighted average point ##\ (x_{\sf avg},y_{\sf avg})=(1.5,27.4)\ ## in the middle.
The purple line represents the weighted average ##y## with the dashed lines at ##\pm \sigma##. As you say, all data are consistent with ##\ y = 26.3\pm 0.9##
[Edit] made an error in the error --- I should have said the weighted average is ##\ y = 26.3\pm 0.4##

which means
two points are outside ##\pm\sigma##.
Now that we have something to look at, we can start asking questions. That is to say:
you can start asking
yourself questions, because we don't have any idea at all how the data and the error estimates came about.
Statistics really hardly apply with so few data points. If you repeated the measurements (with e.g. ##x = -0.5, 0.5, 1.5, 2.5, 3.5## it can happen that the new points fit exactly on a parabola through the old points. Or the new points scatter like crazy, but stilll consistent with the errors you estimate. No way to tell from outside.
In the first case the internal errors are an over-estimate and you can consider the possibility that they have something systematic in common.
In the second case the error estimates from your fit will be considerable.
Further comment: Looking at the data alone: a second order fit yields an almost perfect result. Reason the more to ask what this data represents and how the results and the error estimates came about !
Malamala said:
I did this fit using Python
That is about as useful as stating that you used a keyboard

!
I'm not good at weighted stuff, but I made an attempt to at least visually represent your results:
Same figure with your fit result, now with a
weighted average ##\ (x_{\sf avg},y_{\sf avg})=(0.99,26.34)\ ## in the middle.
Malamala said:
again please forgive my lack of knowledge in statistics?
Stop apologizing. You want to learn something and you make a very good effort !
Malamala said:
my value seems to be 2 sigma away from zero. Does that make sense? Also, why is the covariance -1? I remember that if the variables are so well correlated is not a good sign for a fit. Does my fit makes sense? Is there anything I can do to improve it? Thank you!
As you can see, the fit result is quite consistent with the data. Note that for
weighing, only the relative errors count (i.e. the errors relative to each other): a single common factor for ALL errors does not change the result. Due to the relatively large errror in the 4th point, it hardly contributes (weights are 0.29, 1.04, 0.24, 0.02). So it's hardly surprising the first three points determine the result and the weighted average is very close to the second point.
If error estimates are completely right,
about 32% of the data are more than one sigma away from the fit. For your data that would start to happen if you divide all the error estimates by 3 (i.e. increase all weights by a factor of 9 !). The fourth point would be at 1 ##\sigma##. More importantly, a slope ##a=0## would look very unlikely (still, with real statistics, 5% of data are outside ##\pm 2\sigma##).
Now about this correlation in the estimated errrors for ##a## and ##b##. I have shown these average points in the plots for a reason: linear fit lines
always go through those points and the lines can be moved in two independent ways: wiggle (affecting ##a##) and move parallel up and down (affecting ##b##). Independent, so at that point the errors are uncorelated. repeating the fit after subtracting the 0.994 from all ##x## should yield zero correlation. Something worth checking and easily done.
With your original x-scale the wiggling influences the y-axis intercept much more than the shifting up and down, and that's where this -1.03 correlation comes from.
(Actually I am more used to correlation coefficients, but I think your -1.03 is ##\sigma_{ab}## so to get the correlation coefeficient you 'd do ##\displaystyle{\sigma_{ab}\over \sigma_a\sigma_b}##. That way the |coeffcient| ##\le## 1 as desirable.
Does my fit makes sense? Is there anything I can do to improve it? Thank you!
Yes it does.
Best thing to do is add more points. Statistiscs with just a few points is very risky.
You're welcome !