Root Mean Square Error, a straight line fit and a gradient issue

K29
Messages
103
Reaction score
0
I have some measurements from a physics lab experiment and I am coding in Matlab a fit for the data. [Note this is not a problem with Matlab, my problem here is theory]

In normal regression of statistics the RMSE is given by:

s=\frac{\sigma}{\sqrt{n}} =\sqrt{\frac{\Sigma (\epsilon _i)^2}{n(n-1)}}
where \sigma is the standard deviation or Root Mean Square Deviation.

Now, according to my physics lab manual:

"For large n the standard error of the mean implies 68% confidence interval. For small n this is not reliable and it is necessary to multiply \sigma by a certain factor t, to obtain the appropriate confidence interval."

They then give a table with t= 12.7 for n = 2; t = 4.3 for n =3 (t is reduced by a factor of 1/3.6 for each n)

Onwards...

The root mean square error for the straight line fit is given by:
S_{y}=\sqrt{\frac{\Sigma(\delta y_{i}^{2})}{n-2}}

The error in the gradient of the straight line fit is:
S_{m}=S_{y}\sqrt{\frac{\Sigma x_{i}^{2}}{n \Sigma (x_{i}^{2})-(\Sigma x_{i})^2 }}

Now for my plot I have only 3 data points. They are however, very accurate. The root square is about 0.98. (the fit explains 98% of the total variation in the data about the average.)

But the RMSE is quite large due to there being only 3 data points. My error for gradient is therefore ridiculously large. I can not find anywhere how the RMSE equation for the graph is actually derived, therefore I am having difficulty working out how/if/where I am to multiply the factor t into the RMSE equation for a straight line.

Can anyone please help? Thanks in advance
 
Physics news on Phys.org
The statistics of your experiment come from the fact that you took 3 readings. Think about it like this: out of the "universe of possible readings" you picked 3 of them. If you take enough readings, you expect some randomness to show up.

Your statement that they are "very accurate" may be true. It sounds like it may be based on your domain knowledge. Perhaps, you can only take 3 readings due to time or cost constraints. You should document your procedure, so that anybody else, who wants to repeat the procedure, can get comparable results.
 
  • Like
  • Love
Likes jim mcnamara and malawi_glenn
A technical point. Standard deviation, at least to my knowledge, often denoted as ##\sigma##, is used for the population parameter, while S.E is used as the standard deviation of a random variable, obtained from sample data.
 
K29 said:
Now for my plot I have only 3 data points. They are however, very accurate. The root square is about 0.98. (the fit explains 98% of the total variation in the data about the average.)
By "very accurate" do you mean that you measured them very accurately or that you have some subject-matter reason to think that there is very little random variation in the results, or that the R-squared of the linear regression is large? Those three ways to interpret "very accurate" have very different implications.
K29 said:
But the RMSE is quite large due to there being only 3 data points. My error for gradient is therefore ridiculously large. I can not find anywhere how the RMSE equation for the graph is actually derived, therefore I am having difficulty working out how/if/where I am to multiply the factor t into the RMSE equation for a straight line.
I am not familiar with those multipliers for such small samples, but it sounds like you should just multiply ##S_y## by them.
 
Prove $$\int\limits_0^{\sqrt2/4}\frac{1}{\sqrt{x-x^2}}\arcsin\sqrt{\frac{(x-1)\left(x-1+x\sqrt{9-16x}\right)}{1-2x}} \, \mathrm dx = \frac{\pi^2}{8}.$$ Let $$I = \int\limits_0^{\sqrt 2 / 4}\frac{1}{\sqrt{x-x^2}}\arcsin\sqrt{\frac{(x-1)\left(x-1+x\sqrt{9-16x}\right)}{1-2x}} \, \mathrm dx. \tag{1}$$ The representation integral of ##\arcsin## is $$\arcsin u = \int\limits_{0}^{1} \frac{\mathrm dt}{\sqrt{1-t^2}}, \qquad 0 \leqslant u \leqslant 1.$$ Plugging identity above into ##(1)## with ##u...
Back
Top