Root Mean Square Error, a straight line fit and a gradient issue

Click For Summary
The discussion centers on calculating the Root Mean Square Error (RMSE) for a straight line fit using a small dataset of three accurate measurements from a physics lab experiment. The RMSE is significantly large due to the limited number of data points, leading to a correspondingly large error in the gradient calculation. The lab manual suggests using a multiplier, t, to adjust the standard deviation for small sample sizes, but the exact application of this factor to the RMSE equation remains unclear. Participants emphasize the importance of documenting the procedure for reproducibility and highlight the distinction between population standard deviation and sample standard error. Overall, the challenge lies in integrating the theoretical adjustments for small sample sizes into the RMSE calculations effectively.
K29
Messages
103
Reaction score
0
I have some measurements from a physics lab experiment and I am coding in Matlab a fit for the data. [Note this is not a problem with Matlab, my problem here is theory]

In normal regression of statistics the RMSE is given by:

s=\frac{\sigma}{\sqrt{n}} =\sqrt{\frac{\Sigma (\epsilon _i)^2}{n(n-1)}}
where \sigma is the standard deviation or Root Mean Square Deviation.

Now, according to my physics lab manual:

"For large n the standard error of the mean implies 68% confidence interval. For small n this is not reliable and it is necessary to multiply \sigma by a certain factor t, to obtain the appropriate confidence interval."

They then give a table with t= 12.7 for n = 2; t = 4.3 for n =3 (t is reduced by a factor of 1/3.6 for each n)

Onwards...

The root mean square error for the straight line fit is given by:
S_{y}=\sqrt{\frac{\Sigma(\delta y_{i}^{2})}{n-2}}

The error in the gradient of the straight line fit is:
S_{m}=S_{y}\sqrt{\frac{\Sigma x_{i}^{2}}{n \Sigma (x_{i}^{2})-(\Sigma x_{i})^2 }}

Now for my plot I have only 3 data points. They are however, very accurate. The root square is about 0.98. (the fit explains 98% of the total variation in the data about the average.)

But the RMSE is quite large due to there being only 3 data points. My error for gradient is therefore ridiculously large. I can not find anywhere how the RMSE equation for the graph is actually derived, therefore I am having difficulty working out how/if/where I am to multiply the factor t into the RMSE equation for a straight line.

Can anyone please help? Thanks in advance
 
Physics news on Phys.org
The statistics of your experiment come from the fact that you took 3 readings. Think about it like this: out of the "universe of possible readings" you picked 3 of them. If you take enough readings, you expect some randomness to show up.

Your statement that they are "very accurate" may be true. It sounds like it may be based on your domain knowledge. Perhaps, you can only take 3 readings due to time or cost constraints. You should document your procedure, so that anybody else, who wants to repeat the procedure, can get comparable results.
 
  • Like
  • Love
Likes jim mcnamara and malawi_glenn
A technical point. Standard deviation, at least to my knowledge, often denoted as ##\sigma##, is used for the population parameter, while S.E is used as the standard deviation of a random variable, obtained from sample data.
 
K29 said:
Now for my plot I have only 3 data points. They are however, very accurate. The root square is about 0.98. (the fit explains 98% of the total variation in the data about the average.)
By "very accurate" do you mean that you measured them very accurately or that you have some subject-matter reason to think that there is very little random variation in the results, or that the R-squared of the linear regression is large? Those three ways to interpret "very accurate" have very different implications.
K29 said:
But the RMSE is quite large due to there being only 3 data points. My error for gradient is therefore ridiculously large. I can not find anywhere how the RMSE equation for the graph is actually derived, therefore I am having difficulty working out how/if/where I am to multiply the factor t into the RMSE equation for a straight line.
I am not familiar with those multipliers for such small samples, but it sounds like you should just multiply ##S_y## by them.
 
Question: A clock's minute hand has length 4 and its hour hand has length 3. What is the distance between the tips at the moment when it is increasing most rapidly?(Putnam Exam Question) Answer: Making assumption that both the hands moves at constant angular velocities, the answer is ## \sqrt{7} .## But don't you think this assumption is somewhat doubtful and wrong?

Similar threads

  • · Replies 6 ·
Replies
6
Views
4K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 17 ·
Replies
17
Views
3K
  • · Replies 42 ·
2
Replies
42
Views
5K
  • · Replies 9 ·
Replies
9
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 11 ·
Replies
11
Views
1K
  • · Replies 1 ·
Replies
1
Views
2K
Replies
3
Views
2K