Is There a Way to Fit Lines of Worst Fit Using Statistical Packages?

  • Context: Undergrad 
  • Thread starter Thread starter Beam me down
  • Start date Start date
  • Tags Tags
    Fit Fitting Lines
Click For Summary
SUMMARY

The discussion centers on the challenge of fitting a line of worst fit to data that has already been analyzed using a linear model. The original poster seeks a systematic approach using statistical packages like SPSS, Excel, or R. While traditional linear regression focuses on minimizing squared errors, participants suggest exploring methods such as Minimum Likelihood Estimation (MLE) and confidence intervals for slope and intercept estimates. R is highlighted as a particularly suitable tool for advanced modeling, including spline methods and graphical representations of confidence bands.

PREREQUISITES
  • Understanding of linear regression principles and error minimization.
  • Familiarity with statistical software packages such as SPSS, Excel, and R.
  • Knowledge of Minimum Likelihood Estimation (MLE) and its applications.
  • Basic concepts of confidence intervals and their significance in statistical analysis.
NEXT STEPS
  • Learn how to implement Minimum Likelihood Estimation (MLE) in R.
  • Explore spline methods for data fitting in R.
  • Research how to create confidence and prediction bands in R for regression analysis.
  • Investigate the limitations of using Excel for advanced statistical analysis.
USEFUL FOR

Students in physics or statistics, data analysts, and researchers looking to enhance their understanding of regression analysis and statistical modeling techniques.

Beam me down
Messages
46
Reaction score
0
I recently completed an experiment for university level physics and am currently doing the analysis of the data. This data has been transformed to fit a linear model and it fits this very well, with the line of best fit lying within all error bars. I need to provide estimate of the gradient and intercept as these have a physical interpretation. I was hoping to do a line of worst fit, the worst line that still fits all error bars.

I can do this to some degree of accuracy by plotting the data by hand, but I was wondering if there was a more systematic approach using any statistical package?

I have access to SPSS, Excel and R, though I am not that familiar with the latter. Though I suppose I could always download a trial version of other packages.

So is there anyway to fit lines of worst fit?

Thanks for any help you can provide.
 
Physics news on Phys.org
what are the error bars you refer to? standard deviation? When doing linear regression, it is not a requirement that the estimate line fit the error bars. the fact that all your measurements fit within one standard deviation of the line is completely by chance. best fit is done by minizing the squared error sum. there is no such thing as a worst fit line. why can't you just use the slope and intercept of your best fit line as your estimates? if you want you can find the confidence intervals (99%, 95%, etc) for these estimates.
 
Wow - lots of options here. If standard methods are not sufficient, there are many ways to approximate data fit so that you can consider complex transformation for testing and estimation. Currently, the R language might be most amenable to this sort of modeling. You might start with some spline methods. Jack Dagg
 
By "gradient" do you mean slope? if so, there are traditional confidence interval estimates for the slope (and the intercept as well). You can also get R to create a graph showing the fitted regression line and both the confidence and prediction bounds (or exactly one of those) for your data: at the prompt in R type

help(predict)

for more details.
 
Excellent suggestion statdad. Sometimes it's easy to overlook established methods. I see a lot of students have an aha moment after they calculate the estimated confidence bands that increase in width as the data limits are approached. Cheers!
 
Last edited:
I'm not sure how the likelihood idea would apply, or really be useful, in a simple linear regression setting. I'm also leery of using Excel for any statistics work past simple means, but that's not the point of this post.

I should expand my earlier comment that other software will also plot the prediction and/or confidence bands for a regression; it's not limited to R. and yes, jackdagg, it is amusing to see how students respond to seeing those graphs: I've had some actually ask "Is that why extrapolation with these models is so risky?" (thanks for the comments too)
 
statdad said:
I'm not sure how the likelihood idea would apply, or really be useful, in a simple linear regression setting. I'm also leery of using Excel for any statistics work past simple means, but that's not the point of this post.

I should expand my earlier comment that other software will also plot the prediction and/or confidence bands for a regression; it's not limited to R. and yes, jackdagg, it is amusing to see how students respond to seeing those graphs: I've had some actually ask "Is that why extrapolation with these models is so risky?" (thanks for the comments too)

I don't know what the OPs requirements were. I simply suggested it as another alternative given the OP's need for a "worst fit" estimate. MLE is robust regardless of the validity of the normal assumption, is used by many frequentist statisticans, and the section I cited allows for the calculation of probabilities and confidence intervals. Have you used Excel for MLE? I didn't mention it, but the citation refers to R. I was simply suggesting other possibilities since the OP said he wasn't familiar with R.

http://www.jstatsoft.org/v30/i07/paper
 
Last edited:
SW VandeCarr said:
I don't know what the OPs requirements were. I simply suggested it as another alternative given the OP's need for a "worst fit" estimate. MLE is robust regardless of the validity of the normal assumption, is used by many frequentist statisticans, and the section I cited allows for the calculation of probabilities and confidence intervals. Have you used Excel for MLE? I didn't mention it, but the citation refers to R. I was simply suggesting other possibilities since the OP said he wasn't familiar with R.

http://www.jstatsoft.org/v30/i07/paper

My response wasn't very organized and a couple ideas were jumbled. My Excel comment was directed at the OP - I am not a fan of Excel for problems of this type - any, as I said, reasonably advanced stat work. I referenced R for that because I'm most familiar with it but, as noted, any software worth its salt will get that job done. I wasn't implying you suggested Excel, even if poor wording made it seem that way.

I do disagree with the "likelihood methods are robust" comment - but that's not the point of the OP's inquiry.
 
Last edited:

Similar threads

  • · Replies 3 ·
Replies
3
Views
2K
Replies
8
Views
2K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 16 ·
Replies
16
Views
2K
Replies
28
Views
4K
  • · Replies 13 ·
Replies
13
Views
2K
  • · Replies 16 ·
Replies
16
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 22 ·
Replies
22
Views
4K