Find the error in a linear regression

Click For Summary

Discussion Overview

The discussion revolves around understanding the concept of error in linear regression, specifically how to calculate it, its implications for prediction, and its usefulness in assessing model fit. Participants explore the relationship between mean squared error and regression line fitting, as well as the potential for improving predictions.

Discussion Character

  • Exploratory
  • Technical explanation
  • Conceptual clarification
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Some participants propose that the mean squared error is a way to quantify the error in linear regression and question its utility for future predictions.
  • Others argue that once a regression line is fitted to minimize mean squared error, the error cannot be used to find a better fitting line for the same data.
  • A participant suggests that fitting a non-linear function might improve predictions if the goal is to reduce mean squared error.
  • There is a discussion about the precision of experimental equipment and its relation to the mean squared error obtained from the data.
  • Some participants inquire whether residual variance is the same in simple and multiple linear regression contexts.
  • One participant mentions that if errors are independent random variables, the model cannot be improved using the errors, but suggests that dependence on model variables might allow for improvements.
  • Concerns are raised about whether typical regression software can enhance a linear regression model based on mean squared error.

Areas of Agreement / Disagreement

Participants express differing views on the utility of error in linear regression, particularly regarding its implications for future predictions and model improvement. There is no consensus on whether the error can be effectively utilized to refine predictions or models.

Contextual Notes

Limitations include the dependence on definitions of error and residual variance, as well as the assumptions regarding the independence of errors and their distribution.

xeon123
Messages
90
Reaction score
0
Hi,

I am trying to understand how I find the error in linear regression, and what to do with it. I am using linear regression to predict the time of execution based on the size of the input and the number of tasks used in the computer to get the result.

1 - In a linear regression, I calculate the error finding the difference between the regression line and all the points (y-\hat{y}) in the scatter points. The Mean squared error is a way to get the error? Can I use it to predict again?Thanks.
 
Last edited:
Physics news on Phys.org
If you fit the original regression line to the data to minimize the mean square error, then you cannot use the information from the error to find a line that makes the mean square error smaller when fit to the same data. (You aren't being specific when you say that you "used linear regression" because there are different ways to fit a regression line to data. The most commonly seen way is to fit the line that minimizes the mean square error.)

You might be able to improve your prediction if you are willing to fit a non-linear function to the data, if what you mean by "improve" is to make the mean square error between the prediction function and the data smaller..
 
So maybe the question can be, how the error can be useful in the linear regression?
 
Last edited:
The error is one way to quantify how well the regression fits the data.

You are asking questions that express natural human emotions, but they are not precise mathematical questions. It's natural to ask "how is this useful", but this doesn't ask a precise mathematical question unless you can explain what "useful" means in your particular situation.

If you know the precision of the experimental equipment you are using, you can check to see if mean square error you get from the data is approximately the mean square error that the equipment normally produces.
 
Are you just trying to estimate the residual variance for a simple linear or multiple linear regression?
 
@chiro

I thought the error (residual variance?) means the same in the simple or multiple linear regression?

@Stephen

I am using simple and linear regression to predict the time that a task will take before executing it. I know that there is an error between the estimation and the real value after executing the task. I am assuming that the error refers to the '\epsilon' in the linear regression equation. You said previously that I cannot use the error to find a line that makes the mean square error smaller when fit to the same data. This includes also for the next prediction?
 
It usually is (with a normal distribution) but you can have all kinds of co-variance structures so it's always good to ask.
 
xeon123 said:
This includes also for the next prediction?

If you judge that the errors in a least-sqares prediction fit are independent random variables, then you can't improve the model by using the data from the errors. If there is some dependence of the errors on the variables in the models, you might improve the model. For example, if the errors in the prediction y = Ax + b are larger for smaller values of x then this suggests you would do better trying a function with a different shape or try using two different functions, one for smaller values of x and one for larger values.

If you are asking whether the typical regression software package has some way to taken the mean square error of a linear regression and produce a better lineear regression, the answer is no. I don't know whether you are approaching this problem by thinking about it in detail or thinking about it only in terms of what you can do with a software package.
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 11 ·
Replies
11
Views
5K
  • · Replies 7 ·
Replies
7
Views
1K
Replies
3
Views
3K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 13 ·
Replies
13
Views
4K
  • · Replies 8 ·
Replies
8
Views
3K