Find the error in a linear regression

Click For Summary
To find the error in linear regression, calculate the difference between the predicted values and actual data points, commonly using Mean Squared Error (MSE) as a metric. While MSE helps quantify how well the regression fits the data, it cannot be used to refine the regression line for the same dataset. If the errors are independent, they cannot improve the model; however, if there's a dependency, exploring different functions may yield better predictions. The discussion emphasizes the importance of understanding the nature of errors and their implications for model accuracy. Ultimately, regression software does not automatically enhance the linear regression line based on MSE.
xeon123
Messages
90
Reaction score
0
Hi,

I am trying to understand how I find the error in linear regression, and what to do with it. I am using linear regression to predict the time of execution based on the size of the input and the number of tasks used in the computer to get the result.

1 - In a linear regression, I calculate the error finding the difference between the regression line and all the points (y-\hat{y}) in the scatter points. The Mean squared error is a way to get the error? Can I use it to predict again?Thanks.
 
Last edited:
Physics news on Phys.org
If you fit the original regression line to the data to minimize the mean square error, then you cannot use the information from the error to find a line that makes the mean square error smaller when fit to the same data. (You aren't being specific when you say that you "used linear regression" because there are different ways to fit a regression line to data. The most commonly seen way is to fit the line that minimizes the mean square error.)

You might be able to improve your prediction if you are willing to fit a non-linear function to the data, if what you mean by "improve" is to make the mean square error between the prediction function and the data smaller..
 
So maybe the question can be, how the error can be useful in the linear regression?
 
Last edited:
The error is one way to quantify how well the regression fits the data.

You are asking questions that express natural human emotions, but they are not precise mathematical questions. It's natural to ask "how is this useful", but this doesn't ask a precise mathematical question unless you can explain what "useful" means in your particular situation.

If you know the precision of the experimental equipment you are using, you can check to see if mean square error you get from the data is approximately the mean square error that the equipment normally produces.
 
Are you just trying to estimate the residual variance for a simple linear or multiple linear regression?
 
@chiro

I thought the error (residual variance?) means the same in the simple or multiple linear regression?

@Stephen

I am using simple and linear regression to predict the time that a task will take before executing it. I know that there is an error between the estimation and the real value after executing the task. I am assuming that the error refers to the '\epsilon' in the linear regression equation. You said previously that I cannot use the error to find a line that makes the mean square error smaller when fit to the same data. This includes also for the next prediction?
 
It usually is (with a normal distribution) but you can have all kinds of co-variance structures so it's always good to ask.
 
xeon123 said:
This includes also for the next prediction?

If you judge that the errors in a least-sqares prediction fit are independent random variables, then you can't improve the model by using the data from the errors. If there is some dependence of the errors on the variables in the models, you might improve the model. For example, if the errors in the prediction y = Ax + b are larger for smaller values of x then this suggests you would do better trying a function with a different shape or try using two different functions, one for smaller values of x and one for larger values.

If you are asking whether the typical regression software package has some way to taken the mean square error of a linear regression and produce a better lineear regression, the answer is no. I don't know whether you are approaching this problem by thinking about it in detail or thinking about it only in terms of what you can do with a software package.
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 11 ·
Replies
11
Views
5K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 7 ·
Replies
7
Views
1K
Replies
3
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 13 ·
Replies
13
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K