Find the error in a linear regression

In summary, the error (residual variance?) is a measure of how well a linear regression fits the data. The error might be useful in a linear regression when the prediction is for a future event.
  • #1
xeon123
90
0
Hi,

I am trying to understand how I find the error in linear regression, and what to do with it. I am using linear regression to predict the time of execution based on the size of the input and the number of tasks used in the computer to get the result.

1 - In a linear regression, I calculate the error finding the difference between the regression line and all the points [itex](y-\hat{y})[/itex] in the scatter points. The Mean squared error is a way to get the error? Can I use it to predict again?Thanks.
 
Last edited:
Physics news on Phys.org
  • #2
If you fit the original regression line to the data to minimize the mean square error, then you cannot use the information from the error to find a line that makes the mean square error smaller when fit to the same data. (You aren't being specific when you say that you "used linear regression" because there are different ways to fit a regression line to data. The most commonly seen way is to fit the line that minimizes the mean square error.)

You might be able to improve your prediction if you are willing to fit a non-linear function to the data, if what you mean by "improve" is to make the mean square error between the prediction function and the data smaller..
 
  • #3
So maybe the question can be, how the error can be useful in the linear regression?
 
Last edited:
  • #4
The error is one way to quantify how well the regression fits the data.

You are asking questions that express natural human emotions, but they are not precise mathematical questions. It's natural to ask "how is this useful", but this doesn't ask a precise mathematical question unless you can explain what "useful" means in your particular situation.

If you know the precision of the experimental equipment you are using, you can check to see if mean square error you get from the data is approximately the mean square error that the equipment normally produces.
 
  • #5
Are you just trying to estimate the residual variance for a simple linear or multiple linear regression?
 
  • #6
@chiro

I thought the error (residual variance?) means the same in the simple or multiple linear regression?

@Stephen

I am using simple and linear regression to predict the time that a task will take before executing it. I know that there is an error between the estimation and the real value after executing the task. I am assuming that the error refers to the '[itex]\epsilon[/itex]' in the linear regression equation. You said previously that I cannot use the error to find a line that makes the mean square error smaller when fit to the same data. This includes also for the next prediction?
 
  • #7
It usually is (with a normal distribution) but you can have all kinds of co-variance structures so it's always good to ask.
 
  • #8
xeon123 said:
This includes also for the next prediction?

If you judge that the errors in a least-sqares prediction fit are independent random variables, then you can't improve the model by using the data from the errors. If there is some dependence of the errors on the variables in the models, you might improve the model. For example, if the errors in the prediction y = Ax + b are larger for smaller values of x then this suggests you would do better trying a function with a different shape or try using two different functions, one for smaller values of x and one for larger values.

If you are asking whether the typical regression software package has some way to taken the mean square error of a linear regression and produce a better lineear regression, the answer is no. I don't know whether you are approaching this problem by thinking about it in detail or thinking about it only in terms of what you can do with a software package.
 

1. What is a linear regression?

A linear regression is a statistical method used to model the relationship between two variables, usually denoted as x and y. It assumes that there is a linear relationship between the two variables, meaning that the change in y is directly proportional to the change in x.

2. Why is it important to find errors in a linear regression?

Finding errors in a linear regression is important because it allows us to assess the accuracy and validity of the model. If there are significant errors, it means that the model is not a good fit for the data and may need to be revised or improved.

3. How do you find errors in a linear regression?

To find errors in a linear regression, you can use various statistical methods such as residual analysis, hypothesis testing, and visual inspection of the data. These methods help to identify any patterns or outliers in the data that may be affecting the accuracy of the model.

4. What are some common errors in linear regression?

Some common errors in linear regression include non-linearity, heteroscedasticity (unequal variances), autocorrelation (dependence between data points), and multicollinearity (high correlation between predictor variables). These errors can lead to biased or unreliable results.

5. How can errors in a linear regression be corrected?

Errors in a linear regression can be corrected by using advanced techniques such as robust regression, generalized linear models, or nonlinear regression. It is also important to carefully examine the data and consider removing outliers or transforming the variables if necessary. Proper model selection and validation are also crucial for minimizing errors in linear regression.

Similar threads

  • Linear and Abstract Algebra
Replies
3
Views
1K
  • STEM Educators and Teaching
Replies
11
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
827
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Engineering and Comp Sci Homework Help
Replies
7
Views
723
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
2K
Replies
8
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
Back
Top