Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Find the error in a linear regression

  1. Jun 21, 2013 #1
    Hi,

    I am trying to understand how I find the error in linear regression, and what to do with it. I am using linear regression to predict the time of execution based on the size of the input and the number of tasks used in the computer to get the result.

    1 - In a linear regression, I calculate the error finding the difference between the regression line and all the points [itex](y-\hat{y})[/itex] in the scatter points. The Mean squared error is a way to get the error? Can I use it to predict again?


    Thanks.
     
    Last edited: Jun 21, 2013
  2. jcsd
  3. Jun 21, 2013 #2

    Stephen Tashi

    User Avatar
    Science Advisor

    If you fit the original regression line to the data to minimize the mean square error, then you cannot use the information from the error to find a line that makes the mean square error smaller when fit to the same data. (You aren't being specific when you say that you "used linear regression" because there are different ways to fit a regression line to data. The most commonly seen way is to fit the line that minimizes the mean square error.)

    You might be able to improve your prediction if you are willing to fit a non-linear function to the data, if what you mean by "improve" is to make the mean square error between the prediction function and the data smaller..
     
  4. Jun 21, 2013 #3
    So maybe the question can be, how the error can be useful in the linear regression?
     
    Last edited: Jun 21, 2013
  5. Jun 21, 2013 #4

    Stephen Tashi

    User Avatar
    Science Advisor

    The error is one way to quantify how well the regression fits the data.

    You are asking questions that express natural human emotions, but they are not precise mathematical questions. It's natural to ask "how is this useful", but this doesn't ask a precise mathematical question unless you can explain what "useful" means in your particular situation.

    If you know the precision of the experimental equipment you are using, you can check to see if mean square error you get from the data is approximately the mean square error that the equipment normally produces.
     
  6. Jun 21, 2013 #5

    chiro

    User Avatar
    Science Advisor

    Are you just trying to estimate the residual variance for a simple linear or multiple linear regression?
     
  7. Jun 22, 2013 #6
    @chiro

    I thought the error (residual variance?) means the same in the simple or multiple linear regression???

    @Stephen

    I am using simple and linear regression to predict the time that a task will take before executing it. I know that there is an error between the estimation and the real value after executing the task. I am assuming that the error refers to the '[itex]\epsilon[/itex]' in the linear regression equation. You said previously that I cannot use the error to find a line that makes the mean square error smaller when fit to the same data. This includes also for the next prediction?
     
  8. Jun 22, 2013 #7

    chiro

    User Avatar
    Science Advisor

    It usually is (with a normal distribution) but you can have all kinds of co-variance structures so it's always good to ask.
     
  9. Jun 22, 2013 #8

    Stephen Tashi

    User Avatar
    Science Advisor

    If you judge that the errors in a least-sqares prediction fit are independent random variables, then you can't improve the model by using the data from the errors. If there is some dependence of the errors on the variables in the models, you might improve the model. For example, if the errors in the prediction y = Ax + b are larger for smaller values of x then this suggests you would do better trying a function with a different shape or try using two different functions, one for smaller values of x and one for larger values.

    If you are asking whether the typical regression software package has some way to taken the mean square error of a linear regression and produce a better lineear regression, the answer is no. I don't know whether you are approaching this problem by thinking about it in detail or thinking about it only in terms of what you can do with a software package.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook