Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Need linear regression help

  1. Dec 13, 2014 #1
    16legaq.jpg

    I also made a graph which is not pictured.

    1.) Calculate the least squares line. Put the equation in the form of: y-hat = a + bx.
    I got: y hat = 11.304 + 106.218x

    a.) Find correlation coefficient. Is it significant? (use the p-value to decide)
    I got: r = 0.913.... no it is not significant

    b.) Are there any outliers in the data? If so which point(s)? Why is it an outlier? If there are any, recalculate the least squares line after removing the outlier(s).
    --Got kinda lost here. Any help appreciated!
     
  2. jcsd
  3. Dec 13, 2014 #2

    FactChecker

    User Avatar
    Science Advisor
    Gold Member

    Plot your data and the regression line together. Is there a point that is much farther (vertically), from the line than the others? That would be an outlier. Remove that point from the data and redo the linear regression to see if it is significant without that point.
     
  4. Dec 13, 2014 #3

    SteamKing

    User Avatar
    Staff Emeritus
    Science Advisor
    Homework Helper

    It's not obvious what x is supposed to represent in the regression equation. Is x supposed to be the number of stories in a building and y-hat the height of the building?
     
  5. Dec 14, 2014 #4
    Yes, sorry-- "stories" is the independent variable (x) and "height" is the dependent variable (y).
     
  6. Dec 14, 2014 #5

    SteamKing

    User Avatar
    Staff Emeritus
    Science Advisor
    Homework Helper

    Then you've got a problem with your regression equation. According to it, a 10-story building would be over 1000 feet high.
     
  7. Dec 14, 2014 #6
    Ohhh, I think I just need to switch them around, making the regression equation:

    y hat = 106.218 + 11.304x

    Correct?
     
  8. Dec 14, 2014 #7

    SteamKing

    User Avatar
    Staff Emeritus
    Science Advisor
    Homework Helper

    So now, a one-story building is over 100 feet high.

    Nope, that's not going to do it. I think you did something fundamentally wrong in calculating a and b.

    You did use x as the number of stories and y as the heights from your data table? Look carefully, because these data are listed in reverse order in the table.
     
  9. Dec 14, 2014 #8
    I'm confused... I also entered it into a linear regression calculator and it gave me the same equation that I got.

    11hf9sw.jpg
     
  10. Dec 14, 2014 #9

    SteamKing

    User Avatar
    Staff Emeritus
    Science Advisor
    Homework Helper

    This calculator is giving you the wrong results. It can calculate the mean x and mean y and add up the number of data points, but the rest is incorrect.

    You can make your own calculation using a spreadsheet, and some calculators have linear regression fits built in.
     
  11. Dec 14, 2014 #10

    SteamKing

    User Avatar
    Staff Emeritus
    Science Advisor
    Homework Helper

    I had some errors in my check calculations for a and b. This equation is correct (as is the calculator).

    Sorry for the confusion.
     
  12. Dec 15, 2014 #11
    Ok, great! It was driving me crazy-- what a relief. :biggrin:

    Moving on... I had another question for a different problem. By hand, calculate the standard deviation of the residuals.
    21l1krm.jpg
    For the least square line I got:
    y = 284.5/114x - 1.1
    I know the residuals equation is e = y - y hat.... but I'm not exactly sure where to start?
     
  13. Dec 15, 2014 #12
    Actually, I think I'm supposed to use the formula: SEE = √s/(n-p)
     
  14. Dec 15, 2014 #13

    SteamKing

    User Avatar
    Staff Emeritus
    Science Advisor
    Homework Helper

    For the x value of each data point, calculate y-hat according to the regression formula. The actual y value is taken from the graph. Then calculate the sum of the squares of the residuals. Divide this sum by (n-2). This is the variance. The standard deviation is the square root of the variance of the residuals.
     
  15. Dec 15, 2014 #14
    Ok, so after calculating all of that I got: s = 15.7

    Correct?
     
  16. Dec 15, 2014 #15

    SteamKing

    User Avatar
    Staff Emeritus
    Science Advisor
    Homework Helper

    Nope. This is way too big for s.
     
  17. Dec 15, 2014 #16
    I accidentally squared all the y hat variables instead.. Oops. Now I have s = 3.1
     
  18. Dec 15, 2014 #17

    SteamKing

    User Avatar
    Staff Emeritus
    Science Advisor
    Homework Helper

    You're supposed to calculate the ∑ e2 and divide that by (n-2) and take the square root. Your s is still too big. Show your calculations, please.
     
  19. Dec 15, 2014 #18

    FactChecker

    User Avatar
    Science Advisor
    Gold Member

    You can see immediately from your plot that one point at about stories = 50, height = 1050 is an outlier. That is the first data point in your table. Did you try removing that and doing the regression again? I think it will help the significance of your regression a lot.
     
  20. Dec 15, 2014 #19
    Yes, I took out that outlier and got a new regression line of y hat = 43.19 + 11.59x
     
  21. Dec 15, 2014 #20
    6gwkdk.jpg
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: Need linear regression help
  1. Linear regression (Replies: 7)

Loading...