Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Excel trend line vs regression analysis

  1. Mar 9, 2015 #1
    I did some data analysis with excel fitting some linear, zero intercept data with trend line and the regression analysis tool. The slopes generated by the two methods were different by about 10%. The regression line seemed to be weighted differently, are these two methods different for some data sets?
     
  2. jcsd
  3. Mar 9, 2015 #2

    statdad

    User Avatar
    Homework Helper

    Did you use the regression tool to also have zero slope? If not, that is your problem. If you did - I'm not sure: I don't know what procedure Excel uses to do what it calls a trendline.
    This does point out an important point: Excel is one of the worst tools to use for statistical analysis.
     
  4. Mar 9, 2015 #3
    No they both have zero intercept. I have five groups of data points, with two to three points in each group (concentration its a calibration curve). In some of the groups there is 1 point that is a bit higher than the other. The slope from the regression seems to go through the majority of data points, and seems less influenced by the high values in comparison to the trend line.
     
  5. Mar 9, 2015 #4

    statdad

    User Avatar
    Homework Helper

    Did you plot the points with the scatterplot option, or just make a plot with the y-values? If the latter the trendline will not take account of the x-values (at least, I don't think it does: I haven't used Excel for stat for some time). As I recall it somehow assumes the y-values are equally spaced.

    Edit: I just played with some fake data: when I made a scatterplot and added a trendline I got the least squares regression line. If I simply plotted the y-values in a line plot and added a trendline I got some other equation, not the least squares line. So - I don't know whether that is your issue, but my memory was (at least) mostly on target.
     
  6. Mar 9, 2015 #5
    Lol no I definitely used my own x values. Although I am familiar with that mistake
     
  7. Mar 9, 2015 #6

    statdad

    User Avatar
    Homework Helper

    Messages passing in the darkness of the internet.
    Humor me: try both the trendline and the regression line, but include the intercept and see what happens.
     
  8. Mar 9, 2015 #7

    statdad

    User Avatar
    Homework Helper

    I just played with some more fake data. I started with y = 10x and simply added some personal (non-random) error to the points: three each of x = 4, 6, and 9. Data was graphed in a scatterplot.
    When I fitted the trendline with intercept set to zero the fitted equation was y = 10.248x
    I then used the slope function to get the slope of the slope of the regression line the result was 10.605. The intercept was -2.5. Getting the trendline without forcing the intercept to zero gave the equation y = 10.605x - 2.5 - the same as the regression line.
    If you get your least squares slope from the slope function it assumes the intercept is fitted in the equation .
    But, when I used the regression tool (Analysis toolpak) and told Excel to force the intercept to be zero, the result was a slope of 10.248

    So:
    * fitting the trendline with intercept set to zero seems to be (I will not generalize based on one example) the same as using the regression tool and forcing the intercept to be zero there
    * the slope function ALWAYS gives the slope under the assumption that the intercept has been fit with the equation - there is no way to change it

    A) This is but one reason Excel blows for any statistical work
    B) Is there any good reason you are fitting with the intercept zero? Remember that means the R-sq and correlation values are meaningless.
     
  9. Mar 10, 2015 #8
    Ha. I figured it out all the points in my graph were misregistered to each other by one.
     
  10. Mar 12, 2015 #9

    FactChecker

    User Avatar
    Science Advisor
    Gold Member

    I have a guess at the difference between a "trend line" and regression. In regression, the x value is given ant the error of the regression line is the difference in the y-value from the data point to the line. The regression line minimizes the total sum-squared Y-error. A "trend line" may be the line that gives the smallest errors when the "error" is defined as the 2-dimensional distance from the data points to the line, NOT the y-distance from the data points to the line. Those two different definitions of error will give different slopes.

    CORRECTION: As far as I can tell, the "trend line" in Excel is a regression line. The statistical term for the other type of line I described above is "principle component". It is the single line that best fits the scatter plot. Although I could not find documentation of the Excel "trend line" formula, I think it is a regression line.
     
    Last edited: Mar 12, 2015
  11. Mar 13, 2015 #10

    statdad

    User Avatar
    Homework Helper

    "As far as I can tell, the "trend line" in Excel is a regression line. The statistical term for the other type of line I described above is "principle component". It is the single line that best fits the scatter plot. Although I could not find documentation of the Excel "trend line" formula, I think it is a regression line."

    The trend line does seem to be the regression line, since whether the intercept is or is not included the results are the same as obtained with the regression tool.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: Excel trend line vs regression analysis
  1. Regression analysis? (Replies: 11)

Loading...