Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Linear Regression with Many y for each x

  1. Aug 9, 2015 #1


    User Avatar
    Science Advisor
    Gold Member

    Say we collect data points ##(x_i,y_j)## to do a linear regression, but so that for each ##x_i ## we collect
    values ##y_{i1}, y_{i2},...,y_{ij} ## . Is there a standard way of doing linear regression with this type of dataset?
    Would we, e.g., average the ##y_{ij}## abd define it to be ## y_i## to have a single data set ##(x_i, y_i) ## to do linear regression on?
    Last edited: Aug 9, 2015
  2. jcsd
  3. Aug 9, 2015 #2


    User Avatar
    Staff Emeritus
    Science Advisor
    Homework Helper

    You can average the multiple readings if you wish. That's what people do when using normal equations to find the regression coefficients.

    If you are using QR factorization to solve a rectangular system, you can write separate equations for each observation. The resulting regression coefficients should come out the same as with using the normal equations, unless there is something horribly wrong, numerically.
  4. Aug 10, 2015 #3


    User Avatar
    Science Advisor
    Gold Member
    2017 Award

    Are you saying that the Ys represent different things, or are they results of the same variable measured from repeats of the same input x? If the former, just analyse the Y variables separately. If they are the latter, then do not combine the data. Averaging the Y values loses all the information about the variation of y for the same x. Run a regression on the raw data with the same x value repeated for each y value obtained.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook