Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Linear Regression Models (3)

  1. May 14, 2009 #1
    1) "Simple linear regression model:
    Y = β0 + β1X + ε
    E(Y) = β0 + β1X
    A linear model means that it is linear in β's, and not necessarily a linear function of X.
    The independent variable X could be W2 or ln(W), and so on, for some other independent variable W."


    I have some trouble understanding the last line. I was told that a SIMPLE linear regression model is always a straight line model, it is a least-square LINE of best fit. But if X=W2, then we have E(Y) = β0 + β1W2 which is not a straight line...how come?? Is this allowed?


    2) "A SIMPLE linear regression is a linear regression in which there is only ONE independent variable."

    Now is the following a simple linear regression or a multiple linear regression?
    Y = β0 + β1X + β2X2 + ε
    It has only one independent variable X, so is it simple linear regression? But this just looks a bit funny to me...


    3) "A linear regression model is of the form:
    Y = β0 + β1X1 + β2X2 + ... + βkXk + ε
    If there is more than one independent variable, then the model is called a MULTIPLE linear regression model."


    This idea doesn't seem too clear to me. What can the Xi's be? What are some actual examples of mutliple linear model? Does a linear model always have to be a straight line or a plane?

    Thanks for explaining!
     
  2. jcsd
  3. May 14, 2009 #2

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    Think of it this way. You have a bunch of (x,y) pairs and are trying to find the coefficients a and b for y=ax2+b. Introduce a new variable u=x2. Now the equation you are trying to fit is y=au+b. A straight line fit. Now imagine you have a different set of (x,y) pairs and this time you are trying to find the coefficients a and b for y=bxa. Introduce two new variables, u=ln(x) and v=ln(y). Taking the log of both sides of y=axb and substituting yields v=au+b. A straight line fit.

    A bit of caution with regard to the latter. The linear regression yields the best fit (least squares sense) to v=au+b. This is not necessarily the best fit (least squares sense) to y=bxa.


    No. The X and X2 are different independent variables as far as the regression goes.


    Fitting salary y to years of schooling s and years of experience e via y=as+be+c is a multiple linear regression. Here, years of schooling and years of experience are independent variables for the regression. Fitting a parabola, y=ax2+bx+c, can also be done as a multiple linear regression. Think of x2 and x as being independent variables as far as the regression is concerned.
     
  4. May 14, 2009 #3
    2) What I think is that the definition of "simple linear model" is not very well-defined. It's ambiguous. I looked at the definitions in 3 different textbooks, but still can't really figure out whether e.g. Y = β0 + β1X + β2X2 + ε is a simple linear model or mutliple linear model. There seems to be only ONE independent variable X (X2 is also determined by X, it's not a DIFFERENT variable, once we've measured X, we can determine the values of both X and X2), but it has a β2 in there. X and X2 are related, I don't see how they can be two separate independent variables...
    Is there a nicer definition of a "simple linear model"?


    3) "A linear regression model is of the form:
    Y = β0 + β1X1 + β2X2 + ... + βkXk + ε
    If there is more than one independent variable, then the model is called a MULTIPLE linear regression model."


    Now I have some confusion relating to the above paragraph.
    e.g. Are the following also considered as MULTIPLE linear regression models? These are not quite in the exact same form as Y = β0 + β1X1 + β2X2 + ... + βkXk + ε which has "k" DIFFERENT independent variables X1,X2,...,Xk.
    (i) Y = β0 + β1X + β2exp(X) + ε
    (ii) Y = β0 + β1X1 + β2X2 + β3(X1X2) + β4(X12) + β5(X22) + ε

    Are those allowed? Why or why not?


    Thanks a lot!
     
    Last edited: May 15, 2009
  5. May 15, 2009 #4

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    A simple linear regression model has two coefficients. Period.

    Your problem is that you are looking at this the wrong way. Y = β0 + β1X + β2X2 + ε is not a simple model because you have three coefficients: β0, β1, and β2. In a sense, the independent variables for the regression are the βis. As far as the regression equations are concerned, those Xs and Ys are just a bunch of constant N-vectors. The best fit is found by taking the partial derivatives of the sum of the square error with respect to each βi: The βis are variables. The Xs and Ys are not variables as far as the regression equations are concerned. Stop thinking of them as variables and you will have fewer problems.
     
  6. May 15, 2009 #5
    Thanks for the helpful comments! So I think 2) is solved.

    But I am still puzzled by 3) and I would really appreicate if anyone can explain that.

     
  7. May 15, 2009 #6

    HallsofIvy

    User Avatar
    Staff Emeritus
    Science Advisor

    They aren't linear! That's not to say that those might not be better models for the particular situation- not everything is linear- but anything can be approximated by a linear model and linear models are much, much, easier to work with!
     
  8. May 15, 2009 #7

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    They are linear in the βis, and as far as linear regression is concerned, that is all that matters. These are linear regression models. Here are a couple that are not linear regressions:

    [tex]Y = \beta_0*(1 + \beta_1 X_1)*(1 + \beta_2 X_2)+\varepsilon[/tex]
    [tex]Y=\beta_0 + \beta_1X^{\beta_2} + \varepsilon[/tex]
     
    Last edited: May 15, 2009
  9. May 15, 2009 #8
    Yes, I think the trickiest point to notice when I first read through the definition of linear regression model is that it is linear in β's while in calculus, when we talk about linear, we are usually trying to say that the function is linear, i.e. straight line, plane.

    "A linear regression model is of the form:
    Y = β0 + β1X1 + β2X2 + ... + βkXk + ε "


    (i) Y = β0 + β1X + β2exp(X) + ε
    (ii) Y = β0 + β1X1 + β2X2 + β3(X1X2) + β4(X12) + β5(X22) + ε

    For (i), X1=X, X2=exp(X)
    For (ii), X3=X1*X2, X4=X1^2, X5=X2^2
    The latter X's depends on the previous X's. In particular, X3 depends on TWO of the previous X's: X1 AND X2, which looks a bit funny to me? Are those allowed? Somehow I am having a lot of troubles understanding this...I understand the general form of a multiple linear regression model, but I don't seem to understand the specific examples of it like (i) and (ii).

    Once again, your help is greatly appreciated!
     
    Last edited: May 15, 2009
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: Linear Regression Models (3)
  1. Linear regression (Replies: 7)

Loading...