Linear Regression, Linear Least Squares, Least Squares, Non-linear Least Squares

In summary, Linear Regression refers to the general case of fitting a straight line to a set of data, but the method of determining optimal fit can be most anything (e.g. sum of vertical differences, sum of absolute value of vertical differences, max vertical difference, sum of square of vertical differences, etc.), whereas Linear Least Squares refers to a specific measure of optimal fit, namely, sum of the square of vertical differences.
  • #1
hotvette
Homework Helper
996
5
It seems to me that Linear Regression and Linear Least Squares are often used interchangeably, but I believe there to be subtle differences between the two. From what I can tell (for simplicity let's assume the uncertainity is in y only), Linear Regression refers to the general case of fitting a straight line to a set of data, but the method of determining optimal fit can be most anything (e.g. sum of vertical differences, sum of absolute value of vertical differences, max vertical difference, sum of square of vertical differences, etc.), whereas Linear Least Squares refers to a specific measure of optimal fit, namely, sum of the square of vertical differences.

Actually, it seems to me that Linear Least Squares doesn't necessarily mean that you are fitting a straight line to the data, it just means that the modelling function is linear in the unknowns (e.g. [itex]y = ax^2 + bx + c[/itex] is linear in a, b, and c). Perhaps it is established convention that Linear Least Squares does, in fact, refer to fitting a straight line, whereas Least Squares is the more general case?

Lastly, Non-linear Least Squares refers to cases where the modelling function is not linear in the unknowns (e.g. [itex]y = e^{-ax^b}[/itex], where a,b are sought).

Is my understanding correct on this?
 
Last edited:
Physics news on Phys.org
  • #2
You can "linearize" [tex] y = e^{-ax^b} [/tex] so that you are fitting an [tex] -x^b [/tex] as opposed to an exponential by fitting [tex] \ln(y) = -ax^b [/tex]. This is done all the time, maybe not linear, but a much easier function to fit in the long run. It is still non-linear in the strictest sense.

You are correct in saying that least squares is a more general case. In theory you can fit any polynomial, exponential, logarithmic, etc...by using least squares.
 
Last edited:
  • #3
In poking around some of my numerical analysis texts I was reminded of the Levenberg-Marquardt method of fitting curves to data. It seems to be one of the more robust nonlinear least squares methods.

http://www.library.cornell.edu/nr/bookcpdf/c15-5.pdf

Take a look...
 
Last edited by a moderator:
  • #4
Thanks. I agree re [itex]y = e^{-ax^b}[/itex]. Bad choice on my part. I've read several web articles on Levenberg-Marquardt Method but don't seem to quite follow. I've seen what appears to be 2 versions, one for general unconstrained optimization where you are minimizing an objective function, which for least squares would be [itex]\epsilon^2 = \Sigma(f(x_i)-y_i)^2[/itex].

[tex][x^{k+1}] = [x^k] - \alpha [H(x^k) + \beta ]^{-1}[J(x^k)][/tex]

Where x is the unknown parameter list, [itex]H(x^k)[/itex] is the Hessian matrix of second derivatives of the objective function [itex]\epsilon^2[/itex] with respect to the unknown parameters, [itex]\alpha, \beta[/itex] are parameters that control the stability of the iterative solution, and [itex]J(x^k)[/itex] is the Jacobian of first derivatives of the objective function with respect to the unknown parameters. I've actually been successful in using this for non-linear least squares problems, but convergence is extremely sensitive to [itex]\alpha, \beta[/itex]. In this version, the values for y are used within the objective function. Thus, we have a single equation in n unknowns (depending on the complexity of the fitting function) that we are trying to minimize.

I've also seen articles talking specifically to using Levenberg-Marquardt for non-linear least squares, using a solution technique NOT requiring 2nd derivatives and is completely analogous to the linear least squares solution:

[tex]a^{k+1} = a^k - \alpha {[J^TJ + \beta I]^{-1}J^Tf(x)}[/tex]

Where a is the unknown parameters and J is the Jacobian of [itex]y_i = f_i(x_i)[/itex] with respect to the unknown parameters. In this form, the 2nd derivative isn't used and I've seen comments to the effect that the 2nd derivatives can lead to unstable situations. This version I can't get to work at all and I suspect there is something wrong with my interpretation.

I'd appreciate some help with where I'm going astray with the 2nd version. It seems strange to me that one implementation uses 2nd derivatives and the other doesn't. Many thanks!
 
Last edited:
  • #5
I found the discrepancy. The [itex]f(x)[/itex] in:

[tex]a^{k+1} = a^k - \alpha {[J^TJ + \beta I]^{-1}J^Tf(x)}[/tex]

is really [itex]f(x)-y[/itex]. I was just using [itex]y[/itex]. I used it successfully and the convergence wasn't nearly as dependent on [itex]\alpha[/itex] and [itex]\beta[/itex] as it was using the first method. I'm amazed that 2 different ways of supposedly using the same method can produce such different results, not in the final answer, but in the complexity of setup (i.e. computing 2nd derivatives vs not) and stability of the solution. I guess that comment I read about 2nd derivatives contributing to instability was right. The net seems to be that the non-linear least squares version of Levenberg-Marquardt is much more stable than the general version for unconstrained optimization. Amazing.
 
Last edited:

1. What is linear regression?

Linear regression is a statistical method used to analyze the relationship between two continuous variables. It assumes a linear relationship between the variables and uses a line of best fit to model the data.

2. How is linear least squares different from least squares?

Linear least squares is a specific type of least squares method used in linear regression, where the goal is to minimize the sum of the squared differences between the observed data and the predicted values. Least squares, on the other hand, can refer to any method that minimizes the sum of squared errors, not just in linear regression.

3. What is the difference between linear least squares and non-linear least squares?

Linear least squares is used when the relationship between the variables is assumed to be linear, while non-linear least squares is used when the relationship is non-linear. Non-linear least squares uses a more complex model, such as a curve, to fit the data.

4. How is linear regression used in scientific research?

Linear regression is commonly used in scientific research to analyze the relationship between two variables and make predictions based on the data. It is also used to determine the significance of the relationship and to identify any potential outliers or influential data points.

5. Can linear regression be used for categorical data?

No, linear regression is used for continuous variables. For categorical data, other methods such as logistic regression or ANOVA should be used.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Linear and Abstract Algebra
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
445
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
789
  • Set Theory, Logic, Probability, Statistics
Replies
23
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
969
  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
Replies
8
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
1K
Back
Top