Linear Regression, Linear Least Squares, Least Squares, Non-linear Least Squares

Click For Summary

Discussion Overview

The discussion revolves around the distinctions and applications of Linear Regression, Linear Least Squares, and Non-linear Least Squares methods in data fitting. Participants explore the definitions, methodologies, and nuances of these statistical techniques, with a focus on their theoretical underpinnings and practical implementations.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Some participants propose that Linear Regression and Linear Least Squares are often used interchangeably but suggest there are subtle differences, particularly in the methods of determining optimal fit.
  • It is suggested that Linear Least Squares specifically refers to minimizing the sum of the square of vertical differences, while Linear Regression can encompass various fitting methods.
  • Some argue that Linear Least Squares does not necessarily imply fitting a straight line, as it can refer to any model that is linear in the unknown parameters.
  • Non-linear Least Squares is discussed as a method for fitting models that are not linear in the unknowns, with examples provided.
  • One participant mentions the Levenberg-Marquardt method as a robust approach for nonlinear least squares fitting, highlighting its application and sensitivity to parameters.
  • Another participant reflects on the differences in implementations of the Levenberg-Marquardt method, noting the use of second derivatives in one version and the absence in another, leading to different stability and convergence behaviors.
  • A later reply clarifies a misunderstanding regarding the function used in the non-linear least squares version of the method, indicating that it should be the difference between the model and the observed data.

Areas of Agreement / Disagreement

Participants express differing views on the definitions and applications of Linear Regression and Linear Least Squares, indicating that multiple competing interpretations exist. The discussion on the Levenberg-Marquardt method also reveals unresolved questions about its implementations and stability.

Contextual Notes

There are limitations in the discussion regarding the assumptions made about the definitions of terms and the mathematical steps involved in the Levenberg-Marquardt method. The sensitivity of convergence to parameter choices is noted but not fully resolved.

hotvette
Homework Helper
Messages
1,001
Reaction score
11
It seems to me that Linear Regression and Linear Least Squares are often used interchangeably, but I believe there to be subtle differences between the two. From what I can tell (for simplicity let's assume the uncertainity is in y only), Linear Regression refers to the general case of fitting a straight line to a set of data, but the method of determining optimal fit can be most anything (e.g. sum of vertical differences, sum of absolute value of vertical differences, max vertical difference, sum of square of vertical differences, etc.), whereas Linear Least Squares refers to a specific measure of optimal fit, namely, sum of the square of vertical differences.

Actually, it seems to me that Linear Least Squares doesn't necessarily mean that you are fitting a straight line to the data, it just means that the modelling function is linear in the unknowns (e.g. [itex]y = ax^2 + bx + c[/itex] is linear in a, b, and c). Perhaps it is established convention that Linear Least Squares does, in fact, refer to fitting a straight line, whereas Least Squares is the more general case?

Lastly, Non-linear Least Squares refers to cases where the modelling function is not linear in the unknowns (e.g. [itex]y = e^{-ax^b}[/itex], where a,b are sought).

Is my understanding correct on this?
 
Last edited:
Physics news on Phys.org
You can "linearize" [tex]y = e^{-ax^b}[/tex] so that you are fitting an [tex]-x^b[/tex] as opposed to an exponential by fitting [tex]\ln(y) = -ax^b[/tex]. This is done all the time, maybe not linear, but a much easier function to fit in the long run. It is still non-linear in the strictest sense.

You are correct in saying that least squares is a more general case. In theory you can fit any polynomial, exponential, logarithmic, etc...by using least squares.
 
Last edited:
In poking around some of my numerical analysis texts I was reminded of the Levenberg-Marquardt method of fitting curves to data. It seems to be one of the more robust nonlinear least squares methods.

http://www.library.cornell.edu/nr/bookcpdf/c15-5.pdf

Take a look...
 
Last edited by a moderator:
Thanks. I agree re [itex]y = e^{-ax^b}[/itex]. Bad choice on my part. I've read several web articles on Levenberg-Marquardt Method but don't seem to quite follow. I've seen what appears to be 2 versions, one for general unconstrained optimization where you are minimizing an objective function, which for least squares would be [itex]\epsilon^2 = \Sigma(f(x_i)-y_i)^2[/itex].

[tex][x^{k+1}] = [x^k] - \alpha [H(x^k) + \beta <i>]^{-1}[J(x^k)]</i>[/tex]

Where x is the unknown parameter list, [itex]H(x^k)[/itex] is the Hessian matrix of second derivatives of the objective function [itex]\epsilon^2[/itex] with respect to the unknown parameters, [itex]\alpha, \beta[/itex] are parameters that control the stability of the iterative solution, and [itex]J(x^k)[/itex] is the Jacobian of first derivatives of the objective function with respect to the unknown parameters. I've actually been successful in using this for non-linear least squares problems, but convergence is extremely sensitive to [itex]\alpha, \beta[/itex]. In this version, the values for y are used within the objective function. Thus, we have a single equation in n unknowns (depending on the complexity of the fitting function) that we are trying to minimize.

I've also seen articles talking specifically to using Levenberg-Marquardt for non-linear least squares, using a solution technique NOT requiring 2nd derivatives and is completely analogous to the linear least squares solution:

[tex]a^{k+1} = a^k - \alpha {[J^TJ + \beta I]^{-1}J^Tf(x)}[/tex]

Where a is the unknown parameters and J is the Jacobian of [itex]y_i = f_i(x_i)[/itex] with respect to the unknown parameters. In this form, the 2nd derivative isn't used and I've seen comments to the effect that the 2nd derivatives can lead to unstable situations. This version I can't get to work at all and I suspect there is something wrong with my interpretation.

I'd appreciate some help with where I'm going astray with the 2nd version. It seems strange to me that one implementation uses 2nd derivatives and the other doesn't. Many thanks!
 
Last edited:
I found the discrepancy. The [itex]f(x)[/itex] in:

[tex]a^{k+1} = a^k - \alpha {[J^TJ + \beta I]^{-1}J^Tf(x)}[/tex]

is really [itex]f(x)-y[/itex]. I was just using [itex]y[/itex]. I used it successfully and the convergence wasn't nearly as dependent on [itex]\alpha[/itex] and [itex]\beta[/itex] as it was using the first method. I'm amazed that 2 different ways of supposedly using the same method can produce such different results, not in the final answer, but in the complexity of setup (i.e. computing 2nd derivatives vs not) and stability of the solution. I guess that comment I read about 2nd derivatives contributing to instability was right. The net seems to be that the non-linear least squares version of Levenberg-Marquardt is much more stable than the general version for unconstrained optimization. Amazing.
 
Last edited:

Similar threads

  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
Replies
3
Views
3K
  • · Replies 30 ·
2
Replies
30
Views
5K