Nonlinear least squares vs OLS

  • I
  • Thread starter fog37
  • Start date
  • Tags
    Ols
  • #1
fog37
1,568
108
TL;DR Summary
Difference between nonlinear least squares vs ordinary least squares
hello,

I understand that the method of ordinary least squares (OLS) is about finding the coefficients that minimize the sum ##\Sigma (y_{observed} -g(X))^2## where ##g(X)## is the statistical model chosen to fit the data. Beside OLS, there clearly other coefficient estimation methods (MLE, etc.)

In general, OLS is fair game when the model ##g(X)## is "linear with respect with the parameters" (linear regression, polynomial regression, etc.): any model that is the sum of several terms with each term being the product of the estimated coefficient and whatever variable: ##g(X) =\Sigma \beta f(X)## where ##f(X)## are like the basis functions. For example, ##g(X)=\beta_0 +\beta_1 X+\beta_2 X^2## is linear and the basis functions are the three functions ##1, X, X^2##...

Of course, the OLS approach is valid as long as specific assumptions on the residuals are met. Additionally, after taking the first derivative and setting them to zero, we are able to arrive to nice analytical formulas for the coefficients.

That said, what is the issue with using OLS when ##g(X)## is a nonlinear model? I know that sometimes we "convert" a nonlinear model so that it assume the form of a linear model. That strategy then allows us to use OLS on the new model based on the transformed variables...That is a useful hack.

But I have reading about "nonlinear least squares". Isn't it the same approach as OLS but when the model is nonlinear where we directly plug the nonlinear model ##g(X)## in ##\Sigma (y_{observed} -g(X))^2## ? We may not end up with analytical estimators and have to solve for the coefficients using some numerical method...But I don't see an issue apply OLS to nonlinear models...

Thank you.
 
  • Like
Likes Dale
Physics news on Phys.org
  • #2
OLS minimized the sum of squared errors of the actual samples versus the estimated values. If that is your goal, then that is the thing to do.
 
  • Like
Likes fog37
  • #3
FactChecker said:
OLS minimized the sum of squared errors of the actual samples versus the estimated values. If that is your goal, then that is the thing to do.
That is the goal but many resources I read state that OLS is only for linear models and that puzzled me....Is it because the estimates resulting from applying OLS to linear models are not as good as they could be when the model is linear?
 
  • #4
I should probably not have called it OLS. If your goal is to minimize the sum-squared-errors, then do that, whether it requires OLS or a numerical technique.
These problems do not exist in a vacuum. You should have a reason for the model you propose and have something that you want to use the results for. That should determine what approach you can use. What you need to be aware of is that the statistical results like confidence intervals of the parameters may not be valid if certain assumptions are not met.
 
  • Like
Likes fog37
  • #5
FactChecker said:
I should probably not have called it OLS. If your goal is to minimize the sum-squared-errors, then do that, whether it requires OLS or a numerical technique.
These problems do not exist in a vacuum. You should have a reason for the model you propose and have something that you want to use the results for. That should determine what approach you can use. What you need to be aware of is that the statistical results like confidence intervals of the parameters may not be valid if certain assumptions are not met.
I see.

Inferential statistics is either about estimation, hypothesis testing, or both. Estimation is really just about coming up with a reasonably good numerical, unbiased, consistent, low variance estimate of the parameter.

Hypothesis testing focuses on a different task: it hypothesizes the unknown population parameter and uses the limited sample data to check if that hypothesis (H0) is valid or not. Confidence intervals, standard errors, p-values result from hypothesis testing, not from estimation, correct?

If the required assumptions are not met by the chosen model, estimation may still work just fine...but confidence intervals, standard errors, p-values, etc. will not be reliable, statistically speaking.

For example, in linear regression, the response variable ##Y## does not have to be normally distributed for the model to be sound and get good estimates of the slope and intercept. The Markov-Gauss assumptions don't force ##Y## or the residuals to have normal distribution at all....But confidence intervals, standard errors, p-values, the output of hypothesis testing, will not be good if Y in not normal which implies that the residuals will also not be normally distributed...

Am I thinking correctly here?
 
  • #6
fog37 said:
For example, in linear regression, the response variable ##Y## does not have to be normally distributed for the model to be sound and get good estimates of the slope and intercept. The Markov-Gauss assumptions don't force ##Y## or the residuals to have normal distribution at all....But confidence intervals, standard errors, p-values, the output of hypothesis testing, will not be good if Y in not normal which implies that the residuals will also not be normally distributed...

Am I thinking correctly here?
When you talk about a normal distribution, you should be talking about the random term, ##\epsilon##, not about ##Y##. There can be many ways that random behavior influences ##Y##. I have not seen you mention that yet. You need to pay special attention to how the random term enters into the equation. Without that, your model is incomplete.
Some example models are:
##Y = a_0 + a_1 X_1 + a_2 X_2 + \epsilon##
or
##Y = \epsilon \cdot e^{a_0 + a_1 X_1 + a_2 X_2}##
or
##Y = g( X + \epsilon)##
 
  • Like
Likes fog37
  • #7
I see. Your point is that the residuals can be normally distributed (and have equal variance) at each ##X## value...But that does not automatically imply that the observed response variable ##Y## has also normally distributed values....

However, I have always thought that if the error is normal, then ##Y## is also normally distributed...
 
  • #8
fog37 said:
I see. Your point is that the residuals can be normally distributed (and have equal variance) at each ##X## value...But that does not automatically imply that the observed response variable ##Y## has also normally distributed values....

However, I have always thought that if the error is normal, then ##Y## is also normally distributed...
IMO, we shouldn't talk about "residuals" and "error" as though they are a simple normal random variable with a mean of 0. They are the errors of an estimated model versus the true model and can be changed by other errors in the estimated model.
Suppose we have an actual physical relationship ##Y = a_0 + a_1 X + \epsilon##, where ##\epsilon## is a normal variable with a mean of zero, and estimate it with a linear equation ##\hat Y = \hat {a_0} + \hat {a_1} X##.
Then the errors or residuals are ##\hat {\epsilon_i} = y_i - \hat {y_i} = (a_0 - \hat {a_0}) + (a_1 - \hat {a_1})x_i +\epsilon_i##
##\hat {\epsilon_i}## is different from the term ##\epsilon_i##. It includes a term that depends on ##x_i##
 

1. What is the difference between nonlinear least squares (NLS) and ordinary least squares (OLS)?

Ordinary Least Squares (OLS) is used for linear regression models, which assumes a linear relationship between the dependent and independent variables. It minimizes the sum of the squared differences between observed values and the values predicted by the linear model. Nonlinear Least Squares (NLS), on the other hand, is used for fitting nonlinear models. It minimizes the sum of the squares of the residuals but is applied when the model involves nonlinear equations, requiring iterative procedures for parameter estimation.

2. When should I use nonlinear least squares instead of OLS?

Nonlinear least squares should be used when the relationship between the variables in your model cannot be adequately described by a linear equation. This might be the case if the effect of changes in the predictor variables on the response variable changes depending on the value of the predictor variables. If a plot of the data or theoretical understanding of the underlying processes suggests a nonlinear relationship, NLS might be the appropriate method to use.

3. What are the main challenges of using nonlinear least squares?

The main challenges of using nonlinear least squares include the complexity of finding the best fit. Unlike OLS, NLS often requires initial guesses of the parameters and can converge to local minima rather than the global minimum. This method is also more sensitive to outliers and can be significantly affected by the choice of starting values. Additionally, the computational effort is generally higher than for OLS, and assessing the goodness-of-fit can be more complex.

4. Can I use OLS for nonlinear relationships by transforming the data?

Yes, one common approach to handle nonlinear relationships with OLS is to transform the data so that the relationship becomes linear. For example, if you suspect a logarithmic relationship, you can take the logarithm of the dependent variable, or independent variable, or both, to linearize the data. However, this approach depends heavily on correctly identifying the right transformation, and the interpretation of the results can become less intuitive.

5. How do I evaluate the fit of a nonlinear least squares model?

Evaluating the fit of a nonlinear least squares model typically involves looking at the residual plots, the coefficient of determination (R-squared), and other goodness-of-fit statistics. However, because of the potential complexity of the model, you might also need to consider information criteria such as AIC or BIC, which penalize excessive model complexity. Additionally, it's crucial to check the sensitivity of the estimates to different starting values and ensure that the algorithm has converged to a reasonable solution.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
841
  • Set Theory, Logic, Probability, Statistics
Replies
23
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
472
  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
22
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
Back
Top