I What is Nonlinear Least Squares in Regression Analysis?

fog37
Messages
1,566
Reaction score
108
TL;DR Summary
understanding regression in general
Hello,

Regression analysis is about finding/estimating the coefficients for a particular function ##f## that would best fit the data. The function ##f## could be a straight line, an exponential, a power law, etc. The goal remains the same: finding the coefficients.

If the data does not show a linear trend, we cannot straight use linear regression. For example, in the case of data following an exponential trend, we can take the log of the ##Y## data (leaving the ##X## data alone) and get a straight line relation between ##log(Y)## and ##X##. At this point, we can apply least squares and get the required coefficients. That is nice hack: turn the problem into a linear regression problem to find the coefficient using logs...The same goes for a power law relation between ##Y## and ##X##...

A polynomial is simply an extension of the power law. I think we can apply least-squares to minimize the ##MSE## without any log transformation...Is that correct?

What about other more general relationships? I am looking into "nonlinear" least squares. At high level, is it a technique to find the coefficients using a variation of least-squares (I guess the ordinary least-squares which minimizes the MSE is called linear least-squares) without having to transform our data so it follows a linear trend?

thank for any clarification!
 
Physics news on Phys.org
fog37 said:
If the data does not show a linear trend, we cannot straight use linear regression. For example, in the case of data following an exponential trend, we can take the log of the ##Y## data (leaving the ##X## data alone) and get a straight line relation between ##log(Y)## and ##X##. At this point, we can apply least squares and get the required coefficients. That is nice hack: turn the problem into a linear regression problem to find the coefficient using logs...
Yes.
fog37 said:
The same goes for a power law relation between ##Y## and ##X##...
Not sure what the power law model is.
fog37 said:
A polynomial is simply an extension of the power law. I think we can apply least-squares to minimize the ##MSE## without any log transformation...Is that correct?
If you have data, ##(y_i, x_i)## and you see that the curve ##Y = a X^2 + b## might fit, you can square your ##x_i## values and apply linear regression. You can extend this to polynomials. The "linear" part of linear regression indicates to how the coefficients appear in the model.
fog37 said:
What about other more general relationships? I am looking into "nonlinear" least squares. At high level, is it a technique to find the coefficients using a variation of least-squares (I guess the ordinary least-squares which minimizes the MSE is called linear least-squares) without having to transform our data so it follows a linear trend?
No. It works as long as the coefficients appear in the model in the appropriate way.
 
Thank you.

As far as linear regression goes, what happens if the scatter plot of ##Y## vs ##X## shows a good linear trend/association but the required model assumptions (residuals are not uncorrelated and have equal variance in the graph residuals vs ##X##) are not satisfied? Is that even possible? Or will we not see a linear trend in the scatterplot if the assumptions are not met?

Thanks again
 
fog37 said:
Thank you.

As far as linear regression goes, what happens if the scatter plot of ##Y## vs ##X## shows a good linear trend/association but the required model assumptions (residuals are not uncorrelated
Correlated residuals sound like it is a time series. Is that what you mean?
fog37 said:
and have equal variance in the graph residuals vs ##X##)
Do you mean that the variance might be proportional to the ##Y## magnitude? That would imply a model like ##Y = \epsilon X##. I think you should try taking logarithms of both sides: ##\log Y = \log X + \epsilon_l##.
But there are a million similar things that might come up, so it is best to wait until you have a specific case and ask about that.
 
In general you want to difference data that exhibits power law characteristics until the series is homoskedastic (same variance). So, for example, if your rh variable is human height, you can leave it alone, but if it is wealth or market cap of a stock, use logs so the variance does not scale with the value of the rh variable
 
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Thread 'Detail of Diagonalization Lemma'
The following is more or less taken from page 6 of C. Smorynski's "Self-Reference and Modal Logic". (Springer, 1985) (I couldn't get raised brackets to indicate codification (Gödel numbering), so I use a box. The overline is assigning a name. The detail I would like clarification on is in the second step in the last line, where we have an m-overlined, and we substitute the expression for m. Are we saying that the name of a coded term is the same as the coded term? Thanks in advance.
Back
Top