# Least-squares estimation of linear regression coefficients

1. Jul 14, 2008

### DMTN

AFAIK, there are two basic type of linear regression:
y=ax+b and y=a2 + bx + c
But I have to do the same with the function y = asin(x)+bcos(x).
Here is what I have done:

We have:
$$\begin{array}{l} \frac{{\partial L}}{{\partial a}} = 0 \frac{{\partial L}}{{\partial b}} = 0$$

Continue:
$$\begin{array}{l} \frac{{\partial L}}{{\partial a}} = \sum\limits_{i = 1}^n {2\left[ {fi - \left( {a\sin (\frac{{\pi x}}{2}) + b\cos (\frac{{\pi x}}{2})} \right)} \right]\left( { - \sin (\frac{{\pi x}}{2})} \right)} \frac{{\partial L}}{{\partial b}} = \sum\limits_{i = 1}^n {2\left[ {fi - \left( {a\sin (\frac{{\pi x}}{2}) + b\cos (\frac{{\pi x}}{2})} \right)} \right]\left( {\cos (\frac{{\pi x}}{2})} \right)} \end{array}$$

At last, I have:

$$\left( {\begin{array}{*{20}c} {\sin ^2 \left( {\frac{{\pi x}}{2}} \right)} & {\sin \left( {\frac{{\pi x}}{2}} \right)\cos \left( {\frac{{\pi x}}{2}} \right)} \\ {\sin \left( {\frac{{\pi x}}{2}} \right)\cos \left( {\frac{{\pi x}}{2}} \right)} & {\cos ^2 \left( {\frac{{\pi x}}{2}} \right)} \\ \end{array}} \right)\left( \begin{array}{l} a \\ b \\ \end{array} \right) = \left( \begin{array}{l} fi\sin \left( {\frac{{\pi x}}{2}} \right) \\ fi\cos \left( {\frac{{\pi x}}{2}} \right) \\ \end{array} \right)$$

What I have to do now? Please suggest me with this situation.

2. Jul 14, 2008

### mathman

It doesn't look right at all. For starters, you should have xi as the argument for each i, not x. Then the known quantities in the matrix and the r.h.s. vector will all have summations over i.

3. Jul 15, 2008

### hotvette

Looks like you are trying to develop what are called the 'normal equations':

$$A^TAc = A^Ty$$

Check out the 1st attachment in the following thread:

The normal equations are fine from a mathematical standpoint, but in computational practice it is usually not a good idea to use them. It's better to factor A using QR or SVD. Example using QR:

$$Rc = Q^Ty$$

http://www.alkires.com/teaching/ee103/Rec8_LLSAndQRFactorization.htm

4. Jul 17, 2008

### zyh

great tutorials on Least square method.
to OP:

it is very simple that you can write the equation like below
$$\left[\begin{array}{cc} sin(x) & cos(x)\end{array}\right]\left[\begin{array}{c} a\\ b\end{array}\right]=y$$

and for each Xi and Yi, you get the quation
$$\left[\begin{array}{cc} sin(x_{i}) & cos(x_{i})\end{array}\right]\left[\begin{array}{c} a\\ b\end{array}\right]=y_{i}$$

then sum the equation together,you get$$Ac=Y$$
where $$C=\left[\begin{array}{c} a\\ b\end{array}\right]$$

5. Jul 25, 2008

### ssd

There is a problem in your approach. Sin(x) and Cos(x) are not un-correlated. The matrix (A'A) may be a singular one depending on the sample values.

6. Jul 25, 2008

### ssd

The problem may be tackled in the following way:
Write aSin(x)+bCos(x)= c.Sin(d+x), c=sqrt(A^2+b^2) and Sin(d)=b/c.
Start with any arbitrary value of d. Regress to find c in the usual way. Find the residual sum of squares(RSS). Now vary d. Repeat previous procedure. Again find residual sum of squares. Compare this value of RSS with the previous one and check how the RSS decreases with variation of d. Go on repeating the procedure till the RSS value does not decrease further (or you are satisfied with a very small value of the RSS). Choose this pair of c and d. Solve to find a and b.

7. Jul 25, 2008

### zyh

Great,ssd,This is a wonderful algorithm.
Let me explain it more detailly.
we can rewrite the equation below:
$$y=asin(x)+bcos(x)=\sqrt{a^{2}+b^{2}}\left(\frac{a}{\sqrt{a^{2}+b^{2}}}sin(x)+\frac{b}{\sqrt{a^{2}+b^{2}}}cos(x)\right)$$
so, I can simply define a variable c and d.which like below:
define variable c: $$c=\sqrt{a^{2}+b^{2}}$$

define variable d: $$sin(d)=\frac{b}{c},cos(d)=\frac{a}{c}$$

so, we can get $$y=csin(d+x)$$

As you said "start with any arbitrary value of d" it is very simple to find C in the usual way because there is only "one unknow variable c " in
$$y=csin(d+x)$$
Also, it's easy to get the c ,furthermore the RSS.

But my question is how does the "d" vary? Which I mean I should get another d value which is bigger than the previous? or smaller? Are there a convergence way to let the RSS smaller..?

Thank you!

8. Jul 26, 2008

### ssd

We started with any arbitrary value of d. Next change d to d+10, say. We can change d to d-10 also or by any arbitrary amount. Now we have to check whether RSS increases or decreases. If increases then we have to change d in the other direction. In brief, we vary d in a way that at the termination point of the algorithm, RSS shall increase if d is changed (in whichever way). That is, choose d in a way that RSS has at least a local minimum at that value of d.

9. Jul 26, 2008

### zyh

Hi, I'm grad to discuss with this topic with you, but I think it's still numberically difficulty to give the algorithm like this. Because I don't know whether d = d + 10? or d = d + 100? or other value. I't seems too arbitrary.
Let me take sometime to analysis this idea.

10. Jul 26, 2008

### ssd

Basically the first change in d has to be arbitrary unless we have other information. Generally it does not make big difference if the increment is by 10 or 100 since we started with arbitrary d. Which matters is to detect the direction further changes. In reality, a suitable computer program detects the minimum of RSS almost instantly through this method.

Looking forward for thoughts from you.

11. Jul 27, 2008

### zyh

hi, ssd, I think you'd made a mistake of linear square mathod.
look at here:
http://en.wikipedia.org/wiki/Linear_least_squares#The_general_problem
so, for the equation AC = Y
which $$A=\left[\begin{array}{cc} sin(x_{1}) & cos(x_{1})\\ sin(x_{2}) & cos(x_{2})\\ sin(x_{3}) & cos(x_{3})\\ \cdots & \cdots\end{array}\right]$$

even x1=x2, we can still get A is linearly independent in columns
so, I think the regular algorithm still applies.

12. Jul 27, 2008

### ssd

I dont understand what you mean by 'even x1=x2'? Do you mean two columns of A are identical? Then of course al the columns of A are not independent. If you talk of the first two rows being identical then its not really relevent here.
Now look at my first post. I said A'A may be singular depending on the sample values. That is, we cannot eleminate the chance of singularity (I stated this generally for correlated columns and in our particular problem the columns are correlated).
In my approach you get the same answer if there is no singularity, and if it is there then also right answer is obtained.
Further more, if one has equations of the form y= a+ b.Sin(x) + c.d^x, then the method I stated still remains a handy approach.

Last edited: Jul 27, 2008
13. Jul 27, 2008

### zyh

Hi, let me clearify my thoughts.

I mean that A'A is sigular such as x1==x2 (x1, x2, x3 ... are all sampled value of x).
The first two rows of A are identical, but they don't effect the dependence of the columns. Because normally there are so many numbers of xi.

You said that " A'A" may be singular. I do agree! Not only in the problem "y=asinx+bcos", But this "singular condision" may exists in every LSM problem.

consider:
$$AC=Y$$.
(http://en.wikipedia.org/wiki/Linear_least_squares#The_general_problem)

If rank(A)<rank([A,Y]), which means these equations have no exact solutions.
so, the LSM can be applied.

Let's consider the augmented equation :
$$A^{T}AC=A^{T}Y$$
because $$rank(A^{T}A)=rank(\left[A^{T}A,A^{T}Y\right]$$ can always be obtained, the argumented equations do have solutions.
This can divide to two conditions.
1. singular:If A'A is singular is singular, then we have infinite numbers of solutions.
2. non-singular: we have only "ONE solution".

So, if we check that rank(A) = dimensions of C, we can always get the "ONE solution". Otherwise, I don't think there is a fixed handy approach.