Homework Help: Linear Algebra Problem

1. May 29, 2005

Mathman23

The following formula $$E(a,b) = \sum_{j=1} ^{N} (y_{i} - (a + b_{i})^2$$ is used to messure the distance between the points $$(x_1,y_1),(x_2,y_2), \ldots, (x_n, y_n) \in \mathbb{R}^2$$ and the line $y=a+bx$

I need to find a set of points $$(a,b) \in \mathbb{R}^2$$ where E(a,b) using tools of linear Algebra.

I'm given the following vectors:

$$x = \left [ \begin{array}{c} x_{1} \\ \vdots \\ x_{N} \end{array} \right ], v = \left [ \begin{array}{c} y_{1} \\ \vdots \\ y_{N} \end{array} \right ], \ \ \overline{x} = \frac{1}{N} \sum_{j=1} ^N x_{i}, \ \ \overline{y} = \frac{1}{N} \sum_{j=1} ^N y_{i}$$

I'm tasked with calculating $$\mathb{\frac{dE}{da}} , \mathb{\frac{dE}{db}}$$ and showing that if $$\mathb{\frac{dE}{da}}=0 , \mathb{\frac{dE}{db}}= 0$$ leads to the linear system.

$$A \left [ \begin{array}{c} a \\ b \end{array} \right ] = c$$, where $$A = \left [ \begin{array}{cc} \frac{1}{x} \ \ \overline{x} \\ \ \ \frac{x \cdot x}{N} \end{array} \right ]$$ and $$c = \left [ \begin{array}{c} \overline{y} \\ \frac{x \cdot y}{N}\end{array} \right ]$$

Anybody have any hits/idear for to solve this assignment ??

Sincerley Fred

2. May 29, 2005

OlderDan

First, let's make sure the problem is understood. This looks to me like the problem of linear regression. E(a,b) is the sum of the squared deviations in the y (vertical) direction of the data points from the line y = a + bx. Your task is to minimize E(a,b) by finding the values of a and b that result in that minimum. That point is found by taking the partial derivatives of E with respect to a and b and setting each to zero. Your derivatives should be

$$\mathb{\frac{\partial E}{\partial a}}=0 , \mathb{\frac{\partial E}{\partial b}}= 0$$

$$E(a,b) = \sum_{j=1} ^{N} (y_{j} - (a + bx_{j}))^2$$

What is missing in the matrix?

$$A = \left [ \begin{array}{cc} \frac{1}{x} \ \ \overline{x} \\ \ \ \frac{x \cdot x}{N} \end{array} \right ]$$

Looks like is should be this

$$A = \left[ {\begin{array}{*{20}c} 1 & {\overline x } \\ {\overline x } & {\frac{{ x \cdot x }}{N}} \\ \end{array}} \right]$$

Last edited: May 29, 2005
3. May 29, 2005

OlderDan

Yep.. I missed the index. good catch.. I will fix that in my previous post also.

I missed the bottom half of those partial derivatives too . I fixed those also.

And then there are the other index problems from the OP

$$\overline{x} = \frac{1}{N} \sum_{j=1} ^N x_{j}, \ \ \overline{y} = \frac{1}{N} \sum_{j=1} ^N y_{j}$$

Last edited: May 29, 2005
4. May 29, 2005

Mathman23

Hi

The matrix is suppose to look like this.

$$A = \left [ \begin{array}{cc} 1 & \overline{x} \\ \overline{x} & \frac{x \cdot x}{N} \end{array} \right ]$$

/Fred

Last edited: May 29, 2005
5. May 29, 2005

OlderDan

I think you copied it wrong. I just added what I think it should be to my first post.

6. May 29, 2005

Mathman23

Then taking the partial derivative of E. First step is that to write the complete sum-formula ??

/Fred

7. May 29, 2005

OlderDan

Setting the two partial derivatives of E to zero gives you two equations for a and b. The partial with respct to a is simpler and yields an equation in a and b that involves only that averages of x and y. The partial with respect to b is a bit more complicated, involving the dot product of x and y and the dot product of x with itself. With the corrected A matrix you should be able to show that those equations are equivalent to the matrix equation you posted.

8. May 29, 2005

Mathman23

How do I take the partial derivative of that sum-formula?

Sincerley

Fred

9. May 29, 2005

OlderDan

When you take the partial derivative with respect to a, everything else is constant. You know that the derivative of a sum of functions is the sum of the derivatives of the functions. In other words, the derivative "moves inside" the sum. You will be taking the derivative of every term in the sum and you will be left with a sum of terms where each term is a sum of other terms. You will want to separate those in to separate sums. For the derivative wrt a, there will be three sums, a sum of all the y, a sum of all the x times a constant, and the sum of a constant. You can reduce that equation to an equation involving the two constants (a and b) and the averages of x and y. If you get that one, I think you will see how to do the derivative wrt b, but it is a bit more complicated.

Give it a try and post your result. If you don't get it right, I or someone else will post it.

10. May 30, 2005

Mathman23

Hello Dan,

If I differentiate $$E(a,b)$$ first with respect to a and then to b.

I got the following two equations.

$$\begin{array}{cc} \frac{\partial E}{ \partial a} = 2(ax_{j} - y_{j} +b_{j}) x_{j} \ \ \mathrm{and} \ \ \frac{\partial E}{ \partial b} = 2(b_{j} + ax_{j} -y_{j}) \end{array}$$

Is this what You mean?

Again thank You very much for Your help.

Sincerley and Best Regards
Fred

Last edited: May 30, 2005
11. May 30, 2005

OlderDan

The derivatives are still sums of terms

$$E(a,b) = \sum_{j=1} ^{N} (y_{j} - (a + bx_{j}))^2$$

$$E(a,b) = \sum_{j=1} ^{N} (y_{j} - a - bx_{j})^2$$

$$\frac{\partial E}{ \partial a} = \sum_{j=1} ^{N} 2(y_{j} - a - bx_{j})(-1) = -2\sum_{j=1} ^{N} y_{j} + 2a\sum_{j=1} ^{N} 1 + 2b\sum_{j=1} ^{N} x_{j} = 0$$

$$\frac{\partial E}{ \partial b} = \sum_{j=1} ^{N} 2(y_{j} - a - bx_{j})(-x_{j}) = -2\sum_{j=1} ^{N} x_{j}y_{j} + 2a\sum_{j=1} ^{N} x_{j} + 2b\sum_{j=1} ^{N} x_{j}^2 = 0$$

Divide each of these results by -2N and you should start to identify some terms as simple averages. One will reduce to the parameter a. Other terms will involve the dot product of x and y, and the dot product of x with itself. The resulting equations can be put into the matrix form you were given.

Last edited: May 31, 2005
12. May 31, 2005

Mathman23

I devide the three terms in first and second with $$-2N$$
and get the following results:

$$-2 \sum _{j=1} ^{N} y_{j} \rightarrow \frac{-N y}{2} \rightarrow \frac{-N}{2} \sum_{j=1} ^ {n} y$$

$$-2a \sum _{j=1} ^{N} 1 \rightarrow a \rightarrow a \sum_{j=1} ^{N} 1$$

$$-2b \sum _{j=1} ^{N} x_{j} \rightarrow bx \rightarrow b \sum_{j=1} ^{N} x$$

The three sums in the second part yields:

$$-2 \sum _{j=1} ^{N} x_{j} y_{j} \rightarrow xy \rightarrow \sum_{j=1} ^{N} xy$$

$$-2a \sum _{j=1} ^{N} x_{j} \rightarrow ax \rightarrow a \sum_{j=1} ^{N} x$$

$$-2b \sum _{j=1} ^{N} {x_{j}}^2 \rightarrow b x^2 \rightarrow b \sum_{j=1} ^{N} x^2$$

If I insert the above in an array A I get the following:

$$A= \left[ \begin{array}{cc} 1 & x \\ x & x^2 \end{array} \right ]$$

Which can also we written as:

$$A \left [ \begin{array}{c} a \\ b \end{array} \right ] = ??$$

But how do I conclude that the product of these two arrays equal the array

$$C = \left[ \begin{array}{c} y \\ xy \end{array} \right ]$$ ??

Sincerely and Best regards

/Fred

Last edited: May 31, 2005
13. May 31, 2005

OlderDan

You have not divided by -2N. I see I made a foolish sign error in my previous post. I will go back and fix my error, but here are the correct equations

$$\frac{\partial E}{ \partial a} = \sum_{j=1} ^{N} 2(y_{j} - a - bx_{j})(-1) = -2\sum_{j=1} ^{N} y_{j} +2a\sum_{j=1} ^{N} 1 +2b\sum_{j=1} ^{N} x_{j} = 0$$

$$\frac{\partial E}{ \partial b} = \sum_{j=1} ^{N} 2(y_{j} - a - bx_{j})(-x_{j}) = -2\sum_{j=1} ^{N} x_{j}y_{j} + 2a\sum_{j=1} ^{N} x_{j} + 2b\sum_{j=1} ^{N} x_{j}^2 = 0$$

$$-2\sum_{j=1} ^{N} y_{j} + 2a\sum_{j=1} ^{N} 1 + 2b\sum_{j=1} ^{N} x_{j} = 0 \rightarrow \frac{1}{N} \sum_{j=1} ^{N} y_{j} - \frac{a}{N}\sum_{j=1} ^{N} 1 - \frac{b}{N} \sum_{j=1} ^{N} x_{j} = \overline{y} - a - b \overline{x} = 0$$

See if you can fix the second equation and take it from there. Keep in mind that a sum over a product of corresponding vector components is a dot product.

14. Jun 1, 2005

Mathman23

In the second equation I get the following:

$$-2\sum_{j=1} ^{N} x_{j}y_{j} + 2a\sum_{j=1} ^{N} x_{j} + 2b\sum_{j=1} ^{N} x_{j}^2 = 0 \rightarrow \sum_{j=1} ^{N} \frac{x_{j} y_{j}}{N} - \frac{a}{N} \sum_{j=1} ^{N} x_{j} - b \sum_{j=1} ^{N} \frac{x_{j}^2}{N} = (\overline{x} \cdot \overline{y}) - a \overline{x} - b(\overline{x} \cdot \overline{x}) = 0$$

If I rewrite Your result combined with mine I got the following set of equations:

$$\begin{array}{ccc} a + b \cdot \overline{x} & = & \overline{y} \\ a \cdot \overline{x} + b \cdot (\overline{x} \cdot \overline{x}) & = & \overline{x} \cdot \overline{y} \end{array}$$

This can also be written as the inhomogeneous linear equation present in the original problem:

$$\left[ {\begin{array}{*{20}c} 1 & {\overline x } \\ {\overline x } & {\frac{{ x \cdot x }}{N}} \\\end{array}} \right] \cdot \left [ \begin{array}{c} a \\ b \end{array} \right ] = \left [ \begin{array}{c} \overline{y} \\ \frac{x \cdot y}{N}\end{array} \right ]$$

One question then remains, how does one prove the claim that solving the above system lets one obtain the minimum value for E(a,b) ??

Sincerely and Best Regards,

Fred

15. Jun 1, 2005

OlderDan

In the calculus of one variable you learned that maxima and minima are found by setting the first derivative of a function to zero. In multivariate calculus, the analogous thing is setting the partial derivatives equal to zero. For a function of two variables you have

$$df(a,b) = \frac{\partial f(a,b)}{\partial a}da + \frac{\partial f(a,b)}{\partial b}db$$

If you change a slightly without changing b, the change in f(a,b) is just the first term on the right. When you change b without changing a, the change in f(a,b) is just the second term. When both those terms are zero, because the partial derivatives are both zero, an arbitrary small change in both a and b gives no change in f(a,b). There are three possibilities. If both partials represent minima, the function is a minimum. If both are maxima, the function is at a maximum. If one is a minimum, and the other is a maximum it is referred to as a "saddle point" because the graph of f(a,b) has the shape of a saddle. In this problem, the point is a minimum. You could prove that by taking second derivatives. I'll leave that for another problem.

16. Jun 1, 2005

Mathman23

Hi

I remember that from my calculas course. Thank You :-)

I was informed today that I also need to calculate the determinant of A.

$$A= \left[ {\begin{array}{*{20}c} 1 & {\overline x } \\ {\overline x } & {\frac{{ x \cdot x }}{N}} \\\end{array}} \right]$$

I get $$det(A) = \frac{x \cdot x}{N}$$

But if thats correct, for which values is A then invertible ?

Sincerley and Best Regards.

Fred

17. Jun 1, 2005

OlderDan

You are missing a term in the determinant.

18. Jun 2, 2005

Mathman23

Hi

I have re-calculated the determinant for A.

I get the following result

$$det(A) = (\frac{1}{N} -1) x^2 \rightarrow (\frac{1}{N} -1) \sum_{j=1} ^{N} (-x_{j}^2)$$

To answer my second question I need to set the above equation to equal zero, in order to determin for which values the matrix A is invertible.

$$det(A) = (\frac{1}{N} -1) x^2 = 0$$

But for which variable do I need to solve the above equation ?

Sincerely and Best Regards

Fred

19. Jun 2, 2005

OlderDan

That is still not correct

$$\overline{x} = \frac{1}{N} \sum_{j=1} ^N x_{j}$$

The second term in the determinant is

$$-\overline{x}^2 = -\left[\frac{1}{N} \sum_{j=1} ^N x_{j}\right]^2 = -\frac{1}{N^2}\left[ \sum_{j=1} ^N x_{j}\right]^2$$

The square of the sum is not the same thing as the sum of the squares

$$\frac{x \cdot x}{N} = \frac{1}{N} \sum_{j=1} ^N x_{j}^2$$

20. Jun 2, 2005

Mathman23

For which values is A invertible then ?

$$-\overline{x}^2 = -\left[\frac{1}{N} \sum_{j=1} ^N x_{j}\right]^2 = -\frac{1}{N^2}\left[ \sum_{j=1} ^N x_{j}\right]^2$$