# Linear Algebra Problem

by Mathman23
Tags: algebra, linear
 P: 255 The following formula $$E(a,b) = \sum_{j=1} ^{N} (y_{i} - (a + b_{i})^2$$ is used to messure the distance between the points $$(x_1,y_1),(x_2,y_2), \ldots, (x_n, y_n) \in \mathbb{R}^2$$ and the line $y=a+bx$ I need to find a set of points $$(a,b) \in \mathbb{R}^2$$ where E(a,b) using tools of linear Algebra. I'm given the following vectors: $$x = \left [ \begin{array}{c} x_{1} \\ \vdots \\ x_{N} \end{array} \right ], v = \left [ \begin{array}{c} y_{1} \\ \vdots \\ y_{N} \end{array} \right ], \ \ \overline{x} = \frac{1}{N} \sum_{j=1} ^N x_{i}, \ \ \overline{y} = \frac{1}{N} \sum_{j=1} ^N y_{i}$$ I'm tasked with calculating $$\mathb{\frac{dE}{da}} , \mathb{\frac{dE}{db}}$$ and showing that if $$\mathb{\frac{dE}{da}}=0 , \mathb{\frac{dE}{db}}= 0$$ leads to the linear system. $$A \left [ \begin{array}{c} a \\ b \end{array} \right ] = c$$, where $$A = \left [ \begin{array}{cc} \frac{1}{x} \ \ \overline{x} \\ \ \ \frac{x \cdot x}{N} \end{array} \right ]$$ and $$c = \left [ \begin{array}{c} \overline{y} \\ \frac{x \cdot y}{N}\end{array} \right ]$$ Anybody have any hits/idear for to solve this assignment ?? Many thanks in advance Sincerley Fred
HW Helper
Sci Advisor
P: 3,033
 Quote by Mathman23 The following formula $$E(a,b) = \sum_{j=1} ^{N} (y_{i} - (a + b_{i})^2$$ is used to messure the distance between the points $$(x_1,y_1),(x_2,y_2), \ldots, (x_n, y_n) \in \mathbb{R}^2$$ and the line $y=a+bx$ I need to find a set of points $$(a,b) \in \mathbb{R}^2$$ where E(a,b) using tools of linear Algebra. I'm given the following vectors: $$x = \left [ \begin{array}{c} x_{1} \\ \vdots \\ x_{N} \end{array} \right ], v = \left [ \begin{array}{c} y_{1} \\ \vdots \\ y_{N} \end{array} \right ], \ \ \overline{x} = \frac{1}{N} \sum_{j=1} ^N x_{i}, \ \ \overline{y} = \frac{1}{N} \sum_{j=1} ^N y_{i}$$ I'm tasked with calculating $$\mathb{\frac{dE}{da}} , \mathb{\frac{dE}{db}}$$ and showing that if $$\mathb{\frac{dE}{da}}=0 , \mathb{\frac{dE}{db}}= 0$$ leads to the linear system. $$A \left [ \begin{array}{c} a \\ b \end{array} \right ] = c$$, where $$A = \left [ \begin{array}{cc} \frac{1}{x} \ \ \overline{x} \\ \ \ \frac{x \cdot x}{N} \end{array} \right ]$$ and $$c = \left [ \begin{array}{c} \overline{y} \\ \frac{x \cdot y}{N}\end{array} \right ]$$ Anybody have any hits/idear for to solve this assignment ?? Many thanks in advance Sincerley Fred
First, let's make sure the problem is understood. This looks to me like the problem of linear regression. E(a,b) is the sum of the squared deviations in the y (vertical) direction of the data points from the line y = a + bx. Your task is to minimize E(a,b) by finding the values of a and b that result in that minimum. That point is found by taking the partial derivatives of E with respect to a and b and setting each to zero. Your derivatives should be

$$\mathb{\frac{\partial E}{\partial a}}=0 , \mathb{\frac{\partial E}{\partial b}}= 0$$

and your first equation should read

$$E(a,b) = \sum_{j=1} ^{N} (y_{j} - (a + bx_{j}))^2$$

What is missing in the matrix?

$$A = \left [ \begin{array}{cc} \frac{1}{x} \ \ \overline{x} \\ \ \ \frac{x \cdot x}{N} \end{array} \right ]$$

Looks like is should be this

$$A = \left[ {\begin{array}{*{20}c} 1 & {\overline x } \\ {\overline x } & {\frac{{ x \cdot x }}{N}} \\ \end{array}} \right]$$
HW Helper
Sci Advisor
P: 3,033
 Quote by arildno Actually, his first equation should read: $$E(a,b)=\sum_{j=1}^{N}(y_{j}-(a+bx_{j}))^{2}$$ EDIT: I see OlderDan edited his reply; the index must agree too, of course.
Yep.. I missed the index. good catch.. I will fix that in my previous post also.

I missed the bottom half of those partial derivatives too . I fixed those also.

And then there are the other index problems from the OP

$$\overline{x} = \frac{1}{N} \sum_{j=1} ^N x_{j}, \ \ \overline{y} = \frac{1}{N} \sum_{j=1} ^N y_{j}$$

P: 255

## Linear Algebra Problem

Hi

The matrix is suppose to look like this.

$$A = \left [ \begin{array}{cc} 1 & \overline{x} \\ \overline{x} & \frac{x \cdot x}{N} \end{array} \right ]$$

/Fred
HW Helper
Sci Advisor
P: 3,033
 Quote by Mathman23 Hi I'm not sure what is missing in the matrix. Whats how my professor presented it. /Fred
I think you copied it wrong. I just added what I think it should be to my first post.
P: 255
Hi Dan and thanks for Your answer,

Then taking the partial derivative of E. First step is that to write the complete sum-formula ??

/Fred

 Quote by OlderDan First, let's make sure the problem is understood. This looks to me like the problem of linear regression. E(a,b) is the sum of the squared deviations in the y (vertical) direction of the data points from the line y = a + bx. Your task is to minimize E(a,b) by finding the values of a and b that result in that minimum. That point is found by taking the partial derivatives of E with respect to a and b and setting each to zero. Your derivatives should be $$\mathb{\frac{\partial E}{\partial a}}=0 , \mathb{\frac{\partial E}{\partial b}}= 0$$ and your first equation should read $$E(a,b) = \sum_{j=1} ^{N} (y_{j} - (a + bx_{j}))^2$$ What is missing in the matrix? $$A = \left [ \begin{array}{cc} \frac{1}{x} \ \ \overline{x} \\ \ \ \frac{x \cdot x}{N} \end{array} \right ]$$ Looks like is should be this $$A = \left[ {\begin{array}{*{20}c} 1 & {\overline x } \\ {\overline x } & {\frac{{ x \cdot x }}{N}} \\ \end{array}} \right]$$
HW Helper
Sci Advisor
P: 3,033
 Quote by Mathman23 Hi Dan and thanks for Your answer, Then taking the partial derivative of E. First step is that to write the complete sum-formula ?? /Fred
Setting the two partial derivatives of E to zero gives you two equations for a and b. The partial with respct to a is simpler and yields an equation in a and b that involves only that averages of x and y. The partial with respect to b is a bit more complicated, involving the dot product of x and y and the dot product of x with itself. With the corrected A matrix you should be able to show that those equations are equivalent to the matrix equation you posted.
P: 255
 Quote by OlderDan Setting the two partial derivatives of E to zero gives you two equations for a and b. The partial with respct to a is simpler and yields an equation in a and b that involves only that averages of x and y. The partial with respect to b is a bit more complicated, involving the dot product of x and y and the dot product of x with itself. With the corrected A matrix you should be able to show that those equations are equivalent to the matrix equation you posted.
How do I take the partial derivative of that sum-formula?

Sincerley

Fred
HW Helper
Sci Advisor
P: 3,033
 Quote by Mathman23 How do I take the partial derivative of that sum-formula? Sincerley Fred
When you take the partial derivative with respect to a, everything else is constant. You know that the derivative of a sum of functions is the sum of the derivatives of the functions. In other words, the derivative "moves inside" the sum. You will be taking the derivative of every term in the sum and you will be left with a sum of terms where each term is a sum of other terms. You will want to separate those in to separate sums. For the derivative wrt a, there will be three sums, a sum of all the y, a sum of all the x times a constant, and the sum of a constant. You can reduce that equation to an equation involving the two constants (a and b) and the averages of x and y. If you get that one, I think you will see how to do the derivative wrt b, but it is a bit more complicated.

Give it a try and post your result. If you don't get it right, I or someone else will post it.
 P: 255 Hello Dan, If I differentiate $$E(a,b)$$ first with respect to a and then to b. I got the following two equations. $$\begin{array}{cc} \frac{\partial E}{ \partial a} = 2(ax_{j} - y_{j} +b_{j}) x_{j} \ \ \mathrm{and} \ \ \frac{\partial E}{ \partial b} = 2(b_{j} + ax_{j} -y_{j}) \end{array}$$ Is this what You mean? Again thank You very much for Your help. Sincerley and Best Regards Fred
HW Helper
Sci Advisor
P: 3,033
 Quote by Mathman23 Hello Dan, If I differentiate $$E(a,b)$$ first with respect to a and then to b. I got the following two equations. $$\begin{array}{cc} \frac{\partial E}{ \partial a} = 2(ax_{j} - y_{j} +b_{j}) x_{j} \ \ \mathrm{and} \ \ \frac{\partial E}{ \partial b} = 2(b_{j} + ax_{j} -y_{j}) \end{array}$$ Is this what You mean? Again thank You very much for Your help. Sincerley and Best Regards Fred
The derivatives are still sums of terms

$$E(a,b) = \sum_{j=1} ^{N} (y_{j} - (a + bx_{j}))^2$$

$$E(a,b) = \sum_{j=1} ^{N} (y_{j} - a - bx_{j})^2$$

$$\frac{\partial E}{ \partial a} = \sum_{j=1} ^{N} 2(y_{j} - a - bx_{j})(-1) = -2\sum_{j=1} ^{N} y_{j} + 2a\sum_{j=1} ^{N} 1 + 2b\sum_{j=1} ^{N} x_{j} = 0$$

$$\frac{\partial E}{ \partial b} = \sum_{j=1} ^{N} 2(y_{j} - a - bx_{j})(-x_{j}) = -2\sum_{j=1} ^{N} x_{j}y_{j} + 2a\sum_{j=1} ^{N} x_{j} + 2b\sum_{j=1} ^{N} x_{j}^2 = 0$$

Divide each of these results by -2N and you should start to identify some terms as simple averages. One will reduce to the parameter a. Other terms will involve the dot product of x and y, and the dot product of x with itself. The resulting equations can be put into the matrix form you were given.
P: 255
Hello Dan and many thanks for Your answer,

I devide the three terms in first and second with $$-2N$$
and get the following results:

$$-2 \sum _{j=1} ^{N} y_{j} \rightarrow \frac{-N y}{2} \rightarrow \frac{-N}{2} \sum_{j=1} ^ {n} y$$

$$-2a \sum _{j=1} ^{N} 1 \rightarrow a \rightarrow a \sum_{j=1} ^{N} 1$$

$$-2b \sum _{j=1} ^{N} x_{j} \rightarrow bx \rightarrow b \sum_{j=1} ^{N} x$$

The three sums in the second part yields:

$$-2 \sum _{j=1} ^{N} x_{j} y_{j} \rightarrow xy \rightarrow \sum_{j=1} ^{N} xy$$

$$-2a \sum _{j=1} ^{N} x_{j} \rightarrow ax \rightarrow a \sum_{j=1} ^{N} x$$

$$-2b \sum _{j=1} ^{N} {x_{j}}^2 \rightarrow b x^2 \rightarrow b \sum_{j=1} ^{N} x^2$$

If I insert the above in an array A I get the following:

$$A= \left[ \begin{array}{cc} 1 & x \\ x & x^2 \end{array} \right ]$$

Which can also we written as:

$$A \left [ \begin{array}{c} a \\ b \end{array} \right ] = ??$$

But how do I conclude that the product of these two arrays equal the array

$$C = \left[ \begin{array}{c} y \\ xy \end{array} \right ]$$ ??

Sincerely and Best regards

/Fred

 Quote by OlderDan $$\frac{\partial E}{ \partial b} = \sum_{j=1} ^{N} 2(y_{j} - a - bx_{j})(-x_{j}) = -2\sum_{j=1} ^{N} x_{j}y_{j} -2a\sum_{j=1} ^{N} x_{j} -2b\sum_{j=1} ^{N} x_{j}^2 = 0$$ Divide each of these results by -2N and you should start to identify some terms as simple averages. One will reduce to the parameter a. Other terms will involve the dot product of x and y, and the dot product of x with itself. The resulting equations can be put into the matrix form you were given.
HW Helper
Sci Advisor
P: 3,033
 Quote by Mathman23 Hello Dan and many thanks for Your answer, I devide the three terms in first and second with $$-2N$$ and get the following results: $$-2 \sum _{j=1} ^{N} y_{j} \rightarrow \frac{-N y}{2} \rightarrow \frac{-N}{2} \sum_{j=1} ^ {n} y$$
You have not divided by -2N. I see I made a foolish sign error in my previous post. I will go back and fix my error, but here are the correct equations

$$\frac{\partial E}{ \partial a} = \sum_{j=1} ^{N} 2(y_{j} - a - bx_{j})(-1) = -2\sum_{j=1} ^{N} y_{j} +2a\sum_{j=1} ^{N} 1 +2b\sum_{j=1} ^{N} x_{j} = 0$$

$$\frac{\partial E}{ \partial b} = \sum_{j=1} ^{N} 2(y_{j} - a - bx_{j})(-x_{j}) = -2\sum_{j=1} ^{N} x_{j}y_{j} + 2a\sum_{j=1} ^{N} x_{j} + 2b\sum_{j=1} ^{N} x_{j}^2 = 0$$

$$-2\sum_{j=1} ^{N} y_{j} + 2a\sum_{j=1} ^{N} 1 + 2b\sum_{j=1} ^{N} x_{j} = 0 \rightarrow \frac{1}{N} \sum_{j=1} ^{N} y_{j} - \frac{a}{N}\sum_{j=1} ^{N} 1 - \frac{b}{N} \sum_{j=1} ^{N} x_{j} = \overline{y} - a - b \overline{x} = 0$$

See if you can fix the second equation and take it from there. Keep in mind that a sum over a product of corresponding vector components is a dot product.
P: 255
 Quote by OlderDan You have not divided by -2N. I see I made a foolish sign error in my previous post. I will go back and fix my error, but here are the correct equations $$\frac{\partial E}{ \partial a} = \sum_{j=1} ^{N} 2(y_{j} - a - bx_{j})(-1) = -2\sum_{j=1} ^{N} y_{j} +2a\sum_{j=1} ^{N} 1 +2b\sum_{j=1} ^{N} x_{j} = 0$$ $$\frac{\partial E}{ \partial b} = \sum_{j=1} ^{N} 2(y_{j} - a - bx_{j})(-x_{j}) = -2\sum_{j=1} ^{N} x_{j}y_{j} + 2a\sum_{j=1} ^{N} x_{j} + 2b\sum_{j=1} ^{N} x_{j}^2 = 0$$ $$-2\sum_{j=1} ^{N} y_{j} + 2a\sum_{j=1} ^{N} 1 + 2b\sum_{j=1} ^{N} x_{j} = 0 \rightarrow \frac{1}{N} \sum_{j=1} ^{N} y_{j} - \frac{a}{N}\sum_{j=1} ^{N} 1 - \frac{b}{N} \sum_{j=1} ^{N} x_{j} = \overline{y} - a - b \overline{x} = 0$$ See if you can fix the second equation and take it from there. Keep in mind that a sum over a product of corresponding vector components is a dot product.
HI Dan and many for thanks for Your answers,

In the second equation I get the following:

$$-2\sum_{j=1} ^{N} x_{j}y_{j} + 2a\sum_{j=1} ^{N} x_{j} + 2b\sum_{j=1} ^{N} x_{j}^2 = 0 \rightarrow \sum_{j=1} ^{N} \frac{x_{j} y_{j}}{N} - \frac{a}{N} \sum_{j=1} ^{N} x_{j} - b \sum_{j=1} ^{N} \frac{x_{j}^2}{N} = (\overline{x} \cdot \overline{y}) - a \overline{x} - b(\overline{x} \cdot \overline{x}) = 0$$

If I rewrite Your result combined with mine I got the following set of equations:

$$\begin{array}{ccc} a + b \cdot \overline{x} & = & \overline{y} \\ a \cdot \overline{x} + b \cdot (\overline{x} \cdot \overline{x}) & = & \overline{x} \cdot \overline{y} \end{array}$$

This can also be written as the inhomogeneous linear equation present in the original problem:

$$\left[ {\begin{array}{*{20}c} 1 & {\overline x } \\ {\overline x } & {\frac{{ x \cdot x }}{N}} \\\end{array}} \right] \cdot \left [ \begin{array}{c} a \\ b \end{array} \right ] = \left [ \begin{array}{c} \overline{y} \\ \frac{x \cdot y}{N}\end{array} \right ]$$

One question then remains, how does one prove the claim that solving the above system lets one obtain the minimum value for E(a,b) ??

Sincerely and Best Regards,

Fred
HW Helper
Sci Advisor
P: 3,033
 Quote by Mathman23 One question then remains, how does one prove the claim that solving the above system lets one obtain the minimum value for E(a,b) ?? Fred
In the calculus of one variable you learned that maxima and minima are found by setting the first derivative of a function to zero. In multivariate calculus, the analogous thing is setting the partial derivatives equal to zero. For a function of two variables you have

$$df(a,b) = \frac{\partial f(a,b)}{\partial a}da + \frac{\partial f(a,b)}{\partial b}db$$

If you change a slightly without changing b, the change in f(a,b) is just the first term on the right. When you change b without changing a, the change in f(a,b) is just the second term. When both those terms are zero, because the partial derivatives are both zero, an arbitrary small change in both a and b gives no change in f(a,b). There are three possibilities. If both partials represent minima, the function is a minimum. If both are maxima, the function is at a maximum. If one is a minimum, and the other is a maximum it is referred to as a "saddle point" because the graph of f(a,b) has the shape of a saddle. In this problem, the point is a minimum. You could prove that by taking second derivatives. I'll leave that for another problem.
P: 255
Hi

I remember that from my calculas course. Thank You :-)

I was informed today that I also need to calculate the determinant of A.

$$A= \left[ {\begin{array}{*{20}c} 1 & {\overline x } \\ {\overline x } & {\frac{{ x \cdot x }}{N}} \\\end{array}} \right]$$

I get $$det(A) = \frac{x \cdot x}{N}$$

But if thats correct, for which values is A then invertible ?

Sincerley and Best Regards.

Fred

 Quote by OlderDan In the calculus of one variable you learned that maxima and minima are found by setting the first derivative of a function to zero. In multivariate calculus, the analogous thing is setting the partial derivatives equal to zero. For a function of two variables you have $$df(a,b) = \frac{\partial f(a,b)}{\partial a}da + \frac{\partial f(a,b)}{\partial b}db$$ If you change a slightly without changing b, the change in f(a,b) is just the first term on the right. When you change b without changing a, the change in f(a,b) is just the second term. When both those terms are zero, because the partial derivatives are both zero, an arbitrary small change in both a and b gives no change in f(a,b). There are three possibilities. If both partials represent minima, the function is a minimum. If both are maxima, the function is at a maximum. If one is a minimum, and the other is a maximum it is referred to as a "saddle point" because the graph of f(a,b) has the shape of a saddle. In this problem, the point is a minimum. You could prove that by taking second derivatives. I'll leave that for another problem.
HW Helper
Sci Advisor
P: 3,033
 Quote by Mathman23 Hi I remember that from my calculas course. Thank You :-) I was informed today that I also need to calculate the determinant of A. $$A= \left[ {\begin{array}{*{20}c} 1 & {\overline x } \\ {\overline x } & {\frac{{ x \cdot x }}{N}} \\\end{array}} \right]$$ I get $$det(A) = \frac{x \cdot x}{N}$$ But if thats correct, for which values is A then invertible ? Sincerley and Best Regards. Fred
You are missing a term in the determinant.
P: 255
Hi

I have re-calculated the determinant for A.

I get the following result

$$det(A) = (\frac{1}{N} -1) x^2 \rightarrow (\frac{1}{N} -1) \sum_{j=1} ^{N} (-x_{j}^2)$$

To answer my second question I need to set the above equation to equal zero, in order to determin for which values the matrix A is invertible.

$$det(A) = (\frac{1}{N} -1) x^2 = 0$$

But for which variable do I need to solve the above equation ?

Sincerely and Best Regards

Fred
 Quote by OlderDan You are missing a term in the determinant.

 Related Discussions Calculus & Beyond Homework 38 Linear & Abstract Algebra 3 Calculus & Beyond Homework 1 Calculus & Beyond Homework 8 Introductory Physics Homework 6