Linear Algebra Problem


by Mathman23
Tags: algebra, linear
Mathman23
Mathman23 is offline
#1
May29-05, 01:19 PM
P: 255
The following formula [tex]E(a,b) = \sum_{j=1} ^{N} (y_{i} - (a + b_{i})^2[/tex] is used to messure the distance between the points [tex](x_1,y_1),(x_2,y_2), \ldots, (x_n, y_n) \in \mathbb{R}^2 [/tex] and the line [latex]y=a+bx[/latex]

I need to find a set of points [tex](a,b) \in \mathbb{R}^2[/tex] where E(a,b) using tools of linear Algebra.

I'm given the following vectors:

[tex]x = \left [ \begin{array}{c} x_{1} \\ \vdots \\ x_{N} \end{array} \right ], v = \left [ \begin{array}{c} y_{1} \\ \vdots \\ y_{N} \end{array} \right ], \ \ \overline{x} = \frac{1}{N} \sum_{j=1} ^N x_{i}, \ \ \overline{y} = \frac{1}{N} \sum_{j=1} ^N y_{i} [/tex]

I'm tasked with calculating [tex] \mathb{\frac{dE}{da}} , \mathb{\frac{dE}{db}}[/tex] and showing that if [tex] \mathb{\frac{dE}{da}}=0 , \mathb{\frac{dE}{db}}= 0[/tex] leads to the linear system.

[tex]A \left [ \begin{array}{c} a \\ b \end{array} \right ] = c[/tex], where [tex]A = \left [ \begin{array}{cc} \frac{1}{x} \ \ \overline{x} \\ \ \ \frac{x \cdot x}{N} \end{array} \right ][/tex] and [tex] c = \left [ \begin{array}{c} \overline{y} \\ \frac{x \cdot y}{N}\end{array} \right ] [/tex]

Anybody have any hits/idear for to solve this assignment ??

Many thanks in advance

Sincerley Fred
Phys.Org News Partner Science news on Phys.org
Simplicity is key to co-operative robots
Chemical vapor deposition used to grow atomic layer materials on top of each other
Earliest ancestor of land herbivores discovered
OlderDan
OlderDan is offline
#2
May29-05, 01:45 PM
Sci Advisor
HW Helper
P: 3,033
Quote Quote by Mathman23
The following formula [tex]E(a,b) = \sum_{j=1} ^{N} (y_{i} - (a + b_{i})^2[/tex] is used to messure the distance between the points [tex](x_1,y_1),(x_2,y_2), \ldots, (x_n, y_n) \in \mathbb{R}^2 [/tex] and the line [latex]y=a+bx[/latex]

I need to find a set of points [tex](a,b) \in \mathbb{R}^2[/tex] where E(a,b) using tools of linear Algebra.

I'm given the following vectors:

[tex]x = \left [ \begin{array}{c} x_{1} \\ \vdots \\ x_{N} \end{array} \right ], v = \left [ \begin{array}{c} y_{1} \\ \vdots \\ y_{N} \end{array} \right ], \ \ \overline{x} = \frac{1}{N} \sum_{j=1} ^N x_{i}, \ \ \overline{y} = \frac{1}{N} \sum_{j=1} ^N y_{i} [/tex]

I'm tasked with calculating [tex] \mathb{\frac{dE}{da}} , \mathb{\frac{dE}{db}}[/tex] and showing that if [tex] \mathb{\frac{dE}{da}}=0 , \mathb{\frac{dE}{db}}= 0[/tex] leads to the linear system.

[tex]A \left [ \begin{array}{c} a \\ b \end{array} \right ] = c[/tex], where [tex]A = \left [ \begin{array}{cc} \frac{1}{x} \ \ \overline{x} \\ \ \ \frac{x \cdot x}{N} \end{array} \right ][/tex] and [tex] c = \left [ \begin{array}{c} \overline{y} \\ \frac{x \cdot y}{N}\end{array} \right ] [/tex]

Anybody have any hits/idear for to solve this assignment ??

Many thanks in advance

Sincerley Fred
First, let's make sure the problem is understood. This looks to me like the problem of linear regression. E(a,b) is the sum of the squared deviations in the y (vertical) direction of the data points from the line y = a + bx. Your task is to minimize E(a,b) by finding the values of a and b that result in that minimum. That point is found by taking the partial derivatives of E with respect to a and b and setting each to zero. Your derivatives should be

[tex] \mathb{\frac{\partial E}{\partial a}}=0 , \mathb{\frac{\partial E}{\partial b}}= 0[/tex]

and your first equation should read

[tex]E(a,b) = \sum_{j=1} ^{N} (y_{j} - (a + bx_{j}))^2[/tex]

What is missing in the matrix?

[tex]A = \left [ \begin{array}{cc} \frac{1}{x} \ \ \overline{x} \\ \ \ \frac{x \cdot x}{N} \end{array} \right ][/tex]

Looks like is should be this

[tex]A = \left[ {\begin{array}{*{20}c}
1 & {\overline x } \\
{\overline x } & {\frac{{ x \cdot x }}{N}} \\
\end{array}} \right][/tex]
OlderDan
OlderDan is offline
#3
May29-05, 01:55 PM
Sci Advisor
HW Helper
P: 3,033
Quote Quote by arildno
Actually, his first equation should read:
[tex]E(a,b)=\sum_{j=1}^{N}(y_{j}-(a+bx_{j}))^{2}[/tex]
EDIT:
I see OlderDan edited his reply; the index must agree too, of course.
Yep.. I missed the index. good catch.. I will fix that in my previous post also.

I missed the bottom half of those partial derivatives too . I fixed those also.

And then there are the other index problems from the OP

[tex] \overline{x} = \frac{1}{N} \sum_{j=1} ^N x_{j}, \ \ \overline{y} = \frac{1}{N} \sum_{j=1} ^N y_{j} [/tex]

Mathman23
Mathman23 is offline
#4
May29-05, 02:51 PM
P: 255

Linear Algebra Problem


Hi

The matrix is suppose to look like this.

[tex]A = \left [ \begin{array}{cc} 1 & \overline{x} \\ \overline{x} & \frac{x \cdot x}{N} \end{array} \right ][/tex]

/Fred
OlderDan
OlderDan is offline
#5
May29-05, 02:54 PM
Sci Advisor
HW Helper
P: 3,033
Quote Quote by Mathman23
Hi

I'm not sure what is missing in the matrix. Whats how my professor presented it.

/Fred
I think you copied it wrong. I just added what I think it should be to my first post.
Mathman23
Mathman23 is offline
#6
May29-05, 03:07 PM
P: 255
Hi Dan and thanks for Your answer,

Then taking the partial derivative of E. First step is that to write the complete sum-formula ??

/Fred

Quote Quote by OlderDan
First, let's make sure the problem is understood. This looks to me like the problem of linear regression. E(a,b) is the sum of the squared deviations in the y (vertical) direction of the data points from the line y = a + bx. Your task is to minimize E(a,b) by finding the values of a and b that result in that minimum. That point is found by taking the partial derivatives of E with respect to a and b and setting each to zero. Your derivatives should be

[tex] \mathb{\frac{\partial E}{\partial a}}=0 , \mathb{\frac{\partial E}{\partial b}}= 0[/tex]

and your first equation should read

[tex]E(a,b) = \sum_{j=1} ^{N} (y_{j} - (a + bx_{j}))^2[/tex]

What is missing in the matrix?

[tex]A = \left [ \begin{array}{cc} \frac{1}{x} \ \ \overline{x} \\ \ \ \frac{x \cdot x}{N} \end{array} \right ][/tex]

Looks like is should be this

[tex]A = \left[ {\begin{array}{*{20}c}
1 & {\overline x } \\
{\overline x } & {\frac{{ x \cdot x }}{N}} \\
\end{array}} \right][/tex]
OlderDan
OlderDan is offline
#7
May29-05, 03:25 PM
Sci Advisor
HW Helper
P: 3,033
Quote Quote by Mathman23
Hi Dan and thanks for Your answer,

Then taking the partial derivative of E. First step is that to write the complete sum-formula ??

/Fred
Setting the two partial derivatives of E to zero gives you two equations for a and b. The partial with respct to a is simpler and yields an equation in a and b that involves only that averages of x and y. The partial with respect to b is a bit more complicated, involving the dot product of x and y and the dot product of x with itself. With the corrected A matrix you should be able to show that those equations are equivalent to the matrix equation you posted.
Mathman23
Mathman23 is offline
#8
May29-05, 03:53 PM
P: 255
Quote Quote by OlderDan
Setting the two partial derivatives of E to zero gives you two equations for a and b. The partial with respct to a is simpler and yields an equation in a and b that involves only that averages of x and y. The partial with respect to b is a bit more complicated, involving the dot product of x and y and the dot product of x with itself. With the corrected A matrix you should be able to show that those equations are equivalent to the matrix equation you posted.
How do I take the partial derivative of that sum-formula?

Sincerley

Fred
OlderDan
OlderDan is offline
#9
May29-05, 06:23 PM
Sci Advisor
HW Helper
P: 3,033
Quote Quote by Mathman23
How do I take the partial derivative of that sum-formula?

Sincerley

Fred
When you take the partial derivative with respect to a, everything else is constant. You know that the derivative of a sum of functions is the sum of the derivatives of the functions. In other words, the derivative "moves inside" the sum. You will be taking the derivative of every term in the sum and you will be left with a sum of terms where each term is a sum of other terms. You will want to separate those in to separate sums. For the derivative wrt a, there will be three sums, a sum of all the y, a sum of all the x times a constant, and the sum of a constant. You can reduce that equation to an equation involving the two constants (a and b) and the averages of x and y. If you get that one, I think you will see how to do the derivative wrt b, but it is a bit more complicated.

Give it a try and post your result. If you don't get it right, I or someone else will post it.
Mathman23
Mathman23 is offline
#10
May30-05, 06:32 AM
P: 255
Hello Dan,

If I differentiate [tex]E(a,b)[/tex] first with respect to a and then to b.

I got the following two equations.

[tex]\begin{array}{cc} \frac{\partial E}{ \partial a} = 2(ax_{j} - y_{j} +b_{j}) x_{j} \ \ \mathrm{and} \ \ \frac{\partial E}{ \partial b} = 2(b_{j} + ax_{j} -y_{j}) \end{array}[/tex]

Is this what You mean?

Again thank You very much for Your help.

Sincerley and Best Regards
Fred
OlderDan
OlderDan is offline
#11
May30-05, 12:06 PM
Sci Advisor
HW Helper
P: 3,033
Quote Quote by Mathman23
Hello Dan,

If I differentiate [tex]E(a,b)[/tex] first with respect to a and then to b.

I got the following two equations.

[tex]\begin{array}{cc} \frac{\partial E}{ \partial a} = 2(ax_{j} - y_{j} +b_{j}) x_{j} \ \ \mathrm{and} \ \ \frac{\partial E}{ \partial b} = 2(b_{j} + ax_{j} -y_{j}) \end{array}[/tex]

Is this what You mean?

Again thank You very much for Your help.

Sincerley and Best Regards
Fred
The derivatives are still sums of terms

[tex]E(a,b) = \sum_{j=1} ^{N} (y_{j} - (a + bx_{j}))^2[/tex]

[tex]E(a,b) = \sum_{j=1} ^{N} (y_{j} - a - bx_{j})^2[/tex]

[tex] \frac{\partial E}{ \partial a} = \sum_{j=1} ^{N} 2(y_{j} - a - bx_{j})(-1) = -2\sum_{j=1} ^{N} y_{j} + 2a\sum_{j=1} ^{N} 1 + 2b\sum_{j=1} ^{N} x_{j} = 0[/tex]

[tex] \frac{\partial E}{ \partial b} = \sum_{j=1} ^{N} 2(y_{j} - a - bx_{j})(-x_{j}) = -2\sum_{j=1} ^{N} x_{j}y_{j} + 2a\sum_{j=1} ^{N} x_{j} + 2b\sum_{j=1} ^{N} x_{j}^2 = 0[/tex]

Divide each of these results by -2N and you should start to identify some terms as simple averages. One will reduce to the parameter a. Other terms will involve the dot product of x and y, and the dot product of x with itself. The resulting equations can be put into the matrix form you were given.
Mathman23
Mathman23 is offline
#12
May31-05, 08:15 AM
P: 255
Hello Dan and many thanks for Your answer,

I devide the three terms in first and second with [tex]-2N[/tex]
and get the following results:


[tex]-2 \sum _{j=1} ^{N} y_{j} \rightarrow \frac{-N y}{2} \rightarrow \frac{-N}{2} \sum_{j=1} ^ {n} y[/tex]

[tex]-2a \sum _{j=1} ^{N} 1 \rightarrow a \rightarrow a \sum_{j=1} ^{N} 1 [/tex]

[tex]-2b \sum _{j=1} ^{N} x_{j} \rightarrow bx \rightarrow b \sum_{j=1} ^{N} x[/tex]

The three sums in the second part yields:

[tex]-2 \sum _{j=1} ^{N} x_{j} y_{j} \rightarrow xy \rightarrow \sum_{j=1} ^{N} xy[/tex]

[tex]-2a \sum _{j=1} ^{N} x_{j} \rightarrow ax \rightarrow a \sum_{j=1} ^{N} x [/tex]

[tex]-2b \sum _{j=1} ^{N} {x_{j}}^2 \rightarrow b x^2 \rightarrow b \sum_{j=1} ^{N} x^2[/tex]

If I insert the above in an array A I get the following:

[tex] A= \left[ \begin{array}{cc} 1 & x \\ x & x^2 \end{array} \right ][/tex]

Which can also we written as:

[tex] A \left [ \begin{array}{c} a \\ b \end{array} \right ] = ??[/tex]

But how do I conclude that the product of these two arrays equal the array

[tex] C = \left[ \begin{array}{c} y \\ xy \end{array} \right ][/tex] ??

Sincerely and Best regards

/Fred

Quote Quote by OlderDan
[tex] \frac{\partial E}{ \partial b} = \sum_{j=1} ^{N} 2(y_{j} - a - bx_{j})(-x_{j}) = -2\sum_{j=1} ^{N} x_{j}y_{j} -2a\sum_{j=1} ^{N} x_{j} -2b\sum_{j=1} ^{N} x_{j}^2 = 0[/tex]

Divide each of these results by -2N and you should start to identify some terms as simple averages. One will reduce to the parameter a. Other terms will involve the dot product of x and y, and the dot product of x with itself. The resulting equations can be put into the matrix form you were given.
OlderDan
OlderDan is offline
#13
May31-05, 12:02 PM
Sci Advisor
HW Helper
P: 3,033
Quote Quote by Mathman23
Hello Dan and many thanks for Your answer,

I devide the three terms in first and second with [tex]-2N[/tex]
and get the following results:


[tex]-2 \sum _{j=1} ^{N} y_{j} \rightarrow \frac{-N y}{2} \rightarrow \frac{-N}{2} \sum_{j=1} ^ {n} y[/tex]
You have not divided by -2N. I see I made a foolish sign error in my previous post. I will go back and fix my error, but here are the correct equations

[tex] \frac{\partial E}{ \partial a} = \sum_{j=1} ^{N} 2(y_{j} - a - bx_{j})(-1) = -2\sum_{j=1} ^{N} y_{j} +2a\sum_{j=1} ^{N} 1 +2b\sum_{j=1} ^{N} x_{j} = 0[/tex]

[tex] \frac{\partial E}{ \partial b} = \sum_{j=1} ^{N} 2(y_{j} - a - bx_{j})(-x_{j}) = -2\sum_{j=1} ^{N} x_{j}y_{j} + 2a\sum_{j=1} ^{N} x_{j} + 2b\sum_{j=1} ^{N} x_{j}^2 = 0[/tex]

[tex] -2\sum_{j=1} ^{N} y_{j} + 2a\sum_{j=1} ^{N} 1 + 2b\sum_{j=1} ^{N} x_{j} = 0 \rightarrow \frac{1}{N} \sum_{j=1} ^{N} y_{j} - \frac{a}{N}\sum_{j=1} ^{N} 1 - \frac{b}{N} \sum_{j=1} ^{N} x_{j} = \overline{y} - a - b \overline{x} = 0[/tex]

See if you can fix the second equation and take it from there. Keep in mind that a sum over a product of corresponding vector components is a dot product.
Mathman23
Mathman23 is offline
#14
Jun1-05, 05:00 AM
P: 255
Quote Quote by OlderDan
You have not divided by -2N. I see I made a foolish sign error in my previous post. I will go back and fix my error, but here are the correct equations

[tex] \frac{\partial E}{ \partial a} = \sum_{j=1} ^{N} 2(y_{j} - a - bx_{j})(-1) = -2\sum_{j=1} ^{N} y_{j} +2a\sum_{j=1} ^{N} 1 +2b\sum_{j=1} ^{N} x_{j} = 0[/tex]

[tex] \frac{\partial E}{ \partial b} = \sum_{j=1} ^{N} 2(y_{j} - a - bx_{j})(-x_{j}) = -2\sum_{j=1} ^{N} x_{j}y_{j} + 2a\sum_{j=1} ^{N} x_{j} + 2b\sum_{j=1} ^{N} x_{j}^2 = 0[/tex]

[tex] -2\sum_{j=1} ^{N} y_{j} + 2a\sum_{j=1} ^{N} 1 + 2b\sum_{j=1} ^{N} x_{j} = 0 \rightarrow \frac{1}{N} \sum_{j=1} ^{N} y_{j} - \frac{a}{N}\sum_{j=1} ^{N} 1 - \frac{b}{N} \sum_{j=1} ^{N} x_{j} = \overline{y} - a - b \overline{x} = 0[/tex]

See if you can fix the second equation and take it from there. Keep in mind that a sum over a product of corresponding vector components is a dot product.
HI Dan and many for thanks for Your answers,

In the second equation I get the following:

[tex] -2\sum_{j=1} ^{N} x_{j}y_{j} + 2a\sum_{j=1} ^{N} x_{j} + 2b\sum_{j=1} ^{N} x_{j}^2 = 0 \rightarrow \sum_{j=1} ^{N} \frac{x_{j} y_{j}}{N} - \frac{a}{N} \sum_{j=1} ^{N} x_{j} - b \sum_{j=1} ^{N} \frac{x_{j}^2}{N} = (\overline{x} \cdot \overline{y}) - a \overline{x} - b(\overline{x} \cdot \overline{x}) = 0 [/tex]

If I rewrite Your result combined with mine I got the following set of equations:

[tex]\begin{array}{ccc}
a + b \cdot \overline{x} & = & \overline{y} \\ a \cdot \overline{x} + b \cdot (\overline{x} \cdot \overline{x}) & = & \overline{x} \cdot \overline{y} \end{array}[/tex]

This can also be written as the inhomogeneous linear equation present in the original problem:

[tex]\left[ {\begin{array}{*{20}c} 1 & {\overline x } \\ {\overline x } & {\frac{{ x \cdot x }}{N}} \\\end{array}} \right] \cdot \left [ \begin{array}{c} a \\ b \end{array} \right ] = \left [ \begin{array}{c} \overline{y} \\ \frac{x \cdot y}{N}\end{array} \right ] [/tex]

One question then remains, how does one prove the claim that solving the above system lets one obtain the minimum value for E(a,b) ??

Sincerely and Best Regards,

Fred
OlderDan
OlderDan is offline
#15
Jun1-05, 03:58 PM
Sci Advisor
HW Helper
P: 3,033
Quote Quote by Mathman23
One question then remains, how does one prove the claim that solving the above system lets one obtain the minimum value for E(a,b) ??

Fred
In the calculus of one variable you learned that maxima and minima are found by setting the first derivative of a function to zero. In multivariate calculus, the analogous thing is setting the partial derivatives equal to zero. For a function of two variables you have

[tex] df(a,b) = \frac{\partial f(a,b)}{\partial a}da + \frac{\partial f(a,b)}{\partial b}db[/tex]

If you change a slightly without changing b, the change in f(a,b) is just the first term on the right. When you change b without changing a, the change in f(a,b) is just the second term. When both those terms are zero, because the partial derivatives are both zero, an arbitrary small change in both a and b gives no change in f(a,b). There are three possibilities. If both partials represent minima, the function is a minimum. If both are maxima, the function is at a maximum. If one is a minimum, and the other is a maximum it is referred to as a "saddle point" because the graph of f(a,b) has the shape of a saddle. In this problem, the point is a minimum. You could prove that by taking second derivatives. I'll leave that for another problem.
Mathman23
Mathman23 is offline
#16
Jun1-05, 04:33 PM
P: 255
Hi

I remember that from my calculas course. Thank You :-)

I was informed today that I also need to calculate the determinant of A.

[tex]A= \left[ {\begin{array}{*{20}c} 1 & {\overline x } \\ {\overline x } & {\frac{{ x \cdot x }}{N}} \\\end{array}} \right] [/tex]

I get [tex]det(A) = \frac{x \cdot x}{N}[/tex]

But if thats correct, for which values is A then invertible ?

Sincerley and Best Regards.

Fred

Quote Quote by OlderDan
In the calculus of one variable you learned that maxima and minima are found by setting the first derivative of a function to zero. In multivariate calculus, the analogous thing is setting the partial derivatives equal to zero. For a function of two variables you have

[tex] df(a,b) = \frac{\partial f(a,b)}{\partial a}da + \frac{\partial f(a,b)}{\partial b}db[/tex]

If you change a slightly without changing b, the change in f(a,b) is just the first term on the right. When you change b without changing a, the change in f(a,b) is just the second term. When both those terms are zero, because the partial derivatives are both zero, an arbitrary small change in both a and b gives no change in f(a,b). There are three possibilities. If both partials represent minima, the function is a minimum. If both are maxima, the function is at a maximum. If one is a minimum, and the other is a maximum it is referred to as a "saddle point" because the graph of f(a,b) has the shape of a saddle. In this problem, the point is a minimum. You could prove that by taking second derivatives. I'll leave that for another problem.
OlderDan
OlderDan is offline
#17
Jun1-05, 07:40 PM
Sci Advisor
HW Helper
P: 3,033
Quote Quote by Mathman23
Hi

I remember that from my calculas course. Thank You :-)

I was informed today that I also need to calculate the determinant of A.

[tex]A= \left[ {\begin{array}{*{20}c} 1 & {\overline x } \\ {\overline x } & {\frac{{ x \cdot x }}{N}} \\\end{array}} \right] [/tex]

I get [tex]det(A) = \frac{x \cdot x}{N}[/tex]

But if thats correct, for which values is A then invertible ?

Sincerley and Best Regards.

Fred
You are missing a term in the determinant.
Mathman23
Mathman23 is offline
#18
Jun2-05, 04:48 AM
P: 255
Hi

I have re-calculated the determinant for A.

I get the following result

[tex]det(A) = (\frac{1}{N} -1) x^2 \rightarrow (\frac{1}{N} -1) \sum_{j=1} ^{N} (-x_{j}^2) [/tex]

To answer my second question I need to set the above equation to equal zero, in order to determin for which values the matrix A is invertible.

[tex]det(A) = (\frac{1}{N} -1) x^2 = 0[/tex]

But for which variable do I need to solve the above equation ?

Sincerely and Best Regards

Fred
Quote Quote by OlderDan
You are missing a term in the determinant.


Register to reply

Related Discussions
Linear Algebra Problem #1 Calculus & Beyond Homework 38
Linear Algebra problem Linear & Abstract Algebra 3
Linear Algebra Problem Calculus & Beyond Homework 1
Linear Algebra Problem Calculus & Beyond Homework 8
Linear Algebra Problem Introductory Physics Homework 6