Why Do We Square the Deviation in Least Squares Regression?

ElijahRockers · Apr 7, 2012

Homework Statement

http://www.math.tamu.edu/~vargo/courses/251/HW7.pdf

Given a set of points (xi,yi) and assuming f(xi) is linear, the deviation measured is F(m,b)=\sum_{i}^{n}(y_i - f(x_i))^2. There are a few different questions about this from the link above.

The Attempt at a Solution

I'm not sure why the expression \sum_{i}^{n}(y_i - f(x_i))^2 is squared.

Part one says to find the partials of this expression with respect to m and b. Here's my take on it. y_i comes from the set of points, and f(x_i) would be equal to mx_i+b.

Using the chain rule,

\frac{\partial F}{\partial m} = \sum_i^n 2x_i(y_i-(mx_i+b))

\frac{\partial F}{\partial b} = \sum_i^n 2(y_i-(mx_i+b))

Are these correct?

I can kind of see where this question is going. We can use these derivatives to sort of find the 'best fit', I'm guessing where the change in deviation is zero, that is probably the minimum deviation.

Sooo, dF. Does he mean both of the partials? Either one? Or are my expressions wrong to begin with. I haven't got any real experience with the summing notation, but I took a stab at solving for b by setting dF/db = 0.

b=\frac{\sum_i^n 2(y_i - mx_i)}{2n}

That was just kind of a shot in the dark. If someone could give me a push in the right direction I'd appreciate it! Thanks :)

vela · Apr 8, 2012

ElijahRockers said:

Using the chain rule,

\frac{\partial F}{\partial m} = \sum_i^n 2x_i(y_i-(mx_i+b))

\frac{\partial F}{\partial b} = \sum_i^n 2(y_i-(mx_i+b))

Are these correct?

They're both missing a negative sign, but otherwise, they're correct.

Sooo, dF. Does he mean both of the partials? Either one? Or are my expressions wrong to begin with.

Both partials need to be equal to 0.

I haven't got any real experience with the summing notation, but I took a stab at solving for b by setting dF/db = 0.

b=\frac{\sum_i^n 2(y_i - mx_i)}{2n}

That was just kind of a shot in the dark. If someone could give me a push in the right direction I'd appreciate it! Thanks :)

You're trying to obtain expressions for m and b that depend only on the x and y values. Your results for b still has m in it. The two equations ##\partial F/\partial m = 0## and ##\partial F/\partial B = 0## can be written in the form
\begin{align*}
A m + B b &= E \\
C m + D b &= F
\end{align*} where A, B, C, D, E, and F are combinations of the x and y values. You can solve this system of equations to get the results you want.

HallsofIvy · Apr 8, 2012

ElijahRockers said:

Homework Statement

http://www.math.tamu.edu/~vargo/courses/251/HW7.pdf

Given a set of points (xi,yi) and assuming f(xi) is linear, the deviation measured is F(m,b)=\sum_{i}^{n}(y_i - f(x_i))^2. There are a few different questions about this from the link above.

The Attempt at a Solution

I'm not sure why the expression \sum_{i}^{n}(y_i - f(x_i))^2 is squared.

It's not. It is the terms y_i- f(x_i) inside the sum that are squared. One reason for that square is to make sure that errors on opposite sides, positive and negative, do not cancel. More fundamentally (there are many ways to avoid errors canceling) it mimics the formula for distance in n dimensions, \sqrt{\sum (x_i- y_i)^2}.

Part one says to find the partials of this expression with respect to m and b. Here's my take on it. y_i comes from the set of points, and f(x_i) would be equal to mx_i+b.

Using the chain rule,

\frac{\partial F}{\partial m} = \sum_i^n 2x_i(y_i-(mx_i+b))

\frac{\partial F}{\partial b} = \sum_i^n 2(y_i-(mx_i+b))

Are these correct?

I can kind of see where this question is going. We can use these derivatives to sort of find the 'best fit', I'm guessing where the change in deviation is zero, that is probably the minimum deviation.

Sooo, dF. Does he mean both of the partials? Either one? Or are my expressions wrong to begin with. I haven't got any real experience with the summing notation, but I took a stab at solving for b by setting dF/db = 0.

b=\frac{\sum_i^n 2(y_i - mx_i)}{2n}

That was just kind of a shot in the dark. If someone could give me a push in the right direction I'd appreciate it! Thanks :)

ElijahRockers · Apr 8, 2012

vela said:

They're both missing a negative sign, but otherwise, they're correct.

Ahhh I see that now. Thanks. But really though the negative sign won't affect my partials, because they are constant right?

So after more careful consideration, here's a second stab at it...

\frac{\partial F}{\partial b} = 0: \sum_i^n (mx_i + b) = \sum_i^n y_i
\frac{\partial F}{\partial m} = 0: \sum_i^n (mx_i^2 + bx_i) = \sum_i^n y_ix_i

But wouldn't the extra x_i cancel out in the second equation? Then I'd be left with two identical equations and that wouldn't help me.

And thanks Ivy, that clears things up a great deal. :)

Office_Shredder · Apr 8, 2012

The x_i are different for each i, it's not a constant multiple for the whole equation. Here is a sample system of equations like what you have: suppose that x₁=1, x₂=2, y₁ = 3, y₂=4
1*m+b + 2*m+b = 3+ 4
(1*m+b)*1+2*(2*m+b)=3*1+4*2

Doing some algebra these equations are
3m+2b=7
5m+3b=11

Notice that these are not equivalent equations. Solving for m and b will in this case find you the line passing through the two points that are prescribed

ElijahRockers · Apr 8, 2012

Office_Shredder said:

suppose that x₁=1, x₂=2, y₁ = 3, y₂=4
1*m+b + 2*m+b = 3+ 4
(1*m+b)*1+2*(2*m+b)=3*1+4*2

Isn't that basically the same thing I have, just with the sums expanded?

Ok I took another shot...

nb + (\sum_i^n x_i)m = \sum_i^n y_i
(n\sum_i^n x_i )b + (\sum_i^n x_i^2)m = \sum_i^n y_ix_i

Is that right? If so, could I solve by substitution?

Thanks for all the help.

Office_Shredder · Apr 8, 2012

Yes that's exactly what you need. Now solve it by your favorite method of solving 2 equations with 2 unknowns

Why Do We Square the Deviation in Least Squares Regression?

Homework Statement

The Attempt at a Solution

Homework Statement

The Attempt at a Solution

Similar threads

Hot Threads

Prove that the integral is equal to ##\pi^2/8##

Solving the wave equation with piecewise initial conditions

Area of loop in x-y plane

Calculating radius of gyration of plane figure about x-axis

Solve this problem that involves induction

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective