# Minimization and least squares/ridge regression

1. Jul 18, 2012

### monkaez

1. The problem statement, all variables and given/known data

f(x;a) = x_o + (a_1,a_2,a_3,...a_d)*x

min a (Xa - Y)^t o^(-1) (Xa - Y)

a = (a_0 a_1 a_2 a_3 a_4 . . . a_d)^t

2. Relevant equations

Y = (y_1 y_2 y_ 3 ..... y_k)

X = Dsign Matrix

3. The attempt at a solution

to minimize write

(X(a+ (delta a) - Y )^t o^(-1) (X (a+ delta a) - Y)

= (Xa - Y)^t o^(-1) (Xa - Y) + (delta a)^t X^t o^(-1) (Xa - Y) + (Xa -Y)^t o^(-1) X (delta a) + O((delta a)^t * (delta a))

= (Xa - Y)^t o^(-1) (Xa - Y) + 2*(delta a)^t X^t o^(-1) (Xa - Y) + + O((delta a)^t * (delta a))

a = (X^t o^(-1) X)^(-1) X^t o^(-1) Y

This is directly out of my professors notes and I have no clue how this proves that the resulting product is always the minimum this way?

The above solution a is just the solution for standard least squares problem if o = o^2 * I

a = (X^t X)^(-1)*X^t*Y

I guess my main issue is understanding how the minimization process works and how he drops the terms in the O notation. Any input is greatly appreciated. (I tried using latex but the code doesn't manifest for me and I need to read more about it.)

Last edited: Jul 18, 2012