Suppose we have an observation y = c+ e where c is an unknown constant and e is error with the pdf = exp(-e-1) for e >-1 . We want to determine the least square estimator of c given by the c* which minimizes the error cost function E(c) = .5(y-c)^2 Minimizing the error cost is done by taking the derivative wrt c so y=c. Shouldnt it take into account the distribution of the error? I understand in the matrix case E(c) = T(e)e where T( ) is the transpose . where y=Hc+e = T(y-Hc) (y-Hc) = T(y)y -T( x)Hy -T(y)Hx +T(x)T(H)Hx . The derivative wrt x is -2T(y)H+2T(x)T(H)H= 0 => x=inverse(T(H)H)*T(H)y I guess I am just confused on the scalar case on what to do.