In the least square linear regression, say we have y=Xb+e (y,b,e are vector and X is matrix, y is observations, b is coefficient, e is error term)
so we need to minimize e'e=(y-Xb)'(y-Xb)=y'y-y'Xb-b'X'y+b'X'Xb we can take the derivative of e'e to b, and we can get the answer is 0-2X'y+2X'Xb...