Hello, I'm trying to understand this proof: http://en.wikipedia.org/wiki/Proofs...st_squares#Least_squares_estimator_for_.CE.B2 Can someone quickly talk me through the differentiation step, bearing in mind I've never learn how to differentiate with respect to a vector? Most confusing for me is: 1. why are they differentiating with respect to the transpose b' rather than just b? 2. where does the -2X'y term come from? 3. is there any assumption here that X is square? Thanks for any help, Mike