To find the closest point to b in the space spanned by the columns of A we have:
\mathbb{\hat{x}}=(A^TA)^{-1}A^T\mathbb{b}
My question is, shouldn't this solution ##\hat{x}## depend on the choice of distance function over the vector space? Choosing two different distance functions might give...