Hi everyone, I'm currently learning about finite-element methods and I'm having trouble understanding the motivation for the Galkerkin method. Most textbooks I've managed to find are either overly simple or more advanced than my current understanding. I understand unless you happen to select the exact solution as your basis functions you'll get a residual. And, I also understand why it might be useful to integrate that residual over the element (to give you a measure of the total deviation). But I don't understand why the Galerkin method weights the residual by the shape functions and sets it equal to zero. If I were deriving this stuff I would have approached it as a minimization problem. I would be minimizing some measure of the residual to get the 'best' fit. I would be taking the derivative of the residule integral with respect to the unknowns and setting it equal to zero and solving. I can also see variations of this like minimizing the square of the residual (least-squares fit?). The Galerkin method seems to do none of these things? How does the weighting it by the shape functions and setting the result equal to zero give you a good solution? I'd like to understand why the Galerkin method is good and why it became so common?