Hi everyone, I'm currently learning about finite-element methods and I'm having trouble understanding the motivation for the Galkerkin method. Most textbooks I've managed to find are either overly simple or more advanced than my current understanding. I understand unless you happen to select the exact solution as your basis functions you'll get a residual. And, I also understand why it might be useful to integrate that residual over the element (to give you a measure of the total deviation). But I don't understand why the Galerkin method weights the residual by the shape functions and sets it equal to zero. If I were deriving this stuff I would have approached it as a minimization problem. I would be minimizing some measure of the residual to get the 'best' fit. I would be taking the derivative of the residule integral with respect to the unknowns and setting it equal to zero and solving. I can also see variations of this like minimizing the square of the residual (least-squares fit?). The Galerkin method seems to do none of these things? How does the weighting it by the shape functions and setting the result equal to zero give you a good solution? I'd like to understand why the Galerkin method is good and why it became so common?
It seems I've found a potential answer in a book called "Finite Element Method - The Basis and Fundamentals" by Zienkiewicz and Taylor. This answer surprises me somewhat and maybe someone can shed some light on it. I believe the significance of matrix symmetry is that it can be used to reducing the computational cost associated with inverting the matrix. However, isn't there many other reasons that the stiffness matrix can become non-symmetric? For example contact? In situations where the stiffness matrix is already non-symmetric would some other finite-element formulation provide better results?
Matrix symmetry or nonsymmetry depends (or should depend!) on the underlying physics. For example a non-symmetric stiffness matrix means that the structure can do a non-zero amount of work by moving around a closed path. That may be correct if you are modelling friction, fluid-structure interaction, etc, but in general it's a bad idea to have a math model that is fundamentally inconistent with the physics, so you don't really have free choice about whether a stiffness matrix should be symmetric or not. Certainly for linear elasticity, a non-symmetric stffness is just wrong. The Maxwell-Betti reciprocal theorem (first published in the 1870s IIRC) is a simple explanation why.
Thanks for the reply AlephZero. I'm actually trying to understand the more general case for the finite-element method and why it is most commonly done the way it is. Do symmetric stiffness matrices result from other finite-element formulations (non-galkerkin)? Why is the Gallerkin method so common (presumably something makes it 'better' than the alternatives?
The "classical" way to derive FE formulations is from a variational principle. That's fine for situations where the variational priniciple is "obvious" from the physics, but you might want to apply FE to an arbitrary ODE or PDE where it is not at all obvious what the variational formulation of the problem is. The Galerkin method has the pragmatic advantage of being simple to apply in any situation (especially when combined with numerical integration to compute the element matrices), and empirically it is often identical to a more mathematically sophisticated approach. Provided the choice of element shape functions is sensible (e.g. they can represent low order polynomial functions exactly, whatever the geometry of the element) it is very unlikely to produce nonsense results - it might not be "the best" formulation, but it's almost always "good enough".
I have another question (perhaps this is a silly one). What is the variational method is that a name for the physics based derivations such as the virtual work principle used to derive the solid-mechanics finite-element formulation? (Or is the variational method another method?) So perhaps part of the reason for the adoption of the Galerkin method is that it produces the exact same results as the variational methods? Consistency in the results is nice. Any thoughts on my question about why the residual is simply not minimized in someway without any weighting functions?
(Lost my reply due to automatic log out) Thanks. I've spend a while trying to decode the link. There is some pretty heavy math, but I think I've figured some of it out... I do have a question if you have time. What is J(w) from page 5? Where does it come from? What does it mean? I've followed the proof based on the definition, but why do we want to minimize J(w)? My best guess so far is that it is nothing but a reformulation of the problem. By that I mean, if you use this definition, you end up with a condition identical to the weak form of the equation. Which means if you satisfy either the weak form or the minimization form then you satisfy the other. (That can't be the only significance of it though because we could just manipulate the strong form to get the weak one with a lot less effort). J(w) doesn't appear to be a manipulated version of the strong form (at least not in a way I've manged to reproduce). It also doesn't appear to be a measure of the residual (at least I can't see how it is)? What am I missing?
You can prove that in the energy norm, the Galerkin approximation gives the best approximation, i.e. the error in the energy norm is minimized for the Galerkin approximation. You can prove this using the orthogonality condition. This is why the Galerkin method is used. You can also try to minimize the functional in the L2 norm using the least squares method. Unsurprisingly, this method is called the least squares finite element method. J(w) is called a functional. This is what you want to minimize. If you have the ode y''-ay+f=0 define on [0,L], then the functional F(v) belonging to this ode is the integral of (v')^2+av^2-2vf over the domain [0,L]. If you find v that minimizes F(v), then you obtain the solution of the ode. And yes, you don't actually need the variational form, it is equivalent to writing the ode in weak form and multiplying by a test function. But knowing that the Galerkin formulation is actually a minimization problem of a functional gives you access to a lot of tools from functional analysis that helps you to prove that solutions exist, are unique, and that the errors are bounded. I don't have the book anymore but I remember that the book of Strang and Fix is treating the mathematics behind the finite element method in detail, at least in more detail than I can recall right now.
Thanks bigfooted, the sort of thing I'm trying to learn. Part of my question is how do I get J(w)? I don't see what operations were used to get the integrand of the function. For example, how does (v')^2+av^2-2vf come from y''-ay+f=0. (In general what is the procedure for finding a functional for a given ODE?) I will look for the Strang and Fix book. Thanks again.
Is this the book your referring too? "An Analysis of the Finite-Element Method" by Gilbert Strang, George J. Fix
Yes that's the book... Hmm a new edition eh? You can also get the old edition from 1973 for $20 at abebooks.