Solution of ODEs by Continuous Groups by George Emmanuel
Chapter 2 Continuous One-Parameter Groups-I
Meditation 7 Symbol of the Infinitesimal Transformation
In books on Lie Groups that I own, the pages where the "symbol of the infinitesimal transformation" appears in a Taylor series are all wrinkled because I have handled them so much over the years trying to understand the terse explanations of such a profound result.
After the most recent bout of study, I have reached the following conclusion:
It isn't a profound result - I mean expanding the Taylor series isn't. It only looks profound if you make the natural choice for what function to expand. That choice will be wrong. The result will look profound and mysterious because it is the wrong answer for that function. You'll spend hours trying to prove it's actually correct. ( Or maybe you won't check the result and head into later confusion.)
The choice of what function to expand may be profound and the implications may be profound. I'll worry about that aspect later.
Make the following definitions:
Let T(x.y,\alpha) = ( \ T_x(x,y,\alpha), T_y(x,y,\alpha )\ ) denote a 1-parameter transformation.
Let f(x,y) be a real valued function whose domain is the xy-plane.
Let u_x(x,y) = D_\alpha T_x(x,y,\alpha)_{\ |\alpha=0}
Let u_y(x,y) = D_\alpha T_y(x,y,\alpha)_{\ |\alpha =0}
u_x and u_y are the
infinitesimal elements
(If you think of (T_x,T_y) sweeping out a path as "time" \alpha varies, then (u_x, u_y) is a tangent vector to that path at the point (x,y). )
Let U be the differential operator defined by the operation on the function g(x,y) by:
[eq. 7.1] U g(x,y) = u_x(x,y) \frac{\partial}{\partial x} g(x,y) + u_y(x,y)\frac{\partial}{\partial y} g(x,y)
The operator U is
"the symbol of the infinitesimal transformation" .
If you've made several abortive attempts to understand Lie groups, you've read the remark that the group can be determined by "what goes on in the neighborhood of the identity transformation". So it is very natural to look at Taylor series expansion about \alpha = 0 since, by convention, we are supposed to use a parameterization so that T(x,y,0) is the identity function.
The old time Lie group books expand a function they call f(x_1,y_1) in a Taylor series. The coordinates (x_1,y_1) are a transformation of the point (x,y). The expansion is neatly expressed in terms of the differential operator U.
[eq. 7.2] f(x_1,y_1) = f(x,y) + U f\ \alpha + \frac{1}{2!}U^2 f \ \alpha^2 + \frac{1}{3!}U^3 f \ \alpha^3 + ...
Ok, what function is it that they are expanding?
To me, the natural function to expand would be f(T_x(x,y,\alpha),T_y(x,y,\alpha)). I spent the previous post motivating interest in this function. I suppose it wasn't a waste. Group actions will probably come into play in solving ODEs and Lie group books do mention expanding f(T_x(x,y,\alpha),T_y(x,y,\alpha)) in Taylor series. But I don't yet see how that it fits in with this post.
If you expand f(T_x(x,y,\alpha),T_y(x,y,\alpha)) in Taylor series about \alpha = 0, the first two terms come out to be the desired f(x,y) + U f. After that, things go wrong. I think the function that the old time books actually expand is f(x + \alpha\ u_x(x,y), y + \alpha\ u_y(x,y)). Page 3 of the PDF of
http://deepblue.lib.umich.edu/handle/2027.42/33514 says this. Emmanuel doesn't make it clear.
Why would they want to expand that function? Notice that they are expanding a function that is itself an approximation. A linear approximation of T(x,y,\alpha) using first derivatives is:
T_x(x,y,\alpha) = T_x(x,y,0) + \alpha D_\alpha(T_x,y,\alpha)_{\ |\alpha = 0} = x + \alpha\ u_x(x,y)
T_y(x,y,\alpha) = T_y(x,y,0) + \alpha D_\alpha(T_y,y,\alpha)_{\ |\alpha = 0} = y + \alpha\ u_y(x,y)
So f(x + \alpha u_x(x,y), y + \alpha u_y(x,y)) is a linear approximation for f(T_x(x,y,\alpha),T_y(x,y,\alpha) ).
After all, suppose a second semester calculus student came up to you and said "I want to approximate f(g(x)) using a Taylor series. Would it be all right if just expand f(g(0) + x g'(0)) instead? That would make the differentiation simpler."
You might say "You know, your eyes aren't quite focusing at the same spot. Have you ever suffered a serious head injury?"
One guess is that the old time books are thinking that the linear approximations are exactly correct when \alpha is the "infinitesimally small" \delta \alpha, so if we expand the approximation in Taylor series and keep track of infinitesimals correctly , something useful will come out. The above linear approximations with an infinitesimal value for \alpha might be the "infinitesimal transformation" - but I need to think about that more. I just want to get the expansion over with.
We assume the existence of all derivatives involved. Use the "D" notation for differentiation.
[ eq. 7.3] f(x + \alpha u_y(x,y), y + \alpha u_y(x,y)) =
f(x +\alpha u_y(x,y), y + \alpha u_y(x,y))_{\ |\alpha = 0}
+ D_\alpha f(\ x + \alpha u_y(x,y), y + \alpha u_y(x,y)\ )_{|_{\alpha=0}}\ \alpha
+ \frac{1}{2!} D^2_\alpha f(\ x + \alpha u_y(x,y), y + \alpha u_y(x,y)\) ()_{|_{\alpha=0}}\ \alpha^2
+ \frac{1}{3}! D^3_\alpha f(\ x + \alpha u_y(x,y), y + \alpha u_y(x,y)\ )_{|_{\alpha=0}}\ \alpha^3
+ ...The first term on the right hand side is obviously f( x + (0)u_x(,x,y),y+(0)u_y(x,y)) = f(x,y).
Working out the differentiation needed for the second term can get confusing because of the traditional notation for partial derivatives. I'll digress to illustrate this. If you define a function by saying w(A,B) = A^2 + B and you set about to do the differentiation D_\theta w(g(x,\theta),h(x,\theta)) you don't have problem expressing this as:
D_\theta w(g(x,\theta), h(x,\theta)) = \frac{\partial w}{\partial A} \frac{\partial g}{\partial \theta} + \frac{\partial w}{\partial B} \frac{\partial h}{\partial \theta}
= 2A_{|_{A=g(x,\theta)}} \frac{\partial g}{\partial \theta} + (1)\frac{\partial h}{\partial \theta} = 2 g(x,\theta) \frac{\partial g}{\partial \theta} + \frac{\partial h}{\partial \theta}
However, suppose you were unlucky enough to have stated the definition of w as w(x,y) = x^2 + y. Then the analogous calculation begins:
D_\theta w(g(x,\theta), h(x,\theta)) = \frac{\partial w}{\partial x} \frac{\partial g}{\partial \theta} + \frac{\partial w}{\partial y} \frac{\partial h}{\partial \theta}
That notation only makes sense to someone who understands that \frac{\partial w}{\partial x} means "the derivative of w with respect to the first of its two arguments" instead of "the derivative of w with respect x no matter where the x appears in the expression".
We are in an unlucky situation because then natural way to define f is as f(x,y).and we want to differentiate an expression where some functions involving x are put into both arguments of f.
So let's temporarily state the function f as:
f = f(A,B)
A = x +\alpha\ u_x(x,y)
B = y +\alpha\ u_y(x,y).
Then the notation for the result is:
D_\alpha f(x + \alpha u_x(x,y), y + \alpha u_y(x,y)) = \frac{\partial f}{\partial A} u_x(x,y) + \frac{\partial f}{\partial B} u_y(x,y)
Set \alpha = 0 and this gives:
D_\alpha f(x + \alpha u_x(x,y), y + u_y(x,y))_{\ |(x,y)} = \frac{\partial f}{\partial A}_{\ |(x,y)} u_x(x,y) + \frac{\partial f}{\partial B}_{\ |(x,y)} u_y(x,y)
i.e. all evaluations take place at the point (x,y).
With the understanding that \frac{\partial f}{\partial x} will mean "the partial derivative of f with respect to its first argument", we can replace \frac{\partial f}{\partial A} with \frac{\partial f}{\partial x}. Similarly we can replace \frac{\partial f}{\partial B} with \frac{\partial f}{\partial y}
Doing that and changing the order of factors we get
D_\alpha f(x + \alpha u_x(x,y), y + u_y(x,y))_{\ |(x,y)} = u_x(x,y) \frac{\partial f}{\partial x}_{\ (x,y)} + u_y(x,y) \frac{\partial f}{\partial y}_{\ |(x,y)}
= U f
I'm not going to give a formal proof of eq. 7.2, but I am going to work out the third term since it is the one that shows I'm expanding the correct function.
The differentiation involved is:
D^2_\alpha f(x + \alpha u_x(x,y), y + \alpha u_y(x,y)) = D_\alpha( \frac{\partial f}{\partial A} u_x(x,y) + \frac{\partial f}{\partial B} u_y(x,y) )
Remembering that each of \frac{\partial f}{\partial A} and \frac{\partial f}{\partial B} has two arguments (A,B) we have:
= \frac{\partial}{\partial A}( \frac{\partial f}{\partial A} u_x(x,y) ) + \frac{\partial}{\partial B}(\frac{\partial f}{\partial A} u_x(x,y)) + \frac{\partial}{\partial A}( \frac{\partial f}{\partial B} u_y(x,y) ) \ +\ \frac{\partial}{\partial B}(\frac{\partial f}{\partial B} u_y(x,y)
= \frac{\partial^2 f}{\partial A^2} u_x^2(x,y) + \frac{\partial^2 f}{\partial B \partial A} u_y(x,y) u_x(x,y) \ + \frac{\partial^2 f}{\partial A \partial B} u_x(x,y) u_y(x,y) + \frac{\partial^2 f}{\partial B^2} u_y^2(x,y)The functions derived from f are each evaluate at (x + \alpha\ u_x, x + \alpha\ u_y) (Taking an additional partial derivative is what produces the additional factors of u_x and u_y. by the chain rule.)
Setting \alpha = 0 evaluates the functions of f at (x,y).
With the understanding that \frac{\partial}{\partial A} can be denoted \frac{\partial}{\partial x} etc. we have:
= u_x^2(x,y) \frac{\partial^2 f}{\partial x^2}+ u_y(x,y) u_x(x,y) \frac{\partial^2 f}{\partial y \partial x} \ + u_x(x,y) u_y(x,y) \frac{\partial^2 f}{\partial x \partial y} + u_y^2(x,y) \frac{\partial^2 f}{\partial y}
The above expression equal to U^2 \ f, which is:
U^2 f = ( u_x(x,y)\frac{\partial}{\partial x} + u_y(x,y)\frac{\partial}{\partial y} )^2 f
\ \ \ = ( u_x^2(x,y)\frac{\partial^2}{\partial x} + u_x(x,y) u_y(x,y)\frac{\partial}{\partial x}\frac{\partial}{\partial y} + u_y(x,y) u_x(x,y)\frac{\partial}{\partial y}\frac{\partial}{\partial x} + u_y^2(x,y))\frac{\partial^2}{\partial y^2})\ f