# Using Lie Groups to Solve & Understand First Order ODE's

Hey guys, I'm really interested in finding out how to deal with differential equations from the point of view of Lie theory, just sticking to first order, first degree, equations to get the hang of what you're doing.

What do I know as regards lie groups?

Solving separable equations somehow exploits the fact that the constant of integration $$C \ = \ y \ - \ \int f(x) dx$$ is a one-parameter group mapping solutions into solutions, & further that the method of change of variables is (apparently?) nothing more than a method of finding a coordinate system in which a one-parameter group of translations/rotations/...??? is admitted so that separation of variables is possible (not sure if that's only what SoV is good for, that just seems to be the implication!).

Solving Euler-Homogeneous equations somehow exploits the fact that the differential equation y' = f(y/x) admits a group of scalings, T(x,y) = (ax,ay), as in this link (bottom of page 23), thus because of this one can use Lie theory to solve these equations as well.

I've tried to teach myself this material a while ago & failed, built up a bit of a mental block when trying again & failed again, went of asking grad students & professors who hadn't come across much of this material & so am now here with another attempt, basically all I need is for someone to explain what's going on with lie groups in general in light of what I've said I know about them & to kind of give the intuition behind what the general process is, how powerful it is etc... I was thinking maybe along the lines of the first chapter of this book, but whatever you think really, would just be good to have someone to ask questions of who knows this stuff!

Related Differential Equations News on Phys.org
bigfooted
Gold Member
I'm not an expert but I do have a handful of books on symmetry analysis. They should be available in the (university) library and the first two mentioned below I found suitable for self-studying.
Peter Hydon (1) has written a very nice introduction to symmetry analysis. it is a very compact book and only covers some aspects but especially the first couple of chapters are a good read. The book of Hans Stephani (2) is also very good and he treats some subjects in more detail. I like the way they explain things. The book of Bluman and Kumei (3) is a very important classic but I find it harder to read, especially for self-studying. It's also a more proof-based mathematics book, which makes it more difficult to read if you want to understand all the proofs. I highly recommend to study these books and in this order: 1->2->3 or 2->3 or maybe just 2.

The idea is that a (local point) symmetry of an ode will be able to reduce it's order, and for first order the ode can be reduced to quadrature. Unfortunately, for first order ODE's there is no systematic way of finding a symmetry, even though you can prove that infinitely many exist.
But you can work backwards: You can find out what the most general ODE is that has a certain symmetry.
So you classify first order ODE's and use a transformation based on the known symmetry of that ODE class, like what you would do for a Bernoulli ODE for instance.
It gets more complicated for the Riccati ODE, but you've probably noticed that already (to find the symmetry you have to solve the Ricatti ODE first).

Edgardo Cheb-Terrab has written a number of papers, some of them are online on arxiv. He has written a large part of the ODE solvers for the Maple software package, based on symmetry analysis. They are very good papers and he explains very well how you can use symmetry analysis to systematically solve ODE's.

I was also shocked to learn that 1. there is such a thing as symmetry analysis and 2. that nobody seems to know about it. It is the most powerful tool for solving nonlinear ODE's (for linear ODE's we have differential Galois theory) and it is the tool that connects all ODE solving 'tricks'. One ring to bring them all and in the darkness bind them. But maybe Tolkien wasn't talking about Lie rings... hmmm...

See this thread: https://www.physicsforums.com/showthread.php?t=598191&highlight=groups The thread didn't explain things concretely enough to suit me and nobody took me up on going through Emanuel's book.
Ages ago I read the first few chapter's of Emanuel & that is the reason I'm creating this thread - it's the one that gave me the mental block!

I'm not an expert but I do have a handful of books on symmetry analysis. They should be available in the (university) library and the first two mentioned below I found suitable for self-studying.
Peter Hydon (1) has written a very nice introduction to symmetry analysis. it is a very compact book and only covers some aspects but especially the first couple of chapters are a good read. The book of Hans Stephani (2) is also very good and he treats some subjects in more detail. I like the way they explain things. The book of Bluman and Kumei (3) is a very important classic but I find it harder to read, especially for self-studying. It's also a more proof-based mathematics book, which makes it more difficult to read if you want to understand all the proofs. I highly recommend to study these books and in this order: 1->2->3 or 2->3 or maybe just 2.
Thanks for the links, I'd checked all these books (& more) after reading Emanuel but the mental block was too much for me back then & I had so much other stuff to do that this side-project was eventually put on hold. Maybe if we go ahead with this idea for reading Emanuel & you see any little things we post that can be added to by referencing one of the books you've mentioned that would be great (but knowing me I'll need to refer to one or another of them soon enough :p)

Stephen Tashi
I haven't read Emanuel's book - being a retired guy, I'm very busy and I only attend to those parts of the World where it's willing to supply me with adequate motivation. Yes, I'd like to go through the book and post in a thread about it. I don't know if we should use this thread to do it. I think there is a section of the forum dedicated to particular books - however, I don't usually visit it.

The only mental block I have about concrete treatments of Lie groups is that they all use the same time honored notation for the 2D case and I don't like the letters they use. I'd prefer to see a notation that uses subscripts to show whether a thing applies to the x or y coordinate.

Great stuff! While I think the textbook section is more just general discussion about the books, if a mod wants to transfer our posts from here into a thread on Emanuel's book that's cool with me - we can't do it ourselves as we can't create threads in that part of the forum.

I think the best way to do this is to use the Feynman method, i.e. act is if you were teaching someone else the theory. One way we could do this is by writing up our thoughts on a chapter, chapter by chapter, & add our own thoughts, ideas, questions etc... Another way we could do it is if one writes something up & posts it the other can just add their comments on it, & take turns or whatever. Another way is to use two different but similar books & write up our thoughts on each (i.e. Emanuel & Cohen since Emanuel says he follows Cohen closely, but Cohen is so old so it's bound to contain clarity!). Whatever you think really, I'm open to none, all or more suggestions :tongue:

Stephen Tashi
Let's begin without a plan. I'll find where I put the book tommorrow and post something about it. I think I have the old Cohen book somewhere too. It may take longer to find.

Tonight, I'll just post some uniformed speculation. Maybe some other forum member will offer to reform me.

I gather that the high class way to think of physics is to think of a "phase space" (if that's the right term.)

A low class way to think of the 1-dimensional "falling body" is to think of it as one particular problem. In that way of thinking, we are given the mass of the body, the position of the body at some time (usulall t = 0,) the velocity of the body at some time and (assuming a constant acceleration due to gravity) we solve a simple differential equation by doing integration and find the formula for the position and velocity of the body at subsequent times. Since the physics is reversible we can also find the position and velocity of the body at previous times.

The high class way to think of the falling body problem is to think of a space (m,x,v,t) consisting of all possible falling body problems. In general, different falling body problems have different answers. But there will be sets of problems in this space that have the same answer. For example there willl be some point (m1,x1,v1,t=0) that has the same answer ("answer" = formulas for x and v) as the point (m2=m1,x2,v2,t=5) because the answer for (m1,x1,v1,t=0) will predict that the state of a body at time t=5 will be (m1,x2,v2,t=5).

From this abstract point of view, we can define a transformation of the space into itself that is a function of one paramter, namely time. We let U(T) be the transformation that sends (m,x1,v1,t1) to point where the body is predicted to be at time T later. So U(T) "acting" on (m,x1,v1,T) = (m,x2,v2,t+T) where x2 and v2 are the position and velocity predicted for t + T by the answer to the falling body problem with initial conditions (m,x1,v1,t).

I think the transformations U(T) are a 1-parameter group. U(0) is the identity transformation. The multiplication U(T1)U(T2) is interpreted as applying U(T2) first and then applying U(T1) to its result. This amounts to the same thing as U(T1+T2). U(T) has an inverse transformation of U(-T). (Some part of this argument must depend on the fact that physics tells us that U(T) is a 1-to-1 transformation on the space. Intuitively, this is because a given falling body doesn't have two different answers. Associativity is just a matter of definition U(T1)( U(T2) U(T3)) and (U(T1) U(T2)) U(T3) are both defined to amount to applying the transformation ins order from right to left.

So there is a Lie group that is related to differential equations.

Cool, without a plan & uninformed speculation is good with me That's a nice way to look at a basic physics problem, & just based off it I already see a lot more clearly how Noether's theorem can be understood in terms of lie groups, i.e. if your problem was invariant under time translations we'd have conservation of energy etc... Obviously one goal of lie groups for me will be to prove Noether's theorem using them!

strangerep
I'm interested in this (or possibly merely related) topic in the context of finding maximal symmetry groups for dynamical equations in physics. (E.g., lurking therein is a "3rd way" to "derive" special relativity by finding all such dynamical symmetries of the equation of free motion.)

Anyway,... the only textbook I've (partially) studied is this one:

P. J. Olver, "Applications of Lie Groups to Differential Equations", Springer, 2nd Ed.
https://www.amazon.com/dp/0387950001/?tag=pfamazon01-20&tag=pfamazon01-20

I haven't yet looked at the other textbooks mentioned earlier in this thread, so I'd be interested if anyone who's read Olver as well as the others can tell me where Olver fits in the heirarchy? I.e., is Olver more/less difficult than the others? Different/expanded subject range? Etc?

(BTW, I got the feeling from Olver that there's still a lot of open problems and unexplored territory here, since papers continue to be published on the subject. There are some computer programs for finding the equations that must be solved to find the Lie algebra generators -- which is the relatively easy part, imho, -- but the task of solving the resulting coupled PDEs is much more tedious.)

For me I would definitely think Olver is too much - over 130 pages of lie groups, manifolds, forms etc... all leading up to 6 pages on first order ode's tells me I'll have no idea how to solve any potential type of first order ode after all that. I feel as though I'd be falling foul to my favourite mathoverflow quote:

Knowing that the Riemann-Hilbert correspondence is an equivalence of triangulated categories may feel empowering, but as a matter of technique, it is mere stardust compared with the power of being able to compute the monodromy of a Fuchsian differential equation by hand.
While I'd definitely want to know the subject from the perspective Olver takes, I couldn't do it until I knew the classical way of approaching it, akin to the way I wouldn't like to know ode's on manifolds ala Arnol'd until I'd learned all the classical tricks I could to be sure I could get by.

But this could be even more interesting if you wanted to study that book in concert with us we could all get the best of both worlds, cover all bases strangerep
But this could be even more interesting if you wanted to study that book in concert with us we could all get the best of both worlds, cover all bases Well, ok, I'll keep an eye on this thread.

But I don't have my own copy of Emanuel, and the price for a new copy is a bit steep: around USD 158 on Amazon. The vendors offering "used" copies at more reasonable prices don't offer international shipping. So I'll just have to follow along without what extracts Amazon and Google Books will let me read online.

Stephen Tashi
Solution of ODEs by Continuous Groups by George Emmanuel

Let's start simple.

Meditation 1 "Differential Equations"

Chapter 1, p3:
The idea of separation of variables is quite simple. Suppose we have a first-order ODE in its most general form
$$f(x,y,dy/dx) = 0$$
where $f$ is an arbitrary function of its three arguments. If this equation can be written as
$$M(y)dy = N(x) dx$$
where $M$ is a function only of $y$ and $N$ is a function only of $x$, then there is a solution
$$\int{M\ dy} = \int{N \ dx }\ + \ a$$ where $a$ is a constant of integration.

The Leibnitz notation is an impediment to understanding things precisely. It's difficult to answer simple questions about it what it means. For example:

Functions have domains and ranges and equations have solution sets. An equation is a propositional function whose range is the set of two values {True, False}. A solution to an equation is a value in the domain of the propositional function that makes the function return True.

$$f(x,y,dy/dx) = 0$$
where $f$ is an arbitrary function of its three arguments.
If a solution to a differential equation is a function, why does this differential equation have a "function of its three arguments"? Shouldn't it be a proposition function that whose domain is a single function represented by a single argument?

$$M(y)dy = N(x) dx$$
Is this a propositional function whose domain is a set of functions or is its domain a set of pairs of functions $x$ and $y$ ?

I wrote answers to a few such questions. This topic might be too elementary to be of interest, so I won't post those thoughts now.

Mediation 2: "Separation Of Variables"

"Separation Of Variables" is defined on page 3 as success in manipulating the differential equation into a certain form. I'm used to the context where "separation of variables" means expressing a function $f(x,y)$ as $h(x) g(y)$. This context involves a function of two variables. So can we relate this context to the "separation of variables" method in manipulating ODEs?

A purely mathematical digression is the question of whether there is useful way to define a more general "separation of variables". For example if can write $f(x,y)$ as the sum of two functions $r(x) + s(y)$ then we have, in a manner of speaking, separated the variables. A generalized definition would be "A separation of variables of the function $f(x,y)$ of two variables is a binary operation $B$ , a function $g(x)$ of $x$ alone and a function $h(y)$ of $y$ alone such that $f(x,y) = B(g(x),h(y))$". I wonder if that leads to anything interesting.

Meditation 3: The Book In A Pea Shell

The chapter gives a summary of the content of the book. It says that if a differential equation is "invariant" under the tranformations defined by a (continuous) group then this reveals what substitutions we can make to transform the differential equation into a separable differential equation. If it teaches me that, I'll be happy.

On Meditation 1: "Differential Equations"

Since we're mainly working with functions of only one or two variables, there are three possible representations we're going to have to be fluent with in dealing with this stuff, there is no real issue of one being more fundamental than the other really as far as differential equations are concerned & I can think of situations where we're gonna need all three... Furthermore I think you're using a definition of function used in logic whereas these definitions are actually valid if you think in terms of axiomatic set theory & not dumbed down or high-school bastardizations of concepts or anything, something I can justify if you really want to get into the nitty gritty :tongue:

As a consequence of this perspective of functions from three viewpoints we can understand the solution of differential equations of the form

$$\frac{dy}{dx} \ = \ f(x,y)$$

as finding an explicit function that acts as a solution, & solving

$$M(x,y)dx \ + \ N(x,y)dy \ = 0$$

as finding an implicit (one parameter family of) function(s) that acts as a solution. The craziest implication of this, however, is given under "Lesson 7: Stay away from differentials" in this essay which berates the Leibniz notation pretty badly yet it illustrates the deep relationship of the parametric perspective of functions to the other two & offers an interpretation of what the notation actually means via trajectories & vector fields - thus all three methods have some real value! In fact, already just by thinking in terms of different representations of functions we've derived a geometric interpretation of integrating factors! Lets see if we can use this lie theory thing to see if we can shed any light on this picture, or get a lie-theoretic version of it.

On Meditation 2: "Separation of Variables"

In the context of ODE's, separation of variables is literally always either defined via the explicit representation as stating that y' = f(x,y) is separable if f(x,y) = g(x)h(y), or in the implicit representation as stating that M(x,y)dx + N(x,y)dy = 0 is Separable if M(x,y)dx + N(x,y)dy = 0 = A(x)B(y)dx + C(x)D(y)dy = 0. Emanuel acknowledges this by stating that the general first order ode f(x,y,y') = 0 is separable if it can be reduced to the form M(x,y)dx + N(x,y)dy = 0, now it's obviously abuse of notation to write dx & dy terms & should be phrasing everything in terms of differential forms if we want to be rigorous but it's an abuse of notation that, according to the article I linked to above, actually encodes the parametric definition of a function within it, & is extremely useful when deriving integrating factors allowing us to solve ode's, thus we'll have to live with it :tongue:

As regards separation of variables having any general kind of definition, page one of this paper & the links therein should indicate there isn't a finished definition yet, however there are books on applying lie theory to pde's that I'm using this thread to get towards that approach the topic of separation of variables as best as is theoretically possible (as far as I know, more on this later!). I'd imagine you're mentioning the additive separation of variables as being motivated by something like the additive method used in the Hamilton Jacobi equation, that's the only place I've ever seen it used so far, I have no idea if it would work for ode's so if you can find an example of it working I'd love it!

I'll try & get back with something more substantive asap Well, ok, I'll keep an eye on this thread.

But I don't have my own copy of Emanuel, and the price for a new copy is a bit steep: around USD 158 on Amazon. The vendors offering "used" copies at more reasonable prices don't offer international shipping. So I'll just have to follow along without what extracts Amazon and Google Books will let me read online.
Cool, definitely do keep an eye on it. Emanuel's book is extremely similar to the by Cohen I linked to so that's a good option if you're interested.

Stephen Tashi
On Meditation 1: "Differential Equations"

Since we're mainly working with functions of only one or two variables, there are three possible representations we're going to have to be fluent with in dealing with this stuff
Those approaches are already too imprecise to satisfy a stickler like me. I'll post a thread in the General Math section (someday) about how to precisely define a "differential equation" instead of digressing on it here.

On Meditation 2: "Separation of Variables"

In the context of ODE's, separation of variables is literally always either defined via the explicit representation as stating that y' = f(x,y) is separable if f(x,y) = g(x)h(y), or in the implicit representation as stating that M(x,y)dx + N(x,y)dy = 0 is Separable if M(x,y)dx + N(x,y)dy = 0 = A(x)B(y)dx + C(x)D(y)dy = 0.
Let's show the two definitions are equivalent - if they are.

I like Rota's paper http://www.ega-math.narod.ru/Tasks/GCRota.htm that you linked. I don't know how to view equations like $M(x)dx + M(y)dy = 0$ in the context of differential forms.

As regards separation of variables having any general kind of definition, page one of this paper & the links therein should indicate there isn't a finished definition yet
I don't understand that paper, but I do understand that my definition is an utter failure. The problem with mine is that a function f(x,y) of two variables "is" a binary operation. Namely that it can be used to define the binary operation B(x,y) = f(x,y). Thus f(x,y) = B(h(x), g(y)) where h and g are both the identity function. Perhaps the search for a good generalization of "separation of variables" must focus on using "simple" binary operatioins, however we can define those.

Since the group invariance is going to reveal the proper substitutions to make, it would be useful to understand if using the technique of substitution just amounts to changing coordinates. Does it? Or are there some technicalities?

Stephen Tashi
Solution of ODEs by Continuous Groups by George Emmanuel

Chapter 2 Continuous One-Parameter Groups-I

Meditation 4: Group Concept

[ Emmanuel doesn't explain many of the concepts about groups that are emphasized in a course on group theory, so apparently they aren't needed. I'll digress to cover a few of them , as a review for my own sake.]

I memorized the chant "closed, associative,identity, inverse" when I first encountered groups.

I prefer to think of a group as set of 1-to-1 functions from some set (or "space") onto itself. The group operation is composition of functions. So the group operation, which we denote as if it were multiplication $(f)(g)$, is $f(g(x))$. (Sometimes people prefer to define it "backwards", so that $(f)(g)$ means $g(f(x))$. Let's not do that. )

The mathematical definition of a group is more abstract than this way of thinking. A group has a set of elements and these can be arbitrary things - they don't have to be functions. A group has a binary operation defined on it that need not be defined using the composition of functions. "Closed, associative,identiy,inverse" is a chant for remembering what properties the set and the operation must satisfy.

Emmanuel's approach is to state the abstract definition of a "group" and then focus his attention on "groups of transformations". "Transformation" is just another word for "function", so thinking of a group as a set of functions is consistent with his approach.

Also there is a sense in which nothing is lost by thinking of groups as sets of functions. A result called "Cayley's Theorem" says that any abstract group can be exactly imitated by some group of functions that are 1-to-1 mappings of some set onto itself. Of course it doesn't actually say "exactly imitated by", it says "is isomorphic to", but I haven't defined "is isomorphic to" yet in this article.

Thinking of the group operation as the composition of functions makes it obvious that the operation (even though it is customarily called "multiplication") need not be commutative. It's clear that $f(g(x))$ and $g(f(x))$ can be different functions.

To be a group, a set of 1-1 functions $G$, can't be just any arbitrary set of 1-1 functions.. There have to be enough of them to satisfy the properties of a group.

"Closed": if $f$ and $g$ are any two functions in $G$ then $f(g(x))$ also must be in it. Notice the condition that the functions map of $G$ map "some space onto itself" is important. If both $f$ and $g$ mapped apples to oranges then $f( g(x))$ wouldn't be defined since it gives $f$ the job of mapping an orange to something.

"Associative": This holds for composing functions.. If you think about what is done to evaluate $((f)(g))(h)$ vs $(f)((g)(h))$ , you see that the only choice in both cases is apply $f,g,h$, working from right to left, so to speak. It doesn't matter whether the functions are mappings of the real numbers, or points in 2-D space etc., you still apply the functions in that order. You begin by finding $h(x)$ then do $g(h(x))$ then do $f(g(h(x))$.

"Identity": $G$ must contain the identity function. One important consequence of this is that if you have the thought "Lets divide the group $G$ into two non-overlapping smaller groups", you are out of luck. The identity is a unique element of $G$. (This can be proven.) So if you divide $G$ into two non-overlapping sets, only one of them has the identity in it. The other one can't be a group.

"Inverse": For any function $f$ in $G$, $G$ must also contain $f^{-1}$ the inverse function of $f$. Since were assuming 1-to-1 functions, there is no problem with the existence of $f^{-1}$, but you must make sure $G$ contains $f^{-1}$.

A cheap way to create a group is to pick a set $\Omega$ and say "Let $S$ be the group consisting of all 1-to-1 functions that map $\Omega$ onto itself". (Sophisticated people will understand that you mean the group operation to be defined as the composition of functions.)

The cheap way, makes it easy to verify that the set of functions satisfies "closed, associative,identity,inverse". For example if $f$ and $g$ are 1-to-1 functions in $S$ then $f(g(x))$ is defined and also a 1-to 1 mapping from $\Omega$ onto itself. So $f(g(x))$ must be in $S$ since we said $S$ contains all such 1-to-1 functions. Thus you have the "Closed" property handed to you.

If there are a finite number of elements in a group we say the group is a "finite group". If we have a finite group of functions, the term "finite group" means that the group has a finite number of functions. It doesn't imply that the domain and range of the functions is finite in any respect.. It also doesn't mean that the functions are bounded in some way.

I mentioned that we lose nothing by thinking of a group as a set of 1-to-1 functions that map some space $\Omega$ onto itself. For finite groups there is a more specific result:

Any finite group $G$ is exactly imitated by some group of functions that are 1-to-1 mappings of a finite set set $\Omega$ onto itself.

Note: this doesn't say you must use the group all possible 1-to1 mappings of $\Omega$ onto itself. It's possible to have a group of 1-to-1 functions mapping $\Omega$ onto itself that has fewer than "all possible" such functions.

For example, let $S$ be the group of all possible 1-to-1 mappings of the set $\{\ 1,2,3,4 \ \}$ onto itself. ( $S$ is called the "symmetric group" on that set.) Let $H$ be the set of 1-to-1 functions of the set $\{\ 1,2,3,4 \ \}$ onto itself that map the element $4$ to itself. It turns out that the functions in $H$ also form a group. $H$ has fewer functions in it than $S$.

I'll continue this meditation in another post and defined "exactly imitated". I conclude this post by tell you about some disagreeable stuff we've skipped.

One of the painful adjustments that students must make in group theory is to learn a new definition for "permutation". A permutation (in group theory) is defined to be a 1-to-1 function of a set onto itself. So, in group, theory, a permutation is no longer "an arrangement of n distinct objects". Since a permutation is a function, we can talk about multiplying permutations together, because we can compose two functions and multiplying two permutations is defined as composing them as function

The poor student who is longing for the days when a permutation was an arrangement of things gets further confused by the shorthand notation used to describe a permutation as a function. The notation somewhat looks like it gives an arrangement of things, but if you try to interpret it that way you get hopelessly lost.

The compensation for this, is that the student can rephrase the result about finite groups given above to sound more imposing. It becomes:

Any finite group can be exactly imitated by some permutation group on a finite set $\Omega$.

We just used the phase "permutation group on a finite set $\Omega$ " instead of saying "group of 1-to-1 functiond of a finite set $\Omega$ onto itself".

Judging off your explanation of permutations I see that you've read this Arnold essay If anybody reads this thread, has read Arnold's ODE's book & is interested in contributing it would be amazing if they could explain in a bit of detail how Arnold's exposition of lie theory in there relates to what we're doing, or any other advanced book really Cohen Ch.1 Part a: "Transformations"

Structure of Cohen's Book:
Cohen's book is titled "An Introduction to the Lie Theory of One-Parameter Groups". The introduction says that a knowledge of ode's is not strictly necessary for this book thus it should be fine for people who are only learning ode's to read along - hopefully this thread will make it easier for people learning ode's! My favourite thing about this book is that the intro says it retains Lie's original proofs & mode of presentation to a large extent! The hope is that translating this stuff to manifolds will become a simple exercise in formalism & notation. It does some basic theory, then first order ode's, some second order ode's, linear first order pde's then more second order ode's.

Structure of Chapter 1:
The chapter is broken up into 11 sections, but really I think there are only two topics discussed, the first being "transformations" & the second being "invariants", thus I'll post on transformations first.

Chapter 1: "Lie's Theory of One-Parameter Groups"
01.01 - Groups of Transformations
Motivation for Lie Group of Transformations
In this motivation section I'll go through Cohen's explanation, point out how it differs from the modern definitions & go through issues of notation etc... I think it'll be fascinating to see the history & see if we all understand it, call me on, or add, anything you can!
Cohen says that a set of transformations constitutes a group if:
"the product of any two transformations is equal to some transformation of the aggregate".
In other words, according to Cohen the transformations

$$x_1 \ = \phi(x_0,y_0,\alpha), \ y_1 \ = \psi(x_0,y_0,\alpha)$$

form a group if given

$$x_2 \ = \phi(x_1,y_1,\beta), \ y_2 \ = \psi(x_1,y_1,\beta)$$

we have that:

$$x_2 \ = \phi(x_1,y_1,\beta) \ = \phi(\phi(x_0,y_0,\alpha),\psi(x_0,y_0,\alpha),\beta) \ = \phi(x_0,y_0,\gamma(\alpha,\beta))$$

$$y_2 \ = \psi(x_1,y_1,\beta) \ = \psi(\phi(x_0,y_0,\alpha),\psi(x_0,y_0,\alpha),\beta) \ = \psi(x_0,y_0,\gamma(\alpha,\beta))$$

He then labels the transformations by Tα etc... & rephrases the above as TβTα = Tγ (Actually he does it in the reverse order which Stephen mentioned we wouldn't use because it sucks!). In a comment he says that φ & ψ are real-valued analytic functions of the variables & further says they are independent w.r.t. x & y, as in they're not functions of each other. Notice though he doesn't gives an actual definition of a group, it's more like he's saying that this set of functions forms a group because of reason X, or it could just be because the book is so old... This definition looks to me like it encodes closure only, he could be relying on the fact that the set of functions satisfies associativity trivially, something that makes a lot of sense since Emanuel stresses the point that we'll never need to check associativity, or he could just be using an earlier definition of a group which actually relies on the structure of the set of functions under composition... It's extremely interesting though that he gives this as his starting point because this is basically the definition of a one-parameter lie group of transformations that we'll actually be using, more or less, whereas it seems here that he's actually saying this is the (1911!) definition of a group! I'm not sure, in any care this is just an interesting side-note.

Then he discusses the concept of an inverse & calls a group like the above with an inverse a Lie group. Thus if the transformations
$$x_1 \ = \phi(x_0,y_0,\alpha_0), \ y_1 \ = \psi(x_0,y_0,\alpha_0)$$
can be put in the form
$$x_0 \ = \phi(x_1,y_1,\alpha_1(\alpha_0)), \ y_0 \ = \psi(x_1,y_1,\alpha_1(\alpha_0))$$
we're dealing with a lie group according to these definitions. This tallies with the modern definition as far as I can see based on the implicit assumption of analyticity & the fact everything is real here, but things can be far more general & again we'll have to be more careful with our definitions (though you have to love these classical definitions in the fact that everything's so natural!).

Finally since we've allowed inverses if we can perform two mutually inverse transformations we get the identity. In other words there must always exist a parameter value δ such that
$$x_1 \ = \phi(x_0,y_0,\delta) \ = \ x_0, \ y_1 \ = \psi(x_0,y_0,\delta) \ = \ y_0$$
He then notes in a comment that there exist groups for which this is not possible, but they wont be considered here. It could be because he's taking a lot for granted that he shows the identity axiom as if it were a trivial consequence of his construction, or else it could be that the identity axiom is actually not part of the definition of a classical group, but either way you have to love how natural the identity axiom falls out of this, even though the modern definitions in group theory would place the identity axiom before the inverse axiom (i.e. magma ---> semi-group ---> monoid ---> group).

Now, how do we reconcile the with modern definitions?
The above construction implicitly encodes three types of mathematical structure (as discussed on this page). The group structure is encoded in the entire explanation, albeit in a weird way... I don't see any mention of associativity, his definition seems like it's based on closure. However it also seems like he never even defined a group so it could just be that he is not defining groups he's giving an example & omitting axioms, relying on the set structure as obviously implying associativity, who knows... The topological structure is encoded in the continuity of φ & ψ & their inverses. In terms of modern group theory, this extra structure is not a trivial addition, invites a world of complexity! The manifold structure is encoded in the analyticity of φ & ψ, another monster of complexity... Luckily the stuff that translates to manfolds for us will be just basically calculus so no need to worry. Thus I found a great definition that will work for us without defining manfolds or topological groups in Bluman that suits our needs & is still perfectly rigorous.

Definition of Lie Group of Transformations

The set S of mappings of the form
$$T \ : \ \mathbb{R^2} \ \times \ M \ \rightarrow \ \mathbb{R^2} \ | \ (\vec{x}_0,t) \ \mapsto \ T(\vec{x}_0,t) \ = \vec{x}_1$$

form a one-parameter Lie group of transformations, with respect to the group (M ⊆ ℝ,ψ),
under the operation
$$\phi \ : \ S \ \times \ S \ \rightarrow \ S \ | \ (T_1,T_0) \ \mapsto \ \phi(T_1,T_0)$$

where the map φ(T₁,T₀) is defined by

$$\phi (T_1,T_0) \ : \ \mathbb{R^2} \ \times \ M \ \rightarrow \ \mathbb{R^2} \ | \ ( \vec{x}_0,t_0) \ \mapsto \ \phi (T_1,T_0) ( \vec{x}_0,t_0) \ = T_1 (T_0 ( \vec{x},t_0),t_1) \ = \ T_1 ( \vec{x}_1,t_1) \ = \vec{x}_2$$

provided that:

a) Topology: t varies continuously on M ⊆ ℝ such that T maps x₀ to T(x₀,t) = x₁ injectively,
b) Group Theory: There is an identity for a certain t (= 0 or 1 when it makes sense), T(x₀,0) = x₀, & the operations φ & ψ interact as:
$$\phi (T_1,T_0) ( \vec{x}_0,t_0) \ = T_1 (T_0 ( \vec{x},t_0),t_1) \ = \ T_0 ( \vec{x}_0,\psi(t_0,t_1)) \ = \vec{x}_2$$
c) Manifold Theory: ψ in (M,ψ) is analytic w.r.t. both arguments & each T on ℝ²×M is analytic w.r.t. t & infinitely differentiable w.r.t. x.
Thus in this definition we have a group (M,ψ) encoded within our "one-parameter lie group of transformations" (S,φ). Note I included ℝ² in the definition (nice notation) but more generally it's for some subset of ℝⁿ. When you grasp what I've written I really encourage you to read page 36 of Bluman that I linked to just to check what I've written as he spells it out a bit more than I did. Note that my x₁ = (x₁,y₁) = (x₁(x₀,y₀,α),y₁(x₀,y₀,α)), in Emanuel he basically just says that the transformations x₁(x₀,y₀,α) & y₁(x₀,y₀,α) should form a group w.r.t. the α term & ignores a lot of the notation. This is a bit of a monster definition though, lets see how we actually use it:

Examples of Lie Group of Transformations
a) Translations T(x,y,ε) = (x + ε,y)
b) Rotations T(x,y,Ө) = (xcos(Ө) - ysin(Ө),xsin(Ө) + ycos(Ө))
c) Affine Transformations of the form T(x,y,λ) = (λx,y)
d) Similitude Transformations T(x,y,λ) = (λx,λy)
e) Arbitrary Examples
T(x,y,λ) = (λx,y/λ)
T(x,y,λ) = (λ²x,λy)
T(x,y,λ) = (λ²x,λ²y)
T(x,y,λ) = (x + 2λ,y + 3λ)
T(x,y,λ) = (λx + (1 - λ)y,y)
T(x,y,λ) = (xcosh(Ө) + ysinh(Ө),xsinh(Ө) + ycosh(Ө))
f) Non-Examples
T(x,y,λ) = (λ/x,y)
g) Re-Parametrizations
λ = sin(Ө) in the rotation gives T(x,y,λ) = (x√(1 - λ²) - λy,λx + y√(1 - λ²)) etc...

But how do we show that any of these are lie groups of transformations? The quick way is to just look at what you're given & verify the λ term turns everything into a group under compositions (the rotation example is a good one to work out on pen & paper to see this explicitly!). Being a bit more careful, I'd use the a), b), c)'s:
a) Define (M,ψ) to be a group in such a way that that T (in say T(x,y,λ) = (λx,y/λ) or T(x,y,ε) = (x + ε,y)) makes sense, is continuous & is injective w.r.t. t, (thus (M,ψ) in T(x,y,λ) = (λx,y/λ) couldn't be (ℝ,+) here since we'd have division by zero whereas in T(x,y,ε) = (x + ε,y) it could be ℝ!
b) Define your identity (T(x,y,1) = (1x,y/1) = (x,y)& T(x,y,0) = (x + 0,y) = (x,y)) & ensure the whole T(x,ψ(δ,ε)) = T(T(x,δ),ε) axioms holds
T((x,y),+(δ,ε)) = (x + δ + ε,y) = (T(x + δ,y),ε) = (T(T((x,y),δ),ε)
c) I'm not really sure yet, I think this is just part of the construction to ensure smoothness etc... Come back to it (not even referred to in any of the examples I've seen but I'm sure we'll find a serious use for it).

Theory of Lie Group of Transformations
The main theoretical tool I can gather at this stage is the infinitesimal transformation & it's consequences which I'll explain soon, however in Emanuel there's a nice proof of something Cohen just states with examples like the one I gave above in g) Re-Parametrizations.
Basically if we have an injective coordinate transformation F(x₀,y₀) = (u,v) = (u(x₀,y₀),v(x₀,y₀)) we can invert to get (x₀,y₀) = F-¹(u,v) = (x₀(u,v),y₀(u,v)) then

$$(x_1,y_1) \ = T(x_0,y_0,\alpha) \ = \ T(x_0(u,v),y_0(u,v),\alpha) \ = \ T'(u,v,\alpha)$$

which implies that the group methods we'll be using to solve ode's will be coordinate independent!

Interpretation of Lie Group of Transformations
Cohen gives a geometric interpretation by talking about the transformations T as transforming points (x₀,y₀) to other points (x₁,y₁) along some curve (due to continuity of α in T(x,y,α)!) thus we span out a 'path-curve' of the group (M,ψ). In other words, as α varies T transforms points along a curve to other points on that curve, hence the name "point transformation" is sometimes used in this context (e.g. Emanuel) since we're just transforming points to other points on the same curve. For an illustration of me talking about this classical way of doing things as being an exercise in notation when translating to the modern context - check out the Bluman link I gave, end of page 36 & the picture on page 37, to see this classical explanation re-interpreted in terms of flows... This implies that we're working with a parametric representation of some curve (!!!) & thus if we eliminate the parameter we get our original curve (that Rota essay ringing a bell?).

I'll get to the infinitesimal transformations as soon as I can.

Last edited:
Stephen Tashi
Solution of ODEs by Continuous Groups by George Emmanuel

Chapter 2 Continuous One-Parameter Groups-I

Meditation 5 Group Concept - continuous transformation groups - their notation

I don't like the notation used by Emmanuel. It's apparently the traditional way to do things, but as an exercise for my own benefit, I'm going to use subscripts to indicate whether things apply to the $x$ or $y$ coordinate instead of using different Greek letters for each.

The groups considered in this chapter are some set of functions that are 1-to-1 mappings of the plane onto itself. They won't be set of all such functions; .they will be special subsets of it. To completely describe such a function $T$ we will need two real valued functions, one to describe how it maps the x-coordinate and one to describe how it maps the y-coordinate. For the time being, I'll represent this as $T(x,y) = (\ T_x(x,y),\ T_y(x,y)\ )$.

It's tempting to call $T$ a vector valued function of a vector. Technically, a pair of coordinates is not necessarily a vector, so I won't write $T(x,y)$ as $\vec{T}(x,y)$ or $\vec{T}(\vec{p})$. You'll just have to remember that $T$ is a pair of functions, one for each coordinate.

An example Emmanuel uses is the group $S$ of all functions that rotate the points in the plane about the origin. (In group theory texts, this group is called SO(2), pronounced "ess-oh-two" or "the special orthogonal group in two dimensions"). The group operation is the composition of functions. The composition of two rotation functions is a rotation function. By saying that $S$ is the group of all rotations of the plane about the origin, we take care of "closed" and "identity" and "inverse". (We regard the identity function as a rotaton of zero degrees.). "Associative" always holds for the composition of functions.

As an example, one element $T$ in the group $S$ is the rotation of points (counterclockwise) by the angle $\frac{\pi}{4}$.
$$T(x,y) = (\ \cos(\frac{\pi}{4}) x - \sin(\frac{\pi}{4})y,\ \sin(\frac{\pi}{4}) x + \cos(\frac{\pi}{4})y \ )$$

or we can represent $T$ as the pair of functions

$T_x(x,y) = \ \cos(\frac{\pi}{4}) x \ - \ \sin(\frac{\pi}{4})y$
$T_y(x,y) = \ \sin(\frac{\pi}{4}) x \ + \ \cos(\frac{\pi}{4})y$

When I try to deduce the formulas for a rotation from simple geometry, I get confused. I only find simple geometric diagrams useful for determining the signs and placement of the trig functions in the formulas. given that I do remember that $\sin$ and $\cos$ are involved. It's helpful to have studied the particular kind of vector valued functions of vectors that are represented by matrices and know that rotations of a vector are given by matrices of the form:

$$\begin{pmatrix} T_x \\ T_y \end{pmatrix} = \begin{pmatrix} \cos(\alpha) & -sin(\alpha) \\ \sin(\alpha) & cos(\alpha) \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix}$$

The group $S$ has an uncountable infinity of elements. There is an element for each possible rotation angle. Let's look for a way to describe them without assigning a different letter to each individual function in the group. The natural way is to put the rotation angle $\alpha$ into the notation. There are two common approaches to accomplish this, indexes and coordinates. I think Emmanuel is using coordinates.

It's an interesting digression to compare the two approaches.

The natural notation for indexing an element of $S$ would be $T_\alpha$ to indicate the function that does a rotation by angle $\alpha$. We can ignore protests from those poor lost souls who think that indexes must be integers. High class mathematicians know that a set of real numbers can also be used to index things. The requirement is that we establish a 1-to-1 function between the set used to index and the things that are indexed. (For example, people who study continuous stochastic processes that take place in time do this - whether they know it or not. The high-class definition of a stochastic process is that it is an indexed collection of (not necessarily independent) random variables. A random process in time is a collection of random variables indexed by the set of real numbers that we use for times).

The natural notation for assigning coordinates is just to list the coordinates in parentheses. We ignore protests from poor lost souls who think that functions cannot be points. High class mathematicians know that anything can be considered a point in some space.

The indexing method requires that an element of the group have 1 and only 1 index. By contrast, in coordinate systems, the same "point" can have several different coordinates. ( For example, for points in polar coordinates, $(r,\theta) = (r, \theta + 2 \pi) = (r,\theta + 4\pi)$. In the example of the group $S$, Emmanuel uses expressions like $\alpha + \beta$ when adding angles and he doesn't say anything about having to modify the result so it lies in interval $[0,2\pi)$. So I think he's using coordinates, not indexes.

The groups in this chapter are "1-paraamter groups". We will consider them as points in a 1-dimensonal space so they have 1 coordinate. (I'm going to call the "parameter" the "coordinate".) The usual way to denote a "point" with a 1-dimensional coordinate is just to write a variable representing that number. If we did that, a function in the group $S$ with coordinate $\alpha$ would be be denoted by $\alpha =(\alpha_x(x,y),\alpha_y(x,y) )$. Emmanuel prefers to put the coordinate of the function in the argument list with $(x,y)$ So a function gets to have both a name like $T$ and a coordinate like $\alpha$.

I'll go along with that, and the full notation for a function $T$ in $S$ will be:

$$T(x,y,\alpha) = ( \ T_x(x,y,\alpha),\ T_y(x,y,\alpha) \ )$$.

The fact that a function in $S$ is denoted by both a name and a coordinate can create some minor confusion. For example, consider the typical math-sounding phrase "Let $T(x,y,\alpha)$ and $W(x,y,\alpha)$ be two functions in the group $S$ ....". The two functions are actually the same function because they have the same coordinate $\alpha$.

The group operation is, of course, composition of functions. If we didn't have to worry about the coordinates of functions, we'd be in familiar territory. For example if $T$ and $W$ are two functions in $S$ then since the group operation of "multiplication" is defined by the composition of functions:

$V = (T)(W)$ (here the product notation means the group operation)

$$= (\ T_x( W_x(x,y), W_y(x,y)),\ T_y(W_x(x,y),W_y(x,y))\ )$$.

That's like what you see when you compose 2-D vector-valued functions of 2-D vectors.

But when we write all our functions with the family name $"T"$, they are distinguished only by their coordinate. So we must compute compositions of functions like $T(x,y,\alpha)$ and $T(x,y,\beta)$.

One notation for a composition is

$T(x,y,\theta) = T(x,y,\alpha) T(x,y,\beta)$ (indicating the group operation)

$= (\ T_x(T_x(x,y,\beta),T_y(x,y,\beta)),\alpha),\ T_y(T_x(x,y,\beta),T_y(x,y,\beta),\alpha) \ )$

It's slightly easier on the eyes to write the info for each result as separate equation:

$T_x(x,y,\theta) = \ T_x(\ T_x(x,y,\beta), \ T_y(x,y,\beta),\alpha)$
$T_y(x,y,\theta) = \ T_y(\ T_x(x,y,\beta), \ T_y(x,y,\beta),\alpha)$

That notation will pass muster in a class of students who are already lost. However, suppose someone asks "How do you find $\theta$?"

In the concrete example at hand, we have a group of rotations. If you apply a rotation of angle $\beta$ and then apply a rotation of $\alpha$ this amounts to applying a rotation of $\alpha + \beta$. So, in the example at hand $\theta = \alpha + \beta$.

Now suppose we are in the general situation and the group of functions isn't known to be rotations. What is the "honest" math notation? The coordinate $\theta$ of the product is a function of the information in the factors, so we should write it as a function $\theta(....)$. What should the arguments of that function be?

The arguments of the function $\theta(....)$ should not include any variables that denote points on the 2-D plane. This is because $\theta$ is supposed to be a function that results from composing two other functions and there is nothing in that thought that says we only compose them at a particular location $(x,y)$. The parameters $\alpha, \beta$ are the designations of two elements in the group and (by analogy to the multiplication table for a finite group) the designation of the result is only a function of the designations of the two elements of the group that are the factors. So we should write $\theta(\alpha,\beta)$.

In the examples in this chapter $\theta(\alpha,\beta) = \alpha + \beta$.

Another special property of the examples in this chapter is that the coordinate of the identity function is always zero, i.e. $T_x (x,y,0) = x, \ T_y(x,y,0) = y$.

I haven't dug out my copy of Cohen's book. As I recall, he goes into these matters in detail. In Emmauel's presentation, a "1-parameter continuous group" is simply a group of functions that map the 2D plane onto itself. He doesn't restrict these transformations to be nice in any way. (Think about how mathematicians can invent all sorts of crazy functions to disturb people.) I think Cohen has a more restrictive definition. From that definition, he shows (as I recall) that one can always assign the coordinates for the functions in the group in such a way so that $\theta(\alpha,\beta) = \alpha + \beta$. I found that very counter intuitive. If he gave a proof, I got lost in the Greek letters.

Both Cohen and Emmanuel adopt the convention that $T(x,y,\alpha)$ will be the identity function when $\alpha = 0$. It isn't controversial that this can be arranged. If you had assigned coordinates so that $T(x,y,35.2)$ was the identity function, you could make a new assignment of coordinates by subtracting $35.2$ from the original coordinates assignments.

Let's look briefly at a 2-parameter group of transformations. Why? Because I can read the mind of people who think like physicists. Perhaps people who think that way didn't read past the place where I called the parameter of the group a "coordinate". They already have in mind that the parameter of the group is "time" and that as you vary "time" $\alpha$, the function $T(x,y,\alpha)$ is just a way to generate a position vector that starts at $(x,y)$ at time 0 and moves elsewhere as time progresses. That view needs a slight modification.

Let $D$ be the group whose elements are all functions that map the 2D plane onto itself by translating each point a given distance in a given direction. Also include the identity transformation as one of them.. There are various ways to designate the elements of this group with coordinates. One could adopt a polar coordinate style scheme using the magnitude and direction of the displacement. It seems simplest to use cartesian style coordinates $(\alpha_x,\alpha_y)$ where each coordinate gives the displacement the function makes in the respective coordinate

If we must write that out, let's do it as 2 coordinate equations:

$T_x(x,y,\alpha_x,\alpha_y) = x + \alpha_x$.
$T_y(x,y,\alpha_x,\alpha_y) = y + \alpha_y$

This method assigning coordinates makes $T(x,y,0,0)$ the identity function.

The group operation is still denoted as multiplication and implemented as the composition of functions. I won't write out an example of that in detail. I will write down the shorthand notation for it where we use a symbol like $T(x,y,\alpha_x,\alpha_y)$ to stand for a pair of coordinate functions and multiplication to stand for the operation of composing the functions.

$T(x,y,\theta_x,\theta_y) = T(x,y,\alpha_x,\alpha_y) T(x,y,\beta_x,\beta_y)$

Again the question arises, what are the arguments of the $\theta$'s ? In this particular example $\theta_x = \alpha_x + \beta_x , \ \theta_y = \alpha_y + \beta_y$.

However in the general case they must be written as:

$\theta_x(\alpha_x,\alpha_y,\beta_x,\beta_y),\ \theta_y(\alpha_x,\alpha_y,\beta_x,\beta_y)$

This is because you can't determine the result unless you specific the particular functions involved in the group operation and you need two coordinates per function to specify them precisely.

I haven't peeked at anything about 2-parameter groups yet, so I'm really curious if there is a theorem that says you can always assign coordinates so the functions $\theta_x, \theta_y$ have a simple form.

Last edited:
Stephen Tashi
Solution of ODEs by Continuous Groups by George Emmanuel

Chapter 2 Continuous One-Parameter Groups-I

Meditation 6. Context For The "symbol of the infinitesimal transformation"

Emmanuel defines an infinitesimal transformation to be an expression that has some Leibnitizian "$\delta$'s that are sitting by themselves. The only "infinitesimal" things he clearly defines are the "infinitesimal elements" and the "symbol of the infinitesimal transformation", which is a differential operator. So I'll deal with "symbol of the infinitesimal transformation" before worrying about the infinitesimal transformation. Before dealing with the "symbol of the infinitesimal transformation" itself, I'll devote this post to establishing the context for it.

Taking a general view of the subject of this section, a group $G$ that has been defined as set of functions on one set $\Omega$ can often be regarded (simultaneously) as a set of functions on a completely different set $\Psi$. It's useful to have a definition expressing the idea that the functions of $G$ have an orderly behavior as functions on another space $\Psi$ but don't necessarily form a group as functions on $\Psi$. One definition that expresses this idea is the definition of a "group action" on a set $\Psi$.

I won't try to explain the formalities of a "group action" in this post. (I myself would need to review them!) I'll only describe a simple way to regard a groups of functions that are defined as mappings of the plane onto itself as also being functions that map real valued functions on the plane to other real valued functions.

As usual, let $G$ be a 1-parameter group of functions that map the 2-D plane to itself. Let $F(x,y)$ be a real valued function on the plane. ($F(x,y)$ is not an element of $G$. The function $F$ maps a pair of coordinates to a single real number, not to a pair of numbers.) You can imagine $F(x,y)$ displayed as surface above the xy-plane in 3D by setting $z = F(x,y)$. A function $g$ that is an element of the group $G$ maps points $(x,y)$ to different points. We can visualize the result is that the surface of $F(x,y)$ is also moved along with the points. So $g$ maps $F(x,y)$ to a different function.

Let $\Psi$ be the set of all real valued functions on the plane. There is nothing in the definition of $G$ that says we must also regard an element of G as a function that maps $\Psi$ into itself. But, since definitions in mathematics are arbitrary, we may define a way to associates each element of $G$ with a functions that does that.

Since several types of functions are being discussed here, It may help if I start calling the elements of $G$ "transformations" instead of "functions". There is no difference in what the two words mean, but "transformations" reminds us that they are functions that map the plane to itself.

To give a definition in precise terms, let's first use the notation for 1-parameter transformations $T(x,y,\alpha)$ that doesn't list both its coordinate functions. We define how $T(x,y,\alpha)$ maps the function $F(x,y)$ to another function by saying that it sends $F(x,y)$ to the "new" function $G(x,y) = F( T(x,y,\alpha) )$.

If we want to show to details, we make the definition that the transformation
$T(x,y,\alpha) = (\ T_x(x,y,\alpha),\ T_y(x,y,\alpha) \ )$
"acts" to map the function $F(x,y)$ to the function $G(x,y)$ given by
$G(x,y) = F( T_x(x,y,\alpha), T_y(x,y,\alpha) )$

To relate this to the book, on page 13 section 2.3 Global Group Equations, Emmanuel considers a function denoted by $f(x_1,y_1)$. This amounts to the same thing as the function $F( T_x,(x,y,\alpha),T_y(x,y,\alpha) )$ since the coordinates $(x_1,y_1)$ are understood to be the result of transforming the point $(x,y)$ by a 1-parameter transformation.

Let's do some examples of transformations "acting" on functions. I'll use lowercase letters for the real valued functions. Even though I haven't defined an "action", I'll use some notation for it, which employs a period ".". The notation $g(x,y) = f(x,y).T(x,y,\alpha))$ indicates that the one 1-parameter transformation $T(x,y,\alpha)$ "acts" to map the function $f(x,y)$ to the function $g(x,y)$. (It might seem more natural to write transformation "$T$" on the left hand side of the function $f$. I may explain in a later post why it's better to write it on the right side.)

Example 6.1 $f(x,y) = 3x + y^2$ Let $T(x,y,\alpha)$ be an element of the rotation group $S$ defined in a previous post.

$g(x,y) = f(x,y).T(x,y,\alpha) = f ( T_x(x,y,\alpha),T_y(x,y,\alpha))$
$= 3(T_x(x,y,\alpha)) + ( T_y(x,y,\alpha))^2$
$= 3 ( x \cos(\alpha) - y \sin(\alpha)) + ( x \sin(\alpha) + y \cos(\alpha) )^2$
$= 3x \cos(\alpha) - 3y \sin(\alpha) + x^2 \sin^2(\alpha) + y^2 \cos^2(\alpha) + 2xy \sin(\alpha)\cos(\alpha)$

In the above example, it may seem that a "simple" 2 variable polynomial function $f$ has been mapped to a complicated trig function. However, keep in mind that $\alpha$ is a constant because we are looking at what a particular element of the group $S$ does. So the messy looking $g(x,y)$ is also a polynomial function because the terms involving $\alpha$ are constants.

Two simple, yet important examples:

Example 6.2 $f(x,y) = x$

$f(x,y).T(x,y,\alpha) = f ( T_x(x,y,\alpha),T_y(x,y,\alpha) = T_x(x,y,\alpha)$

Example 6.3 $f(x,y) = y$

$f(x,y).T(x,y,\alpha) = f ( T_x(x,y,\alpha),T_y(x,y,\alpha) = T_y(x,y,\alpha)$

Example 6.4: Let $T$ be an element of the rotation group $S$. The elements of $S$ move a point $(x,y)$ to another point that is the same distance from the origin. So we would expect a real valued function $f(x,y)$ that takes constant values on circles about the origin will be "transformed into itself" by $T$.

$f(x,y ) = x^2 + y^2$ is such a function.

$g(x,y) = f(x,y).T(x,y,\alpha) = f ( T_x(x,y,\alpha),T_y(x,y,\alpha))$
$= (T_x(x,y,\alpha))^2 + (T_y(x,y,\alpha))^2$
$= ( x \cos(\alpha) - y \sin(\alpha) )^2 + ( x \sin(\alpha) + y \cos(\alpha) )^2$
$= x^2 \cos^2(\alpha) + y^2 \sin^2(\alpha) -2xy \cos(\alpha)\sin(\alpha)$
$\ \ + x^2\sin^2(\alpha) + y^2 \cos^2(\alpha) + 2xy \sin(\alpha) \cos(\alpha)$
$= x^2 \cos^2(\alpha) + x^2\sin^2(\alpha) + y^2 \sin^2(\alpha) + y^2 \cos^2(\alpha)$
$= x^2 (\cos^2(\alpha) + \sin^2(\alpha)) + y^2 ( \sin^2(\alpha) + \cos^2(\alpha))$
$= x^2(1) + y^2(1) = x^2 + y^2$

The general idea of "invariants" is important in mathematics and physics so I suspect functions that are "invariant" under all transformations of a particular group (as in the above example) are important.

Stephen Tashi
Solution of ODEs by Continuous Groups by George Emmanuel

Chapter 2 Continuous One-Parameter Groups-I

Meditation 7 Symbol of the Infinitesimal Transformation

In books on Lie Groups that I own, the pages where the "symbol of the infinitesimal transformation" appears in a Taylor series are all wrinkled because I have handled them so much over the years trying to understand the terse explanations of such a profound result.

After the most recent bout of study, I have reached the following conclusion:

It isn't a profound result - I mean expanding the Taylor series isn't. It only looks profound if you make the natural choice for what function to expand. That choice will be wrong. The result will look profound and mysterious because it is the wrong answer for that function. You'll spend hours trying to prove it's actually correct. ( Or maybe you won't check the result and head into later confusion.)

The choice of what function to expand may be profound and the implications may be profound. I'll worry about that aspect later.

Make the following definitions:

Let $T(x.y,\alpha) = ( \ T_x(x,y,\alpha), T_y(x,y,\alpha )\ )$ denote a 1-parameter transformation.

Let $f(x,y)$ be a real valued function whose domain is the xy-plane.

Let $u_x(x,y) = D_\alpha T_x(x,y,\alpha)_{\ |\alpha=0}$
Let $u_y(x,y) = D_\alpha T_y(x,y,\alpha)_{\ |\alpha =0}$

$u_x$ and $u_y$ are the infinitesimal elements

(If you think of $(T_x,T_y)$ sweeping out a path as "time" $\alpha$ varies, then $(u_x, u_y)$ is a tangent vector to that path at the point $(x,y)$. )

Let $U$ be the differential operator defined by the operation on the function $g(x,y)$ by:
[eq. 7.1] $U g(x,y) = u_x(x,y) \frac{\partial}{\partial x} g(x,y) + u_y(x,y)\frac{\partial}{\partial y} g(x,y)$

The operator $U$ is "the symbol of the infinitesimal transformation" .

If you've made several abortive attempts to understand Lie groups, you've read the remark that the group can be determined by "what goes on in the neighborhood of the identity transformation". So it is very natural to look at Taylor series expansion about $\alpha = 0$ since, by convention, we are supposed to use a parameterization so that $T(x,y,0)$ is the identity function.

The old time Lie group books expand a function they call $f(x_1,y_1)$ in a Taylor series. The coordinates $(x_1,y_1)$ are a transformation of the point $(x,y)$. The expansion is neatly expressed in terms of the differential operator $U$.

[eq. 7.2] $$f(x_1,y_1) = f(x,y) + U f\ \alpha + \frac{1}{2!}U^2 f \ \alpha^2 + \frac{1}{3!}U^3 f \ \alpha^3 + ...$$

Ok, what function is it that they are expanding?

To me, the natural function to expand would be $f(T_x(x,y,\alpha),T_y(x,y,\alpha))$. I spent the previous post motivating interest in this function. I suppose it wasn't a waste. Group actions will probably come into play in solving ODEs and Lie group books do mention expanding $f(T_x(x,y,\alpha),T_y(x,y,\alpha))$ in Taylor series. But I don't yet see how that it fits in with this post.

If you expand $f(T_x(x,y,\alpha),T_y(x,y,\alpha))$ in Taylor series about $\alpha = 0$, the first two terms come out to be the desired $f(x,y) + U f$. After that, things go wrong.

I think the function that the old time books actually expand is $f(x + \alpha\ u_x(x,y), y + \alpha\ u_y(x,y))$. Page 3 of the PDF of http://deepblue.lib.umich.edu/handle/2027.42/33514 says this. Emmanuel doesn't make it clear.

Why would they want to expand that function? Notice that they are expanding a function that is itself an approximation. A linear approximation of $T(x,y,\alpha)$ using first derivatives is:

$T_x(x,y,\alpha) = T_x(x,y,0) + \alpha D_\alpha(T_x,y,\alpha)_{\ |\alpha = 0} = x + \alpha\ u_x(x,y)$

$T_y(x,y,\alpha) = T_y(x,y,0) + \alpha D_\alpha(T_y,y,\alpha)_{\ |\alpha = 0} = y + \alpha\ u_y(x,y)$

So $f(x + \alpha u_x(x,y), y + \alpha u_y(x,y))$ is a linear approximation for $f(T_x(x,y,\alpha),T_y(x,y,\alpha) )$.

After all, suppose a second semester calculus student came up to you and said "I want to approximate $f(g(x))$ using a Taylor series. Would it be all right if just expand $f(g(0) + x g'(0))$ instead? That would make the differentiation simpler."

You might say "You know, your eyes aren't quite focusing at the same spot. Have you ever suffered a serious head injury?"

One guess is that the old time books are thinking that the linear approximations are exactly correct when $\alpha$ is the "infinitesimally small" $\delta \alpha$, so if we expand the approximation in Taylor series and keep track of infinitesimals correctly , something useful will come out. The above linear approximations with an infinitesimal value for $\alpha$ might be the "infinitesimal transformation" - but I need to think about that more. I just want to get the expansion over with.

We assume the existence of all derivatives involved. Use the "$D$" notation for differentiation.

[ eq. 7.3] $f(x + \alpha u_y(x,y), y + \alpha u_y(x,y)) =$

$f(x +\alpha u_y(x,y), y + \alpha u_y(x,y))_{\ |\alpha = 0}$

$+ D_\alpha f(\ x + \alpha u_y(x,y), y + \alpha u_y(x,y)\ )_{|_{\alpha=0}}\ \alpha$

$+ \frac{1}{2!} D^2_\alpha f(\ x + \alpha u_y(x,y), y + \alpha u_y(x,y)\) ()_{|_{\alpha=0}}\ \alpha^2$

$+ \frac{1}{3}! D^3_\alpha f(\ x + \alpha u_y(x,y), y + \alpha u_y(x,y)\ )_{|_{\alpha=0}}\ \alpha^3$

$+ ....$

The first term on the right hand side is obviously $f( x + (0)u_x(,x,y),y+(0)u_y(x,y)) = f(x,y)$.

Working out the differentiation needed for the second term can get confusing because of the traditional notation for partial derivatives. I'll digress to illustrate this. If you define a function by saying $w(A,B) = A^2 + B$ and you set about to do the differentiation $D_\theta w(g(x,\theta),h(x,\theta))$ you don't have problem expressing this as:

$$D_\theta w(g(x,\theta), h(x,\theta)) = \frac{\partial w}{\partial A} \frac{\partial g}{\partial \theta} + \frac{\partial w}{\partial B} \frac{\partial h}{\partial \theta}$$
$$= 2A_{|_{A=g(x,\theta)}} \frac{\partial g}{\partial \theta} + (1)\frac{\partial h}{\partial \theta} = 2 g(x,\theta) \frac{\partial g}{\partial \theta} + \frac{\partial h}{\partial \theta}$$

However, suppose you were unlucky enough to have stated the definition of $w$ as $w(x,y) = x^2 + y$. Then the analogous calculation begins:

$$D_\theta w(g(x,\theta), h(x,\theta)) = \frac{\partial w}{\partial x} \frac{\partial g}{\partial \theta} + \frac{\partial w}{\partial y} \frac{\partial h}{\partial \theta}$$

That notation only makes sense to someone who understands that $\frac{\partial w}{\partial x}$ means "the derivative of $w$ with respect to the first of its two arguments" instead of "the derivative of $w$ with respect $x$ no matter where the $x$ appears in the expression".

We are in an unlucky situation because then natural way to define $f$ is as $f(x,y)$.and we want to differentiate an expression where some functions involving $x$ are put into both arguments of $f$.

So let's temporarily state the function $f$ as:

$f = f(A,B)$
$A = x +\alpha\ u_x(x,y)$
$B = y +\alpha\ u_y(x,y)$.

Then the notation for the result is:

$$D_\alpha f(x + \alpha u_x(x,y), y + \alpha u_y(x,y)) = \frac{\partial f}{\partial A} u_x(x,y) + \frac{\partial f}{\partial B} u_y(x,y)$$

Set $\alpha = 0$ and this gives:

$$D_\alpha f(x + \alpha u_x(x,y), y + u_y(x,y))_{\ |(x,y)} = \frac{\partial f}{\partial A}_{\ |(x,y)} u_x(x,y) + \frac{\partial f}{\partial B}_{\ |(x,y)} u_y(x,y)$$

i.e. all evaluations take place at the point $(x,y)$.

With the understanding that $\frac{\partial f}{\partial x}$ will mean "the partial derivative of $f$ with respect to its first argument", we can replace $\frac{\partial f}{\partial A}$ with $\frac{\partial f}{\partial x}$. Similarly we can replace $\frac{\partial f}{\partial B}$ with $\frac{\partial f}{\partial y}$

Doing that and changing the order of factors we get

$$D_\alpha f(x + \alpha u_x(x,y), y + u_y(x,y))_{\ |(x,y)} = u_x(x,y) \frac{\partial f}{\partial x}_{\ (x,y)} + u_y(x,y) \frac{\partial f}{\partial y}_{\ |(x,y)}$$

$= U f$

I'm not going to give a formal proof of eq. 7.2, but I am going to work out the third term since it is the one that shows I'm expanding the correct function.

The differentiation involved is:

$$D^2_\alpha f(x + \alpha u_x(x,y), y + \alpha u_y(x,y)) = D_\alpha( \frac{\partial f}{\partial A} u_x(x,y) + \frac{\partial f}{\partial B} u_y(x,y) )$$

Remembering that each of $\frac{\partial f}{\partial A}$ and $\frac{\partial f}{\partial B}$ has two arguments $(A,B)$ we have:

$$= \frac{\partial}{\partial A}( \frac{\partial f}{\partial A} u_x(x,y) ) + \frac{\partial}{\partial B}(\frac{\partial f}{\partial A} u_x(x,y)) + \frac{\partial}{\partial A}( \frac{\partial f}{\partial B} u_y(x,y) ) \ +\ \frac{\partial}{\partial B}(\frac{\partial f}{\partial B} u_y(x,y)$$

$$= \frac{\partial^2 f}{\partial A^2} u_x^2(x,y) + \frac{\partial^2 f}{\partial B \partial A} u_y(x,y) u_x(x,y) \ + \frac{\partial^2 f}{\partial A \partial B} u_x(x,y) u_y(x,y) + \frac{\partial^2 f}{\partial B^2} u_y^2(x,y)$$

The functions derived from $f$ are each evaluate at $(x + \alpha\ u_x, x + \alpha\ u_y)$ (Taking an additional partial derivative is what produces the additional factors of $u_x$ and $u_y$. by the chain rule.)

Setting $\alpha = 0$ evaluates the functions of $f$ at $(x,y)$.

With the understanding that $\frac{\partial}{\partial A}$ can be denoted $\frac{\partial}{\partial x}$ etc. we have:

$$= u_x^2(x,y) \frac{\partial^2 f}{\partial x^2}+ u_y(x,y) u_x(x,y) \frac{\partial^2 f}{\partial y \partial x} \ + u_x(x,y) u_y(x,y) \frac{\partial^2 f}{\partial x \partial y} + u_y^2(x,y) \frac{\partial^2 f}{\partial y}$$

The above expression equal to $U^2 \ f$, which is:

$$U^2 f = ( u_x(x,y)\frac{\partial}{\partial x} + u_y(x,y)\frac{\partial}{\partial y} )^2 f$$

$$\ \ \ = ( u_x^2(x,y)\frac{\partial^2}{\partial x} + u_x(x,y) u_y(x,y)\frac{\partial}{\partial x}\frac{\partial}{\partial y} + u_y(x,y) u_x(x,y)\frac{\partial}{\partial y}\frac{\partial}{\partial x} + u_y^2(x,y))\frac{\partial^2}{\partial y^2})\ f$$

Last edited:
The usual way to denote a "point" with a 1-dimensional coordinate is just to write a variable representing that number. If we did that, a function in the group $S$ with coordinate $\alpha$ would be be denoted by $\alpha =(\alpha_x(x,y),\alpha_y(x,y) )$. Emmanuel prefers to put the coordinate of the function in the argument list with $(x,y)$ So a function gets to have both a name like $T$ and a coordinate like $\alpha$.

I'll go along with that, and the full notation for a function $T$ in $S$ will be:

$$T(x,y,\alpha) = ( \ T_x(x,y,\alpha),\ T_y(x,y,\alpha) \ )$$.

The fact that a function in $S$ is denoted by both a name and a coordinate can create some minor confusion.
When you write $\alpha =(\alpha_x(x,y),\alpha_y(x,y) )$ what you're saying that the image of the value of the function $\alpha$ at the point $(x,y)$, i.e. $(\alpha_x(x,y),\alpha_y(x,y) )$, is equal to the actual function $\alpha$, but that's abusing the notation a bit & if I just go with it knowing what you mean I end up plugging functions into the arguments of the cosines & sines in the examples you've given thus calling them numbers while simultaneously calling them functions I think the confusion arises because of notation really, what we're doing is linking a function in a different group to the parameter $\alpha$ in it's own group. So we're actually dealing with two different groups, intimately related to each other living inside some structure we're going to call a one-parameter group. If you try to construct a one-parameter group along the lines of structures you end up with something like (S,M,φ,ψ,I,e), using the notation from my post. Here S & M ⊆ ℝ are just sets, S a set of functions & M a subset of the real numbers. Further (M,ψ,e) is defined to be a group, & the $\alpha$ is actually an element of M [in your example of rotations using α+β as parameters we have the group (ℝ,+,0)]. The main thing here is to turn (S,φ,I) into a group & this is done in a roundabout way. Basically we're going to say that the operation φ acting on functions in the set S will turn this substructure into a group if the new function φ(T₀,T₁) inside S satisfies certain properties that relate to the group (M,ψ,e) by ensuring the image of the function φ(T₀,T₁), which includes parameters like $\alpha$ (it seems to me you were saying $\alpha[/url] is in S whereas it's actually just in the domain of the functions in S), satisfies the axioms I posted thus establishing a link between these two groups (substructures) within the structure (S,M,φ,ψ,I,e) & so allowing us to call (S,M,φ,ψ,I,e) a one-parameter group. When I gave the specific definition I gave in my last post I chose it because it's the nicest one I found with regard to this issue because most expositions don't make the distinctions clear. Other than that good stuff, will post more stuff asap Stephen Tashi Science Advisor When you write [itex]\alpha =(\alpha_x(x,y),\alpha_y(x,y) )$ what you're saying that the image of the value of the function $\alpha$ at the point $(x,y)$, i.e. $(\alpha_x(x,y),\alpha_y(x,y) )$, is equal to the actual function $\alpha$, but that's abusing the notation a bit
I agree that it's abusing notation. Using the coordinates of any structure to "stand for it" is a minor abuse of notation in some contexts. A bad abuse (which I did suggest) is to write an expression that says the coordinates of the thing are "=" to the thing. The equivalence relation "=" is defined for coordinates of things, and there may be an different equivalence relation defined on things themselves. Techncally, to set a coordinate of a thing equal to a thing, I'd have to define an equivalence relation on a set that contained both the things and also their coordinates.

• 1 person
Stephen Tashi
Solution of ODEs by Continuous Groups by George Emmanuel

Chapter 2 Continuous One-Parameter Groups

Meditation 8 Symbol Of The Infinitesimal Transformation - continued

$f\big(T_x(x,y,\alpha),T_y(x,y,\alpha)\big) = f(x,y) + U\ f + \frac{1}{2} U^2\ f + ....$

I posted another thread asking about the general theory of "infinitesimal transformations". https://www.physicsforums.com/showthread.php?p=4441687#post4441687 The discussion was helpful. Lovinia gave a proof the the expansion that will work for 1-paramter groups that are matrix groups. (Many of Emmanuel's examples can be stated as matrix groups.) Jostpuur corrected my interpretation (given in meditation 7) of what $U^2$ means and, in a series of posts I don't understand yet, derived the expansion from the point of view of differential equations.

I remain stubbornly committed to investigating whether the expansion can be derived directly by differentiation together with the basic facts we assume about 1-parameter groups.

After all, the old books claim the result is straightforward. (e.g. An Introduction to the Lie Theory of one-parameter groups" by Abraham Cohen (1911 http://archive.org/details/introlietheory00coherich page 30 of the PDF, page 14 of the book) So is it or isn't it? I'm not stubborn enough to want an inductive proof. I'll be happy just to get the first 3 terms of it. Is that too much to ask?

I'll devote this post just to computing the derivatives involved in 3rd term and corresponding result given by applying $U^2$. As far as I can see, they aren't the same.

I'll use some capital letters for the variables. It helps me avoid confusion.

$f(S,W)$
$T_x(A,B,C)$
$T_y(A,B,C)$

[eq.8.1]:
$$D_\alpha\ f(T_x(x,y,\alpha),T_y(x,y,\alpha)) = \frac{\partial f}{\partial S} \frac{\partial T_x}{\partial C} + \frac{\partial f}{\partial W} \frac{\partial T_y}{\partial C}$$

The functions on the right hand side are evaluated at:
$$S = T_x(x,y,\alpha),\ W = T_Y(x,y,\alpha)$$
$$A = x,\ B = y,\ C = \alpha$$

$$D^2_\alpha f(T_x(x,y,\alpha),T_y(x,y,\alpha)) = D_\alpha\ (\frac{\partial f}{\partial S} \frac{\partial T_x}{\partial C} + \frac{\partial f}{\partial W} \frac{\partial T_y}{\partial C})$$

[eq. 8.2]
$$= \big( \frac{\partial^2 f}{\partial S^2} \frac{\partial T_x}{\partial C} + \frac{\partial^2 f }{\partial W \partial S}\frac{\partial T_y}{\partial C} \big) \frac{\partial T_x}{\partial C} + \frac{\partial f}{\partial S}\frac{\partial^2 T_x}{\partial C^2}$$

$$\ + \big( \frac{\partial^2 f}{\partial S \partial W} \frac{\partial T_x}{\partial C} + \frac{\partial^2 f }{\partial W^2}\frac{\partial T_y}{\partial C} \big) \frac{\partial T_y}{\partial C} + \frac{\partial f}{\partial W}\frac{\partial^2 T_y}{\partial C^2}$$

The partial derivatives of $f$ are evaluated at:
$$S = T_x(x,y,\alpha),\ W = T_y(x,y,\alpha)$$
The derivatives of $T_x, T_y$ are evaluated at:
$$A = x,\ B = y,\ C = \alpha$$

Evaluate 8.1 at $\alpha = 0$ using the conventions we assume for a 1-parameter transformation $$S = T_x(x,y,0) = x ,\ W=T_y(x,y,0) = y$$
and the definitions of the "infinitesimal elements"
$$u_x(S,W) = \frac{\partial T_x}{\partial C}_{|\ \alpha = 0},\ u_y(S,W) = \frac{\partial T_y}{\partial C}_{|\ \alpha = 0}$$
to obtain:

$$D_\alpha\ f(T_x(x,y,\alpha),T_y(x,y,\alpha))_{|\ \alpha = 0} = \frac{\partial f}{\partial S} u_x + \frac{\partial f}{\partial W} u_y$$
[eq. 8.3]
$$= u_x \frac{\partial f}{\partial S} + u_y \frac{\partial f}{\partial W}$$

Where the functions are evaluated at $S = x,\ W = y ,\ A = x,\ B = y,\ C =0$

Evaluate 8.2 at $\alpha = 0$ using the same facts as above to obtain:

$$D^2_\alpha f(T_x(x,y,\alpha),T_y(x,y,\alpha)) = \big( \frac{\partial^2 f}{\partial S^2} u_x + \frac{\partial^2 f }{\partial W \partial S}u_y \big) u_x + \frac{\partial f}{\partial S}\frac{\partial^2 T_x}{\partial C^2}$$
$$\ + \big( \frac{\partial^2 f}{\partial S \partial W}u_x + \frac{\partial^2 f }{\partial W^2}u_y \big) u_y + \frac{\partial f}{\partial W}\frac{\partial^2 T_y}{\partial C^2}$$

[Eq.8.4]

$$= u_x^2 \frac{\partial^2 f}{\partial S^2} + u_x u_y \big( \frac{\partial^2 f }{\partial W \partial S} + \ \frac{\partial^2 f}{\partial S \partial W}\big) + u_y^2 \frac{\partial^2 f }{\partial W^2}+ \frac{\partial f}{\partial S}\frac{\partial^2 T_x}{\partial C^2} + \frac{\partial f}{\partial W}\frac{\partial^2 T_y}{\partial C^2}$$

Where the functions are evaluated at $S = x,\ W = y , A =x,\ B =y,\ C = 0$

Now I'll do the supposedly analogous calculations using the operator $U$, which is defined in terms of the "infinitesimal elements" $u_x(S,W),\ u_y(S,W)$ and its action on a function $g(S,W)$ by:

[eq. 8.10]
$$U g(S,W) = u_x(S,W) \frac{\partial g}{\partial S}+ u_y(S,W) \frac{\partial g}{\partial W}$$

Applying $U$ once to $f(T_x(x,y,\alpha),T_y(x,y,\alpha))$

[eq. 8.11]
$$U \ f(T_x(x,y,\alpha),T_y(x,y,\alpha)) = u_x \frac {\partial f}{\partial S} + u_y \frac{\partial f}{\partial W}$$

where the functions are evaluated at $S = T_x(x,y,\alpha),\ W=T_y(x,y,\alpha), \ A = x, \ B = y, \ C = \alpha$.

Using the corrected definition of what it means to apply $U$ twice (which makes things much more complicated that my interpetation in Meditation 7) we get:

$$U^2 \ f(T_x(x,y,\alpha),T_y(x,y,\alpha)) = U\ \big( u_x \frac {\partial f}{\partial S} + u_y \frac{\partial f}{\partial W}\big)$$

$$= u_x \frac{\partial}{\partial S} \big(u_x \frac {\partial f}{\partial S} + u_y \frac{\partial f}{\partial W}\big) + u_y \frac{\partial}{\partial W}\big( u_x \frac {\partial f}{\partial S} + u_y \frac{\partial f}{\partial W}\big)$$

$$= u_x \big( \frac{\partial u_x}{\partial S}\frac{\partial f}{\partial S} + u_x \frac{\partial^2 f}{\partial S^2} + \frac{\partial u_y}{\partial S}\frac{ \partial f}{\partial W} + u_y \frac{\partial^2 f}{\partial S \partial W} \big)$$

$$\ + u_y \big( \frac{\partial u_x}{\partial W}\frac{\partial f}{\partial S} + u_x \frac{\partial^2 f}{\partial W \partial S} + \frac{\partial u_y}{\partial W}\frac{ \partial f}{\partial W} + u_y \frac{\partial^2 f}{\partial W^2} \big)$$

[eq. 8.12]

$$= u_x^2 \frac{\partial^2 f}{\partial S^2} + u_x \big( \frac{\partial u_x}{\partial S}\frac{\partial f}{\partial S} + \frac{\partial u_y}{\partial S}\frac{\partial f}{\partial W} \big)$$
$$\ + u_x u_y \big( \frac{\partial^2 f}{\partial S \partial W} + \frac{\partial^2 f}{\partial W \partial S} \big)$$
$$\ + u_y \big( \frac{\partial u_x}{\partial W}\frac{\partial f}{\partial S} + \frac{\partial u_y}{\partial W} \frac{\partial f}{\partial W}\big) + u_y^2 \frac{\partial^2 f}{\partial W^2}$$

The functions are evaluated at $S = T_x(x,y,\alpha), \ W = T_y(x,y,\alpha),\ A = x,\ B = y,\ C= \alpha$

Comparing eq 8.12 to eq. 8.4 we see that straightforward calculus does not show that $D^2 f(T_x(x,y,\alpha),T_y(x,y,\alpha))_{|\ \alpha = 0} = U^2 f(T_x(x,y,\alpha),T_y(x,y,\alpha))$

We need more juice. I'll consider this further in the next meditation.

strangerep
[...Meditations on Emanuel...]
Hey Stephen!

Have you abandoned this stuff? Or maybe got bored/lonely being here by yourself for too long?
I've kinda been sitting back waiting for you (and others) to get through the basic stuff in Emanuel.

Unfortunately, I don't have my own copy of Emanuel as it seems a bit expensive (imho) for what it covers. Olver's treatment of Lie Groups & Differential Equations is quite difficult, so I figured I needed a range of other books on the subject, and recently placed orders for the following (listed in increasing order of difficulty):

Peter E. Hydon,
Symmetry Methods for Differential Equations: A Beginner's Guide.
https://www.amazon.com/gp/product/0521497868/?tag=pfamazon01-20&tag=pfamazon01-20

Hans Stephani,
Differential Equations: Their Solution Using Symmetries
https://www.amazon.com/gp/product/0521366895/?tag=pfamazon01-20&tag=pfamazon01-20

L. V. Ovsyannikov,
Lectures on the Theory of Group Properties of Differential Equations
https://www.amazon.com/gp/product/9814460818/?tag=pfamazon01-20&tag=pfamazon01-20

I hope you get back to "meditating" soon. 