What's the deal on infinitesimal operators?

jostpuur · Jul 8, 2013

Oh dear, it could be there is something wrong in my "answer".

I think I was supposed to prove

[tex]
D_{\alpha} (f\circ T(\alpha))(x,y) = (U(f\circ T(\alpha)))(x,y)
[/tex]

but the result that I proved was

[tex]
D_{\alpha} (f\circ T(\alpha))(x,y) = ((Uf)\circ T(\alpha))(x,y)
[/tex]

I'm feeling little confused now...

jostpuur · Jul 8, 2013

If a function has been written in polar coordinates like [itex]f(r,\theta)[/itex], then

[tex]
e^{\theta\frac{\partial}{\partial \theta}} f(r,\theta_0)
[/tex]

is the Taylor series with respect to the angle variable. Hence the operator should be seen as a rotation operator. On the other hand

[tex]
\frac{\partial}{\partial\theta} = -y\frac{\partial}{\partial x} + x\frac{\partial}{\partial y}
[/tex]

holds too, so we should be able to write the rotation operator in the Cartesian coordinates. Let's define

[tex]
\phi(x,y,\theta) = e^{\theta(-y\frac{\partial}{\partial x} + x\frac{\partial}{\partial y})}f(x,y)
[/tex]

and

[tex]
\psi(x,y,\theta) = f(x\cos\theta - y\sin\theta,\; x\sin\theta + y\cos\theta)
[/tex]

Now

[tex]
\phi(x,y,\theta) = \psi(x,y,\theta)
[/tex]

is supposed to hold, right?

[tex]
\frac{\partial}{\partial\theta}\phi(x,y,\theta) = \Big(-y\frac{\partial}{\partial x}+ x\frac{\partial}{\partial y}\Big)\phi(x,y,\theta)
[/tex]

[tex]
\frac{\partial}{\partial\theta}\psi(x,y,\theta)
= (-x\sin\theta - y\cos\theta)\frac{\partial f}{\partial x}
+ (x\cos\theta - y\sin\theta)\frac{\partial f}{\partial y}
[/tex]
[tex]
= -y\Big(\cos\theta\frac{\partial f}{\partial x} +\sin\theta \frac{\partial f}{\partial y}\Big)
+ x\Big(\cos\theta\frac{\partial f}{\partial y} - \sin\theta \frac{\partial f}{\partial x}\Big)
[/tex]
[tex]
= \Big(-y\frac{\partial}{\partial x} + x \frac{\partial}{\partial y}\Big)\psi(x,y,\theta)
[/tex]

Here equations

[tex]
\frac{\partial \psi}{\partial x} = \cos\theta \frac{\partial f}{\partial x} +\sin\theta \frac{\partial f}{\partial y}
[/tex]
[tex]
\frac{\partial \psi}{\partial y} = -\sin\theta \frac{\partial f}{\partial x} +\cos\theta\frac{\partial f}{\partial y}
[/tex]

were used.

Ok so [itex]\phi[/itex] and [itex]\psi[/itex] satisfy the same PDE and therefore it should be possible to prove that they are the same thing rigorously.

I'm sure a similar manipulation should be possible in more general case, but I did something wrong in my previous attempt, and all pieces didn't fit together for some reason. It is still little mysterious, but I'm getting optimistic on this now.

Stephen Tashi · Jul 8, 2013

jostpuur said:

This looks like a mistake to me. The operator [itex]U^2[/itex] behaves like this:

[tex]
U^2f = u_x\frac{\partial}{\partial x}(Uf) + u_y\frac{\partial}{\partial y}(Uf)
[/tex]
[tex]
= u_x\Big(\big(\frac{\partial u_x}{\partial x}\frac{\partial f}{\partial x}
+ u_x\frac{\partial^2 f}{\partial x^2}
+ \frac{\partial u_y}{\partial x}\frac{\partial f}{\partial y}
+ u_y\frac{\partial^2 f}{\partial x\partial y}\Big)
[/tex]
[tex]
+ u_y\Big(\frac{\partial u_x}{\partial u_y}\frac{\partial f}{\partial x}
+ u_y\frac{\partial^2 f}{\partial x \partial y}
+ \frac{\partial u_y}{\partial y} \frac{\partial f}{\partial y}
+ u_y\frac{\partial^2 f}{\partial y^2}\Big)
[/tex]

If you compute derivatives of [itex]f(x+\alpha u_x,y+\alpha u_y)[/itex] with respect to [itex]\alpha[/itex], the partial derivatives of [itex]u_x[/itex] and [itex]u_y[/itex] will never appear.

I see your point. I way trying to interpret [itex] U^n f [/itex] as an operation that is applied only to the original function [itex] f [/itex] and not to the function that is the result of previous operations.

jostpuur · Jul 8, 2013

Just in case somebody feels unsure about how to interpret the [itex]U^n[/itex], I would recommend studying the example

[tex]
\frac{\partial}{\partial\theta} = -y\frac{\partial}{\partial x} + x\frac{\partial}{\partial y}
[/tex]

I didn't check this now myself, but I'm sure that with some example, it can be verified that

[tex]
\frac{\partial^2}{\partial\theta^2},\frac{\partial^3}{\partial\theta^3},\ldots
[/tex]

will not turn out correctly, if it is assumed that [itex]\frac{\partial}{\partial x},\frac{\partial}{\partial y}[/itex] would commute with [itex]x,y[/itex].

I thought that I recognized the Stephen Tashi's problem as something that I had thought through earlier, but now it became clear that I hadn't thought it through completely after all. This seems to be coming a good thread, although it also contains lot of mistakes now, and some from me. :blushing:

I'll try to take a break now, and return later to see what's happened.

Stephen Tashi · Jul 8, 2013

With the correction and also adding the evaluation [itex] \alpha = 0 [/itex] to the left hand side,

[tex]
D_{\alpha} f(T_x(x,y,\alpha),T_y(x,y,\alpha))_{|\ \alpha = 0} = U (f(T_x(x,y,\alpha),T_y(x,y,\alpha)))
[/tex]

is my interpretation of what old books on Lie groups prove and then they immediately assert
something that (I think) amounts to

[tex] D^n_{\alpha} f(T_x(x,y,\alpha),T_y(x,y,\alpha))_{|\ \alpha = 0} = U^n (f(T_x(x,y,\alpha),T_y(x,y,\alpha))) [/tex]

as if the result is just straightforward calculus. (Perhaps with your correction to my interpretation of [itex] U^2 [/itex], it is.)

In those books, the right hand side is denoted as [itex] U( f(x_1,y_1) ) [/itex].

In previous pages, notation amounting to [itex] x_1 = T_x(x,y,\alpha),\ y_1 = T_y(x,y,\alpha) [/itex] is employed, and in other places the same symbols are also defined implicityly using "infinitesimals" as [itex] x_1 + \delta x_1 = T_x(x,y,\alpha + \delta \alpha),\ y1 +\delta y_1 = T_y(x,y,\alpha + \delta \alpha) [/itex]. So I think the interpretation in the first equation is correct.

For matrix groups, lovina's proof works. (Perhaps even my wrong interpretation of [itex] U^2 [/itex] works.)

Stephen Tashi · Jul 8, 2013

Many people react to questions about the expansion in [itex] U [/itex] by brining up differential equations. Intuitively, I understand why. We trace out the curve followed by the point [itex](T_x(x,y.\alpha),T_y(x,y,\alpha)[/itex] and the value of function [itex] f(T_x,T_y) [/itex] as it varies along that curve, thinking of [itex] \alpha [/itex] as "time". The derivative of [itex] f(x,y) [/itex] with respect to [itex] \alpha [/itex] depends on the gradient of [itex] f [/itex] and the directional derivative of the curve at [itex] (x,y) [/itex].

Since the curve is a group of transformations, the directional derivative at [itex] (x,y) [/itex] is determined by what the transformation does to the point [itex] (x,y) [/itex] by values of [itex] \alpha [/itex] close to zero.

I'd think that some advanced calculus book somewhere would treat expanding a function f(x,y) in Taylor series "along a curve". Then the problem for adapting this result to 1-parameter Lie groups would be to show how that result becomes simpler when the curve is generated by a tranformation group.

jostpuur · Jul 10, 2013

One new notation:

[tex]
T(\alpha)(x,y) = \big(T_x(x,y,\alpha),T_y(x,y,\alpha)\big)
[/tex]

so that [itex]T(\alpha)[/itex] is a function with two variables.

Stephen Tashi said:

[tex] D^n_{\alpha} f(T_x(x,y,\alpha),T_y(x,y,\alpha))_{|\ \alpha = 0} = U^n (f(T_x(x,y,\alpha),T_y(x,y,\alpha))) \quad\quad\quad (1)
[/tex]

Expressions like this are dangerous, since there could be ambiguity about whether we mean

[tex]
\big(U(f\circ T(\alpha)\big)(x,y)\quad\quad\quad (2)
[/tex]

or

[tex]
\big((Uf)\circ T(\alpha)\big)(x,y)\quad\quad\quad (3)
[/tex]

The difference between these is that (2) involves weights

[tex]
u_x(x,y),\quad u_y(x,y)
[/tex]

while (3) involves weights

[tex]
u_x\big(T_x(x,y,\alpha),T_y(x,y,\alpha)\big),\quad u_u\big(T_x(x,y,\alpha),T_y(x,y,\alpha)\big)
[/tex]

I did some mistakes with these issues in my previous attempt to deal with the infinitesimal problem through PDEs.

Being fully strict with the notation, a [itex]f(x_0,y_0)[/itex] is only a real number once [itex]x_0,y_0[/itex] have been fixed. An operation by [itex]U[/itex] from left on a real number is not defined, hence [itex]U(f(x_0,y_0))[/itex] is not defined. I wouldn't get too strict merely for the sake of being strict, but critical ambiguity should be avoided.

jostpuur · Jul 10, 2013

Stephen Tashi said:

Many people react to questions about the expansion in [itex] U [/itex] by brining up differential equations. Intuitively, I understand why.

IMO the PDE approach is extremely critical. The reason is that sometimes the exponential series can diverge, but the transformation itself may still make sense. Even then, the transformation could be seen as generated by some PDE. The exponential operator could then be seen as a formal notation for an operator that generates the transformation.

For example, the translation operator is well defined even if Taylor series diverge at some points. Also, a transport equation (PDE) will generate the translation very well for differentiable functions whose Taylor series diverges at some points.

Stephen Tashi said:

Is there a treatment of "infinitesimal operators" that is rigorous from the epsilon-delta point of view?

In looking for material on the infinitesimal transformations of Lie groups, I find many things online about infinitesimal operators. Most seem to be by people who take the idea of infinitesimals seriously and I don't think they are talking about the rigorous approach to infinitesimals [a la Abraham Robinson and "nonstandard analysis".

I have never seen any such rigour material. My own replies were combination of knowledge on PDEs, groups, and some own thinking.

Everytime I have read about Lie groups, it has been about actions on some finite dimensional manifolds. Now these differential operators are acting on infinite dimensional vector spaces. Are these infinite dimensional Lie groups? I guess not. The groups themselves are still finite dimensional?

Returning to the opening post... it would be interesting to learn about some pedagogical material on this. It seems replies containing links to such did not appear yet.

jostpuur · Jul 10, 2013

By comparing my posts #34 and #37, it is clear that #34 contained some fatal mistakes. I would like to outline the goal more clearly now.

We define two functions [itex]\phi,\psi[/itex] by formulas

[tex]
\phi(x,y,\alpha) = (e^{\alpha U}f)(x,y)
[/tex]

[tex]
\psi(x,y,\alpha) = f\big(T_x(x,y,\alpha),T_y(x,y,\alpha)\big)
[/tex]

Now [itex]\phi(x,y,0)=\psi(x,y,0)[/itex] holds obviously, and

[tex]
\frac{\partial\phi(x,y,\alpha)}{\partial\alpha} = U\phi(x,y,\alpha)
[/tex]

holds almost by the definition. The goal should be to prove that [itex]\psi[/itex] satisfies the same PDE, which would then imply [itex]\phi=\psi[/itex].

[tex]
\frac{\partial\psi(x,y,\alpha)}{\partial\alpha}
= \frac{\partial T_x(x,y,\alpha)}{\partial\alpha} \frac{\partial f}{\partial x}
+ \frac{\partial T_y(x,y,\alpha)}{\partial\alpha} \frac{\partial f}{\partial y}
[/tex]

[tex]
\frac{\partial\psi}{\partial x} =\frac{\partial f}{\partial x}\frac{\partial T_x}{\partial x}
+ \frac{\partial f}{\partial y}\frac{\partial T_y}{\partial x}
[/tex]

[tex]
\frac{\partial\psi}{\partial y} =\frac{\partial f}{\partial x}\frac{\partial T_x}{\partial y}
+ \frac{\partial f}{\partial y}\frac{\partial T_y}{\partial y}
[/tex]

In the end we want [itex]U\psi[/itex] to appear, so we need to get the [itex]\frac{\partial\psi}{\partial\alpha}[/itex] written in such form that [itex]\frac{\partial\psi}{\partial x}[/itex] and [itex]\frac{\partial\psi}{\partial y}[/itex] would be present, and not the derivatives of [itex]f[/itex].

It's not immediately obvious how to accomplish that. The properties of the [itex]T_x,T_y[/itex] shoul be studied in more detail.

Stephen Tashi · Jul 10, 2013

jostpuur said:

One new notation:

[tex]
T(\alpha)(x,y) = \big(T_x(x,y,\alpha),T_y(x,y,\alpha)\big)
[/tex]

so that [itex]T(\alpha)[/itex] is a function with two variables.

Expressions like this are dangerous, since there could be ambiguity about whether we mean

[tex]
\big(U(f\circ T(\alpha)\big)(x,y)\quad\quad\quad (2)
[/tex]

or

[tex]
\big((Uf)\circ T(\alpha)\big)(x,y)\quad\quad\quad (3)
[/tex]

What about
[tex]
U( f\circ (T(\alpha)(x,y) )) \quad\quad\quad (4)
[/tex]

which is how I'd translate

[tex] D_{\alpha} f(T_x(x,y,\alpha),T_y(x,y,\alpha))_{|\ \alpha = 0} = U (f(T_x(x,y,\alpha),T_y(x,y,\alpha))) [/tex]

into your notation.

Of course, I'm just guessing at what the old books mean. The result I'm trying to translate into modern calculus is shown in An Introduction to the Lie Theory of one-parameter groups" by Abraham Cohen (1911) http://archive.org/details/introlietheory00coherich page 30 of the PDF, page 14 of the book.

The expansion is also asserted in the modern book Solution of Ordinary Differential Equations by Continuous Groups by George Emmanuel (2001) page 13,

jostpuur · Jul 10, 2013

Continuing the work from post #44

When you don't know how to arrive at the goal, the basic trick is to see what happens if you work backwards from the goal. It turns out this:

[tex]
U\psi(x,y,\alpha) = \frac{\partial T_x(x,y,0)}{\partial\alpha}
\frac{\partial\psi(x,y,\alpha)}{\partial x}
+ \frac{\partial T_y(x,y,0)}{\partial\alpha}\frac{\partial\psi(x,y,\alpha)}{\partial y}
[/tex]
[tex]
=\Big(\frac{\partial T_x(x,y,0)}{\partial\alpha}\frac{\partial T_x(x,y,\alpha)}{\partial x}
+ \frac{\partial T_y(x,y,0)}{\partial\alpha}\frac{\partial T_x(x,y,\alpha)}{\partial y}\Big)
\frac{\partial f(T_x(x,y,\alpha),T_y(x,y,\alpha))}{\partial x}
[/tex]
[tex]
+\Big(\frac{\partial T_x(x,y,0)}{\partial\alpha}\frac{\partial T_y(x,y,\alpha)}{\partial x}
+\frac{\partial T_y(x,y,0)}{\partial\alpha}\frac{\partial T_y(x,y,\alpha)}{\partial y}\Big)
\frac{\partial f(T_x(x,y,\alpha),T_y(x,y,\alpha))}{\partial y}
[/tex]

Comparing this with equation of post #44 we see that [itex]\psi[/itex] will satisfy the desired PDE if the equations

[tex]
\frac{\partial T_x(x,y,\alpha)}{\partial\alpha}
= \frac{\partial T_x(x,y,0)}{\partial\alpha}\frac{\partial T_x(x,y,\alpha)}{\partial x}
+ \frac{\partial T_y(x,y,0)}{\partial\alpha}\frac{\partial T_x(x,y,\alpha)}{\partial y}
[/tex]
[tex]
\frac{\partial T_y(x,y,\alpha)}{\partial\alpha}
= \frac{\partial T_x(x,y,0)}{\partial\alpha}\frac{\partial T_y(x,y,\alpha)}{\partial x}
+\frac{\partial T_y(x,y,0)}{\partial\alpha}\frac{\partial T_y(x,y,\alpha)}{\partial y}
[/tex]

hold.

Do these follow from some axioms about the transformation?

jostpuur · Jul 10, 2013

Stephen Tashi said:

What about
[tex]
U( f\circ (T(\alpha)(x,y) )) \quad\quad\quad (4)
[/tex]

This contains two mistakes. You probably meant

[tex]
U((f\circ T(\alpha))(x,y))
[/tex]

?

The problem with this is that [itex](f\circ T(\alpha))(x,y)[/itex] is a real number, and we cannot operate with [itex]U[/itex] from left on real numbers. The equation (4) also halts because [itex]T(\alpha)(x,y)[/itex] is a point on plane, and [itex]f\circ (x',y')[/itex] is not defined for any point [itex](x',y')[/itex], but only [itex]f(x',y')[/itex] is.

Notice that the difference between equations (2) and (3) was that the weights [itex]u_x,u_y[/itex] were evaluated at different points. There are only two obvious ways to evaluate [itex]u_x,u_y[/itex], so the ambiguous expressions will consequently have at most two obvious interpretations.

Stephen Tashi · Jul 10, 2013

jostpuur said:

This contains two mistakes. You probably meant

[tex]
U((f\circ T(\alpha))(x,y))
[/tex]

I won't say I mean that since I don't know what it means yet.

Is the convention you're using that any function [itex] G [/itex] mapping 3 variables [itex] x,y,\alpha) [/itex] to a point on the plane [itex] (G_x(x,y,\alpha), G_y(x,y,\alpha) ) [/itex] will be denoted as [itex] G(\alpha) (x,y) [/itex]? Or is this honor reserved for [itex] T(\alpha)(x,y) [/itex] alone ?

I don't see [itex] f \circ T [/itex] as a function that maps 3 variables to a point on the plane and I don't mean that.

The problem with this is that [itex](f\circ T(\alpha))(x,y)[/itex] is a real number, and we cannot operate with [itex]U[/itex] from left on real numbers.

I can't interpret [itex](f\circ T(\alpha))(x,y)[/itex]. The two arguments [itex] (x,y) [/itex] sitting outside the parentheses confuse me.

Returning to the notation [itex] T(x,y,\alpha) = (T_x(x,y,\alpha), T_y(x,y,\alpha)) [/itex]
I consider [itex]f\circ T(x,y,\alpha)[/itex] as a real valued function of 2 variables where each of those variables is replaced by a real valued function of 3 variables. .So [itex] f [/itex] is a only a specific real number when [itex] x, y, \alpha [/itex] are specific real numbers.

The equation (4) also halts because [itex]T(\alpha)(x,y)[/itex] is a point on plane, and [itex]f\circ (x',y')[/itex] is not defined for any point [itex](x',y')[/itex], but only [itex]f(x',y')[/itex] is.

I don't understand what the primes mean, but I think that argument is a further objection to something that I don't mean anyway.

Notice that the difference between equations (2) and (3) was that the weights [itex]u_x,u_y[/itex] were evaluated at different points. There are only two obvious ways to evaluate [itex]u_x,u_y[/itex], so the ambiguous expressions will consequently have at most two obvious interpretations.

I heartily agree that where the weights are to be evaluated needs precise notation. Denoting a real valued function [itex] g [/itex] as [itex] g(A,B) [/itex] to avoid the confusing proliferation of x's and y's, I think that [itex] U [/itex] is the operator

[tex] U g(A,B) = u_x(A,B)\frac{\partial}{\partial A} + u_y(A,B) \frac{\partial}{\partial B} [/tex].

So in the case that [itex] A = T_x(x,y,\alpha),\ B = T_x(x,y,\alpha) [/itex] those are the
values where the derivatives and weights are evaluated. An [itex] \alpha [/itex] is thus introduced into this evaluation. When we set [itex] \alpha = 0 [/itex] to get the Taylor coefficient we have [itex] T_x(x,y,0) = x, \ T_y(x,y,0) = y [/itex] so the evaluation in the Taylor coefficient is done at [itex] (A,B) = (x,y) [/itex].

Of course, I'm not yet certain how the old books are defining [itex] U [/itex].

Stephen Tashi · Jul 10, 2013

jostpuur said:

Comparing this with equation of post #44 we see that [itex]\psi[/itex] will satisfy the desired PDE if the equations

[tex]
\frac{\partial T_x(x,y,\alpha)}{\partial\alpha}
= \frac{\partial T_x(x,y,0)}{\partial\alpha}\frac{\partial T_x(x,y,\alpha)}{\partial x}
+ \frac{\partial T_y(x,y,0)}{\partial\alpha}\frac{\partial T_x(x,y,\alpha)}{\partial y}
[/tex]
[tex]
\frac{\partial T_y(x,y,\alpha)}{\partial\alpha}
= \frac{\partial T_x(x,y,0)}{\partial\alpha}\frac{\partial T_y(x,y,\alpha)}{\partial x}
+\frac{\partial T_y(x,y,0)}{\partial\alpha}\frac{\partial T_y(x,y,\alpha)}{\partial y}
[/tex]

hold.

Do these follow from some axioms about the transformation?

I think they do.

I don't follow the entire argument yet, but I at least understand the above question.

We assume the coordinate functions obey the composition law:

[itex] T_x( T_x(x,y,\beta), T_y(x,y,\beta),\alpha) = T_x( x,y,\alpha + \beta) [/itex] [1]
[itex] T_y( T_x(x,y,\beta), T_y(x,y,\beta),\alpha) = T_y( x,y,\alpha + \beta) [/itex] [2]

I copy the technique that lovinia showed for matrix groups.

Consider [itex] T_x [/itex] to be [itex]T_x(A,B,C) [/itex] to avoid confusion. Differentiate both sides of [1] with respect to [itex] \beta [/itex] :

[3]:
[tex] \frac{\partial T_x}{\partial A} \frac{\partial Tx}{\partial C} + \frac{\partial T_x}{\partial B} \frac{\partial T_y}{\partial C} = \frac{\partial T_x}{\partial C}[/tex]

The evaluations of these functions are:

Left hand side:

[itex] \frac{\partial T_x}{\partial A} [/itex] at [itex] A = T_x(x,y,\beta),\ B = T_y(x,y,\beta),\ C = \alpha [/itex]
[itex] \frac{\partial T_x}{\partial B} [/itex] at [itex] A = T_x(x,y,\beta),\ B = T_y(x,y,\beta),\ C = \alpha[/itex]
[itex] \frac{\partial T_x}{\partial C} [/itex] at [itex] A = x,\ B = y ,\ C = \beta) [/itex]
[itex] \frac{\partial T_y}{\partial C} [/itex] at [itex] A = x,\ B = y ,\ C = \beta) [/itex]
Right hand side:
[itex] \frac{\partial T_x}{\partial C} [/itex] at [itex] A = T_x(x,y,\beta+\alpha),\ B = T_y(x,y,\beta+\alpha),\ C = \alpha + \beta [/itex]

Set [itex] \beta = 0 [/itex] in [3]. This sets [itex] T_x(x,y,\beta)= x,\ T_y(x,y,\beta) = y [/itex] so we have the same functions with the evaluations:

[itex] \frac{\partial T_x}{\partial A} [/itex] at [itex] A = x,\ B = y ,\ C = \alpha [/itex]
[itex] \frac{\partial T_x}{\partial B} [/itex] at [itex] A = x,\ B = y, \ C = \alpha [/itex]
[itex] \frac{\partial T_x}{\partial C}[/itex] at [itex] A = x,\ B = y,\ C = 0 [/itex]
[itex] \frac{\partial T_y}{\partial C}[/itex] at [itex] A = x,\ B = y,\ C = 0 [/itex]

Right hand side:
[itex] \frac{\partial T_x}{\partial C} [/itex] at [itex] A =x,\ B = y,\ C = \alpha[/itex]

I interpret the above right hand side as the same thing as [itex] \frac{\partial T_x}{\partial \alpha} [/itex]

Using your notation ( [itex]\frac{\partial T_x}{\partial x} [/itex] instead of [itex]\frac{\partial T_x}{\partial A} [/itex] etc.), I interpret the above left hand side to be the same thing as

[tex] \frac{\partial T_x}{\partial x}(x,y,\alpha) \frac{\partial T_x}{\partial \alpha}(x,y,0) + \frac{\partial T_x}{\partial y}(x,y,\alpha) \frac{\partial T_y}{\partial \alpha}(x,y,0) [/tex]

so I get the equation you needed for the derivatives of [itex] T_x [/itex].

jostpuur · Jul 11, 2013

Stephen Tashi said:

We assume the coordinate functions obey the composition law:

[itex] T_x( T_x(x,y,\beta), T_y(x,y,\beta),\alpha) = T_x( x,y,\alpha + \beta) [/itex] [1]
[itex] T_y( T_x(x,y,\beta), T_y(x,y,\beta),\alpha) = T_y( x,y,\alpha + \beta) [/itex] [2]

I copy the technique that lovinia showed for matrix groups.

Consider [itex] T_x [/itex] to be [itex]T_x(A,B,C) [/itex] to avoid confusion. Differentiate both sides of [1] with respect to [itex] \beta [/itex] :

I had only considered derivatives with respect to [itex]\alpha[/itex] from this equation. The derivative with respect to [itex]\beta[/itex] was the trick.

I don't see much problems with this anymore. IMO this is settled now reasonably.

The two dimensional plane was chosen for simplicity of study in the beginning. It could be interesting to check the same thing in higher dimension, and get some guarantees for the uniqueness of solutions of PDEs.

jostpuur · Jul 11, 2013

https://en.wikipedia.org/wiki/Flow_(mathematics)

https://en.wikipedia.org/wiki/Vector_flow

bolbteppa · Jul 14, 2013

Sorry for such a late response, here we go:

Stephen Tashi said:

An example that I've given in another thread (on using Lie Groups to solve differential equations) is the treatment of the "infinitesimal operator" of a 1-parameter Lie group.

Avoiding the traditional zoo of Greek letters, I prefer the terminology:

Let [itex] T(x.y,\alpha) = ( \ T_x(x,y,\alpha), T_y(x,y,\alpha )\ ) [/itex] denote an element of a Lie group of 1-parameter transformations of the xy-plane onto itself.

Let [itex] f(x,y) [/itex] be a real valued function whose domain is the xy-plane.

Let [itex] u_x(x,y) = D_\alpha \ T_x(x,y,\alpha)_{\ |\alpha=0} [/itex]
Let [itex] u_y(x,y) = D_\alpha \ T_y(x,y,\alpha)_{\ |\alpha =0} [/itex]

([itex] u_x [/itex] and [itex] u_y [/itex] are the " infinitesimal elements ".)

Let [itex] U [/itex] be the differential operator defined by the operation on the function [itex]g(x,y)[/itex] by:

[itex] U g(x,y) = u_x(x,y) \frac{\partial}{\partial x} g(x,y) + u_y(x,y)\frac{\partial}{\partial y} g(x,y) [/itex]

(The operator [itex] U [/itex] is "the symbol of the infinitesimal transformation" .)

I think the notation is the source of the problem here, it gets a bit confusing not specifying which point your starting from etc... I'll just derive it the way I would do it, substituting your notation into what I've done at one or two points so we can compare the results.

If I set [itex]x_0 = \phi(x_0,y_0,\alpha_0), \ y_0 = \psi(x_0,y_0,\alpha_0)[/itex] & [itex]x_1 = \phi(x_0,y_0,\alpha_0 + \delta \alpha), \ y_1 = \psi(x_0,y_0,\alpha_0 + \delta \alpha)[/itex] then

[tex] \Delta x = x_1 - x_0 = \phi(x_0,y_0,\alpha_0 + \delta \alpha) - \phi(x_0,y_0,\alpha_0) = \delta x + \varepsilon_{error}(\delta x) \approx \delta x = \frac{\partial \phi}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0} \delta \alpha[/tex]

Similarly [itex] \delta y = \frac{\partial \psi}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0} \delta \alpha [/itex]

Therefore [itex](x_1,y_1) = T(x_0,y_0,\alpha) = (\phi(x_0,y_0,\alpha),\psi(x_0,y_0,\alpha)[/itex] w/ [itex]T(x_0,y_0,\alpha_0) = (x_0,y_0)[/itex] &

[tex](x_1,y_1) = T(x_0,y_0,\alpha) = (\phi(x_0,y_0,\alpha),\psi(x_0,y_0,\alpha)) \\
\ \ \ \ \ \ \ \ \ \ \ \ \ = (\phi(x_0,y_0,\alpha_0) + \frac{\partial \phi}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0} \delta \alpha,\psi(x_0,y_0,\alpha_0) + \frac{\partial \psi}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0}\delta \alpha)[/tex]

Using your notation I have

[tex] T_x(x_0,y_0,\alpha)= \phi(x_0,y_0,\alpha) = \phi(x_0,y_0,\alpha_0) + \frac{\partial \phi}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0} \delta \alpha \\
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ = T_x(x_0,y_0,\alpha_0) + \frac{\partial T_x}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0}\delta \alpha = T_x(x_0,y_0,\alpha_0) + u_x(x_0,y_0)\delta \alpha[/tex]

Now we want to deal with [itex]f(x_1,y_1) = f(\phi(x_0,y_0,\alpha),\psi(x_0,y_0,\alpha))[/itex].

Since the Taylor series expansion of [itex]f[/itex] is

[tex]f(x_1,y_1) = f(x_0,y_0) + \frac{\partial f}{\partial x}(x_0,y_0)\delta x + \frac{\partial f}{\partial y}(x_0,y_0)\delta y + (\delta x \frac{\partial}{\partial x} + \delta y \frac{\partial}{\partial y})^2f|_{(x,y) = (x_0,y_0)} + ...[/tex]

written more succinctly as

[tex]f(x_1,y_1) = \sum_{i=0}^\infty \frac{1}{n!}(\delta x \frac{\partial}{\partial x} + \delta y \frac{\partial}{\partial y})^nf|_{(x,y) = (x_0,y_0)}[/tex]

we see that simply by substituting in our [itex] \delta y = \frac{\partial \psi}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0} \delta \alpha [/itex] terms we get:

[tex]f(x_1,y_1) = \sum_{i=0}^\infty \frac{1}{n!}(\delta x \frac{\partial}{\partial x} + \delta y \frac{\partial}{\partial y})^nf|_{(x,y) = (x_0,y_0)} \\

\ \ \ \ \ \ \ \ \ \ \ \ \ \ = \sum_{i=0}^\infty \frac{1}{n!} (\frac{\partial \phi}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0} \delta \alpha \frac{\partial}{\partial x} + \frac{\partial \psi}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0} \delta \alpha \frac{\partial}{\partial y})^nf|_{(x,y) = (x_0,y_0)} \\

\ \ \ \ \ \ \ \ \ \ \ \ \ \ = \sum_{i=0}^\infty \frac{(\delta \alpha)^n}{n!}(\frac{\partial \phi}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0} \frac{\partial}{\partial x} + \frac{\partial \psi}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0} \frac{\partial}{\partial y})^nf|_{(x,y) = (x_0,y_0)} \\

\ \ \ \ \ \ \ \ \ \ \ \ \ \ = \sum_{i=0}^\infty \frac{(\delta \alpha)^n}{n!}(\varepsilon(x_0,y_0) \frac{\partial}{\partial x} + \eta(x_0,y_0) \frac{\partial}{\partial y})^nf|_{(x,y) = (x_0,y_0)} \\[/tex]

Stephen Tashi said:

Every book that takes a concrete approach to Lie Groups proves a result that says

[tex]f(x_1,y_1) = f(x,y) + U f\ \alpha + \frac{1}{2!}U^2 f \ \alpha^2 + \frac{1}{3!}U^3 f \ \alpha^3 + ..[/tex]

by using Taylor series.

However, the function they are expanding is (to me) unclear.

If I try to expand [itex] f(x_1,y_1) = f(T_x(x,y,\alpha),T_y(x,y,\alpha)) [/itex] in Taylor series, only the first two terms of that result work.

Just substitute [tex] T_x(x_0,y_0,\alpha)= \phi(x_0,y_0,\alpha)[/tex] into what I derived above:

[tex] \sum_{i=0}^\infty \frac{(\delta \alpha)^n}{n!}(\frac{\partial \phi}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0} \frac{\partial}{\partial x} + \frac{\partial \psi}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0} \frac{\partial}{\partial y})^nf|_{(x,y) = (x_0,y_0)} \\

\sum_{i=0}^\infty \frac{(\delta \alpha)^n}{n!}(\frac{\partial T_x}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0} \frac{\partial}{\partial x} + \frac{\partial T_y}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0} \frac{\partial}{\partial y})^nf|_{(x,y) = (x_0,y_0)}\\

\sum_{i=0}^\infty \frac{(\delta \alpha)^n}{n!}(u_x \frac{\partial}{\partial x} + u_y \frac{\partial}{\partial y})^nf|_{(x,y) = (x_0,y_0)}[/tex]

(I don't like the [itex]u_x[/itex], [itex]\varepsilon[/itex] or [itex]\eta[/itex] terms, a) because the subscript looks like a partial derivative & b) because defining [itex]u_x[/itex] or [itex]\eta[/itex] etc... makes me forget what we're doing, I think including the partial derivatives explicitly is least confusing, though that may be due to inexperience)

Stephen Tashi said:

If I expand [itex] f(x_1,y_1) = f( x + \alpha \ u_x(x,y), y + \alpha \ u_y(x,y) ) [/itex] then I get the desired result. So I think this is equivalent to what the books do because they do not give an elaborate proof of the result. They present it as being "just calculus" and expanding [itex] f(x_1,y_1) = f( x + \alpha \ u_x(x,y), y + \alpha \ u_y(x,y) ) [/itex] is indeed just calculus.

This is exactly equivalent to what I wrote above, only instead of [itex]\alpha u_x(x,y) [/itex] I used [itex]\frac{\partial \phi}{\partial \alpha}\delta \alpha[/itex], the fact we can write both things basically amounts to whether we use [itex]\alpha - 0[/itex] or [itex]\alpha - \alpha_0[/itex] in our notation (where [itex]\alpha_0[/itex] & [itex] \alpha = 0[/itex] do the same thing!). If there's any ambiguity here just let me know.

Stephen Tashi said:

The books then proceed to give examples where the above result is applied to expand [itex] f(x_1,y_1) = f(T_x(x,y,\alpha),T_y(x,y,\alpha)) [/itex] I haven't found any source that justifies this expansion except by using the concept of infinitesimals.

Hopefully what I wrote is enough, it all relies on nothing more than Taylor series thus you could just modify a more rigorous proof of Taylor's theorem I'd imagine. This is all a mix of what's in Cohen & Emanuel so hopefully it's alright!

Stephen Tashi said:

Many people react to questions about the expansion in [itex] U [/itex] by brining up differential equations. Intuitively, I understand why. We trace out the curve followed by the point [itex](T_x(x,y.\alpha),T_y(x,y,\alpha)[/itex] and the value of function [itex] f(T_x,T_y) [/itex] as it varies along that curve, thinking of [itex] \alpha [/itex] as "time". The derivative of [itex] f(x,y) [/itex] with respect to [itex] \alpha [/itex] depends on the gradient of [itex] f [/itex] and the directional derivative of the curve at [itex] (x,y) [/itex].

Since the curve is a group of transformations, the directional derivative at [itex] (x,y) [/itex] is determined by what the transformation does to the point [itex] (x,y) [/itex] by values of [itex] \alpha [/itex] close to zero.

I'd think that some advanced calculus book somewhere would treat expanding a function f(x,y) in Taylor series "along a curve". Then the problem for adapting this result to 1-parameter Lie groups would be to show how that result becomes simpler when the curve is generated by a tranformation group.

The proof of Taylor's theorem in two variables follows exactly the same approach we're following, almost word for word as far as I can see. You could substitute the [itex]t[/itex] in their proof with [itex]\alpha[/itex] for us & get the same result, just modifying a tiny bit of notation, thus you are indeed incorporating the idea of a directional derivative in a sense. Their proof just assumes you've got a line in your domain between the two points you're deriving the taylor series with, the transformation group idea is analogous to saying we can move our second point along any point on this line, there isn't really any difference between them as far as I can see (on an intuitive level, modulo the formalism of establishing what transformation groups are to allow for curved paths between those points etc...). Furthermore references to differential equations & matrix exponentials are implicitly encoding our basic ideas as far as I can see. Doing it any other way is merely a formalistic way of justifying our intuition in the context of a formalism we've built up to get the answer we already know we're looking for.

jostpuur · Jul 14, 2013

bolbteppa said:

[tex]f(x_1,y_1) = \sum_{i=0}^\infty \frac{1}{n!}(\delta x \frac{\partial}{\partial x} + \delta y \frac{\partial}{\partial y})^nf|_{(x,y) = (x_0,y_0)}[/tex]

Am I understanding correctly, that you have written [itex]f[/itex] in [itex](x_1,y_1)[/itex] as a Taylor series with respect to the point [itex](x_0,y_0)[/itex], under the assumption that these points are infinitesimally close to each other? Doesn't look very useful.

Stephen Tashi said:

If I expand [itex] f(x_1,y_1) = f( x + \alpha \ u_x(x,y), y + \alpha \ u_y(x,y) ) [/itex] then I get the desired result.

This is exactly equivalent to what I wrote above, only instead of [itex]\alpha u_x(x,y) [/itex] I used [itex]\frac{\partial \phi}{\partial \alpha}\delta \alpha[/itex],...

What Tashi wrote was incorrect, so this isn't looking good. Explanation here #32

Stephen Tashi · Jul 14, 2013

bolbteppa said:

We see that simply by substituting in our [itex] \delta y = \frac{\partial \psi}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0} \delta \alpha [/itex] terms we get:

How is such a substitution justifed by standard calculus? You're putting an approximation into an infinite series of terms.

It looks to me like you are reproducing what the old books do using an argument based on infinitesimals. You do supply more details than the old books. However, if we use standard calculus, we obtain the 3rd term of the Taylor series that I got in post #24 of https://www.physicsforums.com/showthread.php?t=697036&page=2. I think I can get the third term to work out without any infinitesimal reasoning. I'll try to accomplish that in my next post in that thread.

I think the old books "have something going for them" and the way they reason would be useful to know. However, I want it translated to (or at least clearly related to) the modern viewpoint of calculus.

Stephen Tashi · Jul 14, 2013

jostpuur said:

What Tashi wrote was incorrect, so this isn't looking good. Explanation here #32

My goal to expand that function was my intended goal. My method was incorrect. It would be interesting to see if the correct interpretation of [itex] U^n [/itex] makes any difference in the final result for the simple arguments that I used in [itex]f [/itex].

bolbteppa · Jul 14, 2013

jostpuur said:

Am I understanding correctly, that you have written [itex]f[/itex] in [itex](x_1,y_1)[/itex] as a Taylor series with respect to the point [itex](x_0,y_0)[/itex], under the assumption that these points are infinitesimally close to each other? Doesn't look very useful.

Our goal in the other thread referred to is to use Lie groups to solve ordinary differential equations. There is a theorem, which I hope we'll prove, is that the transformations [itex]T[/itex] of the form [itex] T(x_0,y_0,\alpha) = (\phi(x_0,y_0,\alpha),\psi(x_0,y_0,\alpha))[/itex] of a one-parameter group can be put in one-one correspondence with transformations of a group generated by the infinitesimal generator

[tex]Uf \ = \ \varepsilon(x_0,y_0) \frac{\partial f}{\partial x} + \eta(x_0,y_0)\frac{\partial f}{\partial y} = \frac{\partial \phi}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0}\frac{\partial f}{\partial x} + \frac{\partial \psi}{\partial \alpha}|_{\alpha = \alpha_0}\frac{\partial f}{\partial y}[/tex]

The idea is that those coefficients are basically the coefficients of the tangent vector to the curve (thus need to be evaluated at a point!) & we'll use this local idea to generate our entire (global) curve & also solve some differential equations... What you would say doesn't look very useful is something someone like Lie would say has the potential to revolutionize an entire subject

jostpuur said:

What Tashi wrote was incorrect, so this isn't looking good. Explanation here #32

First of all you actually made the mistake when you wrote [itex] \frac{\partial u_x}{\partial x}[/itex], the [itex]u_x[/itex] should be written as [itex]u_x = u_x(x_0,y_0,\alpha)[/itex], in the context of what we're doing it is only a function of [itex]\alpha[/itex] being evaluated at [itex]x = x_0, y = y_0[/itex], thus the last sentence of your post is also a misinterpretation of what we're actually doing as those partials are never meant to show up. Second if what Tashi wrote was incorrect, & I said what I'd written was equivalent to what he'd written, then it should have been easy to spot the flaw in what I'd written also. The reason you're allowed to interpret it as an operator acting on [itex]f[/itex] is because you end up evaluating the derivatives at [itex]x = x_0, y = y_0[/itex] thus it is nothing but a shortcut to an assumption you're making in the derivation of the Taylor series, in the Taylor series you have to evaluate all partials at the point [itex](x_0,y_0)[/itex]. Thus what he wrote is fine, & this is the method used in those old scary books where they went out of their way to prove the theorem I referred to above. What you've basically done is to take Taylor's theorem:

[tex] f(x_0 + \delta x,y + \delta y) = f(x_0,y_0) + \frac{\partial f}{\partial x}|_{(x_0,y_0)} \delta x + \frac{\partial f}{\partial y}|_{(x_0,y_0)} \delta y + \frac{1}{2!}(\frac{\partial f^2}{\partial x^2}|_{(x_0,y_0)} \delta x^2 + 2\frac{\partial ^2 f}{\partial x \partial y}\delta x \delta y + \frac{\partial ^2 f}{\partial y^2}|_{(x_0,y_0)} \delta y ^2) + ... \\
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ = \sum_{n = 0}^\infty \frac{1}{n!}(\delta x \frac{\partial}{\partial x}|_{(x_0,y_0)} + \delta y \frac{\partial}{\partial y}|_{(x_0,y_0)})^n f \\
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ = \sum_{n = 0}^\infty \frac{1}{n!}U^n f
[/tex]

which I'm sure you know how to use & then, because of sloppy notation said that, for instance, in the second term above:

[tex] U^2 f = U(\delta x \frac{\partial f}{\partial x} + \delta y \frac{\partial f}{\partial y})\\
\ \ \ \ \ \ \ = (U \delta x) \frac{\partial f}{\partial x} + \delta x (U \frac{\partial f}{\partial x}) + ...[/tex]

which is obviously wrong here, this does not equal the Taylor series I wrote above when you expand it out (if you can make sense of it), yet that's exactly what you did, albeit with the 'dx' & 'dy' replaced by something equivalent to them (re-read my derivation).

Stephen Tashi said:

How is such a substitution justifed by standard calculus? You're putting an approximation into an infinite series of terms.

What I'm doing is replacing one real number by another slightly smaller real number, where the numbers were so small to begin with that the difference between them doesn't matter. The infinite series [itex] x_1 = x_0 + Ux|_{x_0} \delta \alpha + ...[/itex] is assumed to converge, i.e. you are supposed to choose [itex] \alpha[/itex] so small that the above series converges, i.e. so small that the difference between [itex]x_1[/itex] & [itex]x_0[/itex] is negligible - where these are two real numbers - not variables (sloppy notation again). Again it is by doing this that we'll end up, hopefully, proving the theorem I mentioned above (or at least implicitly using it). The reason the books are alright in using [itex]x[/itex] & not [itex]x_0[/itex] in the notation is that the point [itex]x_0[/itex] is arbitrary & it's a standard abuse of notation for authors not to be so pedantic all the time, but it can lead you into trouble (as this thread clearly shows) so use this as a lesson to become familiar with what we're doing. For me I had trouble with these things when learning how to translate multivariable calculus to manifolds & found a lot of flawed assumptions I was making was based on this exact abbreviation in notation so don't feel bad :tongue2:

Stephen Tashi said:

It looks to me like you are reproducing what the old books do using an argument based on infinitesimals. You do supply more details than the old books.

I'm just expanding a Taylor series as is done in Emanuel & Cohen, the only difference is that I'm being more careful than they are in specifying my initial points [itex](x_0,y_0)[/itex] with subscripts.

Stephen Tashi said:

However, if we use standard calculus, we obtain the 3rd term of the Taylor series that I got in post #24 of https://www.physicsforums.com/showthread.php?t=697036&page=2. I think I can get the third term to work out without any infinitesimal reasoning. I'll try to accomplish that in my next post in that thread.

I'll be honest, your notation is very confusing, it's already managed to lead jostpuur into doing something equivalent to differentiating the "dx" & "dy" terms in [itex]\frac{\partial f}{\partial x}dx + \frac{\partial f}{\partial y}dy[/itex] & some of what you're doing is not consistent with standard practice. When you refer to your "corrected definition" you are actually doing something completely incorrect, I've explained above why what you're doing is wrong. What you had originally was correct, so if you are now managing to get the third term without infinitesimal reasoning it's a complete fluke (though I don't see anything you wrote in your last post in that thread as being correct). I honestly would be more careful with my notation, with so many S's, T's & W's thrown around it's very easy to get confused, there are very standard conventions for doing everything you're doing, like replacing an x in a partial with an x' or X when using the chain rule. If you want to use tons of letters that's fine, but it'll probably lead to more problems unless you're really careful so keep an eye on it.

Stephen Tashi said:

I think the old books "have something going for them" and the way they reason would be useful to know. However, I want it translated to (or at least clearly related to) the modern viewpoint of calculus.

What we're doing is on the level of modern calculus. If you want to do this from an analysis standpoint you'll have to be comfortable differential forms, be willing to start constructing tangent & cotangent spaces & phrase everything you're doing in terms of the definition of a one-parameter group I gave in my post in the other thread, something encoding about 4 layers of pedanticism not present in anything we're seeing, & we'd probably have to invoke jets etc... just to do what we're doing. It's far too much for us, far too roundabout a way of getting the exact same answer we're striving for based off pure intuition, which is why I'm not trying to learn the manifold perspective of this theory until I know what I'm doing because all intuition is lost. Even in the general relativity lectures by Susskind on youtube he talks about having to go back to basic tensor analysis when doing his own work, not being able to invoke the formalism naturally because it's tough to do... Turning this into a modern perspective will become a matter of notation once we're familiar with it.

jostpuur · Jul 14, 2013

bolbteppa said:

First of all you actually made the mistake when you wrote [itex] \frac{\partial u_x}{\partial x}[/itex], the [itex]u_x[/itex] should be written as [itex]u_x = u_x(x_0,y_0,\alpha)[/itex], in the context of what we're doing it is only a function of [itex]\alpha[/itex] being evaluated at [itex]x = x_0, y = y_0[/itex]

This is as wrong as 0=1.

If [itex]\mathcal{C}^{\infty}(\mathbb{R}^2)[/itex] is the set of smooth functions [itex]f:\mathbb{R}^2\to\mathbb{R}[/itex], the operator [itex]U[/itex] can be seen as a mapping [itex]U:\mathcal{C}^{\infty}(\mathbb{R}^2)\to\mathcal{C}^{\infty}(\mathbb{R}^2)[/itex]. That means that if [itex]f[/itex] is a smooth function on the plane, also [itex]Uf[/itex] is a smooth function on the plane. In other words

[tex]
f\in\mathcal{C}^{\infty}(\mathbb{R}^2)\quad\implies\quad
Uf\in\mathcal{C}^{\infty}(\mathbb{R}^2)
[/tex]

And the smooth function [itex]Uf[/itex] is defined by the formula

[tex]
(Uf)(x,y) = u_x(x,y)\frac{\partial f(x,y)}{\partial x} + u_y(x,y)\frac{\partial f(x,y)}{\partial y}
[/tex]

In other words there is no special point [itex](x_0,y_0)[/itex] which would be substituted into [itex]u_x,u_y[/itex] always indepedent of the input of [itex]Uf[/itex].

If you want to compute the partial derivative of [itex]Uf[/itex] with respect to x, the definition is this:

[tex]
\frac{\partial (Uf)(x,y)}{\partial x} = \lim_{\Delta x\to 0} \frac{(Uf)(x+\Delta x, y) - (Uf)(x,y)}{\Delta x}
[/tex]

and the [itex]u_x,u_y[/itex] will not be constants.

Finally, when you want to study [itex]U^2f = U(Uf)[/itex], you will need the partial derivatives of [itex]Uf[/itex].

jostpuur · Jul 14, 2013

After all, [itex]U(Uf)[/itex] is a smooth function defined by the formula

[tex]
(U(Uf))(x,y) = u_x(x,y)\frac{\partial (Uf)(x,y)}{\partial x} + u_y(x,y)\frac{\partial (Uf)(x,y)}{\partial y}
[/tex]

...

Wait a minute, what was alpha doing in [itex]u_x(x,y,\alpha)[/itex]? The [itex]u_x[/itex] is defined as

[tex]
u_x(x,y) = \frac{\partial T_x(x,y,0)}{\partial \alpha}
[/tex]

where the parameters of [itex]T_x[/itex] are [itex]T_x(x,y,\alpha)[/itex] by convention.

bolbteppa · Jul 14, 2013

If you are going to invoke the language akin to that of the language of manifolds to analyze this then you should really see how what you've written makes no sense in our context because you are conflating terms from the cotangent space basis with vectors in the tangent space, picking them out of their respective spaces & just mushing them together in some magical operator [itex]U[/itex] - do you not see that? That alone should convince you what you're writing is flawed, but forget about the [itex]U[/itex] operator, go back to where I wrote out Taylor's theorem to second order (before I ever even mentioned [itex]U[/itex]) & replace the [itex]\delta x[/itex] & [itex]\delta y[/itex] terms with [itex] \frac{\partial \phi}{\partial \alpha}|_{\alpha = \alpha_0}\delta \alpha[/itex] & [itex] \frac{\partial \psi}{\partial \alpha}|_{\alpha = \alpha_0}\delta \alpha[/itex], then compare your two answers to have another reason why what you've written is wrong. You will now unequivocally see you are misinterpreting the meaning of [itex]U[/itex], because we are using what [itex]U[/itex] is actually derived from. Another way you should see this is wrong is when you try to explain to me why what you did makes perfect sense yet what I did (when I wrote [itex](U\delta x)f + \delta x (Uf) + ...[/itex]), i.e. follow your own logic, is illogical (or is it?).

Yes there is no special point, as I made clear in my post, but in the derivation of Taylor's theorem you need to specify the point you're working from in order to derive the series expansion, again as I made clear in my post. Only then can you invoke the arbitrariness of the point we're deriving things from, which implies the operator [itex]U[/itex] does not act on the coefficients (this should be obvious because you are getting the wrong answer based on the little test I've given you above). Furthermore you've ignored my illustration of the flaw in what you've done - why is it in your case it makes perfect sense yet when I use your own logic on a basic Taylor series it becomes nonsensical? You have not answered this.

jostpuur said:

Wait a minute, what was alpha doing in [itex]u_x(x,y,\alpha)[/itex]?

Sorry that was a mistake, I only deleted one of the [itex](x_0,y_0,\alpha)|_{\alpha = \alpha_0}[/itex] it seems, corrected now.

Stephen Tashi · Jul 14, 2013

I don't understand most of the issues in the last few posts, so I proclaim my neutrality.

bolbteppa said:

What I'm doing is replacing one real number by another slightly smaller real number, where the numbers were so small to begin with that the difference between them doesn't matter. The infinite series [itex] x_1 = x_0 + Ux|_{x_0} \delta \alpha + ...[/itex] is assumed to converge, i.e. you are supposed to choose [itex] \alpha[/itex] so small that the above series converges, i.e. so small that the difference between [itex]x_1[/itex] & [itex]x_0[/itex] is negligible - where these are two real numbers - not variables (sloppy notation again).

It seems to me that if I try put this in a modern context, that I should avoid the idea of "a real number so close to another real number that it doesn't matter". Is there any way to look at it as an iterated limit? - along the lines of:

Limit as something ( perhaps [itex] \delta x, \delta y [/itex]?)) approaches 0 of ( Limit as n -> infinity of a series of terms that are a function of alpha and the something).
Swap the limits.
= limit as n->infinity of ( limit of each term as the something ->0)
The limit of each term is an appropriate term for a power series in alpha
= Limit as n->infinity of a Taylor series in alpha.

Emmanuel remarks that he isn't concerned with the radius of convergence of the Taylor series. I don't know why. My intuition confused by the fact that the Taylor series expansion in alpha makes no (overt) assumption that alpha is small. Yet deriving seems to depend delicately on making (x1,y1) close to (x0,y0). I don't understand why(x0,y0) is needed at all. From some of what you just wrote, I have the thought that the Taylor series is first approached as a series in [itex] \delta x , \delta y [/itex] and that the [itex] \alpha [/itex] gets in there by a substitution.

I think my derivation begun the other thread will work. I don't think it will be a fluke if I show it utilizes the result of post #49 of this thread.

bolbteppa · Jul 14, 2013

Stephen Tashi said:

It seems to me that if I try put this in a modern context, that I should avoid the idea of "a real number so close to another real number that it doesn't matter". Is there any way to look at it as an iterated limit? - along the lines of:

Limit as something ( perhaps [itex] \delta x, \delta y [/itex]?)) approaches 0 of ( Limit as n -> infinity of a series of terms that are a function of alpha and the something).
Swap the limits.
= limit as n->infinity of ( limit of each term as the something ->0)
The limit of each term is an appropriate term for a power series in alpha
= Limit as n->infinity of a Taylor series in alpha.

Sorry, I should have explained the whole point of this when I said the Lie series for [itex]x_1[/itex] converges in my last post. If you assume [itex] x_1 = x_0 + Ux|_{x_0} \delta \alpha + ...[/itex] converges then you have shown that [itex]x_1[/itex] can be expressed in terms of the infinitesimal generator [itex]\varepsilon(x_0,y_0) = \frac{\partial \phi}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0}[/itex] because the [itex]U^nx|_{x_0}[/itex]'s when worked out only involve [itex]\varepsilon (x_0,y_0)[/itex] (as in, just above equations 2.18 in Emanuel). What does this all mean? It means we can generate our one-parameter group of transformations [itex]T(x_0,y_0,\alpha) = (x_1,y_1)[/itex] using infinitesimal transformations of the form [itex]\varepsilon(x_0,y_0) = \frac{\partial \phi}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0}[/itex] because we've shown that [itex]x_1[/itex] can be expressed in terms of the [itex]\varepsilon(x_0,y_0) = \frac{\partial \phi}{\partial \alpha}(x_0,y_0,\alpha)|_{\alpha = \alpha_0}[/itex] to the first power. Thinking about it this does put transformations [itex]T[/itex] in one-one correspondence with transformations of the group generated by the infinitesimal generator... The only issue I see as regards limits would be as to what your radius of convergence would be so as to also allow for approximations (like I'll mention below), though I might be wrong there, so that would be an interesting discussion but it's in the realm of Taylor series error approximation & nothing crazier, as far as I can see (correct me if I'm wrong, which is likely).

Therefore Emanuel's example on page 14 just below equations 2.18 of him starting with the infinitesimal generator [itex]- y\frac{\partial}{\partial x} + x\frac{\partial}{\partial y}[/itex] & ending up with the global one-parameter group of transformations [itex](x_1,y_1) = (x_0\cos(\alpha) - y_0\sin(\alpha),x_0\sin(\alpha) + y_0 \cos(\alpha))[/itex] makes perfect sense. If you try the other method you've mentioned you'll end up with a completely wrong answer in this simple example (try it).

More can be said on this, we can use this idea to show how every transformation group is isomorphic to a translation group thus establishing the whole integration constant as our symmetry group when integrating differential equations, but there's no point in doing that until this makes sense.

Stephen Tashi said:

Emmanuel remarks that he isn't concerned with the radius of convergence of the Taylor series. I don't know why. My intuition confused by the fact that the Taylor series expansion in alpha makes no (overt) assumption that alpha is small. Yet deriving seems to depend delicately on making (x1,y1) close to (x0,y0). I don't understand why(x0,y0) is needed at all. From some of what you just wrote, I have the thought that the Taylor series is first approached as a series in [itex] \delta x , \delta y [/itex] and that the [itex] \alpha [/itex] gets in there by a substitution.

Yeah your intuition is right, the issue about this being small is to ensure convergence of the global group equations we derive from the infinitesimal generator. I do not know the details of this, but if you look at the example of generating the infinitesimal generator of the rotation group given in Cohen on page 8 you see there comes a point where they derive the infinitesimal generator by assuming [itex]\alpha[/itex] is close to zero so that [itex]\cos(\alpha) = 1[/itex], similarly for [itex]\sin(\alpha) = \delta \alpha[/itex]. This is the kind of thing they mean when [itex]\alpha[/itex] is small, & the justification is that in doing this you end up showing you can actually re-derive the original one-parameter group of transformations via a Lie series (as is done in Emanuel).

Stephen Tashi said:

I think my derivation begun the other thread will work. I don't think it will be a fluke if I show it utilizes the result of post #49 of this thread.

You are basing your derivation off differentiating the "dx" & "dy" in [itex] \frac{\partial f}{\partial x}dx + \frac{\partial f}{\partial y}dy[/itex], you may as well differentiate the plus sign :tongue2: Please go ahead with it & hopefully you'll see what I mean when it doesn't work out, try applying your method to the example Emanuel gave, of deriving the rotation group from the infinitesimal transformations (below equations 2.18 on page 14) as a test before you go any further, this will illustrate what I mean when I'm saying you will differentiate the "dy" & "dy" terms, you'll not only not make sense you'll get the wrong answer when you try to generate the global rotation group starting from it's infinitesimal transformations (Emanuel's example).

Stephen Tashi · Jul 14, 2013

Not that the original poster of a thread can ever direct it! - but here is my simplistic view of the situation.

1.The straightforward Taylor expansion for the function in question is:
[tex] f(T_x(x,y,\alpha),T_y(x,y,\alpha) =[/tex]
[tex]\ f(x,y) + (D_\alpha( f(T_y(x,y,\alpha),T_y(x,y,\alpha))_|{\alpha = 0})\ \alpha + (\frac{1}{2!}\ D^2_\alpha( f(T_y(x,y,\alpha),T_y(x,y,\alpha))_|{\alpha = 0})\ \alpha^2 + ... [/tex]
The coefficients of the powers of [itex] \alpha [/itex] do not depend on [itex] \alpha [/itex].

2. Without saying that I know what the operator [itex] U [/itex] or its powers are, the claim to be proven is:
[tex] f(T_x(x,y,\alpha),T_y(x,y,\alpha) = [/tex]
[tex]\ f(x,y) +(U f(T_x(x,y,\alpha),T_y(x,y,\alpha)))\ \alpha + (\frac{1}{2!}\ U^2 f(T_y(x,y,\alpha),T_y(x,y,\alpha)))\ \alpha^2 + ... [/tex]

Edit: Taking bolbteppa's suggestion in a subsequent post that I think of the concrete example of the rotation group, I think the claim is actually:
[tex] f(T_x(x,y,\alpha),T_y(x,y,\alpha) = [/tex]
[tex]\ f(x,y) +(U f(x,y)\ \alpha + (\frac{1}{2!}\ U^2 (f(x,y))\ \alpha^2 + ... [/tex]

So the evaulations of the coefficients take place at [itex] (x,y) [/itex].

3. There are two possibilities to consider about 2. Either the coefficients of the powers of [itex] \alpha [/itex] are themselves functions of [itex] \alpha [/itex] or they are not.

If they are not, then the coefficients of the respective powers of [itex] \alpha [/itex] in 2. should equal the respective coefficients in the Taylor expansion in 1.) I see no fancy-dan footwork with manifolds, infinitesimals, or otherwise that can avoid this conclusion. Corresponding coeffiencts may not look the same when you compute them initially, but you should be able to prove they are the same.

If the coefficients in 2.) are functions of [itex] \alpha [/itex] then 2. is a much more complicated expression than a simple power series in [itex] \alpha [/itex] The coefficients of the corresponding powers of [itex] \alpha [/itex] don't have to match because those in 2. would not be constant with respect to [itex] \alpha [/itex] while those in 1. are.

4. As best I can tell, the "infinitesimal elements" [itex] u_x, u_y [/itex] are not functions of [itex] \alpha [/itex] since they are defined as evaluating derivatives at the value [itex] \alpha = 0 [/itex].

5. The old books use a notation for [itex] U [/itex] that amounts to [itex] U = u_x f_x + u_y f_y [/itex]. What they mean by [itex] f_x, f_y [/itex] is unclear to me. If [itex] f_x [/itex] means "the partial derivative of [itex] f [/itex] with respect to its first argument" then [itex] U [/itex] applied to [itex] f(T_x(x,y,\alpha),T_y(x,y,\alpha) ) [/itex] is going to produce a result that is a function of [itex] \alpha [/itex] because the partial derivatives of [itex] f [/itex] will be evaulated at the arguments [itex] (T_x(x,y,\alpha),T_y(x,y,\alpha)) [/itex].

Edit: However, if my revised opinion of the claim 2. is correct the operator [itex] U [/itex] is applied to [itex] f(x,y) [/itex] not [itex] f(T_x(x,y,\alpha), T_y(x,y,\alpha) ) [/itex].

(This is a separate issue than the question of whether [itex] U^2 [/itex] implies differentiating the factors [itex] u_x, u_y [/itex] upon the second application of [itex] U [/itex].)

Stephen Tashi · Jul 15, 2013

bolbteppa said:

You are basing your derivation off differentiating the "dx" & "dy" in [itex] \frac{\partial f}{\partial x}dx + \frac{\partial f}{\partial y}dy[/itex], you may as well differentiate the plus sign :tongue2: Please go ahead with it & hopefully you'll see what I mean when it doesn't work out, try applying your method to the example Emanuel gave, of deriving the rotation group from the infinitesimal transformations (below equations 2.18 on page 14) as a test before you go any further, this will illustrate what I mean when I'm saying you will differentiate the "dy" & "dy" terms, you'll not only not make sense you'll get the wrong answer when you try to generate the global rotation group starting from it's infinitesimal transformations (Emanuel's example).

I'm not differentiating any infinitesimals. I'm differentiation functions. For the rotation group

[itex] T_x(x,y,\alpha) = - x \sin \alpha + y \cos(\alpha) [/itex]
[itex] u_x(x,y) = D_\alpha \ T_x(x,y,\alpha)_{|\ \alpha = 0} = - y [/itex]
[itex] T_y(x,y,\alpha) = x \cos(\alpha) + y \sin(\alpha) [/itex]
[itex] u_y(x,y) = D_\alpha \ T_y(x,y,\alpha)_{|\ \alpha = 0} = x [/itex]

So the results are consistent with those at the bottom of page 14 in Emmanuel's book:

[itex] U x = u_x \frac{\partial x}{\partial x} + u_y \frac{\partial x}{\partial y} [/itex]
[itex] = -y (1) + x (0) = -y [/itex]

[itex] U^2 x = U( Ux) = U (-y) = u_x \frac{\partial (-y)}{\partial x} + u_y \frac{\partial (-y)}{\partial y} [/itex]
[itex] = (-y)(0) + x (-1) = -x [/itex]

For matrix groups [itex] u_x [/itex] and [itex] u_y [/itex] are linear functions of [itex] x [/itex] and [itex] y [/itex]. I think if we want an example to show a problem with interpreting [itex] U^2 [/itex] as involving terms like [itex] \frac{\partial ^2 u_x}{\partial x \partial y} [/itex] we need an example where such terms are nonzero.

Edit:Thinking about what you meant, I believe one of my problems (here and in the other thread) is thinking that [itex] U [/itex] is applied to [itex] f(T_x(x,y,\alpha),T_y(x,y,\alpha)) [/itex]. Actually the power series applies the operator [itex] U [/itex] to [itex] f(x,y) [/itex].

bolbteppa · Jul 15, 2013

Stephen Tashi said:

4. As best I can tell, the "infinitesimal elements" [itex] u_x, u_y [/itex] are not functions of [itex] \alpha [/itex] since they are defined as evaluating derivatives at the value [itex] \alpha = 0 [/itex]. [/itex]

Yeah, they are functions of x & y, though again the operator [itex]U[/itex] does not act on them because it is derived under the assumption that x & y are fixed at a point, but then because this fixed point was arbitrary we can consider them as variables. This is made clear in the Taylor series I posted in my first post yesterday.

Stephen Tashi said:

5. The old books use a notation for [itex] U [/itex] that amounts to [itex] U = u_x f_x + u_y f_y [/itex]. What they mean by [itex] f_x, f_y [/itex] is unclear to me. If [itex] f_x [/itex] means "the partial derivative of [itex] f [/itex] with respect to its first argument" then [itex] U [/itex] applied to [itex] f(T_x(x,y,\alpha),T_y(x,y,\alpha) ) [/itex] is going to produce a result that is a function of [itex] \alpha [/itex] because the partial derivatives of [itex] f [/itex] will be evaulated at the arguments [itex] (T_x(x,y,\alpha),T_y(x,y,\alpha)) [/itex].
...

Edit: However, if my revised opinion of the claim 2. is correct the operator [itex] U [/itex] is applied to [itex] f(x,y) [/itex] not [itex] f(T_x(x,y,\alpha), T_y(x,y,\alpha) ) [/itex].

The problem here amounts to lack of familiarity with conventions of notation. In this case you should be writing [itex] f(T_x(x_0,y_0,\alpha), T_y(x_0,y_0,\alpha) ) [/itex] so that when [itex]\alpha[/itex] is not zero (or [itex]\alpha_o[/itex] we can write [itex] f(T_x(x_0,y_0,\alpha), T_y(x_0,y_0,\alpha) ) = f(x,y)[/itex] thus applying [itex]U[/itex] to f makes perfect sense, when applied to f we find that we should write [itex]U|_{(x_0,y_0)}f[/itex], & note that because of the [itex]|_{(x_0,y_0)}[/itex] we are not dealing with something that is a function of [itex]\alpha[/itex] in the end.

Stephen Tashi said:

Edit:Thinking about what you meant, I believe one of my problems (here and in the other thread) is thinking that [itex] U [/itex] is applied to [itex] f(T_x(x,y,\alpha),T_y(x,y,\alpha)) [/itex]. Actually the power series applies the operator [itex] U [/itex] to [itex] f(x,y) [/itex].

I've been trying to find a link to a proof of Taylor's theorem for multivariable functions (or even the second derivative test) analogous to the way it's done in Thomas calculus but I can't find one. What we are doing is treating [itex] f(T_x(x,y,\alpha),T_y(x,y,\alpha)) [/itex] as a function of one variable, [itex] \alpha[/itex] or [itex] \delta \alpha[/itex] whichever you prefer, then expanding this single variable function [itex]g(\alpha)[/itex] out in a Taylor series. However by using the chain rule we inadvertently end up with a Taylor expression for [itex]f[/itex] in two variables. If you know Taylor's theorem in two variables this should be no problem (if you don't I'll post an explicit proof & an example or two no problem so that we're on the same page). The proof offers a perfect example of a moment analogous to the above where you might be confused into differentiating the [itex]u_x[/itex] terms (they will be written as [itex]\frac{dx}{dt}[/itex] in the proof) but you'll see it's not done there & it should be clear from your derivation why you shouldn't do it. Note that this is the entire idea, no [itex]U[/itex] is required, we really should forget about it as it's just notation. We may introduce the [itex]U[/itex] notation if we wish, afterwards, but the idea is that it lessens confusion whereas right now it's only creating more confusion thus forget about it. This can all be solved by going back to the derivation of Taylor's theorem, something you should try to do. There's an interesting comment in Cohen about the [itex]U[/itex] that is highly relevant:

Since [itex]Uf[/itex] can be written when the infinitesimal transformation is known, & conversely, [itex]\delta x = \varepsilon(x,y)\delta \alpha[/itex], ... is known when [itex]Uf[/itex] is given, [itex]Uf[/itex] is said to represent [itex]\delta x = \varepsilon(x,y)\delta \alpha[/itex], ... . For convenience of language we'll speak of "the infinitesimal transformation [itex]Uf[/itex]" instead of "the transformation represented by [itex]Uf[/itex]." But it must be borne in mind that [itex]Uf[/itex] is not a transformation, it is only representative of one.

A really really short way to sum all this up is that we are using the chain rule, repeatedly in a way that turns things into Taylor's theorem. The [itex]U[/itex] operator acts on [itex]f[/itex] with respect to the x & y variables in the function (the first half of the chain rule), then the rest of the chain rule is already taken care of by the coefficients inside the [itex]U[/itex] operator already, thus it's just a notational shortcut for the chain rule. Please try re-reading my first post from yesterday & point out any step in it you find iffy or unclear

I think I know where we're going with all this stuff now, but I'll hold off until I've cemented everything.

jostpuur · Jul 15, 2013

I took a closer look at the Taylor series question.

jostpuur said:

[tex]
\phi(x,y,\alpha) = (e^{\alpha U}f)(x,y)
[/tex]

[tex]
\psi(x,y,\alpha) = f\big(T_x(x,y,\alpha),T_y(x,y,\alpha)\big)
[/tex]

In posts #44 #46 #49 #50 we succeeded in proving that

[tex]
\frac{\partial}{\partial \alpha}\psi(x,y,\alpha) = U\psi(x,y,\alpha)\quad\quad\quad (*)
[/tex]

holds. By the assumption that the PDE has a unique solution (which hasn't been verified, but probably holds anyway), this implied [itex]\phi=\psi[/itex].

If we assume that the Taylor series converge we get

[tex]
e^{\beta D_{\alpha}}\big(f(T_x(x,y,\alpha),T_y(x,y,\alpha))\big)\Big|_{\alpha=0}
= f(T_x(x,y,\beta),T_y(x,y,\beta)) = (e^{\beta U}f)(x,y)
[/tex]

based on what was proven earlier.

Could it be that you want to prove this in some other way, not using the PDE interpretation? It doesn't look impossible.

The operators [itex]\frac{\partial}{\partial\alpha}[/itex] and [itex]U[/itex] commute, because the weights in [itex]U[/itex] do not depend on alpha, and everything is smooth so that the partial derivatives commute too. So the induction step

[tex]
\frac{\partial}{\partial\alpha}U^n\psi(x,y,\alpha) = U^{n+1}\psi(x,y,\alpha)
[/tex]

is clear. This implies that [itex]e^{\beta\frac{\partial}{\partial\alpha}}[/itex] and [itex]e^{\beta U}[/itex] do the same thing when you operate [itex]\psi[/itex] with them. Assuming that the Taylor series converge we get

[tex]
\psi(x,y,\beta) = e^{\beta\frac{\partial}{\partial\alpha}}\psi(x,y,0) = e^{\beta U}\psi(x,y,0) = e^{\beta U}f(x,y)
[/tex]

This way the PDE interpretation was not used, but still the technical result (*) which was proven in #44 #46 #49 #50 was needed.

Stephen Tashi said:

jostpuur said:

Do these follow from some axioms about the transformation?

I think they do.

I don't follow the entire argument yet, but I at least understand the above question.

IMO you should take a closer look. The keys to the Taylor series results are there.

Stephen Tashi said:

I can't interpret [itex](f\circ T(\alpha))(x,y)[/itex]. The two arguments [itex] (x,y) [/itex] sitting outside the parentheses confuse me.

It should not be seen as confusing. For example, if [itex]f,g:\mathbb{R}\to\mathbb{R}[/itex] are mappings, then also [itex]f+g[/itex] is a mapping. This means that we can denote [itex](f+g):\mathbb{R}\to\mathbb{R}[/itex] and the mapping is defined by

[tex]
(f+g)(x) = f(x) + g(x)
[/tex]

Also [itex]f\circ g[/itex] is a mapping, which is defined by

[tex]
(f\circ g)(x) = f(g(x))
[/tex]

This how parentheses are usually used.

jostpuur · Jul 15, 2013

I'm going to add some details. In the previous post I said that [itex]\frac{\partial}{\partial\alpha}[/itex] and [itex]U[/itex] commute. This can get tricky with some notations though. But just to be clear, here's the proof. Assume that [itex]\xi(x,y,\alpha)[/itex] is some three parameter function. Then

[tex]
\frac{\partial}{\partial\alpha} U \xi(x,y,\alpha)
= \frac{\partial}{\partial\alpha}\Big(u_x(x,y)\frac{\partial \xi(x,y,\alpha)}{\partial x}
+ u_y(x,y)\frac{\partial \xi(x,y,\alpha)}{\partial y}\Big)
[/tex]
[tex]
= u_x(x,y)\frac{\partial^2\xi(x,y,\alpha)}{\partial\alpha\partial x}
+u_y(x,y)\frac{\partial^2\xi(x,y,\alpha)}{\partial\alpha\partial y}
[/tex]
[tex]
= u_x(x,y)\frac{\partial}{\partial x}\Big(\frac{\partial \xi(x,y,\alpha)}{\partial\alpha}\Big)
+ u_y(x,y)\frac{\partial}{\partial y}\Big(\frac{\partial \xi(x,y,\alpha)}{\partial\alpha}\Big)
=U\frac{\partial}{\partial\alpha}\xi(x,y,\alpha)
[/tex]

So they commute. At least here. Then we can substitute [itex]\xi=U^{n-1}\psi[/itex], [itex]\xi=U^{n-2}\psi[/itex] and so on, when commuting [itex]\frac{\partial}{\partial\alpha}[/itex] and [itex]U^n[/itex].

The stuff gets tricky if we are operating on the expression

[tex]
f(T_x(x,y,\alpha),T_y(x,y,\alpha))
[/tex]

If you take the derivative with respect to alpha (like operating with [itex]D_{\alpha}[/itex] or [itex]\frac{d}{d\alpha}[/itex]), the partial derivatives [itex]\frac{\partial f}{\partial x}[/itex] and [itex]\frac{\partial f}{\partial y}[/itex] will appear with some additional factors, again depending on all three parameters. So do the operators commute now too?

I can admit I feel little confused by some of these questions. This is why I defined the functions [itex]\phi,\psi[/itex]. They enabled me to avoid the confusion and ambiguities.

The fact is that partial derivatives are not well defined, if we are not clear about what the functions are, and what their parameters are. So operating with [itex]\frac{\partial}{\partial x}[/itex] isn't neccessarily allowed, if we don't know the function on right. Consequently, operating with [itex]U[/itex] isn't allowed always then either.

Stephen Tashi · Jul 20, 2013

One approach to proving the expansion

[eq 9.1]
[tex] f(T_x(x,y,\alpha),T_y(x,y,\alpha) = f(x,y) + U f(x,y)\ \alpha + \frac{1}{2!} U^2\ \alpha^2 + ... [/tex].

using "just calculus" is to define [itex] U [/itex] so that [itex] U^n f(T_x(x,y,\alpha),T_y(x,y,\alpha) [/itex] is (exactly) equal to [itex] D^n_\alpha f( T(_x(x,y,\alpha), T_y(x,y,\alpha)) [/itex].

That will ensure
[eq. 9.2]
[tex] (U^n f(T_x(x,y,\alpha),T_y(x,y,\alpha)))_{|\ \alpha = 0 } = (D^n f(T_x(x,y,\alpha),T_y(x,y,\alpha)))_{|\ \alpha = 0} [/tex].

The only question left open will be whether
[eq. 9.3]
[tex] (U^n f(T_x(x,y,\alpha),T_y(x,y,\alpha)))_{|\ \alpha = 0 } = U^n f(x,y) [/tex]

Settling that question may or may not require more advanced mathematics.

I'll summarize this approach (and ignore criticism that I use too many letters, in view of the fact that alternatives feel free to employ [itex]\phi,\psi,\eta,\xi, U, u_x, u_y,f,f_1, x, y, x_0,y_0,x_1,y_1,\epsilon, \delta x, \delta y, \delta \alpha [/itex].)The elements of the 1-parameter group are the mappings [tex]T(C): (A,B) \rightarrow ( T_x(A,B,C), T_y(A,B,C))[/tex].

We assume the parameterization with [itex]C [/itex] is done so that

[Eq. 9.4]
[tex]T(0) = (A,B)[/tex] (i.e. [itex]T(0) [/itex] is the identiy map.)

[Eq. 9.5]
[tex]T( T(\beta), \alpha) = T(\beta + \alpha)[/tex].

Assume all derivatives mentioned in this post exist.

The following two results were proved in post #49:

Theorem 9-1:

[Eq. 9.6]
[tex]\frac{\partial T_x}{\partial C} =\frac{\partial T_x}{\partial C}_{|\ C = 0}\frac{\partial T_x}{\partial A} + \frac{\partial T_y}{\partial C}_{|\ C = 0}\frac{\partial T_x}{\partial B}[/tex]
[Eq. 9.7]
[tex]\frac{\partial T_y}{\partial C} =\frac{\partial T_x}{\partial C}_{|\ C = 0}\frac{\partial T_y}{\partial A} + \frac{\partial T_y}{\partial C}_{|\ C = 0}\frac{\partial T_y}{\partial B}[/tex]

To prove theorem 9-1, let [itex]C = \alpha + \beta[/itex], differentiate both sides of the coordinate equations implied by [itex]T(\alpha + \beta) = T( T(\alpha),\beta)[/itex] with respect to [itex]\beta [/itex]. Set [itex]\beta = 0[/itex]. Then deeply contemplate the verbal interpretation of the notation in the result!

Develop condensed notation for Theorem 9-1 by making the definitions:

[Eq. 9.8]
[tex]u_x(A,B) = \frac{\partial T_x}{\partial C}_{|\ C = 0}[/tex]
[Eq. 9.9]
[tex]u_y(A,B) = \frac{\partial T_y}{\partial C}_{|\ C = 0}[/tex]

[Eq. 9.10]
[tex]U_x(A,B,C) = u_x \frac{\partial T_x}{\partial A} + u_y \frac{\partial T_x}{\partial B}[/tex]
[Eq 9.11]
[tex]U_y(A,B,C) = u_x \frac{\partial T_y}{\partial A} + u_y \frac{\partial T_y}{\partial B}[/tex]

With that notation Theorem 9-1 amounts to:

[Eq. 9.12]
[tex]\frac{\partial T_x}{\partial C} = U_x[/tex]
[Eq 9.13]
[tex]\frac{\partial T_y}{\partial C}= U_y[/tex]

Define the differential operator [itex]U[/itex] acting on a real valued function [itex]f[/itex] of two variables [itex](S,W)[/itex] by:

[Eq. 9.14]
[tex]U f = U_x \frac{\partial}{\partial S} + U_y \frac{\partial}{\partial W}[/tex]

An equivalent definition of [itex] U [/itex] for a matrix group might be written in a simpler manner. In a matrix group, [itex]T_x[/itex] and [itex] T_y[/itex] are linear in [itex]A[/itex] and [itex]B[/itex]. Operations such as [itex]\frac{\partial T_x}{\partial A}[/itex] or [itex]\frac{\partial T_y}{\partial B}[/itex] will "pick-off" the functions of [itex]C[/itex] that are the coefficients of [itex]A[/itex] and [itex]B[/itex]. (For example, [itex] T_x(x,y,\alpha) = x \cos(\alpha) - y \sin(\alpha),\ \frac{\partial T_x}{\partial x} = cos(\alpha) [/itex].) For matrix groups, it may be simpler to write a definition of [itex]U[/itex] that specifies the functions that are "picked off" directly by stating them as a matrix rather than doing this implicitly via the above definitions of [itex]U_x, U_y[/itex].

For [itex]U[/itex] to be well defined, we must have specified a particular 1-parameter group since its definition depends on the definitions of [itex]U_x, U_y[/itex]. The definition of [itex]U[/itex] does not enforce any relationship between the variables [itex]S,W[/itex] and the variables [itex]A,B,C[/itex] involved with the group.

The most important application of [itex]U[/itex] will be in the particular case when [itex]S = T_x(A,B,C), W=T_y(A,B,C)[/itex].

Theorem 9-2: Let [itex]f(S,W)[/itex] be a real valued function with [itex]S = T_x(A,B,C), W=T_y(A,B,C)[/itex]. Then

[Eq. 9.15]
[tex]\frac{\partial f}{\partial C} = U f[/tex]

This proven by using the chain rule, theorem 9-1 and the definitions relating to [itex] U [/itex] above.

If [itex]f(S,W)[/itex] is understood to be evaluated at [itex]S = T_x(A,B,C), W=T_y(A,B,C)[/itex] then functions derived from it such as [itex]\frac{\partial f}{\partial S}[/itex] or [itex]\frac{\partial f}{\partial C}[/itex] are understood to be evaluated at the same values. Hence Theorem 9-2 applies to them also and so we have results such as:

[Eq. 9.16]
[tex]\frac{\partial}{\partial C} \ (\frac{\partial f}{\partial S}) = U \ \frac{\partial f}{\partial S}[/tex]

[Eq. 9.17]
[tex]\frac{\partial^2 f}{\partial C^2} = \frac{\partial}{\partial C} \ (\frac{\partial f}{\partial C}) = U \ \frac{\partial f}{\partial C} = U (U f)[/tex]

Using the values [itex]A = x, B= y, C =\alpha[/itex] the above reasoning shows (at least informally) that for any given integer [itex]n > 0[/itex]

[Eq 9.18]
[tex]D^n_\alpha f(T_x(x,y,\alpha),T_y(x,y,\alpha)) = U^n f(T_x(x,y,\alpha),T_y(x,y,\alpha)) [/tex]

We now come to the question of how to prove eq 9.3, which said
[tex] (U^n f(T_x(x,y,\alpha),T_y(x,y,\alpha)))_{|\ \alpha = 0} = U^n f(x,y) [/tex]
(and also, whether it is actually true for all 1-parameter groups!)

I can't decide whether this result requires some sophisticated continuity argument about differential operators or whether result follows just from verbally interpreting the notation and definitions involved with [itex] U [/itex].

We can contrast, the meanings of:

[expression 9-A]
[tex] U^n\ f(x,y) [/tex]

[expression 9-B]
[tex] (U^n\ ( f(T_x(x,y,\alpha),T_y),T_y(x,y,\alpha)))_{|\ \alpha = 0} [/tex]

Using eq, 9.4, we can rewrite [itex] f(x,y) [/itex] so expression 9-A becomes:

[expression 9-C]
[tex] U^n\ f( T_x(x,y,0), T_y(x,y,0)) [/tex]

This establishes that the variables in the definition of [itex] U^n f(x,y) [/itex] are [itex] S = T_x(A,B,C), W = T_x(A,B,C), A = x, B = y, C = 0 [/itex] As I interpret 9-C, it means that the relationships among the variables are enforced (including the fact that [itex]C = 0[/itex]) and then the operator [itex] U^n [/itex] is applied to [itex] f(S,W) [/itex].

My interpretation of expression 9-B is that the relationships [itex] A = x, B = y , C =\alpha, S = T_x(A,B,C), W= T_y(A,B,C) [/itex] are enforced. The operation of [itex] U^n [/itex] is applied to [itex] f(S,W)[/itex]. Then [itex] C [/itex] is set to zero after all that is done.

The question of whether expression 9-C equals expression 9-B hinges on whether the operation of setting [itex] C = 0 [/itex] commutes with the operation of applying [itex] U^n [/itex].

I don't know much about differential operators, but I'll speculate that proving 9.3 by a continuity property of operators would involve something like showing:
[tex] (U^n(f(T_x(x,y,\alpha),T_y(x,y,\alpha)))_{|\ \alpha = 0} = \lim_{\alpha \rightarrow 0} U^n f(T_x(x,y,\alpha),T_y(x,y,\alpha)) [/tex]
[tex] = U^n (lim_{\alpha \rightarrow 0} f(T_x(x,y,\alpha),T_y(x,y,\alpha)) = U^n f(T_x,(x,y,0),T_y(x,y,0)) = U^n f(x,y) [/tex]

The significant feature of how [itex] U [/itex] has been defined is that it does not involve any differentiations with respect to [itex] C [/itex]. If there is a simple proof of 9.3 by parsing definitions, it would involve the claim that all functions involved in [itex] U^n [/itex], such as [itex] U_x(A,B,C), U_y(A,B,C), f(T_x(A,B,C),T_y(A,B,C))[/itex] and their various partial derivatives with respect to [itex] A, B [/itex] give the same results, whether you first substitute [itex] C = 0 [/itex] and perform the differentiations with respect to [itex] A, B [/itex] or whether you do the differentiations first and then substitute [itex] C = 0 [/itex].

jostpuur · Jul 23, 2013

Is there still a problem with this? I already thought that everything was reasonably solved.

Stephen Tashi · Jul 23, 2013

jostpuur said:

Is there still a problem with this? I already thought that everything was reasonably solved.

My goal is to define [itex] U [/itex] explicitly as a differential operator. ( I don't know why definitions of [itex] U [/itex] need to be so indirect and obfuscated. If [itex] U [/itex] is differential operator then one idea for defining it is to define it in terms of differentiation - or is that too far out?)

I don't think definition of [itex] U [/itex] as a differential operator that I offered in post #6 works in general, so I have proposed a different one. The definition I proposed in post #6 when applied to [itex] f( T_x(x,y,\alpha), T_y(x,y,\alpha) ) [/itex] involves differentiating [itex] f [/itex] with respect to its first and second arguments, but does not imply the factors that come from differentiating [itex] T_x [/itex] and [itex] T_y [/itex] with respect to their first and second arguments. Perhaps in your work, you were already using the definition that I propose in post #67.

What's the deal on infinitesimal operators?

Similar threads

Hot Threads

Recent Insights