# Lorentz invariance of Klein-Gordon eqn & Maxwell Lagrangian

1. Apr 20, 2016

### spaghetti3451

1. The problem statement, all variables and given/known data

1. Show directly that if $\varphi(x)$ satisfies the Klein-Gordon equation, then $\varphi(\Lambda^{-1}x)$ also satisfies this equation for any Lorentz transformation $\Lambda$.

2. Show that $\mathcal{L}_{Maxwell}=-\frac{1}{4}F_{\mu\nu}F^{\mu\nu}$ is invariant under the Lorentz transformation $x \rightarrow \Lambda x$.

2. Relevant equations

3. The attempt at a solution

The Klein-Gordon equation is $\partial^{\mu}\partial_{\mu}\varphi(x) + m^{2}\varphi(x)=0$.

Under a Lorentz tranformation $x \rightarrow \Lambda x$, the Klein-Gordon equation becomes

${(\Lambda^{-1})_{\rho}}^{\mu}\partial^{\rho}{(\Lambda^{-1})_{\sigma}}^{\mu}\partial^{\sigma}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0$

$\implies {(\Lambda^{-1})_{\rho}}^{\mu}{(\Lambda^{-1})_{\sigma}}^{\mu}\partial^{\rho}\partial^{\sigma}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0$

$\implies {(\Lambda^{-1})_{\rho}}^{\mu}{\Lambda^{\mu}}_{\sigma}\partial^{\rho}\partial^{\sigma}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0$

Am I correct so far?

2. Apr 20, 2016

### strangerep

You've misplaced one of your $\mu$'s. One must be upstairs, and the other downstairs, as in your original KG eqn.

3. Apr 21, 2016

### spaghetti3451

Ok. Let me start again.

The Klein-Gordon equation is $\partial^{\mu}\partial_{\mu}\varphi(x) + m^{2}\varphi(x)=0$.

Under a Lorentz transformation $x \rightarrow \Lambda x$, the Klein-Gordon equation becomes

${(\Lambda^{-1})_{\rho}}^{\mu}\partial^{\rho}{(\Lambda^{-1})^{\sigma}}_{\mu}\partial_{\sigma}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0$

$\implies {(\Lambda^{-1})_{\rho}}^{\mu}{(\Lambda^{-1})^{\sigma}}_{\mu}\partial^{\rho}\partial_{\sigma}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0$

$\implies {(\Lambda^{-1})_{\rho}}^{\mu}{(\Lambda)_{\mu}}^{\sigma}\partial^{\rho}\partial_{\sigma}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0$

$\implies {\eta_{\rho}}^{\sigma}\partial^{\rho}\partial_{\sigma}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0$

$\implies \partial^{\rho}\partial_{\rho}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0$

Doesn't this mean that the Klein-Gordon is Lorentz invariant?

4. Apr 21, 2016

### strangerep

It looks ok to me.

5. Apr 21, 2016

### spaghetti3451

In retrospect, I think I should put brackets around $\partial^{\rho}\partial_{\sigma}\phi$ starting from the fourth line of my previous post.

That's because we have $(\partial^{\rho}\partial_{\sigma}\phi)(\Lambda^{-1}x)$, i.e. $\rho$ and $\sigma$ ought of be indices of the argument $\Lambda^{-1}x$, not of the the argument $x$.

6. Apr 21, 2016

### strangerep

Indeed. I wasn't sure how pedantic you wanted to get. One could rephrase the problem as: show that the equation $$\eta^{\mu\nu} \frac{\partial^2}{\partial x^\mu \, \partial x^\nu} \, \phi(x) ~+~ m^2 \phi(x) ~=~ 0$$remains form-invariant under the change of coordinates $x^\mu \to y^\mu = \Lambda^\mu_{~\sigma} x^\sigma$ ...

Strictly speaking, you'd also have to show that the domain of $y$ for which the equation makes sense coincides with the original domain of $x$.

7. Apr 22, 2016

### spaghetti3451

How'd you go about proving that the domains of $x$ and $y$ coincide?

8. Apr 22, 2016

### strangerep

Well, first state the domain of $x$. All of Minkowski space, right? Then prove that the coordinate mappings specified by any Lorentz transformation are bijective. (This is actually rather easy -- don't over-think it. If you need a simpler warmup, consider ordinary 3-space rotations first, then generalize to Lorentz.)

9. Apr 22, 2016

### strangerep

Another way to make the original problem notationally simpler is to note that $m^2 \phi$ is already a Lorentz scalar. Then re-express the 1st term like a linear algebra equation. I.e., think of $\partial$ as a 4-component column vector. The Laplacian operator can then be expressed as $$\Delta ~=~ \partial^T \eta \partial ~,$$where $\eta$ is the 4x4 Minkowski metric matrix. Also note that since $\Lambda$ is independent of $x$, it can pass through the derivative operators if necessary. In the transformed coordinate system, the new Laplacian, i.e., $\Delta'$ is $$\Delta' ~=~ \partial'^T \eta' \partial' ~=~ (\Lambda\partial)^T (\Lambda \eta \Lambda^{-1}) (\Lambda\partial) ~=~ \partial^T \Lambda^T \Lambda \eta \Lambda^{-1}\Lambda\partial ~=~ \partial^T \eta \partial ~=~ \Delta~,$$since $\eta$ is invariant under Lorentz transformations, and $\Lambda^T = \Lambda^{-1}$.

[Edit: I just realized I was using the wrong symbol for the Laplacian. Corrected now.]

Last edited: Apr 22, 2016
10. Apr 22, 2016

### spaghetti3451

To show that the coordinate transformation specified by some Lorentz transformation is bijective, we need to prove that the Lorentz tranformation is both injective and surjective.

An injective transformation is such that two distinct elements of its codomain is mapped to from distinct elements of its domain. Given two spacetime points $x_{1}$ and $x_{2}$ in our domain, a particular Lorentz tranformation $\Lambda$ maps the two spacetime points to different spacetime points $y_{1}$ and $y_{2}$ in our codomain.

A surjective transformation is such that every element of the codomain is mapped to from element(s) of the domain. Given a spacetime point $y$ in our codomain, a particular Lorentz tranformation $\Lambda$ maps at least one spcaetime point $x$ in our domain to the spacetime point $y$ in our codomain.

Not really sure if this constitutes a proof, though.

This is really nice!

11. Apr 22, 2016

### strangerep

You can probably take a shortcut by appealing to the fact that Lorentz transformations form a group. Then, having shown that any point in Minkowski space is mapped by an arbitrary Lorentz transformation into another point in Minkowski space, you're done -- because every element of a group has an inverse, and that's only possible here if the transformations are indeed bijective. (If it failed the injectivity requirement, a well-defined inverse transformation wouldn't exist.)

Yes -- it's good to become "bilingual". I.e., able to work with both index-free and index-full notations.

12. Apr 22, 2016

### spaghetti3451

Let me try question $2$.

First, let me do it the hard way:

$\mathcal{L}_{\text{Maxwell}}=-\frac{1}{4}F_{\mu\nu}F^{\mu\nu}=-\frac{1}{4}(\partial_{\mu}A_{\nu}-\partial_{\nu}A_{\mu})(\partial^{\mu}A^{\nu}-\partial^{\nu}A^{\mu})$

$\rightarrow -\frac{1}{4}[{(\Lambda^{-1})^{\rho}}_{\mu}{(\Lambda)_{\nu}}^{\sigma}(\partial_{\rho}A_{\sigma})(\Lambda^{-1}x)-{(\Lambda^{-1})^{\rho}}_{\nu}{(\Lambda)_{\mu}}^{\sigma}(\partial_{\rho}A_{\sigma})(\Lambda^{-1}x)][{(\Lambda^{-1})_{\alpha}}^{\mu}{(\Lambda)^{\nu}}_{\beta}(\partial^{\alpha}A^{\beta})(\Lambda^{-1}x)-{(\Lambda^{-1})_{\alpha}}^{\nu}{(\Lambda)^{\mu}}_{\beta}(\partial^{\alpha}A^{\beta})(\Lambda^{-1}x)]$

$= -\frac{1}{4}[{(\Lambda^{-1})^{\rho}}_{\mu}{(\Lambda)_{\nu}}^{\sigma}{(\Lambda^{-1})_{\alpha}}^{\mu}{(\Lambda)^{\nu}}_{\beta}-{(\Lambda^{-1})^{\rho}}_{\nu}{(\Lambda)_{\mu}}^{\sigma}{(\Lambda^{-1})_{\alpha}}^{\mu}{(\Lambda)^{\nu}}_{\beta}-{(\Lambda^{-1})^{\rho}}_{\mu}{(\Lambda)_{\nu}}^{\sigma}{(\Lambda^{-1})_{\alpha}}^{\nu}{(\Lambda)^{\mu}}_{\beta}+{(\Lambda^{-1})^{\rho}}_{\nu}{(\Lambda)_{\mu}}^{\sigma}](\partial_{\rho}A_{\sigma})(\Lambda^{-1}x)(\partial^{\alpha}A^{\beta})(\Lambda^{-1}x)$

$= -\frac{1}{4}[{(\Lambda^{-1})^{\rho}}_{\mu}{(\Lambda)^{\mu}}_{\alpha}{(\Lambda^{-1})^{\sigma}}_{\nu}{(\Lambda)^{\nu}}_{\beta}-{(\Lambda^{-1})^{\rho}}_{\nu}{(\Lambda)^{\nu}}_{\beta}{(\Lambda^{-1})_{\alpha}}^{\mu}{(\Lambda)_{\mu}}^{\sigma}-{(\Lambda^{-1})^{\rho}}_{\mu}{(\Lambda)^{\mu}}_{\beta}{(\Lambda^{-1})_{\alpha}}^{\nu}{(\Lambda)_{\nu}}^{\sigma}+{(\Lambda^{-1})^{\rho}}_{\nu}{(\Lambda)^{\nu}}_{\alpha}{(\Lambda^{-1})^{\sigma}}_{\mu}{(\Lambda)^{\mu}}_{\beta}](\partial_{\rho}A_{\sigma})(\Lambda^{-1}x)(\partial^{\alpha}A^{\beta})(\Lambda^{-1}x)$

$= -\frac{1}{4}[{\eta^{\rho}}_{\alpha}{\eta^{\sigma}}_{\beta}-{\eta^{\rho}}_{\beta}{\eta_{\alpha}}^{\sigma}-{\eta^{\rho}}_{\beta}{\eta_{\alpha}}^{\sigma}+{\eta^{\rho}}_{\alpha}{\eta^{\sigma}}_{\beta}](\partial_{\rho}A_{\sigma})(\Lambda^{-1}x)(\partial^{\alpha}A^{\beta})(\Lambda^{-1}x)$

$=-\frac{1}{4}[(\partial_{\rho}A_{\sigma})(\Lambda^{-1}x)(\partial^{\rho}A^{\sigma})(\Lambda^{-1}x)-(\partial_{\rho}\partial_{\alpha})(\Lambda^{-1}x)(\partial^{\alpha}A^{\rho})(\Lambda^{-1}x)-(\partial_{\rho}\partial_{\alpha})(\Lambda^{-1}x)(\partial^{\alpha}A^{\rho})(\Lambda^{-1}x)+(\partial_{\rho}A_{\sigma})(\Lambda^{-1}x)(\partial^{\rho}A^{\sigma})(\Lambda^{-1}x)]$

$=-\frac{1}{4}[(\partial_{\rho}\partial_{\sigma})(\Lambda^{-1}x)-(\partial_{\sigma}\partial_{\rho})(\Lambda^{-1}x)][(\partial^{\rho}\partial^{\sigma})(\Lambda^{-1}x)-(\partial^{\sigma}\partial^{\rho})(\Lambda^{-1}x)]$

$-\frac{1}{4}F_{\rho\sigma}(\Lambda^{-1}x)F^{\rho\sigma}(\Lambda^{-1}x)$.

Now, the easy way:

$\mathcal{L}_{\text{Maxwell}} = -\frac{1}{4}F_{\mu\nu}(x)F^{\mu\nu}(x)$

$\rightarrow -\frac{1}{4}{(\Lambda)_{\mu}}^{\rho}{(\Lambda)_{\nu}}^{\sigma}{(\Lambda)^{\mu}}_{\alpha}{(\Lambda)^{\nu}}_{\beta}F_{\rho\sigma}(\Lambda^{-1}x)F^{\alpha\beta}(\Lambda^{-1}x)$

$= -\frac{1}{4}{(\Lambda^{-1})^{\rho}}_{\mu}{(\Lambda)^{\mu}}_{\alpha}{(\Lambda^{-1})^{\sigma}}_{\nu}{(\Lambda^{-1})^{\nu}}_{\beta}F_{\rho\sigma}(\Lambda^{-1}x)F^{\alpha\beta}(\Lambda^{-1}x)$

$= -\frac{1}{4} {\eta^{\rho}}_{\alpha} {\eta^{\sigma}}_{\beta} F_{\rho\sigma}(\Lambda^{-1}x)F^{\alpha\beta}(\Lambda^{-1}x)$

$= -\frac{1}{4} F_{\rho\sigma}(\Lambda^{-1}x)F^{\rho\sigma}(\Lambda^{-1}x)$.

What do you think?

13. Apr 22, 2016

### strangerep

Well, I guess it's ok to thrash yourself in this masochistic fashion once in a while, if you feel that your sins warrant it.

Personally, I would have simply showed that $F_{\mu\nu}$ does indeed transform as a 2nd-rank tensor under Lorentz transformations (for which a cut-down edit of your first bit would be enough). Then show that any contraction between upper and lower indices of a vector results in a scalar. Then generalize this to a double contraction over the 2 indices in a 2nd-rank tensor (which is also a scalar, hence Lorentz-invariant).

I.e., build up a little toolkit for yourself so that you can eventually just look at a tensorial expression involving contractions and understand what rank tensor the overall expression represents.

14. Apr 24, 2016

### spaghetti3451

I did not notice that $F_{\mu\nu}F^{\mu\nu}$ involved a double contraction. Obviously, a (double) contraction results in a scalar.

But, let me try to explictly demonstrate the Lorentz invariance of $\mathcal{L}_{Maxwell}$ using your guidelines anyway.

Firstly, I need to show that $F_{\mu\nu}$ transforms as a $(0,2)$ tensor under Lorentz transformations. To demonstrate this fact, I will use the fact that $F_{\mu\nu}=\partial_{\mu}A_{\nu}-\partial_{\nu}A_{\mu}$ and that $A_{\mu}$ is a vector.

$F_{\mu\nu}(x) = (\partial_{\mu}A_{\nu}-\partial_{\nu}A_{\mu})(x)$
$\rightarrow {(\Lambda^{-1})^{\rho}}_{\mu}{(\Lambda)_{\nu}}^{\sigma}(\partial_{\rho}A_{\sigma}-\partial_{\sigma}A_{\rho})(\Lambda^{-1}x)$
$= {(\Lambda)_{\mu}}^{\rho}{(\Lambda)_{\nu}}^{\sigma}(\partial_{\rho}A_{\sigma}-\partial_{\sigma}A_{\rho})(\Lambda^{-1}x)$
$= {(\Lambda)_{\mu}}^{\rho}{(\Lambda)_{\nu}}^{\sigma}F_{\rho\sigma}(\Lambda^{-1}x)$

so that $F_{\mu\nu}$ transforms as a $(0,2)$ tensor under Lorentz transformations.

Then I need to show that any contraction between the upper index of one vector and the lower index of another vector results in a scalar. And finally, I need to generalize this to a double contraction over the 2 indices in a 2nd-rank tensor.

To show that the contractions lead to a scalar, don't I have to Lorentz transform both $F^{\mu\nu}(x)$ and $F_{\mu\nu}(x)$ and show that the resulting Lorentz transformation tensors cancel out among themselves to give us $F^{\mu\nu}(\Lambda^{-1}x)$ and $F_{\mu\nu}(\Lambda^{-1}x)$. That's exactly what I've done in the second proof of my previous post without really knowing that I was Lorentz transforming the scalar $F^{\mu\nu}F_{\mu\nu}$.

So, I guess I can use your guidelines as an informal way to convince myself of the Lorentz invariance of $\mathcal{L}_{Maxwell}$, and use my second proof of the previous post as a formal way to prove the Lorentz invariance of $\mathcal{L}_{Maxwell}$.

15. Apr 24, 2016

### strangerep

You could shorten this further by just considering $\partial_{\mu}A_{\nu}$ first. Then appeal to the fact that the sum of 2 tensors of the same rank (i.e., with the same sets of free indices) is also a tensor of that rank.

You could shorten this by simply noting that a contraction like $v^\alpha v_\alpha$ can be written similarly to what I did in post #9, i.e., $v^T \eta v$. For the double contraction on a 2-tensor, you can write it like this: $$F^{\mu\nu} F_{\mu\nu} ~=~ Tr ( F^T \eta F \eta ) ~.$$Then replace all quantities by their dashed versions, substitute in the Lorentz matrices, and perform manipulations that generalize what I did in post #9. But, tbh, this is probably overkill, because... (see below)...

Well, it's more than just "informal". When one writes $v^\alpha w_\alpha$, that's an inner product on vectors in Minkowski space. Lorentz transformations are designed to preserve all such inner products. Indeed, that's one way of defining the Lorentz group. So $v^\alpha w_\alpha$ is automatically a Lorentz scalar by definition.

16. Apr 25, 2016

### spaghetti3451

Can you please explain why $F^{\mu\nu} F_{\mu\nu} ~=~ Tr ( F^T \eta F \eta ) ~$? I have never seen a double contraction written in index-free notation before, hence the query.

Weren't Lorentz transformations originally designed as a consequence of the postulates of special relativity? I always thought that an inner product is a Lorentz scalar due to the Lorentz transformation rules of a $4$-vector, not the other way round.

17. Apr 25, 2016

### strangerep

$$F^{\mu\nu} F_{\mu\nu} = F^{\mu\nu} F^{\alpha\beta} \eta_{\mu\alpha} \eta_{\nu\beta} = F^{\mu\nu} \eta_{\mu\alpha} F^{\alpha\beta} \eta_{\nu\beta} = (F^T)^{\nu\mu} \eta_{\mu\alpha} F^{\alpha\beta} \eta_{\beta\nu}~,$$where, in the last step I've used the definition of "transpose" and the symmetry of the metric $\eta$. The "Tr" operation is just there to perform the same task in the index-free version as contracting over the last index $\nu$.

If you're still having trouble with this, try writing out the ordinary product of two 3x3 matrices $A,B$ and the resultant matrix $C$ (i.e., $AB=C$) in component notation. And do you understand what $Tr(C)$ means in that context?

Lorentz transformations were discovered to be applicable in that context (actually they were discovered earlier than Einstein, but let's not get into that).

The logic is bidirectional. The modern definition of the Lorentz group is simply the group of matrices that preserve an indefinite inner product of the form $ds^2 = -dt^2 + dx^2 + dy^2 + dz^2$. Hence the group is often written as $SO(1,3)$. One can represent elements of the (abstract) Lorentz group as coordinate transformations in Minkowski space but there are also other possible representations. (A "representation" of a group is mapping of the elements of the abstract group to operators on some vector space.)

18. Apr 26, 2016

Thanks!