# Lorentz invariance and equation of motion for a scalar field

• I
Hi there,

I just saw some lectures where they claim that the Klein Gordon equation is the lowest order equation which is Lorentz invariant for a scalar field.
But I could easily come up with a Lorentz invariant equation that is first order, e.g.
$$(M^\mu\partial_\mu + m^2)\phi=0$$
where M is a generic matrix.
Now, something should be wrong with this equation, because, as Dirac showed, if we want a first order equation the field needs to be a spinor.

But I don't clearly understand why this first order equation is not Lorentz invariant. I mean, $$M^\mu\partial_\mu$$ is a scalar, so the equation is invariant, isn't it?
Is it maybe because the matrix M changes form by changing reference system, so that we could find privileged systems (e.g. a reference where the matrix is diagonal)?

hilbert2
Gold Member
Let's say there's only one space coordinate, ##x##, in addition to the time coordinate ##t##. Then that first order equation would be

##\left(a\frac{\partial}{\partial t} - b\frac{\partial}{\partial x} + m^2 \right)\phi (x,t) = 0##,

with ##a## and ##b## some constants. This kind of an equation is called advection equation, as it has a first time derivative and first space derivative, but there's also the ##m^2 \phi## term which acts like a source term that depends on how large the ##\phi## already is at some point.

You could first assume that ##m=0## and read a bit about the advection equation to deduce whether it would be of any use as a field equation in physics.

vanhees71
Gold Member
2021 Award
For the scalar field to be interpretible as free particles it should also imply the "on-shell condition", i.e.,
$$(\Box+m^2) \Phi(x)=0.$$
The heuristic approach, however leads in the most simple case to Dirac spinors, i.e., spin-1/2 particles+antiparticles rather than spin-0 particles.

For a systematic understanding, it's necessary to study the famous analysis on the unitary representations of the Poincare group and their realization through local field operators. A very good introduction is given in

R. Sexl, H. Urbandtke, Relativity, Groups, Particles, Springer

Let's say there's only one space coordinate, ##x##, in addition to the time coordinate ##t##. Then that first order equation would be

##\left(a\frac{\partial}{\partial t} - b\frac{\partial}{\partial x} + m^2 \right)\phi (x,t) = 0##,

with ##a## and ##b## some constants. This kind of an equation is called advection equation, as it has a first time derivative and first space derivative, but there's also the ##m^2 \phi## term which acts like a source term that depends on how large the ##\phi## already is at some point.

You could first assume that ##m=0## and read a bit about the advection equation to deduce whether it would be of any use as a field equation in physics.

I've read something about this equation, but still I don't get why we cannot use it. It has to do with the fact that, as pointed out by vanhees71, with such an equation the on shell condition is not met?

For the scalar field to be interpretible as free particles it should also imply the "on-shell condition", i.e.,
$$(\Box+m^2) \Phi(x)=0.$$
The heuristic approach, however leads in the most simple case to Dirac spinors, i.e., spin-1/2 particles+antiparticles rather than spin-0 particles.

For a systematic understanding, it's necessary to study the famous analysis on the unitary representations of the Poincare group and their realization through local field operators. A very good introduction is given in

R. Sexl, H. Urbandtke, Relativity, Groups, Particles, Springer
Thanks for the reference, I admit my knowledge of group theory is still in its infancy. I still have to read the book, but as far as I remember the Lorentz group being noncompact has not representations that are unitary, isn't it?

vanhees71
Gold Member
2021 Award
The Lorentz group or, more importantly, the entire Poincare group has no unitary finite-dimensional representations but fortunately it has many physically useful "infinite-dimensional" unitary representations. That's why the quantum mechanical Hilbert spaces of relativistic (as well as nonrelativistic) systems has an infinite dimension.

Demystifier
Gold Member
But I could easily come up with a Lorentz invariant equation that is first order, e.g.
$$(M^\mu\partial_\mu + m^2)\phi=0$$
where M is a generic matrix.
Now, something should be wrong with this equation, because, as Dirac showed, if we want a first order equation the field needs to be a spinor.

But I don't clearly understand why this first order equation is not Lorentz invariant. I mean, $$M^\mu\partial_\mu$$ is a scalar, so the equation is invariant, isn't it?
Is it maybe because the matrix M changes form by changing reference system, so that we could find privileged systems (e.g. a reference where the matrix is diagonal)?
Since you suggest that ##\phi## transforms as a scalar and ##M^\mu## as a vector under Lorentz transformations, you will probably find interesting that Dirac equation can also be interpreted in that way: https://lanl.arxiv.org/abs/1309.7070

hilbert2
Gold Member
I've read something about this equation, but still I don't get why we cannot use it. It has to do with the fact that, as pointed out by vanhees71, with such an equation the on shell condition is not met?

If you have a 1D advection equation for function ##\phi (x,t)##, the time evolution of an initial state ##\phi (x,t_0 )## is just a translation with constant speed ##v##:

##\phi (x,t_0 + \Delta t) = \phi (x + v\Delta t,t_0 )##.

In the 2D or 3D cases, it is a similar translation to the direction of some velocity vector ##\vec{v}##. There's not much room for any interesting physics in that kind of time evolution. The term dependent on ##m^2##, if not zero, will only make the norm of the function ##\phi## grow or decrease exponentially (if it's normalizable in the first place).

samalkhaiat
But I could easily come up with a Lorentz invariant equation that is first order, e.g.
$$(M^\mu\partial_\mu + m^2)\phi=0$$
where M is a generic matrix.
Now, something should be wrong with this equation
How about, almost everything is wrong with that equation:
1) If, as you say, $M^{\mu}$ is not $\partial^{\mu}$ but a “generic matrix”, then the expression $M^{\mu}\partial_{\mu} + m^{2}$ is meaningless because each term has different physical unit. In the natural units, the dimension of the first term is $\mbox{cm}^{-1}$ while the dimension of $m^{2}$ is $\mbox{cm}^{-2}$.
2) You said that $\phi$ is a scalar field. In 4-dimensional spacetime, a (real) scalar field can be described either by a single function or (equivalently) by a 5-component field treating $(\phi , \partial_{\mu}\phi )$ as independent variables. In the first case (i.e., when $\phi$ is a 1-component field), your “matrices” $M^{\mu}$ must be $1 \times 1$ matrices. Thus, you must take $M^{\mu} = \partial^{\mu}$ so that the correct dispersion relation $E^{2} = P^{2} + m^{2}$ holds. In the second (5-component) case, the correct first-order equation (for a real scalar field) looks exactly like Dirac equation $$\left( i \Gamma^{\mu} \partial_{\mu} - m \right) \Psi (x) = 0 , \ \ \ \ \ \ \ \ \ \ (1)$$ with $$\Psi = \left( \varphi , \psi_{0} , \psi_{1}, \psi_{2} , \psi_{3} \right)^{T} ,$$ and the $\Gamma$’s are a set of four $5 \times 5$ matrices satisfying the Duffin-Kemmer algebra $$\Gamma^{\mu}\Gamma^{\rho}\Gamma^{\nu} + \Gamma^{\nu}\Gamma^{\rho}\Gamma^{\mu} = \eta^{\mu \rho}\Gamma^{\nu} + \eta^{\nu\rho}\Gamma^{\mu} . \ \ \ \ \ \ (2)$$ Notice that (the Duffin-Kemmer equation) Eq(1) has no $m^{2}$ term in it. Within the representation theory, the Duffin-Kemmer equation has beautiful interpretation. However, since your knowledge in group theory is (unfortunately) poor, bellow I will only show you how to obtain the DK equation (1) from the following Klein-Gordon equation of real (1-component) scalar field $$\partial^{\mu}\partial_{\mu} \phi (x) + m^{2} \phi (x) = 0 . \ \ \ \ \ \ \ \ \ \ \ \ \ (3)$$ So we want to reduce (3) into a system of five (coupled) first-order equations. To do this we define a scalar field by $$\varphi (x) = \sqrt{m} \phi (x) ,$$ and a 4-vector field by $$\psi_{\mu}(x) = \frac{1}{\sqrt{m}} \partial_{\mu}\phi (x) .$$ Thus, the KG equation (3) can be replaced by the following equivalent system of first-order equations $$\partial^{\mu}\psi_{\mu} (x) + m \varphi (x) = 0 ,$$$$\partial_{\mu} \varphi (x) - m \psi_{\mu}(x) = 0 .$$ These can easily be rewritten as matrix equation
$$\begin{pmatrix} - m & - \partial_{0} & \partial_{1} & \partial_{2} & \partial_{3} \\ \partial_{0} & - m & 0 & 0 & 0 \\ \partial_{1} & 0 & - m & 0 & 0 \\ \partial_{2} & 0 & 0 & - m & 0 \\ \partial_{3} & 0 & 0 & 0 & - m \end{pmatrix} \begin{pmatrix} \varphi (x) \\ \psi_{0}(x) \\ \psi_{1}(x) \\ \psi_{2}(x) \\ \psi_{3}(x) \end{pmatrix} = 0 .$$ This is exactly the Duffin-Kemmer equation (1) with the $\Gamma$’s given by $$\Gamma^{0} = \begin{pmatrix} 0 & i & 0 & 0 & 0 \\ - i & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{pmatrix} , \ \ \ \Gamma^{1} = \begin{pmatrix} 0 & 0 & - i & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ - i & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{pmatrix} ,$$ $$\Gamma^{2} = \begin{pmatrix} 0 & 0 & 0 & - i & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ - i & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{pmatrix} , \ \ \ \Gamma^{3} = \begin{pmatrix} 0 & 0 & 0 & 0 & - i \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ - i & 0 & 0 & 0 & 0 \end{pmatrix} ,$$ and the Duffin-Kemmer field $\Psi (x)$ is given by $$\Psi (x) = \begin{pmatrix} \varphi (x) \\ \psi_{0}(x) \\ \psi_{1}(x) \\ \psi_{2}(x) \\ \psi_{3}(x) \end{pmatrix} .$$
Similarly, you can equivalently write the Proca equation $$\partial_{\mu}F^{\mu\nu} + m^{2}A^{\nu} = 0 ,$$ which describes a free massive spin-1 field, as a Duffin-Kemmer equation, with four $10 \times 10$ $\Gamma$'s. I leave this as exercise for you.

Last edited:
• Vampyr, dextercioby and vanhees71
hilbert2
Gold Member
How about, almost everything is wrong with that equation:
1) If, as you say, $M^{\mu}$ is not $\partial^{\mu}$ but a “generic matrix”, then the expression $M^{\mu}\partial_{\mu} + m^{2}$ is meaningless because each term has different physical unit. In the natural units, the dimension of the first term is $\mbox{cm}^{-1}$ while the dimension of $m^{2}$ is $\mbox{cm}^{-2}$..

I was thinking that the ##M^\mu## is just a four-vector where the components are simple numbers with freely chosen dimensions. It can be called a "matrix" with only one row or column in it. I guess you're assuming here that the components of ##M^\mu## can be matrices that act in the space of some indices other than the Minkowski ones.

samalkhaiat
I was thinking that the ##M^\mu## is just a four-vector where the components are simple numbers with freely chosen dimensions.
This thread is about relativistic field equations. So, in relativistic field theories, the expression $M^{\mu}\partial_{\mu} + m^{2}$ is meaningful differential operator if and only if $M^{\mu} = \partial^{\mu}$.
I guess you're assuming here that the components of ##M^\mu## can be matrices that act in the space of some indices other than the Minkowski ones.
No, I made no such assumption because it is not true in general. The indices on the Duffin-Kemmer field are spacetime indices acted upon by the matrices $\Gamma^{\mu}$. So, the lessons from #8 are: (1) First-order relativistic field equations have no $m^{2}$ term in them, and (2) All relativistic multi-component fields satisfy the Klein-Gordon equation (which has $m^{2}$ term) component by component.

hilbert2