# Dirac's solution to the Klein-Gordon equation

• A
Dirac wanted to fix the problems with the Klein-Gordon equation by seeking a new solution to it.

He wanted a relativistic solution so it makes sense that the solution needed to satisfy Einstein's energy-momentum relation.

But why did it need to be of first order in time- and spatial-derivatives?

Is that a requirement for it to be invariance under Lorentz transformation?

(Don't be to hard on me, I'm just getting started with this stuff :) )

vanhees71
Gold Member
The original argument by Dirac was flawed. His heuristics, however, lead to an utmost important step in the way to our recent understanding, namely the Dirac equation for spin-1/2 particles. What Dirac first missed was the fact that there's no consistent one-particle theory for interacting relativistic particles (or even for the simpler case of relativistic particles in an external field). Nowadays the reason for this is obvious, because we have a lot of experience with relativistic particles by the use of high-energy particle accelerators: At relativistic energies interactions among particles or of particles moving in an external field there's always some probability that new particles get created or the original particles get destroyed and other particles or radiation is produced. That's why a single-particle interpretation of the Dirac equation (and any other relativistic wave equation) fails.

Dirac's true genius concerning his discovery of his wave equation thus is that he realized precisely this: You need a many-particle picture to interpret the equation right. Then he had another ingenious idea, namely the "Dirac sea". The idea was to get rid of the "negative-energy states" by assuming that these states are all occupied with electrons. In interactions at high enough energies you can kick out electrons of this sea, and since we interpret the completely filled sea (and no electrons present) as "the vacuum", the holes in the sea appear as positively charged particles with all other properties (mass and spin) the same as for electrons. That was the prediction of the existence of anti-electrons or positrons. Dirac could even work out quantum electro dynamics in terms of this hole theory. This formulation of QED is, however pretty complicated, and it's clear that the "Dirac sea" is another heuristics, because we don't see an infinite negative charge anywhere.

That's why nowadays we don't teach the old-fashioned hole theory anymore but formulate relativistic QT right away as relativistic QFT, which is the natural formulation for the situation that particles get created and destroyed in interactions.

stevendaryl
Staff Emeritus
The Klein-Gordon equation has a big problem if it is to be interpreted as the relativistic generalization of the Schrodinger equation. For the Schrodinger equation, you can form a "conserved current"

##\rho = \psi^* \psi##
##\overrightarrow{J} = \frac{-i}{2m} (\psi^* (\nabla \psi) - (\nabla \psi^*) \psi)##

##\frac{d \rho}{dt} + \nabla \cdot \overrightarrow{J} = 0##

This allows ##\rho## to be interpreted as a probability density. If you try to analogously come up with a conserved current for the massive Klein Gordon equation, you can come up with one:

##\rho = \frac{-i}{2m} (\dot{\phi^*} \phi - \phi^* \dot{\phi})##
##\overrightarrow{J} = \frac{-i}{2m} ((\nabla \phi^*) \psi - \phi^* (\nabla \phi))##

This has the same nonrelativistic limit as the Schrodinger current, if you make the identification ##\phi = e^{-i m t} \psi##. However, the ##\rho## for Klein-Gordon is not guaranteed to be positive. So you can't interpret it as a probability density. It can be reinterpreted as a charge density, where the negative sign indicates the presence of oppositely charged anti-particles, but not as a probability. So the Klein-Gordon equation just can't be interpreted as giving probabilities for particle positions, the way that the Schrodinger equation can.

In contrast, there is a conserved current for the Dirac equation such that the corresponding ##rho## is always positive:

##\rho = \psi^\dagger \psi##
##\overrightarrow{J} = \psi^\dagger \overrightarrow{\alpha} \psi##

(where ##\overrightarrow{\alpha}## is a vector made from the Dirac matrices)

So ##\psi^\dagger \psi## can be interpreted as a probability density in a way that there is nothing comparable in the Klein-Gordon equation.

But why did it need to be of first order in time- and spatial-derivatives?

Is that a requirement for it to be invariance under Lorentz transformation?
The equation does not need to be of the first order for the following reason.

The Dirac equation is a system of four first-order equations for four components of the Dirac spinor. However, one can algebraically eliminate two of the components from the Dirac equation in the chiral representation of the gamma-matrices and obtain an equivalent system of two second-order equations for two components (O. Laporte, G. E. Uhlenbeck, Phys. Rev., vol. 37, p. 1380 (1931); R. P. Feynman, M. Gell-Mann, Phys. Rev., vol. 109, p. 193 (1958)). Moreover, in a general case, another component can be algebraically eliminated, yielding an equivalent fourth-order equation for just one component (please see references in https://www.physicsforums.com/threa...-and-particles-with-spin.563974/#post-3690162).

So Dirac derived an epically successful equation based on a wrong assumption. I guess the end sanctifies the means in this case.

stevendaryl
Staff Emeritus
The equation does not need to be of the first order for the following reason.

The Dirac equation is a system of four first-order equations for four components of the Dirac spinor. However, one can algebraically eliminate two of the components from the Dirac equation in the chiral representation of the gamma-matrices and obtain an equivalent system of two second-order equations for two components (O. Laporte, G. E. Uhlenbeck, Phys. Rev., vol. 37, p. 1380 (1931); R. P. Feynman, M. Gell-Mann, Phys. Rev., vol. 109, p. 193 (1958)). Moreover, in a general case, another component can be algebraically eliminated, yielding an equivalent fourth-order equation for just one component (please see references in https://www.physicsforums.com/threa...-and-particles-with-spin.563974/#post-3690162).

So Dirac derived an epically successful equation based on a wrong assumption. I guess the end sanctifies the means in this case.
The issue that I pointed out is that the quantity ##\psi^\dagger \psi## in the Dirac equation is a nonnegative quantity whose integral is conserved. So it's a candidate for a probability density. In the case of the Klein-Gordon equation, the conserved current is not always positive, so it can't be interpreted as a probability density.

I think that this is related to it's being first-order. If you have a first-order equation of the form: ##\dot{\psi} = - i H \psi##, then you can form a positive density by taking ##\rho = \psi^\dagger \psi##. Then you have ##\frac{d \rho}{dt} = -i [(\psi^\dagger H \psi) - (\psi^\dagger H \psi)^*]##. To have a conserved current, you have to be able to write ##-i [(\psi^\dagger H \psi) - (\psi^\dagger H \psi)^*]## as ##-\nabla \cdot \overrightarrow{J}## for some quantity ##\overrightarrow{J}##. You can do that with Schrodinger and with Dirac, but not with Klein-Gordon.

weirdoguy
The issue that I pointed out is that the quantity ##\psi^\dagger \psi## in the Dirac equation is a nonnegative quantity whose integral is conserved. So it's a candidate for a probability density. In the case of the Klein-Gordon equation, the conserved current is not always positive, so it can't be interpreted as a probability density.

I think that this is related to it's being first-order. If you have a first-order equation of the form: ##\dot{\psi} = - i H \psi##, then you can form a positive density by taking ##\rho = \psi^\dagger \psi##. Then you have ##\frac{d \rho}{dt} = -i [(\psi^\dagger H \psi) - (\psi^\dagger H \psi)^*]##. To have a conserved current, you have to be able to write ##-i [(\psi^\dagger H \psi) - (\psi^\dagger H \psi)^*]## as ##-\nabla \cdot \overrightarrow{J}## for some quantity ##\overrightarrow{J}##. You can do that with Schrodinger and with Dirac, but not with Klein-Gordon.
The system of the second order equations and the fourth-order equation that I mentioned are equivalent to the Dirac equation and have the same current with the same positively defined zeroth component, so one can have this property without first-order equations.