I Why does the Dirac equation lead to spin 1/2?

Vampyr

Summary
Why does the derivation of the Dirac equation naturally lead to spin ½ particles? The equation is derived from very general starting assumptions, so which of these assumptions has to be wrong to give us a spin-0 or spin-1 particle?
Why does the derivation of the Dirac equation naturally lead to spin ½ particles? The equation is derived from very general starting assumptions, so which of these assumptions has to be wrong to give us a spin-0 or spin-1 particle?

I have tried to search for an answer and got as far as this quote from Dirac himself from the Principles of Quantum Mechanics:
"We are led to the value 1/2 h-cross for the spin of the electron by an argument depending simply on general principles of quantum theory and relativity. One could apply the same argument to other kinds of elementary particles and one would be led to the same conclusion, that the spin angular momentum is a half quantum. This would be satisfactory for the proton and the neutron, but there are some kinds of elementary particle (e.g. the photon and certain kinds of meson) whose spins are known experimentally to be different from 1/2 h-cross, so we have a discrepancy between our theory and experiment. The answer is to be found in a hidden assumption in our work. Our argument is valid only provided the position of the particle is an observable. If this assumption holds, the particle must have a spin angular momentum of half a quantum. For those particles that have a different spin the assumption must be false and any dynamical variables x1, x2, x3 that may be introduced to describe the position of the particle cannot be observables in accordance with our general theory. For such particles there is no true Schrodinger representation. One might be able to introduce a quasi wave function involving the dynamical variables x1, x2, x3, but it would not have the correct physical interpretation of a wave function - that the square of its modulus gives the probability density. For such particles there is still a momentum representation, which is sufficient for practical purposes."

I don’t understand what this means though. The Schrodinger equation produces wave functions where the position is an observable, yet it is only for spin-0 particles. What is Dirac trying to say here? Seems like it is quite a deep and fundamental point about the difference between spin ½ and other spin particles.

Related Quantum Physics News on Phys.org

George Jones

Staff Emeritus
Science Advisor
Gold Member
Summary: Why does the derivation of the Dirac equation naturally lead to spin ½ particles?
I think that it is the other way around: $m>0$ and $s=1/2$ leads to the Dirac equation!

This viewpoint was solidified by Wigner (1939), Bargmann and Wigner (1948), and others.

stevendaryl

Staff Emeritus
Science Advisor
I don’t understand what this means though. The Schrodinger equation produces wave functions where the position is an observable, yet it is only for spin-0 particles. What is Dirac trying to say here? Seems like it is quite a deep and fundamental point about the difference between spin ½ and other spin particles.
I don't really understand the argument, but the Schrodinger equation is nonrelativistic. If you try to come up with a relativistic wave equation that reduces to the Schrodinger equation in the nonrelativistic limit, then the spin-zero case corresponds to the Klein-Gordon equation. But the KG equation cannot be interpreted as an equation for a wave function. The reason why not is because there is nothing corresponding to a probability density for solutions of the KG equation. The quantity that corresponds to $|\psi|^2$ in the nonrelativistic limit is

$i ( \psi^* \frac{\partial \psi}{\partial t} - \psi \frac{\partial \psi^*}{\partial t})$

in the nonrelativistic limit, this becomes $2m |\psi|^2$. But unlike the nonrelativistic case, that quantity is not always positive, so it can't be interpreted as a probability density.

In contrast, the quantity $\psi^\dagger \psi$ for the Dirac equation is always positive, and so can be interpreted as a probability distribution.

dextercioby

Science Advisor
Homework Helper
Summary: Why does the derivation of the Dirac equation naturally lead to spin ½ particles? [...]
Because of the fundamental set of assumptions made by Dirac in 1928:

- Linearity in both space 3D and time 1D.
- Upon "squaring" the equation, the Klein-Gordon equation (which is basically the Einstein mass-energy-momentum formula upon Schrödinger quantization) is obtained.

The assumptions lead to a 4tuple of wavefunctions and the Dirac Clifford algebra -> Dirac 4x4 matrices -> spin 1/2.

Vampyr

The quantity that corresponds to $|\psi|^2$ in the nonrelativistic limit is

$i ( \psi^* \frac{\partial \psi}{\partial t} - \psi \frac{\partial \psi^*}{\partial t})$

in the nonrelativistic limit, this becomes $2m |\psi|^2$. But unlike the nonrelativistic case, that quantity is not always positive, so it can't be interpreted as a probability density.
Not sure if there is a typo in your sentence or not, but are you saying that $|\psi|^2$ only applies in the nonrelativistic case, and that in relativistic cases the more complicated version needs to be used to get the probability density? I've not seen that more complicated version before and am only familiar with $|\psi|^2$ as the probability density.

The Schrodinger equation lets us define a $|\psi|^2$ probability density, but I'll accept we can use this in general because it is not relativistic. The Klein-Gordon equation for spin-0 doesn't have a well defined probability density so follows Dirac's argument of not having position as an observible. What about spin-1 particles and the Proca equation? Or even more generally with other spins in both mass and massless cases? Is it only the spin-1/2 Dirac equation where the particle's position is an observable? I don't understand why starting out from general assumptions, only the spin 1/2 solution falls out of Dirac's equation.

Because of the fundamental set of assumptions made by Dirac in 1928:

- Linearity in both space 3D and time 1D.
I guess what I am getting at is why don't the photon, gluon (spin 1 massless), graviton (spin 2 massless) or W/Z bosons (spin 1 massive) fall out of Dirac's equation? Don't they all follow the same "fundamental set of assumptions" as spin 1/2 massive particles? Apparently not - there must be something in Dirac's assumptions that rule out all of these. Fromn his own quote in my original post it sounds like the vital assumption is that position has to be an observable. It's not obvious to me why that only applies to spin 1/2 massive particles.

stevendaryl

Staff Emeritus
Science Advisor
Not sure if there is a typo in your sentence or not, but are you saying that $|\psi|^2$ only applies in the nonrelativistic case, and that in relativistic cases the more complicated version needs to be used to get the probability density? I've not seen that more complicated version before and am only familiar with $|\psi|^2$ as the probability density.
For solutions of the Klein-Gordon equation, $|\psi|^2$ can't be interpreted as a probability, because it doesn't always integrate to 1. It isn't conserved.

Nonrelativistically, we can reason as follows: Let $P(t) = \int \psi^* \psi dx$. (Just consider one spatial dimension for simplicity). That should be the total probability, and should add up to 1. We can show that it's a constant (and so we can make it equal to 1 by normalizing our wave function appropriately) by taking a time derivative:

$\frac{dP}{dt} = \int (\dfrac{d \psi^*}{dt} \psi + \psi^* \dfrac{d\psi}{dt}) dx$

Since $\psi$ obeys the Schrodinger equation, we have:

$\dfrac{d\psi}{dt} = -i (\frac{-1}{2m} \nabla^2 + V(x)) \psi$

$\dfrac{d\psi^*}{dt} = +i (\frac{-1}{2m} \nabla^2 + V(x)) \psi$

So $\dfrac{d}{dt} \psi^* \psi = -i (\psi^* (\frac{-1}{2m} \nabla^2 + V(x)) \psi - \psi (\frac{-1}{2m} \nabla^2 + V(x)) \psi^*)$

The terms involving $V$ cancel, leaving:

So $\dfrac{d}{dt} \psi^* \psi = \frac{i}{2m} (\psi^* \nabla^2 \psi - \psi \nabla^2 \psi^*)$
$= \frac{i}{2m} \nabla \cdot (\psi^* \nabla \psi - \psi \nabla \psi^*)$

When you integrate something of the form $\nabla \cdot \overrightarrow{J}$ over all space, you get 0 if $\overrightarrow{J}$ goes to zero at infinity. (In our case, $\overrightarrow{J} = \psi^* \nabla \psi - \psi \nabla \psi*$) So we have:

$\int \dfrac{d}{dt} \psi^* \psi dx = 0$

So $P(t)$ is a constant, independent of $t$, and so can be interpreted as a probability. With the Klein-Gordon equation, however, the corresponding quantity is not a constant of the motion. $\frac{dP}{dt} \neq 0$. For the Klein-Gordon equation, you get a different conserved quantity:

$Q(t) = i \int (\psi^* \frac{\partial \psi}{\partial t} - \psi \frac{\partial \psi^*}{\partial t}) dx$

Since this can be positive or negative, it can be interpreted as a total charge, but not as a probability.

The Schrodinger equation lets us define a $|\psi|^2$ probability density, but I'll accept we can use this in general because it is not relativistic. The Klein-Gordon equation for spin-0 doesn't have a well defined probability density so follows Dirac's argument of not having position as an observible. What about spin-1 particles and the Proca equation? Or even more generally with other spins in both mass and massless cases? Is it only the spin-1/2 Dirac equation where the particle's position is an observable? I don't understand why starting out from general assumptions, only the spin 1/2 solution falls out of Dirac's equation.
As I said, I don't really know Dirac's argument. I was just speaking as to why the interpretation of $\psi* \psi$ as a probability density doesn't work relativistically for a spin-zero particle.

vanhees71

Science Advisor
Gold Member
It depends on, how you "derive" the Dirac equation. Dirac's "derivation" is pure magic and deep mathematical intuition. His idea was to solve the problem with the negative-energy states of the Klein-Gordon equation by finding a first-order-in-time differential equation like the non-relativistic Schrödinger equation. Due to the space-time structure of special relativity this suggests that this equation schould also be linear in the spatial derivatives. Then you need an operator of the form $\gamma^{\mu} \partial_{\mu}$ to be covariant and linear in the space-time derivatives. Then you'd also like to fulfill the on-shell condition for free particles, i.e., $\Box \psi^2=-m^2 \psi^2$. This leads to the conclusion that the $\gamma^{\mu}$ coefficients should fulfill the Clifford-algebra properties of Minkowski spacetime, i.e., $\{\gamma^{\mu},\gamma^{\nu} \}=2 \eta^{\mu \nu}$. Then Dirac realized that this can only be fulfilled for matrices of (at least) $\mathbb{C}^{4 \times 4}$ type.

Analysing the resulting free-particle equation $(\mathrm{i} \gamma^{\mu} \partial_{\mu} -m)\psi=0$ further, he looked for the plain-wave solutions ("momentum eigen states"). This lead him to the conclusion that he described two spin-1/2 particles, one pair with positive and one pair with negative frequencies. Thus the solutions with positive frequences can immediately be identified with particles of positive energy. As long as free particles are concerned you can just say that the negative-frequency solutions are unphysical.

However building the full theory including interactions (at Dirac's time of course electromagnetic interactions), you run into trouble. You can't simply neglect the negative-frequency solutions. After some errorneous ideas finally Dirac came to the idea the fill up all negative-frequency states and define this state as "vacuum" (the socalled Dirac-sea picture). Then only holes in the Dirac sea are of physical significance and those can be interpreted as positively charged electrons running in the other opposite direction but with positive frequency. That's how the necessity of the existence of anti-matter was predicted. Although being an awkward formulation of what's today known as QED, it's still equivalent to the modern QFT formulation of the same theory with all fields quantized, giving a quite consistent picture of relativistic QT as a many-body theory where particles and antiparticles can be created an destroyed in reactions, which is nowadays just what's frequently done in high-energy particle accelerators.

A more systematic derivation of the relativistic types of particles in terms of local relativistic QFT is to analyze the unitary representations of the proper orthochronous Poincare group, which systematically leads to the notion of massive and massive field quanta, spin (or rather angular momentum), the spin-statistics, and the CPT theorem. For a comprehensive treatment, see

S. Weinberg, Quantum Theory of Fields, Vol 1, Cambridge University Press.

bhobba

Mentor
It depends on, how you "derive" the Dirac equation. Dirac's "derivation" is pure magic and deep mathematical intuition.
Other physicists were in awe of his mathematical intuition, except Pauli who was in awe of nobody, even Einstein. Although I seem to recall Heisenberg thought his hole theory 'trash' - but not the equation itself. Landau went to hear a lecture on it (holes) and reported back to his colleges - useless. His 'discovery' of the equation left others shaking their head - this guy is just too clever by half. Einstein revered him so much he carried a copy of his Principles Of Quantum Mechanics everywhere he went - it was his bible on QM. The equation is sheer brilliance - the hole interpretation not so much - although everyone agreed it was still the work of a master. Interestingly, as Vanhees points out, when looked at correctly is was still basically right - maybe Landau, Heisenberg and Pauli should have been a bit less critical.

A good book for these little gems about the great man is:

Thanks
Bill

Last edited:

samalkhaiat

Science Advisor
I guess what I am getting at is why don't the photon, gluon (spin 1 massless), graviton (spin 2 massless) or W/Z bosons (spin 1 massive) fall out of Dirac's equation?
They do. There are two (related) issues you need to be aware of:
1) All fields (of arbitrary spin) can be made to satisfy Dirac-like equations. To see that happening for massive spin = 0, 1, and 3/2 fields, see
https://www.physicsforums.com/threads/lorentz-invariance-and-equation-of-motion-for-a-scalar-field.957092/post-6071636

https://www.physicsforums.com/threads/how-to-construct-a-spin-3-2-theory-from-the-ground-up.941219/post-5954706
2) Dirac’s equation is fundamental in one, and only one, aspect and that is the prediction of anti-particles and, therefore, the birth of relativistic QFT. Spin-1/2 is not a consequence of Dirac equation, i.e., it is not a relativistic effects. The concept of spin appears (as pointed out by dextercioby) naturally in the linearization method of Dirac. This is because linearizing the (relevant) dispersion relation is closely related to finding the double-valued representation of the (corresponding) kinematical symmetry group. Indeed, had Schrodinger linearized his operator, $E - \frac{1}{2m}P^{2} \equiv i\partial_{t} + \frac{1}{2m} \nabla^{2}$, bi-spinors would have carried his name (instead of Dirac), and his linearized equation would have predicted the correct value for the electron magnetic moment. Following Dirac (as Levy-Leblond did in his PhD thesis) one requires the wave equation to be of first-order in all space and time derivatives ($P = - i \nabla \ , E = i \partial_{t}$). Thus, for the most general linear wave equation, one writes
$$\Pi (E , P)\Psi (x) \equiv \frac{1}{\sqrt{2m}} \left( A E + \Gamma_{i}P_{i} + C \right) \Psi (x) = 0 , \ \ \ (1)$$ where $A, \Gamma_{i}, C$ are linear operators to be determined together with the dimension of the vector space they act on. For the solutions of (1) to satisfy the free Schrodinger equation $$\left( i\partial_{t} + \frac{1}{2m} \nabla^{2}\right) \Psi (x) = 0 ,$$ there must exists some operator $\bar{\Pi}(E , P) \equiv \frac{1}{\sqrt{2m}} \left( \bar{A}E + \bar{\Gamma}_{i}P_{i} + \bar{C} \right)$ such that $$\bar{\Pi} \ \Pi \Psi (x) = \left( E - \frac{1}{2m}P^{2}\right) \Psi (x) .$$ By expanding this and identifying various monomials in $E$ and $P_{i}$, we obtain the following set of conditions
$$\bar{A} A = 0, \ \ \bar{A} C + \bar{C} A = 2m, \ \ \bar{C} C = 0 ,$$$$\bar{A} \Gamma_{i} + \bar{\Gamma}_{i} A = 0 , \ \ \bar{C} \Gamma_{i} + \bar{\Gamma}_{i} C = 0 ,$$$$\bar{\Gamma}_{i}\Gamma_{j} + \bar{\Gamma}_{j}\Gamma_{i} = - 2 \delta_{ij} , \ \ i,j = 1,2,3 .$$
At first these conditions look very messy, however the magic appears if we define the following operators $$\Gamma_{4} = i \left( A + \frac{1}{2m} C \right) , \ \ \ \Gamma_{5} = \left( A - \frac{1}{2m} C \right) , \ \ \ (2)$$$$\bar{\Gamma}_{4} = i \left( \bar{A} + \frac{1}{2m} \bar{C} \right) , \ \ \ \bar{\Gamma}_{5} = \left( \bar{A} - \frac{1}{2m} \bar{C} \right) .$$
Using these, the above conditions can be rewritten as $$\bar{\Gamma}_{a} \Gamma_{b} + \bar{\Gamma}_{b} \Gamma_{a} = - 2 \delta_{ab} , \ \ (a,b) = 1,2, \cdots , 5 . \ \ \ \ (3)$$ This is actually a 4-dimensional Euclidian Clifford algebra in disguise. Indeed, if we write ($\alpha , \beta = 1,2, \cdots , 4$) $$\Gamma_{\beta} = \eta \gamma_{\beta} , \ \ \ \Gamma_{5} = i \eta ,$$$$\bar{\Gamma}_{\alpha} = - \gamma_{\alpha} \eta^{-1} , \ \ \bar{\Gamma}_{5} = i \eta^{-1} ,$$ for some arbitrary non-singular matrix $\eta$, then the algebra (3) gives us the Euclidian 4-dimensional Clifford (or Dirac) algebra $$\big\{ \gamma_{\alpha} , \gamma_{\beta} \big\} = 2 \delta_{\alpha \beta} , \ \ (\alpha , \beta) = 1 \ \mbox{to} \ 4 . \ \ \ \ (4)$$ Thus, all irreducible representations of the algebra (3) can be obtained from the well-known irreducible representations of the Dirac algebra (4). This means that all irreducible representations of our algebra (3) are 4-dimensional and are equivalent. Thus, we can choose the following (standard) realization $$\Gamma_{i} = \begin{pmatrix} \sigma_{i} & 0 \\ 0 & \sigma_{i} \end{pmatrix} , \ \ i = 1 , 2 , 3 \ ,$$$$\Gamma_{4} = i \begin{pmatrix} 0 & I_{2} \\ I_{2} & 0 \end{pmatrix} , \ \ \Gamma_{5} = \begin{pmatrix} 0 & - I_{2} \\ I_{2} & 0 \end{pmatrix} .$$ Now, using (2) we find $$A = \begin{pmatrix} 0 & 0 \\ I_{2} & 0 \end{pmatrix} , \ \ C = 2m \begin{pmatrix} 0 & I_{2} \\ 0 & 0 \end{pmatrix} .$$ Thus, the wave function $\Psi$ is a 4-component object, which we write as $$\Psi (x) = \begin{pmatrix} \varphi (x) \\ \chi (x) \end{pmatrix},$$ where $\varphi$ and $\chi$ are 2-component functions. And finally, for the linearized Schrodinger equation (1), we obtain the following coupled equations $$E \varphi (t, x) + \left(\vec{\sigma} \cdot \vec{P} \right) \chi (t,x) = 0 , \ \ \ (5a)$$$$\left( \vec{\sigma} \cdot \vec{P} \right) \varphi (t,x) + 2m \chi (t,x) = 0 , \ \ \ (5b)$$ or $$i\partial_{t} \varphi (t,x) - i \left(\sigma \cdot \nabla \right) \chi (t,x) = 0 ,$$$$- i \left( \sigma \cdot \nabla \right) \varphi (t,x) + 2m \chi (t,x) = 0.$$ It is easy to check that each component of the “Schrodinger bi-spinor” $$\Psi = \begin{pmatrix} \varphi (t,x) \\ \frac{i}{2m} ( \sigma \cdot \nabla ) \varphi (t,x) \end{pmatrix} ,$$ satisfies the ordinary Schrodinger equation. The fact that the linearized Schrodinger equation (5) describes a non-relativistic particle of mass $m$ and spin 1/2 can be checked from the covariance of the Schrodinger equation with respect to the Galilei group: $$\begin{pmatrix} \varphi (t,x) \\ \chi (t,x) \end{pmatrix} \to e^{i\Lambda (t,x)} \begin{pmatrix} D^{(1/2)}(R) & 0 \\ - \frac{\vec{\sigma} \cdot \vec{v}}{2} D^{(1/2)}(R) & D^{(1/2)} \end{pmatrix} \begin{pmatrix} \varphi (t,x) \\ \chi (t,x) \end{pmatrix} ,$$ where $D^{(1/2)}(R)$ is the 2-dimensional projective representation of $SO(3)$ and $\Lambda = \frac{1}{2}mv^{2}t + m \vec{v} \cdot (R \vec{x})$ is the usual phase factor. This method is a bit complicated. But, we can follow the following reasoning: In order to describe a non-relativistic particle of mass $m$ and spin (1/2), the linear equations (5a) and (5b) must be the correct non-relativistic limit of the Dirac equations $$(\mathcal{E} - m) \varphi + ( \sigma \cdot P) \chi = 0 , \ \ \ (6a)$$$$(\sigma \cdot P) \varphi + (\mathcal{E} + m) \chi = 0, \ \ \ (6b)$$ where $\mathcal{E}$ is the total energy (i.e., mass + kinetic) and $\varphi$,$\chi$ are, respectively, the “big” and “small” 2-component spinors of the Dirac bi-spinor. Indeed, it is rather trivial to see that (5a,b) are the nonrelativistic limit of (6a,b): In the non-relativistic limit $\mathcal{E} = E + m, \ m \gg E$, (6a,b) go to (5a,b). The fundamental difference though is the fact that the linearized Schrodinger equations do not predict anti-particles.

Last edited:

Want to reply to this thread?

"Why does the Dirac equation lead to spin 1/2?"

Physics Forums Values

We Value Quality
• Topics based on mainstream science
• Proper English grammar and spelling
We Value Civility
• Positive and compassionate attitudes
• Patience while debating
We Value Productivity
• Disciplined to remain on-topic
• Recognition of own weaknesses
• Solo and co-op problem solving