Uniqueness of quantization of Dirac field

1. Apr 25, 2008

pxb

Let's have a theory involving Dirac field $$\psi$$. This theory is decribed by some Lagrangian density $$\mathcal{L}(\psi,\partial_\mu\psi)$$. Taking $$\psi$$ as the canonical dynamical variable, its conjugate momentum is defined as
$$\pi=\frac{\partial\mathcal{L}}{\partial(\partial_0\psi)}$$
Than the quantization simply means to impose the canonical anticommatation relation of the type $$\{\psi,\pi \}_{e.t.}=i\delta^3(x-y)$$.

OK. But I wonder, whether this procedure is unique. It is well known that free (massless) Dirac field can be described by Lagrangian $$\mathcal{L}=i\bar\psi\gamma^\mu\partial_\mu\psi$$, as well as by Lagrangian
$$\mathcal{L}=\frac{i}{2}\bar\psi\gamma^\mu(\partial_\mu\psi)-\frac{i}{2}(\partial_\mu\bar\psi)\gamma^\mu\psi$$
That's because both Lagranians differ only by a total divergence and hence give the same equations of motion. The problem is that they give different conjugate momenta: the former gives $$\pi=i\psi^\dag$$, the latter
$$\pi=\frac{i}{2}\psi^\dag$$

Where's the problem? Does it mean that the creation/anihilation operators within each quantization have different anticommatation relations (differing by factor of 1/2)? Or something else? Please help, thanks..

2. Apr 25, 2008

jostpuur

Are you sure you know what you mean with that partial derivative? It is difficult to say that you would be doing it wrong, because everybody seems to be doing this stuff like that, but I'll criticize it anyway. That is a derivative with respect to a four component variable, so it's meaning is not quite clear. It would be safer to consider partial derivatives with respect to individual components. Also, physicists often say that $\phi^*$ is constant with respect to $\phi$, and that you can calculate

$$\frac{\partial}{\partial\phi} |\phi|^2 = \phi^*,$$

but IMO this notation doesn't serve any other purpose than confusion. If you can get correct results with that formula, it is of course impossible to prevent physicists from using it...

The safe way, is to consider real and imaginary independent real variables, and deal with real derivatives. So with Dirac field, there is eight real components. I would define canonical momenta like this

$$\textrm{Re}(\pi_a) = \frac{\partial\mathcal{L}}{\partial(\partial_0\textrm{Re}(\psi_a))}$$

$$\textrm{Im}(\pi_a) = \frac{\partial\mathcal{L}}{\partial(\partial_0\textrm{Im}(\psi_a))}$$

and then set notation

$$\pi_a = \textrm{Re}(\pi_a) + i\textrm{Im}(\pi_a),\quad\forall\; a\in\{1,2,3,4\},$$

$$\pi = (\pi_a)_{a\in\{1,2,3,4\}}.$$

I am slightly confused about why these both are popular among physics. The first one is not real, and gives complex actions. IMO we should choose the Lagrangian so that it is real, but it could be that not everybody agrees with me here.

I think the biggest problem here is that even when the $\mathcal{L}(\psi,\partial_0\psi)$ is fixed, the $\mathcal{H}(\psi,\pi)$ is still not unique! The components of $\pi$ don't depend on $\partial_0\psi$, but instead on $\psi$, and you can write the Hamilton's function in infinitely many different ways.

Sorry that I cannot give very canonical answers. I've been fighting with this for some time myself too... And you probably notice, that I'm not doing everything in the same way as these things are usually being done in the physics books, so you should not agree with me without caution In any case, I hope my comments make sense.

Last edited: Apr 25, 2008
3. Apr 26, 2008

jostpuur

I hope my post didn't look too suspicious. I'm not really disagreeing with the formula

$$\pi=i\psi^{\dagger}[/itex] although I would derive it more carefully, so that the meaning is clearer. Don't you find it disturbing, for example, that the right side is made a horizontal vector with the dagger, while the left side could easily be interpreted as a vertical vector? Your formula [tex]\pi = \frac{i}{2}\psi^{\dagger}$$

seems to be derived with the assumption that $\overline{\psi}$ is a constant with respect to $\psi$, which I don't fully agree with.

This is how I would deal with the canonical momenta. We have

$$\frac{i}{2}\overline{\psi}\gamma^{\mu}\partial_{\mu}\psi - \frac{i}{2}(\partial_{\mu}\overline{\psi})\gamma^{\mu}\psi = -\textrm{Im}(\overline{\psi}\gamma^{\mu}\partial_{\mu}\psi).$$

In Weyl representation this becomes

$$-\textrm{Im}\big(\psi_{34}^{\dagger}\sigma^{\mu}\partial_{\mu}\psi_{34} + \psi_{12}^{\dagger}\overline{\sigma}^{\mu}\partial_{\mu}\psi_{12}\big),$$

with notation

$$\psi = \left(\begin{array}{c} \psi_1 \\ \psi_2 \\ \psi_3 \\ \psi_4 \\ \end{array}\right) =\left(\begin{array}{c} \psi_{12} \\ \psi_{34} \\ \end{array}\right),\quad \psi_{12},\psi_{34}\in\mathbb{C}^2$$

$$\sigma^{\mu}=(1,\boldsymbol{\sigma})^{\mu},\quad \overline{\sigma}^{\mu} = (1,-\boldsymbol{\sigma})^{\mu}$$

For the purpose of computing canonical momenta, we can ignore those terms which are not multiplied by $\partial_0\psi_a$ components, so we are left with

$$-\textrm{Im}\Big(\sum_{a=1}^4 \psi_a^* \partial_0 \psi_a\Big) = \sum_{a=1}^4\Big( \textrm{Im}(\psi_a)\partial_0 \textrm{Re}(\psi_a) - \textrm{Re}(\psi_a)\partial_0\textrm{Im}(\psi_a)\Big).$$

The components of canonical momenta are then

$$\textrm{Re}(\pi_a) = \textrm{Im}(\psi_a),\quad \textrm{Im}(\pi_a) = -\textrm{Re}(\psi_a),$$

which can be written more compactly as

$$\pi_a = \textrm{Im}(\psi_a) - i\textrm{Re}(\psi_a) = -i\psi_a,\quad \pi=-i\psi.$$

If you want to have this as a horizontal vector, we can write it like this

$$\pi^{\dagger} = i\psi^{\dagger}.$$

Last edited: Apr 26, 2008
4. Apr 26, 2008

pxb

Hi jostpuur, thanks for interesting responses. Let me comment a bit on them:

Only the first one is popular in physics, second is ocassionaly introduced in the textbooks, but not widely used. I don't know why a Lagrangian (and consequently the action) should be real, it only has to be hermitian (and of course consistent with symmetries of the system - Lorentz symmetry, internal symmetries, ...). In fact, both the mentioned Lagrangians give the same action, since they only differ by a divergence, something like $$\partial_\mu(\bar\psi\gamma^\mu\psi)$$, and hence the integral (ie. the action) $$\int\mathrm{d}^4x\mathcal{L}$$ over the whole space is the same for both Lagrangians. (Provided of course that $$\psi$$ is "well behaved" in infinity, which is the usual assumption in physics.)

If $$\psi$$ is a vertical vector, than it's OK that $$\pi$$ is a horizontal vector. What's the problem? For example, the Hamiltonian is something like $$\mathcal{H}=\pi\psi-\mathcal{L}$$, so in any case $$\pi\psi$$ should be a scalar.

This is the same like in mathematics when dealing with a function $$f$$ of one constant variable $$z=a+ib$$. You can understand the function $$f$$ either as a function of two real variables $$a$$, $$b$$ or as a function of $$z$$, $$\bar z$$. So, for example, the Cauchy-Riemann equations can be expressed in terms of $$\partial_a f, \partial_b f$$, as well as equivalently in terms $$\partial_z f, \partial_{\bar z} f$$.

The comments above look probably a little bit silly for you mathematician, but in physics this is really the common state of the art. Anyway, the rest of what you say looks quite interesting, especially your derivation of $$\pi$$, although I don't understand it yet well a have to go through it.

5. Apr 26, 2008

jostpuur

You are right. This is purely a matter of convention.

What do you mean be hermitian? If we interpret a number $c\in\mathbb{C}$ as an operator $f\mapsto cf$, then the operator is hermitian precisely when $c\in\mathbb{R}$. So is hermitian only an one way to say that the Lagrangian is real?

They do not give the same action! They give the same equations of motion, but not the same action. When we are deriving the EOM with the action principle, we integrate the Lagrange's function over some interval $[t_1,t_2]$, call it the action, and demand the action to be extremized. So the action along some path is

$$S=\int\limits_{t_1}^{t_2} dt\; L(q(t),t)$$

Now the Lagrange's function is given by integrating the Lagrangian density over the spatial space like this

$$L = \int\limits_{\mathbb{R}^3}d^3x\; \mathcal{L}.$$

So the action is given by a four dimensional integral

$$S = \int\limits_{[t_1,t_2]\times\mathbb{R}^3} d^4x\; \mathcal{L}.$$

When taking the complex conjugate, in order to examine the existence of an imaginary component, the integration by parts goes like this

$$(i\overline{\psi}\gamma^{\mu}\partial_{\mu}\psi)^* = -i(\partial\overline{\psi})\gamma^{\mu}\psi = -i\partial_{\mu}(\overline{\psi}\gamma^{\mu}\psi) + i\overline{\psi}\gamma^{\mu}\partial_{\mu}\psi$$

In integration

$$\partial_i(\overline{\psi}\gamma^i\psi)$$

vanishes, because we can assume that $\psi$ approach zero quick enough when $|x|\to\infty$. However, the substitution

$$\overline{\psi}(t_2)\gamma^0\psi(t_2) - \overline{\psi}(t_1)\gamma^0\psi(t_1)$$

does not vanish in general, and as consequence S is not real in general, and cannot be equal to the other action obtained from the real Lagrangian density.

Notice that the situation is slightly different from the one where we are deriving the EOM with action principle. According to the action principle, we substitute $\psi(x)\mapsto \psi(x) + \alpha\xi(x)$, where $\xi$ is some arbitrary variation, and we assume that $\xi=0$ when $t=t_1$ or $t=t_2$. When we start calculating

$$0 = D_{\alpha} \int d^4x\; \mathcal{L}(\psi + \alpha\xi)\Big|_{\alpha=0} = \cdots$$

and perform integration by parts, the substitutions to $t=t_1$ and $t=t_2$ vanish because $\xi$ vanishes there.

In fact I've understood the argument "assume that the complex conjugate is constant" to be something like this. Suppose we have a function $f(x_1,x_2)=x_1x_2$. Then we have partial derivatives

$$\partial_1 f(x_1,x_2) = x_2,\quad\quad \partial_2 f(x_1,x_2) = x_1.$$

Then substitute $x_1=z^*$ and $x_2=z$ and we have

$$\partial_1 f(z^*, z) = z,\quad\quad \partial_2 f(z^*, z) = z^*.$$

It makes sense to use notation

$$\frac{\partial f}{\partial z^*} = \partial_1 f,\quad\quad \frac{\partial f}{\partial z} = \partial_2 f,$$

and when you substitute $f=z^* z = |z|^2$ into this, we have

$$\frac{\partial}{\partial z} |z|^2 = z^*.$$

Okey So taking a derivative with respect to some complex variable, of a function that is not really differentiable in complex sense, is not necessarily wrong, but all I'm saying is that that's dangerous. You have to know what you are doing. I don't think I've ever seen Lagrangians like $\mathcal{L}(\partial_{\mu}\phi^*, \partial_{\mu}\phi, \phi^*, \phi)$. Physics books usually derive the Euler-Lagrange equations with real fields, then substitute complex ones in, and start calculating without much explanations.

I prefer to at least check, that I can get the same results with more primitive calculations. This way one doesn't get lost so easily. Calculations are not always so elegant this way, but it's not a big crime to check things.

6. Apr 26, 2008

Mr.Slava

It is not Lagrangian because this operator is not hermitian $$\mathcal{L}^{\dagger} = - \mathcal{L} \ne \mathcal{L}$$.

7. Apr 26, 2008

reilly

1. Differentiation by operators is commonplace in Functional Analysis, and is perfectly legit, PROVIDED that operator order is preserved. Such differentiation is sometimes called functional differentiation, and is extensively used in path integral approaches, both quantum and classical.
2. Remember that the Lagrange Equations start from minimizing the action. The two Lagrangians, as has already been noticed, differ by a 4-divergence, which does not contribute to the minimization -- provided that fields vanish at infinity; just more integration by parts. Lanczos discusses such cases in his Variational Principles of Mechanics.

Regards,
Reilly Atkinson

8. Apr 27, 2008

pxb

OK, so first of all some unimportant remarks:

You are completely right. I somehow forgot that $\mathcal{L}$ is a scalar and then hermitian == real. I also overlooked that $$\mathcal{L}=i\bar\psi\gamma^\mu\partial_\mu\psi$$ is not hermitian (ie. real). Mea culpa ...

If you put $t_1=-\infty$ and $t_2=+\infty$ and suppose that $\psi$ vanishes for $t_{1,2}=\pm\infty$, as is usually assumed in physics, they do.

Now back to you derivation of $\pi$. The hermitian Lagrangian

$$\mathcal{L}=\frac{i}{2}\bar\psi\gamma^\mu(\partial _\mu\psi)-\frac{i}{2}(\partial_\mu\bar\psi)\gamma^\mu\psi$$

can be rewritten as (using the summation convention)

$$\mathcal{L}=\textrm{Im}(\psi_a)\partial_0 \textrm{Re}(\psi_a) - \textrm{Re}(\psi_a)\partial_0\textrm{Im}(\psi_a)$$

plus terms involving derivatives with respect to the spactial variables. (BTW: this can be derived using no particular representation of gamma matrices, you dont need the Weyl representation ;) .) Hence we have

$$\frac{\partial\mathcal{L}}{\partial(\partial_0\textrm{Re}(\psi_a))}=\textrm{Im}(\psi_a)$$
$$\frac{\partial\mathcal{L}}{\partial(\partial_0\textrm{Im}(\psi_a))}=-\textrm{Re}(\psi_a)$$

Now there are (at least) two approaches for deriving $$\pi_a$$:

1. Your approach: You simply define

$$\pi_a =\textrm{Re}(\pi_a)+i\textrm{Im}(\pi_a) =\frac{\partial\mathcal{L}}{\partial(\partial_0\textrm{Re}(\psi_a))} + i\frac{\partial\mathcal{L}}{\partial(\partial_0\textrm{Im}(\psi_a))} =-i\psi_a$$

2. "My" approach: Let $$z=a+ib$$ be a complex number and $$f(a,b)$$ a function of its real and imaginary parts. They can be epressed in terms of $$z, z^*$$ as

$$a=\frac{z+z^*}{2}$$
$$b=\frac{z-z^*}{2i}$$

Now I define the derivative of $$f(a,b)$$ with respect to $$z$$ as

$$\frac{\partial f}{\partial z} = \frac{\partial f}{\partial a}\frac{\partial a}{\partial z}+\frac{\partial f}{\partial b}\frac{\partial b}{\partial z} = \frac{1}{2} \left( \frac{\partial f}{\partial a}+\frac{1}{i}\frac{\partial f}{\partial b} \right)$$

Now, using these results, I calculate $$\pi_a$$ as

$$\pi_a = \frac{\partial\mathcal{L}}{\partial(\partial_0(\psi_a))} = \frac{1}{2} \left( \frac{\partial\mathcal{L}}{\partial(\partial_0\textrm{Re}(\psi_a))} + \frac{1}{i}\frac{\partial\mathcal{L}}{\partial(\partial_0\textrm{Im}(\psi_a))} \right) = \frac{i}{2}\psi_a^*$$

Both results, mine and yours, are different. They also differ from the "textbook" correct (?) result $$\pi_a = i\psi^*$$. So my ultimate question is, which result is correct and how it should be correctly derived. I stress, that the cannonical dynamical coordinate is in all cases $$\psi_a$$, so when promoting $$\psi_a, \pi_a$$ to be operators, they should satisfy

$$\{ \psi_a(x), \pi_b(y) \} _{e.t.} = \delta_{ab}\delta^3(\vec{x}-\vec{y})$$

9. Apr 27, 2008

samalkhaiat

10. Apr 27, 2008

reilly

pxb -- Do check a field theory course book, or even something like Scheweber's QED and the Men Who Made It. This is old stuff. Good question, but probably can be answered with 1/2 hour or so on Google.

No way can a canonical commutation rule have a "1/2" in it, on the rhs.

Strictly speaking, Fermion and Boson creation operators are not unique: if a is a destruction operator, then (exp(iX)) a + A is also, along with a'-> (exp(-!X))a' +A*
with ' indicating adjoint. You can make this statement clear by looking at the coherent states generated by a and the transformed a.

Again, a 4-divergence in a Lagrangian is effectively zero.; does not count.
Regards,
Reilly Atkinson

11. Apr 28, 2008

pxb

Hi Reilly, thank you for reply.

Of course I tried to consult Google before posting my question, but maybe I didnt try hard enough...

Exactly. As far as I understand the subject, the canonical quantization is defined by insisting on the (anti)commutation relation of the type

$$\{ \psi(x), \pi(y) \} = \delta(x-y)$$

So whatever the $$\psi$$ and $$\pi$$ are, the r.h.s. is allways the same, without any factor of 1/2. And if the dynamical variable is fixed to be THE $$\psi$$ from the Lagrangian, than $$\pi$$ should be unique. Doest matter which Lagrangian is used, since both give the same EOM, in particular the same propagator.

Well, yes, I didnt check this explicitely, but it could be the case. The uniqueness I have in mind is (when expressed in terms of creation and anihilation operators) rather whether destruction operator $$a$$ is defined up to multiplicative constant (not additive constant) or not. I think it should be unique, i.e. the relation $$\{ a, a^\dag\}=1$$ should hold, otherwise the Feynman rules would not be unique. My only problem is that I cannot derive this uniqueness from the process of the canonical quantization.

12. Apr 28, 2008

jostpuur

I don't think we are usually interested in the limit $(t_1,t_2)\to (-\infty,\infty)$. Recall how we derive the Newton's law from the Lagrange's function $L=T-U$. We consider paths $x(t)$ in some finite time interval, and assume that in the variation $x(t)+\alpha\xi(t)$, we have $\xi(t_1)=0=\xi(t_2)$. If we set the time interval to be $]-\infty,\infty[$, we would have an infinite action, and extremizing it would be some what unpleasant.

In fact it is difficult to tell what happens to those spatial integrals when you let time go to $\pm$infinity. Suppose you have some normalization conserving wave packet, its spatial integral will remain constant when time varies.