# Spin in Bohmian mechanics

• I
2019 Award
A paper that appeared today may be very relevant here:
http://lanl.arxiv.org/abs/1603.02541
Equation (8) defines the conditional wave function of a microscopic physical system in terms of the wave function of the universe and the true positions of the particles in the environment (= the universe except the microscopic system). Such a definition is essential for the interpretation of quantum mechanical systems which are not usually analyzed together with the whole universe.

It is well-known that electrons, protons and neutrons, the constituents of ordinary matter, are fermions and hence have spin. Thus the wave function of the universe must account for particles with spin 1/2. But equation (8) ignores the spin. What is the correct generalization to spinning particles?
the generalization is trivial. [...] But of course, to see that something is trivial requires understanding. That's why textbooks have exercises, to test your understanding.
I admit that I lack understanding. But textbooks usually give at least explicit computations the first time something is encountered, so that readers have a template to do the exercises for themselves. Please provide the template so that I can improve my understanding. After all, a triviality should be easy to explain with all details.

Related Quantum Interpretations and Foundations News on Phys.org
Demystifier
Gold Member
Thus the wave function of the universe must account for particles with spin 1/2. But equation (8) ignores the spin. What is the correct generalization to spinning particles?
That's really simple. Eq. (8) remains the same, but in (9) replace ##|\psi|^2## with ##\psi^{\dagger}\psi##.

For a very detailed description of the measurement of spin in Bohmian mechanics I recommend
http://arxiv.org/abs/1305.1280

Last edited:
2019 Award
That's really simple. Eq. (8) remains the same, but in (9) replace ##|\psi|^2## with ##\psi^{\dagger}\psi##.
Ah, thanks for the hint, but I understand that only partially. Could you please explain one more thing to me?

It seems to me that on the right there are ##N+M## such indices while on the left one needs only ##N## of them. Please give me another hint so that I can understand what happens with the ##M## indices belonging to the spinning particles of the environment?

For a very detailed description of the measurement of spin in Bohmian mechanics I recommend
http://arxiv.org/abs/1305.1280
Thanks a lot for the link. I hope to be illuminated with respect to my failure to extend your argument regarding the derivation of the Born rule to the case with spin.

Demystifier
Gold Member
It seems to me that on the right there are N+M such indices while on the left one needs only N of them. Please give me another hint so that I can understand what happens with the M indices belonging to the spinning particles of the environment?
This is an excellent question!!!
In Bohmian mechanics, nobody so far studied the spin degrees of the environment. So to answer your question, I had to think about something I never thought of before. The answer I will give you is something entirely new, and conceptually very interesting. A refined version of the answer that follows might worth a separate paper.

Both sides have N+M indices. But the left-hand side is the conditional of the wave function of the sub-system. Normally, the wave function of the subsystem should only have N indices. So what is the meaning of the extra M indices?

The point is that conditional wave function is an object that does not exist in standard QM. Even though it describes the sub-system, it depends on the environment. First, it depends on the environment trajectories Y(t). Second, it also depends on the environment spin indices.

But in a certain relevant limit, both dependences turn out to vanish. But here I cannot describe it in detail, because it would require a separate research paper. But you gave me a good idea, thanks!

2019 Award
you gave me a good idea, thanks!
Perhaps you might then also reconsider your reluctance to help me with the exercise you had posed before
so that I can see the complete argument in the degenerate case. I don't ask questions if they are completely trivial!

Demystifier
Gold Member
so that I can see the complete argument in the degenerate case. I don't ask questions if they are completely trivial!

A general measurement is described by a POVM. However, any POVM measurement can be viewed as a projective measurement in some higher-dimensional Hilbert space. So without loosing on generality, we can assume that we have a projective measurement defined by some projectors ##\pi_k## satisfying ##\sum_k \pi_k =1##. These projectors can be further decomposed as
$$\pi_k = \sum_l |k,l\rangle\langle k,l |$$
where the set of all ##|k,l\rangle## make a complete basis in the measured system. Such a measurement distinguishes different ##k## but not different ##l##, so ##l## can be said to label the degeneracy of the measured observable.

Before measurement, let the state of the measured system be
$$|\psi\rangle = \sum_{k ,l}c_{kl}|k,l\rangle$$
During the measurement, the measured system will become entangled with the apparatus and its environment, so the total state will be
$$|\Psi\rangle = \sum_{k ,l}c_{kl}|k,l\rangle |A_{k}\rangle |E_{k}\rangle$$
where ##|A_{k}\rangle## are macroscopically distinguishable states of the apparatus and ##|E_{k}\rangle## are macroscopically distinguishable states of the environment. Let me further introduce the notation ##|A_{k}\rangle |E_{k}\rangle \equiv |\Phi_{k}\rangle##, so that the equation above can be written as
$$|\Psi\rangle = \sum_{k ,l}c_{kl}|k,l\rangle |\Phi_{k}\rangle$$
This can also be written in terms of wave functions as
$$\Psi(x,y) = \sum_{k ,l}c_{kl}\psi_{kl}(x) \Phi_{k}(y)$$
where ##x## is the collective coordinate for particle positions of the measured system, while ##y## is the collective coordinate for particle positions of the apparatus+environment. For simplicity, I assume that there are no spin degrees.

Since ##\Phi_{k}(y)## are macroscopically distinguishable, we have an (approximate) equation
$$\Phi^*_{k'}(y) \Phi_{k}(y)=0 \;\; for \;\; k\neq k' \;\; (Eq. 1)$$
The probability density is
$$P(x,y) =|\Psi(x,y)|^2 \;\; (Eq. 2)$$
which is valid in both standard and Bohmian QM. Using (Eq. 1) this becomes
$$P(x,y) = \sum_{k}|\Phi_{k}(y)|^2 |\psi_{k}(x)|^2$$
where
$$\psi_{k}(x)=\sum_{l}c_{kl}\psi_{kl}(x)$$

The probability in the ##y##-space is given by average over ##x##
$$P(y) =\int dx P(x,y) = \sum_{k} |\Phi_{k}(y)|^2 \sum_{l}|c_{kl}|^2$$
where we have used orthogonality and standard normalization of the basis ##\psi_{kl}(x)##.

Now let ##\sigma_k## be the region in the ##y##-space in which ##\Phi_{k}(y)## is not negligible. Then the probability of finding the apparatus+environment inside ##\sigma_k## is
$$P_k=\int_{\sigma_k}dy P(y) = \sum_{l}|c_{kl}|^2$$
where we used standard normalization of ##\Phi_{k}(y)##. The obtained result ##P_k=\sum_{l}|c_{kl}|^2## is nothing but the Born rule in the degenerate case, so we have proved the Born rule in the ##k##-space with degeneracy from the Born rule (Eq. 2) in the position space. Q.E.D.

Note that the whole proof does not depend much on Bohmian mechanics, so in this sense there is nothing really new in it and can be considered "trivial" for an expert in standard QM, even he/she does not know much about Bohmian mechanics.

Last edited:
2019 Award
During the measurement, the measured system will become entangled with the apparatus and its environment, so the total state will be
$$|\Psi\rangle = \sum_{k ,l}c_{kl}|k,l\rangle |A_{k}\rangle |E_{k}\rangle$$
where ##|A_{k}\rangle## are macroscopically distinguishable states of the apparatus and ##|E_{k}\rangle## are macroscopically distinguishable states of the environment.
Why is there no microscopic dependence of the macroscopically distinguishable states on ##l##? Most states in the tensor product of the two Hilbert spaces cannot be represented in the form of this equation, since if the index set over which ##k## runs is finite, the states of this form form only a finite-dimensional subspace ##S## tiny comparted to the Hilbert space of the universe. Thus you either need to assume a more general form of entanglement, or you need to show why the dynamics of the universe doesn't lead you outside the tiny subspace ##S##.

Demystifier
Gold Member
Why is there no microscopic dependence of the macroscopically distinguishable states on ##l##? Most states in the tensor product of the two Hilbert spaces cannot be represented in the form of this equation, since if the index set over which ##k## runs is finite, the states of this form form only a finite-dimensional subspace ##S## tiny comparted to the Hilbert space of the universe. Thus you either need to assume a more general form of entanglement, or you need to show why the dynamics of the universe doesn't lead you outside the tiny subspace ##S##.
I knew you will ask that.

Here we don't have a typical entanglement with apparatus+environment, but precisely such kind of apparatus+environment that measures ##k## but not ##l##. If macroscopically distinguishable states depended on ##l##, then, by looking at the macroscopic state, we could distinguish different values of ##l##, which would mean that we measured ##l##. So if we want an apparatus that does not measure ##l##, then we must have macroscopic states that do not depend on ##l##.

This is the general argument, but in specific cases, of course, one has to show that interaction is such that the above is satisfied. In practice, this usually means that the interaction Hamiltonian commutes with an appropriate operator ##L## with eigenvalues ##l##.

2019 Award
I knew you will ask that.
Then you should have made you subsequent comment directly in the derivation, and it would have saved me a question.
So if we want an apparatus that does not measure ##l##, then we must have macroscopic states that do not depend on ##l##. [...] This usually means that the interaction Hamiltonian commutes with an appropriate operator L with eigenvalues l.
OK. Thus for a binary measurement, what you seem to claim is that the state ##\Psi(t)## at any time ##t## has the form ##\psi_1\Phi_1(t)+\psi_2\Phi_2(t)## where ##\psi_k## is an eigenstate of the projector ##\pi_k## to the eigenvalue 1 determined by the initial state of the measured system, and ##\Phi_k(t)## is a state of detector plus environment. If the measurement begins at a positive time then ##\Phi_1(0)=\Phi_2(0)=\Phi_0## is the initial state of detector plus environment. This is the only way I can see that your setup can hold generically and incorporates your commutation assumption. Is this correct?

Demystifier
Gold Member
Then you should have made you subsequent comment directly in the derivation, and it would have saved me a question.
Well, I was not completely sure that you will ask it, so I wanted to test you.

OK. Thus for a binary measurement, what you seem to claim is that the state ##\Psi(t)## at any time ##t## has the form ##\psi_1\Phi_1(t)+\psi_2\Phi_2(t)## where ##\psi_k## is an eigenstate of the projector ##\pi_k## to the eigenvalue 1 determined by the initial state of the measured system, and ##\Phi_k(t)## is a state of detector plus environment. If the measurement begins at a positive time then ##\Phi_1(0)=\Phi_2(0)=\Phi_0## is the initial state of detector plus environment. This is the only way I can see that your setup can hold generically and incorporates your commutation assumption. Is this correct?
Yes, that's correct.

Demystifier
Gold Member
Both sides have N+M indices. But the left-hand side is the conditional of the wave function of the sub-system. Normally, the wave function of the subsystem should only have N indices. So what is the meaning of the extra M indices?
In the meantime I have realized that it is more trivial than I thought. The only point of calculating the conditional wave function is to calculate the conditional probability density from it. The probability density, of course, does not have any indices because it is calculated from the wave function ##\psi_{ab...}## as
$$P=\psi^{\dagger}\psi\equiv \sum_{a,b,...} \psi^*_{ab...} \psi_{ab...}$$

2019 Award
OK. Thus for a binary measurement, what you seem to claim is that the state ##\Psi(t)## at any time ##t## has the form ##\psi_1\Phi_1(t)+\psi_2\Phi_2(t)## where ##\psi_k## is an eigenstate of the projector ##\pi_k## to the eigenvalue 1 determined by the initial state of the measured system, and ##\Phi_k(t)## is a state of detector plus environment. If the measurement begins at a positive time then ##\Phi_1(0)=\Phi_2(0)=\Phi_0## is the initial state of detector plus environment. This is the only way I can see that your setup can hold generically and incorporates your commutation assumption. Is this correct?
Yes, that's correct.
Good; now I understand a bit more. The fact that you didn't seem to use any assumptions at all was what had lost me. Now we found out that one of the unstated but used assumptions was that the Hamiltonian must commute with all unmeasured operators from a complete collection. It is common practice in physics to state the assumptions used in an argument; so I hope you'll not lose patience if I want to find out the full list of assumptions that are needed to make your argument work.

You state that after the measurement, ##\Phi_1(t)## and ##\Phi_2(t)## are orthogonal. But initially they are equal. Thus this is a gradual process. I don't see yet why they should become orthogonal after some time. It seems to me that they can be very complicated superpositions of detector and environment states, while you claim that they are orthogonal tensor product states. To be able to derive this, one surely must assume another property of the measurement process, and give conditions on the Hamiltonian that guarantees it.

Last edited:
Demystifier
Gold Member
Good; now I understand a bit more. The fact that you didn't seem to use any assumptions at all was what had lost me. Now we found out that one of the unstated but used assumptions was that the Hamiltonian must commute with all unmeasured operators from a complete collection. It is common practice in physics to state the assumptions used in an argument; so I hope you'll not lose patience if I want to find out the full list of assumptions that are needed to make your argument work.
I am glad that we made a progress. But in practice it is almost impossible to say explicitly all assumptions. One says explicitly only those assumptions which are not "obvious", but what is obvious to one person may not be obvious to another one.

You state that after the measurement, ##\psi_1\Phi_1(t)## and ##\Phi_2(t)## are orthogonal. But initially they are equal. Thus this is a gradual process. I don't see yet why they should become orthogonal after some time. It seems to me that they can be very complicated superpositions of detector and environment states, while you claim that they are orthogonal tensor product states. To be able to derive this, one surely must assume another property of the measurement process, and give conditions on the Hamiltonian that guarantees it.
They are approximately orthogonal, not exactly orthogonal. The scalar product between them typically decays with time as ##e^{-t/\tau}## where ##\tau## is the decoherence time. The decoherence time for macroscopic apparatus/environment is typically a very short time (a tiny fraction of a second), but for more details you need to see some review on decoherence. (I can suggest you some reviews if you want.)

2019 Award
They are approximately orthogonal, not exactly orthogonal. The scalar product between them typically decays with time as ##e^{-t/\tau}## where ##\tau## is the decoherence time.
OK. You also claim that the ##\Phi(t)## are ultimately tensor product states between detector and environment. Is this also due to decoherence? It is not really relevant for the argument; but since you assumed it I am curious where it comes from.

Last edited:
Demystifier
Gold Member
OK. You also claim that the ##\Phi(t)## are ultimately tensor product states between detector and environment. Is this also due to decoherence? It is not really relevant for the argument; but since you assumed it I am curious where it comes from.
Yes, it also comes from decoherence.

2019 Award
In practice, this usually means that the interaction Hamiltonian commutes with an appropriate operator ##L## with eigenvalues ##l##.
I am still confused when I try to apply this to the particular situation of a Stern-Gerlach experiment. In the case of a spin up/down measurement, ##l## is particle position, hence ##L## is the position operator of the particle. Thus the interaction Hamiltonian is assumed to commute with position. But I cannot see how this squares with the fact that the magnet bends the particle paths.

Demystifier
Gold Member
I am still confused when I try to apply this to the particular situation of a Stern-Gerlach experiment. In the case of a spin up/down measurement, ##l## is particle position, hence ##L## is the position operator of the particle. Thus the interaction Hamiltonian is assumed to commute with position. But I cannot see how this squares with the fact that the magnet bends the particle paths.
1) Actually, it is the total Hamiltonian (not merely the interaction Hamiltonian) that would need to commute with an operator in order to not measure this observable. But in practice the free Hamiltonian can often (not always!) be neglected, in which case the requirement can be reduced to a requirement on the interaction Hamiltonian.

2) For spin measurement in http://arxiv.org/abs/1305.1280 you can see that the relevant Hamiltonian is (13). In this Hamiltonian the free kinetic part cannot be neglected and therefore the relevant Hamiltonian does not commute with position.

3) Indeed, with SG apparatus it is not true that position is not measured. By SG apparatus you also measure whether the particle goes up or down, which is a (partial) measurement of position. Of course, position commutes with spin, so nothing forbids a simultaneous measurement of both spin and position.

4) The Hamiltonian (13) commutes with momentum operators ##P_x## and ##P_y##, so the degeneracy label ##l## includes ##p_x## and ##p_y##.

2019 Award
1) Actually, it is the total Hamiltonian (not merely the interaction Hamiltonian) that would need to commute with an operator in order to not measure this observable. But in practice the free Hamiltonian can often (not always!) be neglected, in which case the requirement can be reduced to a requirement on the interaction Hamiltonian.
3) Indeed, with SG apparatus it is not true that position is not measured. By SG apparatus you also measure whether the particle goes up or down, which is a (partial) measurement of position. Of course, position commutes with spin, so nothing forbids a simultaneous measurement of both spin and position..
Thanks; this explains part of my concern, but not how it affects your proof. How are (1) and (3) accounted for in your proof in post #6? Wasn't it supposed to be a proof of the most general case? Or what must be modified to be general?

Demystifier
Gold Member
Thanks; this explains part of my concern, but not how it affects your proof. How are (1) and (3) accounted for in your proof in post #6? Wasn't it supposed to be a proof of the most general case? Or what must be modified?
I don't see any inconsistency with the general case in post #6. In particular, precisely because I wanted to be very general and abstract, in #6 I said nothing about Hamiltonians. I mentioned Hamiltonians in #8, but there I used the word "usually", which doesn't mean "always".

2019 Award
Ok, given the leeway you allow yourself in remaining vague, I have no other questions.

But it would be a much better piece of work (and could become a useful reference) if you would reformulate your result and proof, perhaps for an insight article?

To be maximally useful the result should be formulated as a theorem that explicitly states the silent assumptions used, in particular, that the argument depends a lot on the results of decoherence theory! Then you should give the complete proof (essentially everything discussed above) - but without the irrelevant split of the complement of the measured system into detector and rest of the universe, which complicates the notation and contributes no generality but detracts from the relevant details. If you want to complicate the notation the effort is better spent by allowing for spin. Also, you should introduce ##\psi_k## directly after the second formula, which keeps the remainder simpler. Finally you should give the spin measurement as a particular, nontrivial example in which you verify these conditions - including an explicit description of the (not completely trivial) POVM that actually does here the job, and references to a derivation of the decoherence properties relevant in this particular case. This would be valuable not only for purists like me but for everyone interested in the subject.

Had I been a referee of your paper I'd have requested a major revision and would have made this a requirement for publication!

Demystifier
Gold Member
Ok, given the leeway you allow yourself in remaining vague, I have no other questions.

But it would be a much better piece of work (and could become a useful reference) if you would reformulate your result and proof, perhaps for an insight article?

To be maximally useful the result should be formulated as a theorem that explicitly states the silent assumptions used, in particular, that the argument depends a lot on the results of decoherence theory! Then you should give the complete proof (essentially everything discussed above) - but without the irrelevant split of the complement of the measured system into detector and rest of the universe, which complicates the notation and contributes no generality but detracts from the relevant details. If you want to complicate the notation the effort is better spent by allowing for spin. Also, you should introduce ##\psi_k## directly after the second formula, which keeps the remainder simpler. Finally you should give the spin measurement as a particular, nontrivial example in which you verify these conditions - including an explicit description of the (not completely trivial) POVM that actually does here the job, and references to a derivation of the decoherence properties relevant in this particular case. This would be valuable not only for purists like me but for everyone interested in the subject.

Had I been a referee of your paper I'd have requested a major revision and would have made this a requirement for publication!
You are making a good point, but it's impossible to satisfy everybody. Another referee, who already knows all this, might require that paper should be shortened because there is no point in repeating the "well known" stuff. Anyway, that's why I don't write books (and admire those who do, including you) because I find too exhausting to make explicit all the details that I know only implicitly. In a short paper, or even shorter Forum post, one can concentrate on a smaller number of points that one finds more important in a given context, and this suits my writer-personality much better.

2019 Award
might require that paper should be shortened because there is no point in repeating the "well known" stuff.
Stating clearly the assumptions made is not repeating well-known stuff but standard practice in science. So is giving clear references to what one actually used.

Demystifier