# A simple explanation of the Born rule?

• I

## Summary:

Perhaps the Born rule can be understood by considering quantum transitions going both forward and backward in time simultaneously.

## Main Question or Discussion Point

Summary: Perhaps the Born rule can be understood by considering quantum transitions going both forward and backward in time simultaneously.

The probability that an initial quantum state ##|\psi_i\rangle## becomes the final quantum state ##|\psi_f\rangle## is given by

\begin{eqnarray*}
P(i \rightarrow f) &=& |\langle\psi_f|\psi_i\rangle|^2 \tag{1}\\
&=& \langle\psi_f|\psi_i\rangle^*\langle\psi_f|\psi_i\rangle \\
&=& \langle\psi_i|\psi_f\rangle\langle\psi_f|\psi_i\rangle.
\end{eqnarray*}

Equation (1) seems to show that the probability for the transition (##i\rightarrow f##) can be interpreted as the system both moving forward in time (##i\rightarrow f##) with amplitude ##\langle\psi_f|\psi_i\rangle## and backward in time (##f\rightarrow i##) with amplitude ##\langle\psi_i|\psi_f\rangle## simultaneously.

Does this reasoning help to explain the Born rule? (Is it like the Transactional Interpretation of QM?)

I guess we must experience the macroscopic direction of time (##i\rightarrow f##) in accord with increasing entropy in an expanding universe whereas microscopically QM works both forwards and backwards in time.

## Answers and Replies

Related Quantum Interpretations and Foundations News on Phys.org
DrClaude
Mentor
The probability that an initial quantum state ##|\psi_i\rangle## becomes the final quantum state ##|\psi_f\rangle## is given by

\begin{eqnarray*}
P(i \rightarrow f) &=& |\langle\psi_f|\psi_i\rangle|^2 \tag{1}\\
&=& \langle\psi_f|\psi_i\rangle^*\langle\psi_f|\psi_i\rangle \\
&=& \langle\psi_i|\psi_f\rangle\langle\psi_f|\psi_i\rangle.
\end{eqnarray*}

Equation (1) seems to show that the probability for the transition (##i\rightarrow f##) can be interpreted as the system both moving forward in time (##i\rightarrow f##) with amplitude ##\langle\psi_f|\psi_i\rangle## and backward in time (##f\rightarrow i##) with amplitude ##\langle\psi_i|\psi_f\rangle## simultaneously.
No. That is the probability that a system in state ##|\psi_i\rangle## at a given time will be measured in state ##|\psi_f\rangle## at that time. There would need to be some time-evolution operator in there to be a statement about the evolution of a system.

Does this reasoning help to explain the Born rule? (Is it like the Transactional Interpretation of QM?)
The Born rule is a postulate. It can't be explained by anything, just tested empirically.

bhobba and Demystifier
Demystifier
Gold Member
The Born rule is a postulate. It can't be explained by anything, just tested empirically.
Well, Bohmian mechanics sort of explains the Born rule through the subquantum H-theorem. https://www.mdpi.com/1099-4300/20/6/422

DarMM
Gold Member
There are derivations of the Born rule, though they get quite technical and I wouldn't look at them when you're starting off.

One of the classics is Gleason's theorem which says that if you assume sharp* measurements are correctly represented by projection operators (which is how QM represents them) then the Born rule (or really a more general form of it) is the only way to assign probabilities to measurement outcomes that is noncontextual. Where noncontextual means "does not depend on what measurement the outcome is a part of". A very simple example might be in a set of three particles, the probability to find if one particle's spin is aligned or anti-aligned along a given axis doesn't depend on the axis you choose to measure for the others.

There have since been improvements on this, like the work of Adán Cabello who has found if you assume:
(i) There are sharp measurements. Or at least one can get arbitrarily close to one
(ii) They have discrete outcomes
(iii) It is possible to repeat a measurement and have it be statistically independent of the previous run.
(iv) Information cannot be transmitted faster than light
then the set of probabilities that can be assigned to these experiments are those given by QM (with the Born rule as a special case), i.e. quantum correlations are the most general possible probabilities in a world with repeatable sharp measurements which have discrete outcomes and no spacelike propagation of information.

https://journals.aps.org/pra/abstract/10.1103/PhysRevA.100.032120https://arxiv.org/abs/1801.06347
*Sharp means if performed immediately it would give the same results

dextercioby, mattt, bhobba and 1 other person
This is an improved version of the argument incorporating DrClaude's suggestion about time-evolution operators:

The probability that an initial quantum state ##|\psi_i\rangle## evolves to become the final quantum state ##|\psi_f\rangle## is given by

\begin{eqnarray*}
P_{i \rightarrow f} &=& |\langle\psi_f|U_{i \rightarrow f}|\psi_i\rangle|^2 \tag{1}\\
&=& \langle\psi_f|U_{i \rightarrow f}|\psi_i\rangle^*\langle\psi_f|U_{i \rightarrow f}|\psi_i\rangle \\
&=& \langle\psi_i|U^\dagger_{i \rightarrow f}|\psi_f\rangle\langle\psi_f|U_{i \rightarrow f}|\psi_i\rangle \\
&=& \langle\psi_i|U_{f \rightarrow i}|\psi_f\rangle\langle\psi_f|U_{i \rightarrow f}|\psi_i\rangle.
\end{eqnarray*}

Equation (1) seems to show that the probability ##P_{i\rightarrow j}## can be interpreted as the system evolving both forwards in time and backwards in time simultaneously.

Does this reasoning help to explain the Born rule?

Last edited:
DarMM
Gold Member
Well there are results showing that quantum probability arises from being the most general probabilities compatible with both reversibility and discrete outcomes.

I'm not sure though what you mean when you say your calculation there explains the Born rule.

bhobba
Well I guess I'm applying Gell-Mann's https://en.wikipedia.org/wiki/Totalitarian_principle that in QM "Everything not forbidden is compulsory".

At the quantum level, below observable probabilities, there is nothing to stop time flowing both forwards and backwards.

Last edited:
Demystifier
Gold Member
Gleason's theorem ... where noncontextual means "does not depend on what measurement the outcome is a part of".
For comparison, how would you then explain the meaning of noncontextual in the Kochen-Specker theorem (as a theorem that proves that QM is contextual)?

DarMM
Gold Member
For comparison, how would you then explain the meaning of noncontextual in the Kochen-Specker theorem (as a theorem that proves that QM is contextual)?
This is a bit long winded. Half because this is an "I" thread and half because I just want to get everything down for lurkers of a similar level to the OP.

Gleason's theorem shows the Born rule as the only way to assign probabilities to outcomes modelled by projectors. Obviously probabilities are real numbers ##p \in \left[0,1\right]##. We then find this is necessarily of the form ##Tr\left(\rho E\right)## with ##E## a PVM element.

If we then want to refine this to value assignment we need a truth functional that dictates which projectors are true or not. So rather than ##p \in \left[0,1\right]## we need ##\nu \in \left\{0,1\right\}##. We've already found from Gleason's theorem that the only way to assign any element of the real interval ##\left[0,1\right]## is via a density matrix ##\rho##. Since a density matrix is a continuous function on PVMs it simply isn't possible for it to assign ##\nu \in \left\{0,1\right\}## to Projectors. Thus as corollary of Gleason's theorem we have truth value assignment is impossible for outcomes alone.

The Kochen-Specker theorem is just a stronger form of this result. Above we found that values cannot be assigned to all projectors consistently. The KS theorem shows that this occurs even for finite set of projectors. As little as only 18 in a 4D Hilbert space for example. In general dimensions one only needs 39 projectors. This is much stronger as you might have avoided the corollary of Gleason's theorem above by saying that maybe not all projectors correspond to physically possible measurements. Which is probably true in general1.

Thus simple value assignment is not possible. We then know that to get around this you can assign values if they are of the form ##\nu_{M}\left(E\right)##, i.e. the value given to a projector depends on the PVM/decomposition of the identity being evaluated during an experiment.

This is where there is some slight variation of how the word contextual is used. In basic "contextual" means depends on the PVM in which the projector occurs. Probability assignment is not contextual. Value assignment, if you attempt it, is contextual.

Since QM only assigns probabilities some authors say it is noncontextual. Other authors just take contextual to mean "if you were to do value assignment it would be contextual, regardless of whether you do it or not". Thus they would say QM is contextual even though it doesn't actually assign values.

The latter meaning is winning out because it turns out that the inability to assign values noncontextually is what characterizes set ups where quantum computers can give a computational speed up. This is true regardless of whether you actually think there are values or not, so it's best to have a name for it.

1 We've discussed this before, where it's responsible for the fact that interference observables for macroscopic systems probably have no physical meaning and thus the macroscopic world has only one context of commuting observables. As Asher Peres says in his famous monograph this allows us to define "classical" to mean all physically realizable observables commute.

Last edited:
dextercioby, mattt and Mentz114
Demystifier
Gold Member
Probability assignment is not contextual. Value assignment, if you attempt it, is contextual.
I guess that would be the short version of your long answer.

DarMM
Gold Member
I expected a short simple answer, perhaps differing from "does not depend on what measurement the outcome is a part of" only in a couple of words.
Well it depends on which version of the Kochen-Specker theorem one is referring to. The most general version has quite a long winded statement for what "noncontextual" means. As I mentioned it also depends on what one means by contextual. Under certain definitions QM is not contextual.

Demystifier
Demystifier
Gold Member
Well it depends on which version of the Kochen-Specker theorem one is referring to. The most general version has quite a long winded statement for what "noncontextual" means. As I mentioned it also depends on what one means by contextual. Under certain definitions QM is not contextual.
Thanks for the answer. Note, however, that in the meantime I deleted my question and wrote another comment.

DarMM
Gold Member
I guess that would be the short version of your long answer.
Yes that'd probably be the gist of it.

Just to say in the most general version of the theorem there are two levels of contextuality possible for a value assignment:
$$\nu_{M,\Theta}\left(E\right)$$
where the truth value of a projector can depend on both the PVM under which it is evaluated ##M## and the physical realization of that PVM ##\Theta##.

Demystifier
bhobba
Mentor
Here is a simple 'proof' based on the wrong assumption Von-Neumann made that the probabilities must be additive in the observable's in his proof of no hidden variables. Its an interesting story how due to Von-Neumann's well deserved reputation nobody looked at it carefully until about when Gleason came up with his theorem. It was picked up by Grete Hermann but, she was ignored - after all it's John Von-Neumann. Not one of science's finest hours. After Gleason's proof the error in the proof came in quick and fast - as Bell said when you understand it you see its silly. But for starting students using it you can give a slick simple justification of Born's rule but point out the assumption is wrong - however you can prove it assuming non-contextuality which is what Gleason does. Still its better than pulling it out of a hat like is usually done.

The assumption is quite intuitive; the expectation of the observable A+B, E (A+B), is E (A) + E (B)

First its easy to check <bi|O|bj> = Trace (O |bj><bi|).

O = ∑ <bi|O|bj> |bi><bj| = ∑ Trace (O |bj><bi|) |bi><bj|

Now we use the linearity we have assumed.

E(O) = ∑ Trace (O |bj><bi|) E(|bi><bj|) = Trace (O ∑ E(|bi><bj|)|bj><bi|)

Define P as ∑ E(|bi><bj|)|bj><bi| and we have E(O) = Trace (OP).

P, by definition, is called the state of the quantum system. The following are easily seen. Since E(I) = 1, Trace (P) = 1. Thus P has unit trace. Note because the |bi> is an orthonormal basis, that E(|bi><bj|) is positive. A nice challenge is to make that more rigorous (look at the eigenvalues of |bi><bj|). Hence P is a positive operator of unit trace.

Thanks
Bill

Last edited:
vanhees71 and DarMM