# Deriving the Born Rule from the operator-centric formulation

1. Apr 25, 2012

### Hurkyl

Staff Emeritus
Since the issue of deriving the Born rule comes up from time to time in the forum and I'm always a little mystified by some of the opinions people have. I recently fleshed out the details about the derivation I'm about to present and realized it pretty much makes the Born rule a triviality, so I thought I'd present it to see how others take it.

Also, it shows off the operator-focused picture of quantum mechanics, which I like.

As an aside, this is an "external" derivation. (For the record, I generally prefer an "internal" ones, but those seem to stir up much controversy -- especially from those who think internal treatments are impossible by definition)

Recall that the main thing about the operator-centric picture of quantum mechanics is that the central notion is that there is an algebra of operators. This algebra is supposed to represent what sorts of "measurements" can be done, and possibly more general things. There are a variety of features of this algebra, but the main thing I want is the following: there is a quantity $||x||$ that has the meaning
$||x||$ is the magnitude of the largest outcome possible of the measurement.​
(really this should be phrased as a supremum, rather than in terms of a maximum)

One property of a "quantum state" $\psi$ is that you can combine a state with an operator $x$ to produce a complex number $E_\psi(x)$ to be interpreted as something like the "expected value of the operator over that state". This function has various properties, such as
$$| E_\psi(x) | \leq || x ||$$​
which is obvious from its intended interpretation. As an aside, by listing out an adequate list of the properties that $E_\psi$ must satisfy, one can even go so far as to define a quantum state to be a function that satisfies those properties.

If we assume our operators form a C*-algebra, the properties of $E_\psi$ can be summed up by saying it's a "positive linear functional of norm 1". I'm not making that assumption, but I'm going to make some similar requirements on $E_\psi$ to satisfy the properties I require below.

Now, given that we intend to interpret $E_\psi(x)$ as the "expected value" of the operator x, we can also compute the higher moments $m_k = E_\psi(x^k)$ of the operator x over the state $\psi$.

Now, I'm pretty sure the properties of $E_\psi$ ensure that there is a actual probability distribution that has these moments $m_k$ about zero -- and that probability distribution is unique -- and thus we can say that this is probability distribution of the outcomes of x over the state $\psi$. But I feel (ATM) it's enough to treat the case of a qubit, where we don't need anything fancy.

The operator corresponding to measuring a qubit about some orientation satisfies $x^2 = x$; intuitively because the "outcomes" are 0 and 1. The properties of E ensure $0 \leq E_\psi(x) \leq 1$ and clearly, all of the moments $E_\psi(x^k)$ are equal. This corresponds to the probability distribution
• $P(x = 0) = 1 - E_\psi(x)$
• $P(x = 1) = E_\psi(x)$

Born rule derived. QED

Oh wait, you were expecting me to prove the ket version; that the probability of measuring the state $a |0\rangle + b |1\rangle$ to be 0 or 1 is $|a^2|$ or $|b^2|$, didn't you? It's a mere triviality at this point; the very definition of the phrase
The ket $|\psi\rangle$ represents the quantum state $\psi$​
(in some representation of our algebra as being actual operators) is that we have an identity
$$E_\psi(x) = \langle \psi | x | \psi \rangle$$​
The usual form of the Born rule is not some foundational assumption of quantum mechanics; it's just a simple consequence of what we mean by using a ket to represent a quantum state.

I assert that opinions otherwise are really talking about something like "Why does time evolution take the form of the Schrödinger equation if we use the ket representation of a state?"

What time evolution is really is a foundational assumption of quantum mechanics. That probabilities correspond to something in the real world (and what precisely they correspond to) is also a foundational assumption and a matter of interpretation.

But the formula for computing probabilities from kets is a banality, and the specific form of time evolution in a particular way of representing things is just a matter of translation. If we had some other representation, the equation would take some other form.

Last edited: Apr 25, 2012
2. Apr 25, 2012

### Fredrik

Staff Emeritus
I don't understand all the details of the proof you sketched, but I don't have a problem with this kind of derivation. It's certainly possible to prove that if we want to use "the rest of QM" (which doesn't mention probabilities) to assign probabilities to measurement results, we must use the Born rule. However, I don't think it's possible to use "the rest of QM" to show that we should be assigning non-trivial probabilities to measurement results.

I also have a problem with derivations that assume that the Hilbert space of a composite system is the tensor product of the Hilbert spaces of the subsystems, because this is not a natural assumption. It's something we would normally derive from the Born rule. The tensor product stuff can be derived from other assumptions, but I think all of those can be derived from the Born rule too. I would need a good reason to think of these alternative assumptions as the reason for the Born rule rather than the other way round.

You're probably already familiar with this, but there's a very cool derivation of the Born rule in the Hilbert space approach too. You define a state as a probability measure on the lattice of Hilbert subspaces, and prove that for each state μ, there's a unique state operator ρ, such that for all Hilbert subspaces M, we have $\mu(M)=\operatorname{Tr}(\rho P_M)$, where PM is the projection operator for the subspace M. This is Gleason's theorem. The formula reduces to the Born rule when ρ is pure and M is 1-dimensional:
$$\operatorname{Tr}(\rho P_M) =\operatorname{Tr}\big(|\psi\rangle\langle\psi |\phi\rangle\langle\phi|\big) =|\langle\psi|\phi\rangle|^2.$$

3. Apr 25, 2012

### Hurkyl

Staff Emeritus
There is a natural thing in the operator version.

Let $\mathcal{A}, \mathcal{B}$ be the algebras of "observables" on two disjoint (but not necessarily "independent") systems.

For the union of the two systems, the natural thing is to simply define the algebra of observables as the algebra that contains $\mathcal{A}$ and $\mathcal{B}$ with no other postulates. In the language of category theory, we want the coproduct (which can also be viewed as a special kind of pushout). For C*-algebras, this algebra is called the "free product" of $\mathcal{A}$ and $\mathcal{B}$.

The tensor product $\mathcal{A} \otimes \mathcal{B}$ comes from making the additional requirement that the two algebras commute with each other; that $AB = BA$ for $A \in \mathcal{A}$ and $B \in \mathcal{B}$.

There are a variety of plausible arguments for commutativity:
• One could just take it as a definition of "independent", and observe that this definition agrees empirically with our real world notion of "independent"
• One could argue that the set of "outcomes" of AB ought to be the products of an outcome from A with an outcome of B, and that the outcomes of BA ought to be the same, and so we should insist AB=BA
• One could make a mathematical statement of independence:
$$E_\psi(A_1 B_1 A_2 B_2 \cdots A_n B_n) = E_\psi(A_1 A_2 \cdots A_n) E_\psi(B_1 B_2 \cdots B_n)$$
that indicates we shouldn't care about the ordering of products
• One could argue that every state of the joint system should be expressible in terms of product states -- i.e. choosing a state $\psi$ from the first system and a state $\varphi$ from the second system, and we should have
$$E_{\psi, \varphi}(A_1 B_1 A_2 B_2 \cdots A_n B_n) = E_\psi(A_1 A_2 \cdots A_n) E_\varphi(B_1 B_2 \cdots B_n)$$
and so no information is lost by asserting commutativity.

In any case, it's clear that if you have unitary representations of $\mathcal{A}$ and $\mathcal{B}$ acting on Hilbert spaces $\mathcal{H}_1, \mathcal{H}_2$, then there is a representation of $\mathcal{A} \otimes \mathcal{B}$ acting on $\mathcal{H}_1 \otimes \mathcal{H}_2$. I bet there's some compelling statement about irreducible representations that I don't know that one can invoke here.

(most of my experience with this stuff is in commutative algebra without norms being involved, so there are probably some technical details omitted. e.g. I think there are "maximum" and "minimum" tensor products, and I want the maximum. I had never heard of such a thing before today)

4. Apr 25, 2012

### Gordon Watson

Hurkyl,

Quick comment, in passing; may not be relevant.

As I see it: "It is trivially true (being an identity), that any probability density ρ(x) can be represented by the absolute-square of a complex Fourier polynomial ψ(x)."

PS: Froehner ... http://www.fritz-froehner.de/link01.htm [Broken] ... has another (earlier, ca 1915) proof. So Born was late on the scene; sort of.

GW

Last edited by a moderator: May 5, 2017
5. Apr 26, 2012

### strangerep

OK. A "spectral norm", I guess?

OK, so we're talking (presumably) about a projection operator.

I would have expected
$$P(x = 1) ~=~ \left| E_\psi(x) \right|^2$$
(or did I miss something?)

BTW, if we start with a set of projection operators with discrete finite spectra, the Born rule is just the N-dimensional version of
$$\cos^2(\theta) + \cos^2(\pi/2 - \theta) ~=~ 1 ~.$$

6. Apr 27, 2012

### Physics Monkey

Isn't this because $E_\psi(x) = 0*P(x=0) + 1*P(x=1)$ which gives $P(x=1) = E_\psi(x)$?

7. Apr 27, 2012

### Hurkyl

Staff Emeritus
There are a variety of ways to define the norm. For a C*-algebra, the supremum of the magnitude of the elements of the spectrum is such a definition. I was mainly just taking an aside to point out that the norm has a very natural interpretation independent of any issue of probability, since I've previously seen it claimed otherwise.

I haven't said a single thing about wavefunctions. They don't enter the picture until after "QED". As far as this derivation is concerned, it's simply the function whose value is the alleged 'expected value' of an operator. In C*-terms, one can work out that this means $E_\psi$ is a positive linear functional of norm 1.

The high level view of my argument is that there are two important things about observables and states:
• Observables form an algebra of some sort,
• You can combine a state with an observable to produce a probability distribution.
The Born rule appears not as a fundamental hypothesis of QM, but instead merely as a choice of using a certain scheme for representing states with computationally convenient mathematical objects.

The issue is simply muddled because historically, the representation came first.

8. Apr 27, 2012

### strangerep

I don't see where the first equation came from. ($\psi$ was an arbitrary state, iiuc.)

9. Apr 27, 2012

### strangerep

I have no problem with this part. (I prefer the operator-centric approach also.)

It just seemed that your initial use of $E_\psi(x)$ was inconsistent with its later representation as $\langle \psi|x|\psi\rangle$. Shouldn't the latter have been $|\langle \psi|x|\psi\rangle|^2$ ?

10. Apr 28, 2012

### Hurkyl

Staff Emeritus
No, the former is what we want -- the former is quadratic in $|\psi \rangle$ which is where the usual squared amplitudes come from. The latter would give fourth powers. Also, the latter cannot be correct, for the reason that it is always real, whereas $E_\psi(i) = i E_\psi(1) = i$.

11. Apr 28, 2012

### strangerep

OK, I see what you're saying now...
Linear functional acting on a projection operator.

BTW, what about observables with unbounded spectra?
Your original post doesn't seem to cover this.