Deriving the Born Rule from the operator-centric formulation

In summary, the conversation discusses different methods of deriving the Born rule in quantum mechanics, with one being an operator-centric approach and the other being a Hilbert space approach. The operator-centric approach involves using an algebra of operators and defining a function that represents the expected value of an operator over a state. The Hilbert space approach involves defining a state as a probability measure on the lattice of Hilbert subspaces and using Gleason's theorem to derive the Born rule. Both approaches ultimately lead to the same result: the Born rule is a consequence of the interpretation of kets as quantum states.
  • #1
Hurkyl
Staff Emeritus
Science Advisor
Gold Member
14,981
26
Since the issue of deriving the Born rule comes up from time to time in the forum and I'm always a little mystified by some of the opinions people have. I recently fleshed out the details about the derivation I'm about to present and realized it pretty much makes the Born rule a triviality, so I thought I'd present it to see how others take it.

Also, it shows off the operator-focused picture of quantum mechanics, which I like. :smile:

As an aside, this is an "external" derivation. (For the record, I generally prefer an "internal" ones, but those seem to stir up much controversy -- especially from those who think internal treatments are impossible by definition)Recall that the main thing about the operator-centric picture of quantum mechanics is that the central notion is that there is an algebra of operators. This algebra is supposed to represent what sorts of "measurements" can be done, and possibly more general things. There are a variety of features of this algebra, but the main thing I want is the following: there is a quantity [itex]||x||[/itex] that has the meaning
[itex]||x||[/itex] is the magnitude of the largest outcome possible of the measurement.​
(really this should be phrased as a supremum, rather than in terms of a maximum)

One property of a "quantum state" [itex]\psi[/itex] is that you can combine a state with an operator [itex]x[/itex] to produce a complex number [itex]E_\psi(x)[/itex] to be interpreted as something like the "expected value of the operator over that state". This function has various properties, such as
[tex]| E_\psi(x) | \leq || x ||[/tex]​
which is obvious from its intended interpretation. As an aside, by listing out an adequate list of the properties that [itex]E_\psi[/itex] must satisfy, one can even go so far as to define a quantum state to be a function that satisfies those properties.

If we assume our operators form a C*-algebra, the properties of [itex]E_\psi[/itex] can be summed up by saying it's a "positive linear functional of norm 1". I'm not making that assumption, but I'm going to make some similar requirements on [itex]E_\psi[/itex] to satisfy the properties I require below.Now, given that we intend to interpret [itex]E_\psi(x)[/itex] as the "expected value" of the operator x, we can also compute the higher moments [itex]m_k = E_\psi(x^k)[/itex] of the operator x over the state [itex]\psi[/itex].

Now, I'm pretty sure the properties of [itex]E_\psi[/itex] ensure that there is a actual probability distribution that has these moments [itex]m_k[/itex] about zero -- and that probability distribution is unique -- and thus we can say that this is probability distribution of the outcomes of x over the state [itex]\psi[/itex]. But I feel (ATM) it's enough to treat the case of a qubit, where we don't need anything fancy.

The operator corresponding to measuring a qubit about some orientation satisfies [itex]x^2 = x[/itex]; intuitively because the "outcomes" are 0 and 1. The properties of E ensure [itex]0 \leq E_\psi(x) \leq 1[/itex] and clearly, all of the moments [itex]E_\psi(x^k)[/itex] are equal. This corresponds to the probability distribution
  • [itex]P(x = 0) = 1 - E_\psi(x)[/itex]
  • [itex]P(x = 1) = E_\psi(x)[/itex]

Born rule derived. QEDOh wait, you were expecting me to prove the ket version; that the probability of measuring the state [itex]a |0\rangle + b |1\rangle[/itex] to be 0 or 1 is [itex]|a^2|[/itex] or [itex]|b^2|[/itex], didn't you? It's a mere triviality at this point; the very definition of the phrase
The ket [itex]|\psi\rangle[/itex] represents the quantum state [itex]\psi[/itex]​
(in some representation of our algebra as being actual operators) is that we have an identity
[tex]E_\psi(x) = \langle \psi | x | \psi \rangle[/tex]​
The usual form of the Born rule is not some foundational assumption of quantum mechanics; it's just a simple consequence of what we mean by using a ket to represent a quantum state.I assert that opinions otherwise are really talking about something like "Why does time evolution take the form of the Schrödinger equation if we use the ket representation of a state?"

What time evolution is really is a foundational assumption of quantum mechanics. That probabilities correspond to something in the real world (and what precisely they correspond to) is also a foundational assumption and a matter of interpretation.

But the formula for computing probabilities from kets is a banality, and the specific form of time evolution in a particular way of representing things is just a matter of translation. If we had some other representation, the equation would take some other form.
 
Last edited:
Physics news on Phys.org
  • #2
I don't understand all the details of the proof you sketched, but I don't have a problem with this kind of derivation. It's certainly possible to prove that if we want to use "the rest of QM" (which doesn't mention probabilities) to assign probabilities to measurement results, we must use the Born rule. However, I don't think it's possible to use "the rest of QM" to show that we should be assigning non-trivial probabilities to measurement results.

I also have a problem with derivations that assume that the Hilbert space of a composite system is the tensor product of the Hilbert spaces of the subsystems, because this is not a natural assumption. It's something we would normally derive from the Born rule. The tensor product stuff can be derived from other assumptions, but I think all of those can be derived from the Born rule too. I would need a good reason to think of these alternative assumptions as the reason for the Born rule rather than the other way round.

You're probably already familiar with this, but there's a very cool derivation of the Born rule in the Hilbert space approach too. You define a state as a probability measure on the lattice of Hilbert subspaces, and prove that for each state μ, there's a unique state operator ρ, such that for all Hilbert subspaces M, we have ##\mu(M)=\operatorname{Tr}(\rho P_M)##, where PM is the projection operator for the subspace M. This is Gleason's theorem. The formula reduces to the Born rule when ρ is pure and M is 1-dimensional:
$$\operatorname{Tr}(\rho P_M) =\operatorname{Tr}\big(|\psi\rangle\langle\psi |\phi\rangle\langle\phi|\big) =|\langle\psi|\phi\rangle|^2.$$
 
  • #3
Fredrik said:
I also have a problem with derivations that assume that the Hilbert space of a composite system is the tensor product of the Hilbert spaces of the subsystems, because this is not a natural assumption.
There is a natural thing in the operator version.

Let [itex]\mathcal{A}, \mathcal{B}[/itex] be the algebras of "observables" on two disjoint (but not necessarily "independent") systems.

For the union of the two systems, the natural thing is to simply define the algebra of observables as the algebra that contains [itex]\mathcal{A}[/itex] and [itex]\mathcal{B}[/itex] with no other postulates. In the language of category theory, we want the coproduct (which can also be viewed as a special kind of pushout). For C*-algebras, this algebra is called the "free product" of [itex]\mathcal{A}[/itex] and [itex]\mathcal{B}[/itex].

The tensor product [itex]\mathcal{A} \otimes \mathcal{B}[/itex] comes from making the additional requirement that the two algebras commute with each other; that [itex]AB = BA[/itex] for [itex]A \in \mathcal{A}[/itex] and [itex]B \in \mathcal{B}[/itex].

There are a variety of plausible arguments for commutativity:
  • One could just take it as a definition of "independent", and observe that this definition agrees empirically with our real world notion of "independent"
  • One could argue that the set of "outcomes" of AB ought to be the products of an outcome from A with an outcome of B, and that the outcomes of BA ought to be the same, and so we should insist AB=BA
  • One could make a mathematical statement of independence:
    [tex]E_\psi(A_1 B_1 A_2 B_2 \cdots A_n B_n) = E_\psi(A_1 A_2 \cdots A_n) E_\psi(B_1 B_2 \cdots B_n)[/tex]
    that indicates we shouldn't care about the ordering of products
  • One could argue that every state of the joint system should be expressible in terms of product states -- i.e. choosing a state [itex]\psi[/itex] from the first system and a state [itex]\varphi[/itex] from the second system, and we should have
    [tex]E_{\psi, \varphi}(A_1 B_1 A_2 B_2 \cdots A_n B_n) = E_\psi(A_1 A_2 \cdots A_n) E_\varphi(B_1 B_2 \cdots B_n)[/tex]
    and so no information is lost by asserting commutativity.

In any case, it's clear that if you have unitary representations of [itex]\mathcal{A}[/itex] and [itex]\mathcal{B}[/itex] acting on Hilbert spaces [itex]\mathcal{H}_1, \mathcal{H}_2[/itex], then there is a representation of [itex]\mathcal{A} \otimes \mathcal{B}[/itex] acting on [itex]\mathcal{H}_1 \otimes \mathcal{H}_2[/itex]. I bet there's some compelling statement about irreducible representations that I don't know that one can invoke here.(most of my experience with this stuff is in commutative algebra without norms being involved, so there are probably some technical details omitted. e.g. I think there are "maximum" and "minimum" tensor products, and I want the maximum. I had never heard of such a thing before today)
 
  • #4
Hurkyl said:
Since the issue of deriving the Born rule comes up from time to time in the forum and I'm always a little mystified by some of the opinions people have. I recently fleshed out the details about the derivation I'm about to present and realized it pretty much makes the Born rule a triviality, so I thought I'd present it to see how others take it.

...

Hurkyl,

Quick comment, in passing; may not be relevant.

As I see it: "It is trivially true (being an identity), that any probability density ρ(x) can be represented by the absolute-square of a complex Fourier polynomial ψ(x)."

Can supply more info later, if relevant.

PS: Froehner ... http://www.fritz-froehner.de/link01.htm ... has another (earlier, ca 1915) proof. So Born was late on the scene; sort of.

GW
 
Last edited by a moderator:
  • #5
Hurkyl said:
[itex]||x||[/itex] is the magnitude of the largest outcome possible of the measurement.​
OK. A "spectral norm", I guess?

The operator corresponding to measuring a qubit about some orientation satisfies [itex]x^2 = x[/itex]; intuitively because the "outcomes" are 0 and 1.
OK, so we're talking (presumably) about a projection operator.

The properties of E ensure [itex]0 \leq E_\psi(x) \leq 1[/itex] and clearly, all of the moments [itex]E_\psi(x^k)[/itex] are equal. This corresponds to the probability distribution
  • [itex]P(x = 0) = 1 - E_\psi(x)[/itex]
  • [itex]P(x = 1) = E_\psi(x)[/itex]
I would have expected
$$
P(x = 1) ~=~ \left| E_\psi(x) \right|^2
$$
(or did I miss something?)

BTW, if we start with a set of projection operators with discrete finite spectra, the Born rule is just the N-dimensional version of
$$
\cos^2(\theta) + \cos^2(\pi/2 - \theta) ~=~ 1 ~.
$$
 
  • #6
strangerep said:
I would have expected
$$
P(x = 1) ~=~ \left| E_\psi(x) \right|^2
$$
(or did I miss something?)

Isn't this because [itex] E_\psi(x) = 0*P(x=0) + 1*P(x=1) [/itex] which gives [itex] P(x=1) = E_\psi(x) [/itex]?
 
  • #7
strangerep said:
OK. A "spectral norm", I guess?
There are a variety of ways to define the norm. For a C*-algebra, the supremum of the magnitude of the elements of the spectrum is such a definition. I was mainly just taking an aside to point out that the norm has a very natural interpretation independent of any issue of probability, since I've previously seen it claimed otherwise.

I would have expected
$$
P(x = 1) ~=~ \left| E_\psi(x) \right|^2
$$
(or did I miss something?)
I haven't said a single thing about wavefunctions. They don't enter the picture until after "QED". As far as this derivation is concerned, it's simply the function whose value is the alleged 'expected value' of an operator. In C*-terms, one can work out that this means [itex]E_\psi[/itex] is a positive linear functional of norm 1.


The high level view of my argument is that there are two important things about observables and states:
  • Observables form an algebra of some sort,
  • You can combine a state with an observable to produce a probability distribution.
The Born rule appears not as a fundamental hypothesis of QM, but instead merely as a choice of using a certain scheme for representing states with computationally convenient mathematical objects.

The issue is simply muddled because historically, the representation came first.
 
  • #8
Physics Monkey said:
Isn't this because [itex] E_\psi(x) = 0*P(x=0) + 1*P(x=1) [/itex] which gives [itex] P(x=1) = E_\psi(x) [/itex]?
I don't see where the first equation came from. (##\psi## was an arbitrary state, iiuc.)
 
  • #9
Hurkyl said:
The high level view of my argument is that there are two important things about observables and states:
  • Observables form an algebra of some sort,
  • You can combine a state with an observable to produce a probability distribution.
I have no problem with this part. (I prefer the operator-centric approach also.)

It just seemed that your initial use of ##E_\psi(x)## was inconsistent with its later representation as ##\langle \psi|x|\psi\rangle##. Shouldn't the latter have been ##|\langle \psi|x|\psi\rangle|^2## ?
 
  • #10
strangerep said:
I have no problem with this part. (I prefer the operator-centric approach also.)

It just seemed that your initial use of ##E_\psi(x)## was inconsistent with its later representation as ##\langle \psi|x|\psi\rangle##. Shouldn't the latter have been ##|\langle \psi|x|\psi\rangle|^2## ?
No, the former is what we want -- the former is quadratic in [itex]|\psi \rangle[/itex] which is where the usual squared amplitudes come from. The latter would give fourth powers. Also, the latter cannot be correct, for the reason that it is always real, whereas [itex]E_\psi(i) = i E_\psi(1) = i[/itex].
 
  • #11
Hurkyl said:
No, the former is what we want -- the former is quadratic in [itex]|\psi \rangle[/itex] which is where the usual squared amplitudes come from. [...]
OK, I see what you're saying now...
Linear functional acting on a projection operator.

BTW, what about observables with unbounded spectra?
Your original post doesn't seem to cover this.
 

1. What is the Born Rule?

The Born Rule is a key principle in quantum mechanics that describes how the probabilities of different outcomes of a quantum measurement are related to the state of the system being measured. It states that the probability of obtaining a particular measurement result is equal to the squared magnitude of that result's wavefunction coefficient.

2. What is the operator-centric formulation?

The operator-centric formulation is a mathematical framework used in quantum mechanics to describe the behavior of quantum systems. It involves representing physical observables, such as position and momentum, as operators that act on the system's wavefunction to produce measurable results.

3. How is the Born Rule derived from the operator-centric formulation?

The Born Rule can be derived from the operator-centric formulation by considering the expectation value of a measurement operator, which is a mathematical representation of a physical measurement. By taking the squared magnitude of this expectation value and normalizing it, the Born Rule is obtained.

4. Why is the Born Rule important in quantum mechanics?

The Born Rule is important because it allows us to make predictions about the behavior of quantum systems and verify those predictions through experiments. It also provides a mathematical link between the abstract mathematical framework of quantum mechanics and the observable results of measurements.

5. Are there any alternative formulations of the Born Rule?

Yes, there are alternative formulations of the Born Rule, such as the density matrix formulation, which describes the evolution of quantum systems over time, and the path integral formulation, which allows for the calculation of probabilities for complex systems. However, the operator-centric formulation is the most widely used and accepted approach for deriving the Born Rule in quantum mechanics.

Similar threads

  • Quantum Physics
Replies
22
Views
1K
Replies
2
Views
327
  • Quantum Physics
2
Replies
61
Views
1K
  • Quantum Physics
Replies
9
Views
942
Replies
2
Views
1K
Replies
11
Views
1K
  • Quantum Physics
Replies
21
Views
2K
  • Quantum Interpretations and Foundations
2
Replies
47
Views
1K
Replies
3
Views
795
Back
Top