# Probability theory and quantum mechanics

1. Apr 19, 2014

### V0ODO0CH1LD

In probability theory a sample space is a set containing all possible outcomes of an experiment and an event is a subset of the sample space (an element of its power set).

I think it would be natural to think of the basis of the vector space representing a quantum system as a sample space and the quantum states of the system as events, right? But then it seems like an event shouldn't be simply a subset of the sample space.. Does that mean that the concept of a event can be further abstracted? How does quantum mechanics fit in with probability theory? Or does quantum mechanics use a different definition of probability?

If guess my question is: is the probability used in quantum mechanics an abstraction or an instance of ordinary probability theory?

2. Apr 19, 2014

### chogg

"The" basis? :-) Which basis?

There can be many bases for the Hilbert space, and you can choose whichever you like. Each choice corresponds to some observable. The elements of that basis are (one possible choice for) the events, when the corresponding observable is measured. Because that's where probabilities enter quantum mechanics: when a measurement is performed.

I think you have more or less the right idea. I just want to emphasize that:
• Probabilities in quantum mechanics correspond to some particular observable being measured, and
• Not just any quantum state can be an event; it must be one of the eigenstates of the observable being measured.

As for the nature of probabilities in quantum mechanics: they are of the same sort as probabilities anywhere else.

Does that help?

3. Apr 19, 2014

### Staff: Mentor

Actually QM is a an extension of probability theory that allows for continuous transformations between pure states.

The argument goes something like this. Suppose we have a system in 2 states represented by the vectors [0,1] and [1,0]. These states are called pure. These can be randomly presented for observation and you get the vector [p1, p2] where p1 and p2 give the probabilities of observing the pure state. Such states are called mixed. Probability theory is basically the theory of mixed states where the pure states are the usual basis vectors.

Now consider the matrix A that say after 1 second transforms one pure state to another with rows [0, 1] and [1, 0]. But what happens when A is applied for half a second? Well that would be a matrix U^2 = A. You can work this out and low and behold U is complex. Apply it to a pure state and you get a complex vector. This is something new. Its not a mixed state - but you are forced to it if you want continuous transformations between pure states.

QM is basically the theory where pure states are complex vectors, and its the theory that makes sense out of such weird pure states. There is really only one reasonable way to do it - by the Born rule (you make the assumption of non contextuality - ie the probability is not basis dependant, plus a few other things need to go into it) - as shown by Gleason's theorem:
http://kof.physto.se/cond_mat_page/theses/helena-master.pdf [Broken]

But it can also be done without such high powered mathematical machinery:
http://www.scottaaronson.com/democritus/lec9.html
http://arxiv.org/pdf/quant-ph/0101012.pdf

Thanks
Bill

Last edited by a moderator: May 6, 2017
4. Apr 19, 2014

### V0ODO0CH1LD

I was just wondering if the probability used in quantum mechanics was an abstraction or instance of probability theory (defined using kolmogorov's axioms).

Also, if your argument for not every quantum state being an event is that not all of them can be the result of an experiment, then it's okay, because not every event is the outcome of an experiment anyway. I mean "rolling an odd number" is an event in the sample space of "faces of a dice", but you can't actually roll "odd". The outcome of experiments are members of the sample space not the set of events. The probability of rolling an odd number is 50% but there is no possible outcome with that probability when when you roll a dice.

As for your first point, is it related to a theme I've been noticing in quantum mechanics which is that single probabilities aren't as useful as conditional probabilities?

5. Apr 19, 2014

### Staff: Mentor

It is. Its what's called a Generalised probability model.

Bog standard probability theory as detailed by the Kolmogerov axioms is the simplest generalised probability model dealing with pure states that are the usual basis vectors. QM is the extension that allows continuous transformations between such pure states.

Unfortunately beginning QM books are often a bit remiss in detailing the full theory. States are not the elements of a vector space. They are in fact positive operators of unit trace. A state of the form |u><u| is called pure. A convex sum of pure states is called mixed. It can be showen all states are either pure or mixed. Only pure states can be put in correspondence with the elements of a vector space.

Basically a state is a generalisation of probability. A Von Neumann measurement is described by a resolution of the the identity Ei, where the probability of outcome i is given by the Born rule which is Trace (PEi) - P the state. The event space is the Ei.

I have zero idea why you would say that.

Thanks
Bill

Last edited: Apr 20, 2014
6. Apr 20, 2014

### Fredrik

Staff Emeritus
Probability theory is the mathematics of probability measures. The domain of a probability measure is a $\sigma$-algebra of subsets of the sample space. $\sigma$-algebras are lattices. The concept of "probability measure" can be generalized to lattices. The set of all Hilbert subspaces of a quantum theory's Hilbert space is a lattice. Gleason's theorem tells us that there's a bijection between (generalized) probability measures on this lattice and state operators (a.k.a. density matrices), and gives us a formula that we can use to calculate the probability that the probability measure identified by the state operator associates with a given subspace. When the subspace is 1-dimensional, and the state operator is pure (i.e. equal to a projection operator for a 1-dimensional subspace), this formula is just the Born rule.

Since the probability measures in various quantum theories are defined on lattices that aren't $\sigma$-algebras, we can say that quantum theory isn't probability theory, it's generalized probability theory.

The set of bounded linear operators on a Hilbert space is a C*-algebra. Some of its subsets (including the set itself) satisfy the definition of a von Neumann algebra. The set of projection operators in a von Neumann algebra is a lattice. The branch of mathematical physics that studies probability measures on these lattices is called quantum probability theory or quantum measure theory. I believe that there's also a version of Gleason's theorem for probability measures on these lattices.

I don't understand much of it myself. If you're not afraid to have your head explode, check out the article "Quantum probability theory" by Rédei and Summers (http://arxiv.org/abs/quant-ph/0601158) or perhaps the book "Quantum measure theory" by Jan Hamhalter.

Last edited: Apr 20, 2014
7. Apr 22, 2014

### V0ODO0CH1LD

Is Gleason's theorem a proof of born's rule? As in born's rule is an empirical result translated into mathematics and Gleason's theorem gives a "natural" mathematical proof of it?

8. Apr 22, 2014

### Staff: Mentor

Yes it is.

But you have to look at what goes into it.

The primary axiom of QM is a Von Neumann quantum measurement is described by a resolution of the identity Ei such that the probability of outcome i is determined by Ei.

For Gleason's theroem see:
http://kof.physto.se/cond_mat_page/theses/helena-master.pdf [Broken]
'Let p be a measure on the closed subspaces of a separable (real or complex) Hilbert space H with dim H >= 3. There exists a positive semi-definite self-adjoint operator of the trace class 1 such that for all closed subspaces A of H p(A) = Tr(PA) where A is the orthogonal projection of H onto A. In particular, any assignment of probabilities to the vectors in H has to be of this form.'

What it says is that the only probability measure that can be defined on the Ei (since they are projection operators they are isomporphic to subspaces - the Ei are the A in the theorem) is via the Born Rule ie there exists a positive operator of unit trace, P, such that the probability of outcome i is Trace (PEi).

By definition P is the state of the system and it is easily seen to be the analogue of probability in QM.

However its a horrid thing to prove. Its much easier to start with the generalised measurement postulate which goes like this:

A generalised quantum measurement is described by a POVM which is a set of positive operators Ei such that ∑ Ei = 1. The probability of outcome i is determined by Ei.

The proof of the Born rule from that is much easier - its in the paper posted above.

So what's the catch? The way I have described it has hidden a lot of issues. The main one is non contextuality which you can look up.

Thanks
Bill

Last edited by a moderator: May 6, 2017
9. Apr 22, 2014

### Fredrik

Staff Emeritus
The Born rule isn't just a formula. It's a correspondence rule. It relates a real-world concept (possible measurement results) with something in the mathematics (assigned probabilities). It's hard to argue that anything is a proof of a correspondence rule, since it's not just mathematics.

However, when you compare the old and new approaches to QM, it seems that we have made some kind of progress.

The old approach: We include a correspondence rule that says that preparation procedures are represented by state vectors, and we include the Born rule, which specifies how the theory assigns probabilities to possible results of measurements.

The new approach: Instead of the two correspondence rules mentioned above, we include just one, that says that preparation procedures are represented by probability measures on the lattice of Hilbert subspaces (which represent yes-no measuring devices, both in the old approach and the new). Then we prove that there's a bijective correspondence $\rho\leftrightarrow\mu_\rho$ between state operators and probability measures. Here $\rho$ is an arbitrary state operator, and $\mu_\rho$ is the probability measure defined by $\mu_\rho(M)=\operatorname{Tr}(P_M \rho)$ for all Hilbert subspaces $M$. $P_M$ is the projection operator associated with the subspace $M$. When $\rho=|\alpha\rangle\langle\alpha|$ and $P_M=|\beta\rangle\langle\beta|$, we have $\operatorname{Tr}(P_M \rho)=\left|\langle\alpha|\beta\rangle\right|^2$. So we have recovered the Born rule.

Does this mean that we have "derived" it? We certainly haven't derived it from Hilbert space mathematics and the other correspondence rules in the old approach. But you could say that we have derived it (in a sense that isn't 100% mathematical) from Hilbert space mathematics and a slightly different set of correspondence rules. We took the old approach, discarded the Born rule, and replaced the state vector rule with a similar but more intuitive (and apparently stronger) rule. The result was a theory that makes the same predictions as the old one.

I should also add that attempts to "derive" the Born rule are usually based on the far more ambitious (and in my opinion fundamentally misguided) idea that "QM minus the Born rule" should describe what is actually happening in the universe, including the operation of measuring devices, and should therefore be able to tell us why the Born rule is so accurate.

Last edited: Apr 22, 2014
10. Apr 22, 2014

### Staff: Mentor

The sense it's derived depends on your starting axioms, but really its in the 'hidden' assumptions.

QM can in fact be derived from just 2 axioms as found in Ballentine - Quantum Mechanics - A Modern Development.

The first axiom is basically the von Neumann measurement hypothesis I gave before ie 'The primary axiom of QM is a Von Neumann quantum measurement is described by a resolution of the identity Ei such that the probability of outcome i is determined by Ei.'

The second is the Born rule.

What Gleason does is derive the second from the first so you really have just one axiom.

To apply Gleason we have to make a few assumptions - some are obvious - others not so obvious.

First we need the so called strong superposition principle, which says, in the way I have presented it, that any resolution of the identity corresponds, at least in principle, to a measurement. Then you need reasonable continuity arguments - but they are really part of the fact you are dealing with a Hilbert space - so are not strictly speaking assumptions. But it turns out the most far reaching assumption is in fact the most innocuous looking. We are assuming that the probability does not depend on what resolution of the identity the Ei is part of. This is trivial mathematically, but has far reaching consequences physically, being associated with the issue of non-contextuality.

QM these days can be presented very elegantly and in a way that seems very natural eg:
http://arxiv.org/abs/quant-ph/0205039

But be wary, it hides a lot of unstated assumptions.

IMHO this way, via Gleason's theorem, is the correct way to proceed, and is the way I personally approach it, but all sorts of 'issues' are swept under the carpet. It seems almost like magic - one can develop QM from just one axiom - the operative word here is 'seems' - there are really all sorts of assumptions being made along the way - its just they seem very reasonable and natural.

Thanks
Bill

Last edited: Apr 22, 2014
11. Apr 23, 2014

### Fredrik

Staff Emeritus
I was thinking (at first) that you also have to strenghten the one you keep, (e.g. by changing the thing that's associated with preparation procedures, from state vectors, to probability measures on the lattice of subspaces), but on second thought, this doesn't seem to be necessary.

I would however say that you still need a part of the second "axiom", because the first one doesn't say what the mathematical representation of measuring devices is supposed to be used for. We can drop the part of the Born rule that specifies the formula that assigns probabilities, but not the part that says that probabilities are to be assigned.

So we still have a second "axiom". It's easy to overlook, because this is not one of the things that sets QM apart from other theories. It's something that QM has in common with all of them. The property of falsifiability. A set of statements that doesn't associate probabilities with possible results of mesaurements isn't falsifiable, and therefore isn't a theory.

By the way, I stopped using the word "axiom" the way you used it here after a discussion with A. Neumaier a few years ago, where he had a strong negative reaction to this usage of the word. I decided that I agree with him about that. It's confusing to use the term "axiom" for statements that aren't purely mathematical. I've been using the term "correspondence rule" since.

My view is similar. I view QM as probability theory, generalized from $\sigma$-algebras to lattices. (Actually a sub-class of the class of all lattices, defined by a number of technical requirements I don't have memorized. Orthomodularity and stuff). Gleason's theorem is the cornerstone of this view.

12. Apr 23, 2014

### Staff: Mentor

Without doubt hidden assumptions are being made. That is just another.

The purpose of such treatments is not logical completeness but to present it in a way that seems natural and illuminating.

I say, for example, Ballentine's approach uses just two axioms, and its true that is explicitly what he states, but there are hidden ones all through it such as his derivation of Schroedinger equation etc - he assumes - but does not state it explicitly as an axiom - that probabilities are coordinate system invariant. It follows immediately from the POR - and he mentions that - but not that it is a new axiom - which of course it is.

I think words generally gain their exact meaning via context. The context in pure math say is different to the context in physics. In physics it seems to be synonymous with important assumption.

Well that certainly is the view of the mathematically most potent and penetrating version we have as found in say the Geometry of Quantum Theory by Varadarajan. But that comes at a cost - its mathematically - how do mathematicians express it - non trivial - translation its HARD :tongue::tongue::tongue::tongue:.

What is it one wag said about when mathematicians get a hold of a physical theory. It becomes more exact and penetrating - mathematically that is - but its unrecognisable from what founded it such as the formulation of classical mechanics based on symplectic forms on manifolds - and the geometrical view of QM is an outgrowth of that.

Thanks
Bill