# The 7 Basic Rules of Quantum Mechanics

For reference purposes and to help focus discussions on Physics Forums in interpretation questions on the real issues, there is a need for fixing the common ground. There is no consensus about the interpretation of quantum mechanics, and – not surprisingly – there is disagreement even among the mentors and science advisors here on Physics Forums. But the following formulation in terms of 7 basic rules of quantum mechanics was *agreed upon among the science advisors* of Physics Forums in a long and partially heated internal discussion on ”Best Practice to Handle Interpretations in Quantum Physics”, September 24 – October 29, 2017, based on a first draft by @atyy and several improved versions by @tom.stoer. Other significant contributors to the discussions included @fresh_42, @kith, @stevendaryl, and @vanhees71.

I slightly expanded the final version and added headings and links to make it suitable as an Insight article. A revised version of this article is published as Section 1.1 of my recent book

- Coherent Quantum Physics: A Reinterpretation of the Tradition, de Gruyter, Berlin 2019.

**The 7 Basic Rules**

The basic rules reflect what is almost generally taught as the basics in quantum physics courses around the world. Often they are stated in terms of axioms or postulates, but this is not essential for their practical validity. In some interpretations, some of these rules are not considered fundamental rules but only valid as empirical or effective rules for practical purposes.

These rules describe the basis of the quantum formalism and are found in almost all introductory quantum mechanics textbooks, among them: Basdevant 2016; Cohen-Tannoudji, Diu and Laloe 1977; Dirac 1930, 1967; Gasiorowicz 2003; Greiner 2008; Griffiths and Schroeter 2018; Landau and Lifshitz 1958, 1977; Liboff 2003; McIntyre 2012; Messiah 1961; Peebles 1992; Rae and Napolitano 2015; Sakurai 2010; Shankar 2016; Weinberg 2013. [Even Ballentine 1998, who rejects rule (7) = his process (9.9) as fundamental, derives it at the bottom of p.243 as an effective rule.] There are generalizations of these rules (e.g., Auletta, Fortunato and Parisi 2009; Busch, Grabowski and Lahti 2001; Nielsen and Chuang 2011) for degenerate eigenvalues, for mixed states, and for measurements not defined by self-adjoint operators but by POVMs. These generalizations are necessary to be able to apply quantum mechanics to all situations encountered in practice. The basic rules are carefully formulated so that they are correct as they stand and at the same time fully compatible with these generalizations.

When stating the rules, *italic text corresponds to the physical systems, its preparation, measurement, measured values etc.; *non-italic text corresponds to mere mathematical objects that represent the *physical system, etc.*

- A
*quantum system*is described using a Hilbert space ##\mathcal{H}##. Often, this Hilbert space is assumed to be separable. - A pure state of a quantum system is represented by a normalized vector ##|\psi \rangle## in ##\mathcal{H}##; state vectors differing only by a phase factor of absolute value 1 represent the same state. In the position representation, where the Hilbert space is the space of square-integrable functions of a position vector ##x##, ##\psi(x)## is called the
*wave function*of the system. - The
*time evolution of an isolated**quantum system*represented by the state vector ##|\psi(t)\rangle## is given by

$$\mathrm{i} \hbar\frac{\mathrm{d}}{\mathrm{d} t} |\psi(t) \rangle = H \, |\psi(t) \rangle$$

where ##H## is the Hamilton operator and ##\hbar## is Planck’s constant. This is the**Schrödinger equation**.

This rule is valid in the formulation of quantum mechanics called the Schrödinger picture. There are other, equivalent formulations of the time evolution, especially the Heisenberg picture and the Dirac (interaction) pictures, where time evolution is entirely or partially shifted from the state vector to the operators. - An
*observable of a quantum system*is represented by a Hermitian operator ##A## with real spectrum acting on a dense subspace of ##\mathcal{H}##. - The
*possible measured values of a measurement of an observable*are the spectral values of the corresponding operator ##A##. In case of a discrete spectrum, these are the eigenvalues ##a## satisfying ##A\, |a\rangle = a\, |a\rangle##. - Let ##\{|a\rangle\}## be a complete set of (generalized) eigenvectors of the self-adjoint operator ##A## with spectral values ##a##. Let the
*quantum system be prepared in a state*represented by the state vector ##|\psi\rangle##. If a*measurement of the observable*corresponding to ##A## is performed, the*probability*##p_\psi(a)## to find the*measured value*##a## is given by

$$p_\psi(a) = |\langle a | \psi\rangle|^2$$

This is the**Born rule**, in a formulation that assumes that all eigenvalues are nondegenerate. - For
*successive, non-destructive projective measurements*with discrete results, each*measurement with measuring value*##a## can be regarded as*preparation*of a new state whose state vector is the corresponding eigenvector ##|a\rangle##, to be used for the calculation of subsequent time evolution and*further measurements*. This is the**von Neumann projection postulate**.

**Formal Comments**

(2) To be precise, a pure state is not represented by a unit vector but by a unit ray, i.e. the equivalence class $$[\psi] = e^{\mathrm{i} \varphi} |\psi \rangle$$ with ##\varphi \in \mathbb{R}## and ##|\psi \rangle## being a normalized vector in ##\mathcal{H}##.

Equivalently, a pure state can be represented by a rank 1 density operator ##\rho=|\psi \rangle\langle\psi|## satisfying ##\rho^2=\rho=\rho^*## and ##Tr~\rho=1##. Mixed states are represented by more general (non-idempotent) Hermitian density operators of trace 1.

(3) It is equivalent to define the time evolution of an isolated quantum system by $$|\psi(t)\rangle = U(t)\,|\psi(0)\rangle$$

with the unitary time evolution operator

$$U(t) = e^{-iHt/\hbar}.$$

The evolution according to (3) is therefore also referred to as **unitary evolution.**

(4) Equivalently, ##A## is self-adjoint.

(6) In case of degenerate subspaces, let ##\{|a,\nu \rangle\}## be a complete set of (generalized) eigenvectors of ##A##, indexed by ##\nu##. The *probability* ##p_\psi(a)## to find the *measured value *##a## is then given by summing (or integrating) over ##\nu## i.e. over the entire ##a##-subspace

$$p_\psi(a) = \sum_\nu |\langle a,\nu | \psi\rangle|^2.$$

(7) The projection postulate is valid only under the assumptions stated; examples are passing barriers with holes or slits, polarization filters, and certain other instruments that modify the state of a quantum system passing through it. This (nonunitary, dissipative) change of the state to an eigenstate in the course of a projective measurement is often referred to as ”state reduction” or ”collapse of the wave function” or ”reduction of the wave packet”. Note that there is no direct conflict with the unitary evolution in (3) since, during a measurement, a system is never isolated.

In other cases, the prepared state may be quite different. (See the discussion in Landau and Lifschitz, Vol. III, Section 7.) The most general kind of quantum measurement and the resulting prepared state is described by so-called positive operator valued measures (POVMs).

**Comments on the Interpretation**

Not further discussing the foundations of quantum mechanics beyond this is called **shut-up-and-calculate**. It is the mode of working sufficiently for all who do not want to delve into often highly disputed foundational (and partly philosophical) problems. However, the above-mentioned rules are often considered conceptually unsatisfactory because they introduce not well-defined terms ‘probability’, ‘measurement’, and ‘observer’ to define these basic rules whereas in principle one expects that at least measurement and observation can be regarded as quantum mechanical processes or interactions which follow the same fundamental rules and do not play any special role. The associated issues are treated in different ways by different **interpretations of quantum mechanics**.

In the Copenhagen Interpretation (also called Standard Interpretation or Orthodox Interpretation; terminology and interpretation details vary), the above rules are simply operational rules that work in practice. A state vector is a tool that one uses to calculate the *probabilities of measurement outcomes,* and one is agnostic about whether the state vector represents any object that exists in reality. Rules (6) and (7) apply only when a measurement has occurred. Thus unlike in classical physics, it is not enough to specify the initial conditions of the state and let the state evolve. One must also specify when a measurement has occurred: Generally, *a measurement is understood to have occurred when a definite (irreversible, i.e., nonunitary) measurement result or outcome has been obtained; e.g., the observer records a mark on a screen.* (However passing a Stern-Gerlach magnet – which in modern terminology is a *premeasurement* only – is frequently but inaccurately considered to be a measurement, although it is described by a unitary process where even in principle no measurement result becomes available.)

A noteworthy aspect of the standard interpretation is that the state vector cannot represent the whole universe, but must exclude an observer or measuring apparatus that decides when a measurement has occurred; this is the so-called **Heisenberg cut** between the quantum and the classical world. To date, this has not been a problem in making successful experimental predictions, so practitioners are often satisfied with quantum formalism and the standard interpretation.

However, many have suggested that there is a conceptual problem with the standard interpretation because the whole universe presumably obeys the laws of physics. So there should be laws of physics that describe the whole universe, without any need to exclude any observer or measurement apparatus from the quantitative description. Then one must be able to derive the rules (5)-(7) for measuring subsystems of the universe from the dynamics of the universe. The problem of how to do this is called the **measurement problem. **A related problem, the problem of the emergence of a classical macroscopic world from the microscopic quantum description, is often considered as essentially solved by decoherence.

To solve the measurement problem, other interpretations of quantum formalism or theories have been proposed. These alternative interpretations or theories are based on different postulates than those of the standard interpretation, but seek to explain why the standard interpretation has been so successful (e.g., by deriving the rules of the standard interpretation from other postulates). The major alternative interpretations or theories that have been proposed include Everett’s Relative State Interpretation (“Many-Worlds”), the Ensemble Interpretation (or Minimal Statistical Interpretation), the Transactional Interpretation, and the Consistent Histories Interpretation.

Still, other interpretations (e.g., Bohmian Mechanics, Ghirardi–Rimini–Weber theory, the Cellular Automaton Interpretation, and the Thermal Interpretation) modify one or more of the 7 basic rules and only strive to derive the latter in some approximation, for all practical purposes (FAPP). In particular, rule (7) cannot be fundamental if one wants to interpret the state vector ##|\psi\rangle## in an ontic way, i.e., as some direct and ‘faithful’ representation of ‘externally existing reality’ independent from any observer, observation or measurement.

None of the interpretations currently available has been able to solve the measurement problem in a way deemed satisfactory by those interested in the foundations. So there are still major open problems both with the standard interpretation of quantum mechanics and with alternative interpretations. Fortunately, none of these problems seems to be of any practical relevance.

Full Professor (Chair for Computational Mathematics) at University of Vienna

Well for that it’s best to introduce Feynman’s path integral approach from those axioms. I did it in a series of posts I made in the classical mechanics sub-forum:

https://www.physicsforums.com/threads/what-do-newtons-laws-say-when-carefully-analysed.979739/

Basically classical mechanics is QM were you can cancel most paths and get the classical Principle Of Least Action.

Thanks

Bill

Pedagogically I like Ballentine, but though many agree, not all do. And you need to work up to it – to start with I actually like Susskind’s theoretical minimum book, then Griffiths, then Sakurai, then Ballentine. But having an agreed set of axioms is good – and the ones here I like.

Thanks

Bill

What is it Dirac calls it – I think complete set of commuting observables. Not that I recommend using Dirac as the book to base the axioms on. Everyone should eventually own a copy because of it historical significance, but I had the misfortune to use it as my first serious introduction to QM and now regret it. Nor do I recommend the next book I read – Von Neumann’s classic – although serious students should also own a copy of that.

Thanks

Bill

Indeed. The rules in the article are excellent.

Just to elaborate on what Ballentine does. He only uses two rules:

1. The eigenvalues of Hermitian operators, O (called observables), from some vector space, are the possible outcomes of the observation represented by the operator. Or words to that effect – I can dig up my copy for the exact wording if required.

2. The average of those outcomes, E(O), is given by E(O) = Trace (OS) where S is a positive operator of unit trace called the state of the system.

Note 2 to some extent follows from 1 by Gleason’s Theorem, but that is a whole thread in itself and hinges on non-contextuality which even the great Von-Neumann got ‘wrong’ and Greta Herman was ignored when she pointed it out – not one of sciences finest hours.

How does he get away with 2? He is sneaky and the rest are introduced as assumptions so reasonable you do not notice it’s an assumption eg his derivation of Schrodinger’s equation assumes the POR and Galilean transformation but it’s not stated explicitly – he just assumes probabilities are frame independent which is so ‘obvious’ you do not recognise, unless you think about it, it’s invoking the POR. Elegant, but hides important details – it’s still my favourite treatment though. Also there is another assumption not mentioned in the above that for two systems treated as a single system you take the combination of vector spaces ie the space generated from the basis vectors of both spaces, but that is hardly ever mentioned, although it is an assumption. QM is a bit quirky like that – it can be presented in a way assumptions can just seem so natural you do not recognise them as assumptions. There are probably others I haven’t mentioned, and perhaps do not even realise them myself.

Thanks

Bill

He gets out the numbers. But to get out their meaning as probabilities for scattering results, he needs the standard Hilbert space framework! Indeed, Zeidler starts with that…..

and he recovers only (and only an asymptotic series for) the asymptotic S-matrix, no finite time dynamics.

The problem is that you need to assume the positivity of the quantum measure. This cannot be proved for the functional integrals used in QFT – else they would produce finite results without the need for regularization.

Even in quantum mechanics, proving positivity requires somewhere a Hilbert space argument….

The usual Hilbert space formulation is primary, and the path integral formulation is secondary. The path integral formulation allows us o do quantum mechanics in the language of statistical mechanics. Not all statistical mechanics path integrals correspond to quantum theories (ie. they make lack unitary evolution etc). The constraints on the path integrals that make them correspond to quantum theories come from the Hilbert space formulation, which is why the Hilbert space formulation is primary.

In the context of relativistic quantum field theory, a set of constraints on path integrals are the Osterwalder-Schrader axioms.

http://www.einstein-online.info/spotlights/path_integrals.html

https://ncatlab.org/nlab/show/Osterwalder-Schrader+theorem

How do you ensure that in the context of a path integral?

Under your assumptions you’d just have a single free particle. Nothing asymptotic here.

Once you have a Hilbert space and a (not necessarily irreducible) unitary representation of such a group, its infinitesimal generators are represented by operators. This gives operators for energy, momentum, angular momentum, and boosts (of the total system).

It is better to study the subject in some more depth than to dabble in unfounded speculations. It takes some time to become familiar with all the relevant relations between the various approaches and to see what which approach offers and misses.

Yes. The QFT path integral is derived from the QM path integral, which is derived from the Schrödinger equation. Without the latter, one would never know that the path integral formulation is a valid formulation of QM/QFT.

No. With the path integral formulation (but without the equivalent traditional formulation), you don’t even have a Hilbert space (unless you work in the closed time path setting, which is not common knowledge).

But if you consistently and exclusively do QM in the Heisenberg picture, it looks just like QFT, just with a 1D space-time in place of 4D.

Not fundamental means only effective.

Ballentine doesn’t restrict to arbitrary controlled experiments but to the much smaller class of ”filtering-type measurements” by selection, where collapse is equivalent to taking conditional expectations.

whereas Ballentine said explicitly that it is not a fundamental process.

I enjoyed something like that too from my teacher. When I first learned quantum mechanics (Xiao-Gang Wen was the lecturer), the postulates were taught very early, but not in the first lesson. If I recall correctly, the first lecture was about dimensional analysis – to introduce Planck’s constant, the lecture 2 was a tour of the ultraviolet problem and old quantum physics, and the postulates were introduced in lecture 3. Then after that wave mechanics was always done in the context of the postulates.

I believe @vanhees71 has advocated something like that in these forums, though I should let him speak for himself.

What I say about Ballentine in the Insight article (in the slightly polished formulation of this morning – collapse rejected as fundamental but accepted as effective) was designed to be compatible with what he says in his book. It seems to me also compatible with a suitable interpretation of what you say in this quote.

On p.236-238, Ballentine gives a long argument for his rejection of the conventional formulation of (7) = his (9.9) in the density operator version:

He accepts it only as an effective view (p.243f)

and (rightly, like Landau and Lifshits, but unlike many other textbooks) only under special circumstances (p.247):

This is why (7) is formulated in the cautious way given in the Insight article.

The postulates don’t say much without the examples. Thus one has to introduce both in parallel, starting with things that make for an easy bridge, such as optical polarization – see my insight article on the qubit.

It maps an arbitrary pure state with state vector ##\psi## into a pure state with state vector ##\hat P\psi##. That’s enough in the present context.

Only if you want to prepare a pure state from an arbitrary mixed state then ##\hat P## must project to a 1-dimensional subspace.

Busch et al. nowhere refer to reduction. Your use of the term is nonstandard. What is termed state reduction is a process that turns pure states into pure states. It corresponds to the Lüders operator ##\rho\to P\rho P## discussed at the end of Section II.3.1 and in II.4.

Neither of this is the case, so making conclusions based on your assumptions has nothing to do with physics.

The conditioning is always of what is measured afterwards –

simultaneousmeasurement is different and unrelated to the collapse.But this is different from collapse, which says that given the result you can simply work with the projected state – which is what is done in practice. Without projection one must always carry the complete context around (a full ancilla in an extended Hilbert space), which is awkward when making a long sequence of observations.

Although position and momentum do not commute, there are joint position and momentum measurements (e.g., from tracks in bubble chambers or wire chambers), though their accuracy is limited by Heisenberg’s uncertainty relation.

The rules are precise formulations of corresponding statements found more loosely formulated

in all textbooks cited(apart from Ballentine). Rule (7) appears there usually in an unqualified (and hence incorrect) form to which your criticism may apply. But I don’t understand what you consider contentious in the actual formulation of (7). It surely applies in the cases listed under ”Formal discussion” of (7):It is needed to know what is preparedafter passing a barrier (e.g., singling out a ray) or polarizer (singling out a polarization state).I slightly expanded the final version and added headings and links to make it suitable as an insight article. Maybe the participants of the discussion 20 months ago can confirm their continued support or voice disagreements with this public version.