rules of quantum mechanics

The 7 Basic Rules of Quantum Mechanics

For reference purposes and to help focus discussions on Physics Forums in interpretation questions on the real issues, there is a need for fixing the common ground. There is no consensus about the interpretation of quantum mechanics, and – not surprisingly – there is disagreement even among the mentors and science advisors here on Physics Forums. But the following formulation in terms of 7 basic rules of quantum mechanics was agreed upon among the science advisors of Physics Forums in a long and partially heated internal discussion on ”Best Practice to Handle Interpretations in Quantum Physics”, September 24 – October 29, 2017, based on a first draft by @atyy and several improved versions by  @tom.stoer. Other significant contributors to the discussions included @fresh_42,  @kith,  @stevendaryl, and @vanhees71.

I slightly expanded the final version and added headings and links to make it suitable as an Insight article. A revised version of this article is published as Section 1.1 of my recent book


The 7 Basic Rules

The basic rules reflect what is almost generally taught as the basics in quantum physics courses around the world. Often they are stated in terms of axioms or postulates, but this is not essential for their practical validity. In some interpretations, some of these rules are not considered fundamental rules but only valid as empirical or effective rules for practical purposes.

These rules describe the basis of the quantum formalism and are found in almost all introductory quantum mechanics textbooks, among them: Basdevant 2016; Cohen-Tannoudji, Diu and Laloe 1977; Dirac 1930, 1967; Gasiorowicz 2003; Greiner 2008; Griffiths and Schroeter 2018; Landau and Lifshitz 1958, 1977; Liboff 2003; McIntyre 2012; Messiah 1961; Peebles 1992; Rae and Napolitano 2015; Sakurai 2010; Shankar 2016; Weinberg 2013. [Even Ballentine 1998, who rejects rule (7) = his process (9.9) as fundamental, derives it at the bottom of p.243 as an effective rule.] There are generalizations of these rules (e.g., Auletta, Fortunato and Parisi 2009; Busch, Grabowski and Lahti 2001; Nielsen and Chuang 2011) for degenerate eigenvalues, for mixed states, and for measurements not defined by self-adjoint operators but by POVMs. These generalizations are necessary to be able to apply quantum mechanics to all situations encountered in practice. The basic rules are carefully formulated so that they are correct as they stand and at the same time fully compatible with these generalizations.

When stating the rules, italic text corresponds to the physical systems, its preparation, measurement, measured values etc.; non-italic text corresponds to mere mathematical objects that represent the physical system, etc.

  1. quantum system is described using a Hilbert space ##\mathcal{H}##. Often, this Hilbert space is assumed to be separable.
  2. A pure state of a quantum system is represented by a normalized vector ##|\psi \rangle## in ##\mathcal{H}##; state vectors differing only by a phase factor of absolute value 1 represent the same state. In the position representation, where the Hilbert space is the space of square-integrable functions of a position vector ##x##, ##\psi(x)## is called the wave function of the system.
  3. The time evolution of an isolated quantum system represented by the state vector ##|\psi(t)\rangle## is given by
    $$\mathrm{i} \hbar\frac{\mathrm{d}}{\mathrm{d} t} |\psi(t) \rangle = H \, |\psi(t) \rangle$$
    where ##H## is the Hamilton operator and ##\hbar## is Planck’s constant. This is the Schrödinger equation.
    This rule is valid in the formulation of quantum mechanics called the Schrödinger picture. There are other, equivalent formulations of the time evolution, especially the Heisenberg picture and the Dirac (interaction) pictures, where time evolution is entirely or partially shifted from the state vector to the operators.
  4. An observable of a quantum system is represented by a Hermitian operator ##A## with real spectrum acting on a dense subspace of ##\mathcal{H}##.
  5. The possible measured values of a measurement of an observable are the spectral values of the corresponding operator ##A##. In case of a discrete spectrum, these are the eigenvalues ##a## satisfying ##A\, |a\rangle = a\, |a\rangle##.
  6. Let ##\{|a\rangle\}## be a complete set of (generalized) eigenvectors of the self-adjoint operator ##A## with spectral values ##a##. Let the quantum system be prepared in a state represented by the state vector ##|\psi\rangle##. If a measurement of the observable corresponding to ##A## is performed, the probability ##p_\psi(a)## to find the measured value ##a## is given by
    $$p_\psi(a) = |\langle a | \psi\rangle|^2$$
    This is the Born rule, in a formulation that assumes that all eigenvalues are nondegenerate.
  7. For successive, non-destructive projective measurements with discrete results, each measurement with measuring value ##a## can be regarded as preparation of a new state whose state vector is the corresponding eigenvector ##|a\rangle##, to be used for the calculation of subsequent time evolution and further measurements. This is the von Neumann projection postulate.

Formal Comments

(2) To be precise, a pure state is not represented by a unit vector but by a unit ray, i.e. the equivalence class $$[\psi] = e^{\mathrm{i} \varphi}  |\psi \rangle$$ with ##\varphi \in \mathbb{R}## and ##|\psi \rangle## being a normalized vector in ##\mathcal{H}##.

Equivalently, a pure state can be represented by a rank 1 density operator ##\rho=|\psi \rangle\langle\psi|## satisfying ##\rho^2=\rho=\rho^*## and ##Tr~\rho=1##. Mixed states are represented by more general (non-idempotent) Hermitian density operators of trace 1.

(3) It is equivalent to define the time evolution of an isolated quantum system by $$|\psi(t)\rangle = U(t)\,|\psi(0)\rangle$$

with the unitary time evolution operator

$$U(t) = e^{-iHt/\hbar}.$$

The evolution according to (3) is therefore also referred to as unitary evolution.

(4) Equivalently, ##A## is self-adjoint.

(6) In case of degenerate subspaces, let ##\{|a,\nu \rangle\}## be a complete set of (generalized) eigenvectors of ##A##, indexed by ##\nu##. The probability ##p_\psi(a)## to find the measured value ##a## is then given by summing (or integrating) over ##\nu## i.e. over the entire ##a##-subspace

$$p_\psi(a) = \sum_\nu |\langle a,\nu | \psi\rangle|^2.$$

(7) The projection postulate is valid only under the assumptions stated; examples are passing barriers with holes or slits, polarization filters, and certain other instruments that modify the state of a quantum system passing through it. This (nonunitary, dissipative) change of the state to an eigenstate in the course of a projective measurement is often referred to as ”state reduction” or ”collapse of the wave function” or ”reduction of the wave packet”. Note that there is no direct conflict with the unitary evolution in (3) since, during a measurement, a system is never isolated.

In other cases, the prepared state may be quite different. (See the discussion in Landau and Lifschitz, Vol. III, Section 7.) The most general kind of quantum measurement and the resulting prepared state is described by so-called positive operator valued measures (POVMs).

Comments on the Interpretation

Not further discussing the foundations of quantum mechanics beyond this is called shut-up-and-calculate. It is the mode of working sufficiently for all who do not want to delve into often highly disputed foundational (and partly philosophical) problems. However, the above-mentioned rules are often considered conceptually unsatisfactory because they introduce not well-defined terms ‘probability’, ‘measurement’, and ‘observer’ to define these basic rules whereas in principle one expects that at least measurement and observation can be regarded as quantum mechanical processes or interactions which follow the same fundamental rules and do not play any special role. The associated issues are treated in different ways by different interpretations of quantum mechanics.

In the Copenhagen Interpretation (also called Standard Interpretation or Orthodox Interpretation; terminology and interpretation details vary), the above rules are simply operational rules that work in practice.  A state vector is a tool that one uses to calculate the probabilities of measurement outcomes, and one is agnostic about whether the state vector represents any object that exists in reality.  Rules (6) and (7) apply only when a measurement has occurred. Thus unlike in classical physics, it is not enough to specify the initial conditions of the state and let the state evolve.  One must also specify when a measurement has occurred: Generally, a measurement is understood to have occurred when a definite (irreversible, i.e., nonunitary) measurement result or outcome has been obtained; e.g., the observer records a mark on a screen. (However passing a Stern-Gerlach magnet – which in modern terminology is a premeasurement only – is frequently but inaccurately considered to be a measurement, although it is described by a unitary process where even in principle no measurement result becomes available.)

A noteworthy aspect of the standard interpretation is that the state vector cannot represent the whole universe, but must exclude an observer or measuring apparatus that decides when a measurement has occurred; this is the so-called Heisenberg cut between the quantum and the classical world. To date, this has not been a problem in making successful experimental predictions, so practitioners are often satisfied with quantum formalism and the standard interpretation.

However, many have suggested that there is a conceptual problem with the standard interpretation because the whole universe presumably obeys the laws of physics. So there should be laws of physics that describe the whole universe, without any need to exclude any observer or measurement apparatus from the quantitative description. Then one must be able to derive the rules (5)-(7) for measuring subsystems of the universe from the dynamics of the universe. The problem of how to do this is called the measurement problem. A related problem, the problem of the emergence of a classical macroscopic world from the microscopic quantum description, is often considered as essentially solved by decoherence.

To solve the measurement problem, other interpretations of quantum formalism or theories have been proposed. These alternative interpretations or theories are based on different postulates than those of the standard interpretation, but seek to explain why the standard interpretation has been so successful (e.g., by deriving the rules of the standard interpretation from other postulates). The major alternative interpretations or theories that have been proposed include Everett’s Relative State Interpretation (“Many-Worlds”), the Ensemble Interpretation (or Minimal Statistical Interpretation), the Transactional Interpretation, and the Consistent Histories Interpretation.

Still, other interpretations (e.g., Bohmian Mechanics, Ghirardi–Rimini–Weber theory, the Cellular Automaton Interpretation, and the Thermal Interpretation) modify one or more of the 7 basic rules and only strive to derive the latter in some approximation, for all practical purposes (FAPP). In particular, rule (7) cannot be fundamental if one wants to interpret the state vector ##|\psi\rangle## in an ontic way, i.e., as some direct and ‘faithful’ representation of ‘externally existing reality’ independent from any observer, observation or measurement.

None of the interpretations currently available has been able to solve the measurement problem in a way deemed satisfactory by those interested in the foundations. So there are still major open problems both with the standard interpretation of quantum mechanics and with alternative interpretations. Fortunately, none of these problems seems to be of any practical relevance.

Comments Here

41 replies
Newer Comments »
  1. bhobba says:
    I know the insight is aimed at people learning QM and as that is correct and I have nothing to add. But I can’t resist to say that nothing of the 7 rules postulated is inherently quantum, you can formulate classical mechanics (or at least classical statistical mechanics) in a way that incorporate all of them (With a possible exception of rule 5 that might require stating that not all self-adjoint operators are observables).

    Well for that it’s best to introduce Feynman’s path integral approach from those axioms. I did it in a series of posts I made in the classical mechanics sub-forum:

    Basically classical mechanics is QM were you can cancel most paths and get the classical Principle Of Least Action.


  2. bhobba says:
    So I really would like to hear your opinions (both professors and students) about the pedagocical aspect.
    (Sorry if I’m diverting the topic but that is an important part of it I believe…)

    Pedagogically I like Ballentine, but though many agree, not all do. And you need to work up to it – to start with I actually like Susskind’s theoretical minimum book, then Griffiths, then Sakurai, then Ballentine. But having an agreed set of axioms is good – and the ones here I like.


  3. bhobba says:
    Busch et al. nowhere refer to reduction. Your use of the term is nonstandard. What is termed state reduction is a process that turns pure states into pure states. It corresponds to the Lüders operator ##\rho\to P\rho P## discussed at the end of Section II.3.1 and in II.4.

    What is it Dirac calls it – I think complete set of commuting observables. Not that I recommend using Dirac as the book to base the axioms on. Everyone should eventually own a copy because of it historical significance, but I had the misfortune to use it as my first serious introduction to QM and now regret it. Nor do I recommend the next book I read – Von Neumann’s classic – although serious students should also own a copy of that.


  4. bhobba says:
    The rules are precise formulations of corresponding statements found more loosely formulated in all textbooks cited (apart from Ballentine).

    Indeed. The rules in the article are excellent.

    Just to elaborate on what Ballentine does. He only uses two rules:

    1. The eigenvalues of Hermitian operators, O (called observables), from some vector space, are the possible outcomes of the observation represented by the operator. Or words to that effect – I can dig up my copy for the exact wording if required.

    2. The average of those outcomes, E(O), is given by E(O) = Trace (OS) where S is a positive operator of unit trace called the state of the system.

    Note 2 to some extent follows from 1 by Gleason’s Theorem, but that is a whole thread in itself and hinges on non-contextuality which even the great Von-Neumann got ‘wrong’ and Greta Herman was ignored when she pointed it out – not one of sciences finest hours.

    How does he get away with 2? He is sneaky and the rest are introduced as assumptions so reasonable you do not notice it’s an assumption eg his derivation of Schrodinger’s equation assumes the POR and Galilean transformation but it’s not stated explicitly – he just assumes probabilities are frame independent which is so ‘obvious’ you do not recognise, unless you think about it, it’s invoking the POR. Elegant, but hides important details – it’s still my favourite treatment though. Also there is another assumption not mentioned in the above that for two systems treated as a single system you take the combination of vector spaces ie the space generated from the basis vectors of both spaces, but that is hardly ever mentioned, although it is an assumption. QM is a bit quirky like that – it can be presented in a way assumptions can just seem so natural you do not recognise them as assumptions. There are probably others I haven’t mentioned, and perhaps do not even realise them myself.


  5. A. Neumaier says:
    I believe it is quite interesting that experimental numbers (S-matrix) can be obtained from the classical Lagrangian, without needing any kind of functional analysis, only probability theory.

    He gets out the numbers. But to get out their meaning as probabilities for scattering results, he needs the standard Hilbert space framework! Indeed, Zeidler starts with that…..

  6. A. Neumaier says:
    Zeidler bases everything in the QA "magic formula" (basically, the definition of the path integral) and the LSZ "magic formula"

    and he recovers only (and only an asymptotic series for) the asymptotic S-matrix, no finite time dynamics.

  7. A. Neumaier says:
    It seems that it is possible, in some way, to recover a Hilbert space from the path integral formulation (and in non-relativistic QM, this Hilbert space is the standard Hilbert space):

    The problem is that you need to assume the positivity of the quantum measure. This cannot be proved for the functional integrals used in QFT – else they would produce finite results without the need for regularization.

    Even in quantum mechanics, proving positivity requires somewhere a Hilbert space argument….

  8. atyy says:
    I cannot edit the message above, but the following should read as Edit 2:

    It seems that it is possible, in some way, to recover a Hilbert space from the path integral formulation (and in non-relativistic QM, this Hilbert space is the standard Hilbert space):

    The usual Hilbert space formulation is primary, and the path integral formulation is secondary. The path integral formulation allows us o do quantum mechanics in the language of statistical mechanics. Not all statistical mechanics path integrals correspond to quantum theories (ie. they make lack unitary evolution etc). The constraints on the path integrals that make them correspond to quantum theories come from the Hilbert space formulation, which is why the Hilbert space formulation is primary.

    In the context of relativistic quantum field theory, a set of constraints on path integrals are the Osterwalder-Schrader axioms.

  9. A. Neumaier says:
    But if we postulate that the states of that Hilbert space are irreducible representations of the Galilean group (or the Poincaré in the relativistic case).

    How do you ensure that in the context of a path integral?

    we would have at least the asymptotic states, without needing operators.

    Under your assumptions you’d just have a single free particle. Nothing asymptotic here.

    Once you have a Hilbert space and a (not necessarily irreducible) unitary representation of such a group, its infinitesimal generators are represented by operators. This gives operators for energy, momentum, angular momentum, and boosts (of the total system).

    It is better to study the subject in some more depth than to dabble in unfounded speculations. It takes some time to become familiar with all the relevant relations between the various approaches and to see what which approach offers and misses.

  10. A. Neumaier says:
    The Schrödinger equation can be derived from the path integral. As a consequence, giving the Schrödinger equation as fundamental suggests the path integral is derived from the Schrödinger equation.

    Yes. The QFT path integral is derived from the QM path integral, which is derived from the Schrödinger equation. Without the latter, one would never know that the path integral formulation is a valid formulation of QM/QFT.

    But in fact, it is basically the opposite. Could there be a different formulation of the axioms of QM if one took the path integral as its basis?

    No. With the path integral formulation (but without the equivalent traditional formulation), you don’t even have a Hilbert space (unless you work in the closed time path setting, which is not common knowledge).

    But if you consistently and exclusively do QM in the Heisenberg picture, it looks just like QFT, just with a 1D space-time in place of 4D.

  11. A. Neumaier says:
    He accepts it only as an effective view (p.243f)

    He says: "This “reduction” of the state is not a new fundamental process,

    Not fundamental means only effective.

    Leslie Ballentine said

    the statement by Dirac (1958, p. 36) to the effect that the state immediately after an R measurement must be an eigenstate of R, seems perverse unless its application is restricted to filtering-type measurements.

    projection can be applied only under special circumstances but this is the whole purpose of doing controlled experiments.

    Ballentine doesn’t restrict to arbitrary controlled experiments but to the much smaller class of ”filtering-type measurements” by selection, where collapse is equivalent to taking conditional expectations.

    I like formulation of (7) in your Insight article and as I see it is consistent with Ballentine’s filtering-type measurement as a fundamental phenomena which needs special circumstances to be clearly observed.

    whereas Ballentine said explicitly that it is not a fundamental process.

  12. atyy says:
    Thank you for the nice article. I would like to ask your opinion on a slightly different aspect of this which is the pedagogical significance of this scheme.

    I personally believe that this is the correct way of teaching a theory to the new learners. (That’s why I liked the article.) Actually not only quantum mechanics but any physical theory should be thought by giving the postuates in the first encounter and repeating and quoting them all the way throughout the course. Or stating it differenty we have to first teach what a theory is before the theory itself.

    (A few years ago I wrote an article on this and published in a Turkish journal easily, but couldn’t have the chance to discuss it throughly with any colleagues, so trying my chance here also to get some feedback and opinions.)

    Different texts on quantum mechanics have many different approaches like explaining the historical development first, or developing the mathematical framework initally. I even find the method of some great masters like Feynman and Sakurai suspicious when they try to develop the concepts via some thought experiments like double slit or Stern-Gerlach. I have witnessed that the students always seem to get stuck on the "technical" details of the experiment which are irrelevant to the core and that diverges their already fragile attention.

    And that’s why I believe that a sound scheme of postulates should be emphasized as THE fundemantal thing that matters most. This year I tried this approach on my modern physics course and after exposing them to the postulates I continued by the historical development and I felt that the students were more engaged actually. (It was much easier with special relativity since the postulates can be expressed in daily language; and of course much more challenging with quantum mechanics because of the mathematical language. But I refered to their linear algebra course all the time and said it is nothing but linear algebra, eigenvalues, eigenfunctions, etc…)

    So I really would like to hear your opinions (both professors and students) about the pedagocical aspect.
    (Sorry if I’m diverting the topic but that is an important part of it I believe…)

    I enjoyed something like that too from my teacher. When I first learned quantum mechanics (Xiao-Gang Wen was the lecturer), the postulates were taught very early, but not in the first lesson. If I recall correctly, the first lecture was about dimensional analysis – to introduce Planck’s constant, the lecture 2 was a tour of the ultraviolet problem and old quantum physics, and the postulates were introduced in lecture 3. Then after that wave mechanics was always done in the context of the postulates.

    I believe @vanhees71 has advocated something like that in these forums, though I should let him speak for himself.

  13. A. Neumaier says:
    Ballentine does not reject rule (7) as formulated in your insight article. He just makes distinction between wave function collapse at the moment of detection (which he rejects) and projective measurement (which he calls a filtering-type measurement, see p.246 in his 1998 book).

    What I say about Ballentine in the Insight article (in the slightly polished formulation of this morning – collapse rejected as fundamental but accepted as effective) was designed to be compatible with what he says in his book. It seems to me also compatible with a suitable interpretation of what you say in this quote.

    On p.236-238, Ballentine gives a long argument for his rejection of the conventional formulation of (7) = his (9.9) in the density operator version:

    Leslie Ballentine said

    In order to save that interpretation, they postulate a further process that is supposed to lead from the state (9.8) to a so-called “reduced state” (9.9), which is an eigenvector of the indicator variable, with the eigenvalue being the actual observed value of the indicator position. This postulate of reduction of the state vector creates a new problem […] In all cases in which the initial state is not an eigenstate of the dynamical variable being measured, the final state must involve coherent superpositions of macroscopically distinct indicator eigenvectors. If this situation is unacceptable according to any interpretation, such as A, then that interpretation is untenable.

    He accepts it only as an effective view (p.243f)

    Leslie Ballentine said

    Thus we see that the so-called “reduced” state is physically significant in certain circumstances. But it is only a phenomenological description of an effect on the system (the neutron and spectrometer) due to its environment (the cause of the noise fluctuations), which has for convenience been left outside of the definition of the system. This “reduction” of the state is not a new fundamental process, and, contrary to the impression given in some of the older literature, it has nothing specifically to do with measurement.

    and (rightly, like Landau and Lifshits, but unlike many other textbooks) only under special circumstances (p.247):

    Leslie Ballentine said

    This filtering process, which has the effect of removing all values of R except those for which R ∈ Δa, can be regarded as preparing a new state […] Indeed, the statement by Dirac (1958, p. 36) to the effect that the state immediately after an R measurement must be an eigenstate of R, seems perverse unless its application is restricted to filtering-type measurements.

    This is why (7) is formulated in the cautious way given in the Insight article.

  14. A. Neumaier says:
    A Lüders operation ##\hat\rho\mapsto \hat P\hat\rho\hat P## in general only projects pure states to pure states if ##\hat P## projects to a 1-dimensional subspace, right?

    It maps an arbitrary pure state with state vector ##\psi## into a pure state with state vector ##\hat P\psi##. That’s enough in the present context.

    Only if you want to prepare a pure state from an arbitrary mixed state then ##\hat P## must project to a 1-dimensional subspace.

  15. A. Neumaier says:
    The Lüders operation is what is used to represent "reduction" of the density matrix by measurement in Section II.3.2 and II.3.3 of Busch, Grabowski, and Lahti (which you mention above). On its own, we can think of it as representing a measurement of ##\hat A## without recording the result.

    Busch et al. nowhere refer to reduction. Your use of the term is nonstandard. What is termed state reduction is a process that turns pure states into pure states. It corresponds to the Lüders operator ##\rho\to P\rho P## discussed at the end of Section II.3.1 and in II.4.

  16. A. Neumaier says:
    If reality can be factored in probabilities, and in fact, existence is no more than a wave of possibilities,

    Neither of this is the case, so making conclusions based on your assumptions has nothing to do with physics.

  17. A. Neumaier says:
    one can instead think of a measurement as conditioning what other measurements can be made jointly with it.

    The conditioning is always of what is measured afterwards – simultaneous measurement is different and unrelated to the collapse.

    a Lüders operation is given by

    But this is different from collapse, which says that given the result you can simply work with the projected state – which is what is done in practice. Without projection one must always carry the complete context around (a full ancilla in an extended Hilbert space), which is awkward when making a long sequence of observations.

  18. A. Neumaier says:
    "4a: Joint observables of a quantum system are represented by mutually commutative self-adjoint operators A, B, … acting on H."

    Although position and momentum do not commute, there are joint position and momentum measurements (e.g., from tracks in bubble chambers or wire chambers), though their accuracy is limited by Heisenberg’s uncertainty relation.

    Rule 7 seems to me to be the most contentious of those listed

    The rules are precise formulations of corresponding statements found more loosely formulated in all textbooks cited (apart from Ballentine). Rule (7) appears there usually in an unqualified (and hence incorrect) form to which your criticism may apply. But I don’t understand what you consider contentious in the actual formulation of (7). It surely applies in the cases listed under ”Formal discussion” of (7):
    It is needed to know what is prepared after passing a barrier (e.g., singling out a ray) or polarizer (singling out a polarization state).

  19. A. Neumaier says:
    The article is based on a first draft by @atyy and several improved versions by @tom.stoer. Other significant contributors to the discussions included @fresh_42, @kith, @stevendaryl, and @vanhees71.
    I slightly expanded the final version and added headings and links to make it suitable as an insight article. Maybe the participants of the discussion 20 months ago can confirm their continued support or voice disagreements with this public version.
Newer Comments »

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply