- #1

- 7,419

- 3,321

## Summary:

- Compared to Born's rule in its traditional squared probability amplitude form, the POVM concept is both more general and more easy to introduce on an elementary level.

## Main Question or Discussion Point

[Edit 23.12.2019: A much extended, polished version of my contributions to this thread can be found in my paper

Everything can be motivated and introduced nicely for a qubit, using polarization of classical light, as in my Insight article on A Classical View of the Qubit. That article concentrated on preparation (i.e., the states) rather than measurement (i.e., the POVMs). One can follow it up with the following discussion of measurement.

In a first course, I'd introduce pure states later than in the Insight article, deriving initially von Neumann's dynamics for the density operator rather than the Schrödinger equation. This would emphasize the idealization involved in the latter. The Schrödinger equation is really needed only much later, as a computational tool.

Having the Hilbert space and the unnormalized density operator for sources, one introduces a detector as a collection of detector elements of which at most one responds at any given time, defining a stochastic process of events. The measurement postulate takes the following simple form:

A detector element ##k## responds to a stationary source in state ##\rho## with a rate ##p_k## depending linearly on the state ##\rho##.

The linearity is well motivated by beam experiments: Changing the intensity amounts to a scalar multiplication of densities, combining two sources to addition. Thus it is easy to check by experiment the linearity of typical instrument responses, and the motivation is complete.

Postulate (DRP) is the only measurement postulate; everything else can be derived from it when the Hilbert space is finite-dimensional.

By linearity, the rates satisfy ##p_k=\sum_{i,j} P_{kji}\rho_{ij}## for suitable complex numbers ##P_{kji}##. If the Hilbett space has finite dimension ##n##, these coefficients can be found operationally by approximately measuring the rates for at least ##n^2## linearly independent states ##\rho## and solving the resulting linear least squares problem for the coefficients. This is called quantum detection tomography.

Introducing the matrices ##P_k## with ##(j,i)## entries ##P_{kji}##, this can be written as

$$p_k=Tr~\rho P_k,$$

thus providing a derivation of the POVM extension of Born's probability formula from very simple first principles. The properties of the matrices can be deduced from the fact that the ##p_k## are rates of a stationary process. Hence they are nonnegative and sum to a constant. Since ##p_k## is real for all states ##\rho##, the ##P_k## must be Hermitian. Picking arbitrary pure states ##\rho=\psi\psi^*## shows that ##P_k## is positive semidefinite. Summing the probabilities shows that the sum of the ##P_k## is a multiple of the identity. Requiring this multiple to be 1 is conventional and amounts to a choice of units for the rate in such a way that if the state of the surce is normalized to trace 1, the ##p_k## bdcome probabilities rather than rates. Thus the ##P_k## form a POVM and we have derived everything.

If there are a large number of detector elements, the detection event are usually encoded numerically. The value assigned to the ##k##th detection event is pure convention, and can be any number ##a_k##, or even a vector when the elements are arranged in a multidimensional array. It is whatever has been written on the scale the pointer points to, or whatever has been programmed to be written by an automatic digital recording device.

The state dependent formula for the expectation of the observable measured that follows from POVM together with the value assignment is ##\langle A\rangle=Tr~\rho A## with the operator (or operator vector) ##A=\sum a_kP_k##. We may say, the detector measures an observable represented by the operator (vector) ##A##

Note that the same operator ##A## in the expectation can be decomposed in many ways into a linear combinaion of many POVM terms; thus there may be many different POVMs measuring observables corresponding to the same operator ##A##.

By picking the values carefully one can choose them to approximate a particular operator ##X## of interest, for example the position operator. This corrsponds to the classical situation of labeling the scale of a meter to optimally match a desired observable.

If the detector can be tuned by adjusting parameters ##\theta## affecting its responses, the ##P_k=P_k(\theta)## depend on these these parameters, giving ##A=\sum a_kP_k(\theta)##. Now both the labels ##a_k## and the parameters ##\theta## can be tuned to improve the accuracy with which the desired ##X## is approximated. This is the process called calibration. Constructing detector devices that allow a high quality measurement corresponding to theoreticlly important operators is the challenge of high precision experimental physics.

The derivation just given is simple, intuitive, and complete. It tells everything needed to check and if necessary calibrate arbitrary detectors for their claimed measurement properties.

The derivation is far simpler, far more intuitive, and far more complete than what is needed to introduce students new to quantum physics to Born's rule, with its initially very weird formula for probabilities in a pure state.

Born's rule in expectation form is the very idealized case (realized experimentally only approximately, in very special situations) where the ##P_k## are orthogonal projectors, .e., ##P_k^2=P_k=P_k^*## and ##P_jP_k=P_kP_j## for all ##j,k##. In this special case case, and only in this case, the components of ##A## commute and have a joint discrete spectrum, given by the ##a_k##. This special case is distinguished in that by relabeling the values ##a_k## to ##f(a_k)##, the same detector also measures any function ##f(A)## of ##A##.

To get Born's rule in its traditional textbook form, one has to specialize further the state to be a normalized pure state, ##\rho=\psi\psi^*## with ##\psi^*\psi=1##, and one finds ##p_k=\psi^*P_k\psi##.

*Born's rule and measurement*(arXiv:1912.09906).]Well, it is simpler than to introduce in full generality Born's rule.I'd still not know, how to teach beginners in QT using the POVM concept. [...] I don't think it's possible to introduce POVMs for physicists without using the standard formulation in the usual terms of observables and states.

Everything can be motivated and introduced nicely for a qubit, using polarization of classical light, as in my Insight article on A Classical View of the Qubit. That article concentrated on preparation (i.e., the states) rather than measurement (i.e., the POVMs). One can follow it up with the following discussion of measurement.

In a first course, I'd introduce pure states later than in the Insight article, deriving initially von Neumann's dynamics for the density operator rather than the Schrödinger equation. This would emphasize the idealization involved in the latter. The Schrödinger equation is really needed only much later, as a computational tool.

Having the Hilbert space and the unnormalized density operator for sources, one introduces a detector as a collection of detector elements of which at most one responds at any given time, defining a stochastic process of events. The measurement postulate takes the following simple form:

**(DRP) (detector response principle)**A detector element ##k## responds to a stationary source in state ##\rho## with a rate ##p_k## depending linearly on the state ##\rho##.

The linearity is well motivated by beam experiments: Changing the intensity amounts to a scalar multiplication of densities, combining two sources to addition. Thus it is easy to check by experiment the linearity of typical instrument responses, and the motivation is complete.

Postulate (DRP) is the only measurement postulate; everything else can be derived from it when the Hilbert space is finite-dimensional.

By linearity, the rates satisfy ##p_k=\sum_{i,j} P_{kji}\rho_{ij}## for suitable complex numbers ##P_{kji}##. If the Hilbett space has finite dimension ##n##, these coefficients can be found operationally by approximately measuring the rates for at least ##n^2## linearly independent states ##\rho## and solving the resulting linear least squares problem for the coefficients. This is called quantum detection tomography.

Introducing the matrices ##P_k## with ##(j,i)## entries ##P_{kji}##, this can be written as

$$p_k=Tr~\rho P_k,$$

thus providing a derivation of the POVM extension of Born's probability formula from very simple first principles. The properties of the matrices can be deduced from the fact that the ##p_k## are rates of a stationary process. Hence they are nonnegative and sum to a constant. Since ##p_k## is real for all states ##\rho##, the ##P_k## must be Hermitian. Picking arbitrary pure states ##\rho=\psi\psi^*## shows that ##P_k## is positive semidefinite. Summing the probabilities shows that the sum of the ##P_k## is a multiple of the identity. Requiring this multiple to be 1 is conventional and amounts to a choice of units for the rate in such a way that if the state of the surce is normalized to trace 1, the ##p_k## bdcome probabilities rather than rates. Thus the ##P_k## form a POVM and we have derived everything.

If there are a large number of detector elements, the detection event are usually encoded numerically. The value assigned to the ##k##th detection event is pure convention, and can be any number ##a_k##, or even a vector when the elements are arranged in a multidimensional array. It is whatever has been written on the scale the pointer points to, or whatever has been programmed to be written by an automatic digital recording device.

The state dependent formula for the expectation of the observable measured that follows from POVM together with the value assignment is ##\langle A\rangle=Tr~\rho A## with the operator (or operator vector) ##A=\sum a_kP_k##. We may say, the detector measures an observable represented by the operator (vector) ##A##

Note that the same operator ##A## in the expectation can be decomposed in many ways into a linear combinaion of many POVM terms; thus there may be many different POVMs measuring observables corresponding to the same operator ##A##.

By picking the values carefully one can choose them to approximate a particular operator ##X## of interest, for example the position operator. This corrsponds to the classical situation of labeling the scale of a meter to optimally match a desired observable.

If the detector can be tuned by adjusting parameters ##\theta## affecting its responses, the ##P_k=P_k(\theta)## depend on these these parameters, giving ##A=\sum a_kP_k(\theta)##. Now both the labels ##a_k## and the parameters ##\theta## can be tuned to improve the accuracy with which the desired ##X## is approximated. This is the process called calibration. Constructing detector devices that allow a high quality measurement corresponding to theoreticlly important operators is the challenge of high precision experimental physics.

The derivation just given is simple, intuitive, and complete. It tells everything needed to check and if necessary calibrate arbitrary detectors for their claimed measurement properties.

The derivation is far simpler, far more intuitive, and far more complete than what is needed to introduce students new to quantum physics to Born's rule, with its initially very weird formula for probabilities in a pure state.

Born's rule in expectation form is the very idealized case (realized experimentally only approximately, in very special situations) where the ##P_k## are orthogonal projectors, .e., ##P_k^2=P_k=P_k^*## and ##P_jP_k=P_kP_j## for all ##j,k##. In this special case case, and only in this case, the components of ##A## commute and have a joint discrete spectrum, given by the ##a_k##. This special case is distinguished in that by relabeling the values ##a_k## to ##f(a_k)##, the same detector also measures any function ##f(A)## of ##A##.

To get Born's rule in its traditional textbook form, one has to specialize further the state to be a normalized pure state, ##\rho=\psi\psi^*## with ##\psi^*\psi=1##, and one finds ##p_k=\psi^*P_k\psi##.

Last edited: