Heisenberg-Robertson uncertainty relation vs. noise-disturbance measures

vanhees71 · Jan 18, 2013

Finally I found the time to write my account on the interpretation of
the Heisenberg uncertainty principle vs. the question whether it can be
interpreted as Heisenberg did in his very first paper on the
subject. Although it is well known that this interpretation is not
compatible with quantum theory (at least when interpreted in the
ensemble interpretation or the Copenhagen-interpretation flavor
according to Bohr, which is quite similar with the ensemble
interpretation, if you take away the collapse of the state after a
measuremen) one finds this interpretation still in some newer textbooks
on quantum theory. Ironically some older books state it correctly, e.g.,
Dirac's classical book.

Anyway, I'd like to discuss the following paper:

Experimental demonstration of a universally valid error–disturbance uncertainty relation
in spin measurements
Jacqueline Erhart, Stephan Sponar, Georg Sulyok, Gerald Badurek, Masanao Ozawa
and Yuji Hasegawa1, Nat. Phys 8, 188 (2012)
http://dx.doi.org/10.1038/NPHYS2194

I think that's a marvelous example for a real experiment that can be understood in about the middle of an introductory lecture on quantum theory and to discuss interpretational issues on quantum theory and measurement theory. I'll try to translate this paper into the normal language of physicists to make it better understandable. Concerning my standpoint toward interpretation I should say that I'm a follower of the "Minimal Statistical Interpretation" or "Ensemble Representation".

The experiment is done on the spin of a single neutron and it entirely consists on ideal von Neumann measurements of the spin in a kind of Stern-Gerlach experiment. So there is no need for the sophisticated (but sometimes necessary) analysis in terms of formal weak-measurement theory with positive operator values measures. I'll not go into the technical realization of the experiment.

First let's remember briefly the meaning of states in quantum theory. In the formalism a quantum state is represented by a ray in a Hilbert space appropriate to describe the quantum object under consideration. Here this object is the spin of a single neutron, and the Hilbert space of states is given by the two-dimensional Hilbert space that is represented by vectors with two complex components [itex]\mathbb{C}^2[/itex].

Observables are represented by self-adjoint operators. In the case of spin the entire space of observables is built by the three spin components, here represented by the three Pauli matrices, fulfilling the commutation relations
[tex][\hat{\sigma}_j,\hat{\sigma}_k]=2 \mathrm{i} \epsilon_{jkl} \sigma_l.[/tex]
The actual spin components are represented by
[tex]\hat{s}_j=\frac{\hbar}{2} \hat{\sigma}_j,[/tex]
but it's more convenient to work with the [itex]\hat{\sigma}_j[/itex]'s.

The possible outcomes of an ideal measurement of the observable are given by the eigenvalues of the observable. In our case of a single spin as a system we have the most simple case that for any eigenvalue [itex]\sigma_x \in \{-1,1\}[/itex] of [itex]\hat{\sigma}_x[/itex], there is one and only one eigenvector, i.e., the observable is nondegenerate. In the following we'll thus concentrate on this most simple case. The state vectors and also the eigenvectors are all considered to be normalized to 1. The eigenvectors of self-adjoint operators build a complete orthonormal set, i.e.,
[tex]\langle \sigma_x|\sigma_x' \rangle=\delta_{\sigma_x,\sigma_x'}.[/tex]

Now we can also formulate the physical meaning of the states. If the system has been prepared in a pure state represented by the normalized vector [itex]|\psi \rangle[/itex], then the probability to find the value [itex]\sigma_x[/itex] when measuring the spin-x component is given by
[tex]P(\sigma_x|\psi)=|\langle \sigma_x|\psi \rangle|^2.[/tex]
This is Born's rule.

Now, according to the here applied "Minimal Statistical Interpretation", the entire meaning of the states is to provide probabilities for the outcome of measurments, given the information about the state preparation, according to Born's rule. Thus a single measurement cannot tell us anything about the spin-x component of the neutron but we have to prepare many neutrons in the same spin states independently from each other (in the following called an ensemble) and measure the spin-x component and then evaluate the rates with which either [itex]\sigma_x=+1[/itex] or [itex]\sigma_x=-1[/itex] occurs. For the here considered ideal measurements the relative frequency of these counts give, in the limit of very large ensembles, the probabilities for the occurrence of each of these possible values, which can be compared to the predictions of quantum theory. A quantum mechanical state vector thus refers not to properties of single neutron spins but to ensembles of neutrons with equally prepared spins.

If and only if we have prepared the ensemble such that it's represented by the state [itex]|\psi \rangle = |\sigma_x \rangle[/itex], then the spin-x component is determined, because we have
[itex]P(\sigma_x'|\sigma_x)=|\langle \sigma_x'|\sigma_x \rangle|^2=\delta_{\sigma_x',\sigma_x}.[/itex]
If we measure [itex]\sigma_y[/itex] instead of [itex]\sigma_x[/itex], we find
[tex]P(\sigma_y|\sigma_x)=\frac{1}{2},[/tex]
as one can easily calculate using the concrete realization of the Pauli matrices on [itex]\mathbb{C}^2[/itex]. This shows that, if the system is prepared to have a certain sigma-x component, we don't know much about the value of the sigma-y component.

This brings us to the idea to calculate standard deviations of the observables. If the system is prepared in the state [itex]|\psi \rangle[/itex], it's easy to see from the above summarized probabilities that the standard deviation of an observable A, represented by the self-adjoint operator [itex]\hat{A}[/itex] is given by
[tex]\sigma(A|\psi)=\sqrt{\langle \psi|\hat{A}^2|\psi \rangle-\langle \psi|\hat{A}|\psi \rangle^2}.[/tex]
One can prove, making use of the positive definiteness of the scalar product in Hilbert space, that the so defined "uncertainties", i.e., the standard deviations of observables, given the preparation of the neutron's spin represented by [itex]|\psi \rangle[/itex], fulfill the Heisenberg-Robertson uncertainty relation
[tex]\sigma(A|\psi) \sigma(B|\psi) \geq |\langle \psi|[\hat{A},\hat{B}]|\rangle \psi|.[/tex]
This immediately shows that two observables A and B in general can have determined values due to a appropriate preparation of a state only if their representing operators commute. Then we say the observables are compatible, otherwise they are incompatible.

It is now important to make oneself clear the physical meaning of this Robertson uncertainty relation. Ironically Heisenberg got the interpretation wrong in his first paper on the subject, and he was corrected by Bohr afterwards. Unfortunately the wrong interpretation stuck and appears in many, even newer, textbooks on quantum mechanics. So let's first give the correct interpretation, which immediately follows from the above considerations about the meaning of states in quantum theory:

To obtain the above given standard deviations, one has to prepare ensembles of single-neutron spins that are described by the state vector [itex]|\psi \rangle[/itex]. Then one has to measure the probability distributions for the occurrence of the outcome of the values of one observable, in our case [itex]\sigma_x[/itex] (measured on one so prepared ensemble), and then to do the same for the other variable, in our case [itex]\sigma_y[/itex], where this new measurement is done on a second independently such prepared ensemble! We do not perform a joint measurement of both observables on one single neutron but do statistics about independently prepared ensembles of such single neutrons performing independent measurements of each observable separately. That's the meaning of the Robertson uncertainty relation: It describes a relation between the uncertainties of two observables (quantifies as their standard deviations with the probabilities provided by Born's rule) due to the preparation of a quantum state (i.e., of the ensembles), measuring the observables separately on equally prepared ensembles.

Heisenberg now misinterpreted the relation as follows: He thought that the uncertainty relation describes the properties of measurements. His idea was that the uncertainty relation comes from the unavoidable perturbation of the system by a measurement. For our case he would have stated that the measurement of observable A (here [itex]\sigma_x[/itex]) with an "accuracy" [itex]\epsilon(A|\psi,M)[/itex] unavoidably disturbs the other observable B (here [itex]\sigma_y[/itex]) such that it gets a "disturbance" [itex]\eta(B|\psi,A,M)[/itex]. The meaning of my notation here is the following

[itex]\epsilon(A|\psi,M)[/itex] is the systematic uncertainty of a (not necessarily ideal!) measurement of observable A on a quantum system prepared in the state [itex]|\psi \rangle[/itex] by performing an ideal measurement of another observable [itex]M[/itex]. This is a special pretty simple example for a "weak measurement" of A. Of course, if A is measured ideally, this systematic uncertainty must vanish by definition:
[tex]\epsilon(A|\psi,A)=0.[/tex]

[itex]\eta(B|\psi,M)[/itex] is a measure for the disturbance of another observable B, given that the system is prepared in the state [itex]|\psi \rangle[/itex] and that the ideal measurement of [itex]M[/itex] is performed.

It is of course correct that in general the (weak or ideal) measurement of one observable A "disturbes" the system such that the outcome of a subsequent (weak or ideal) measurement of another observable B is different from the outcome of a direct (weak or ideal) measurement of B without the measure A before, but it is wrong to interpret the above quantities as the standard deviations in the above Robertson uncertainty relation, which is due to the possible preparation of quantum systems but not due to the measurement process itself.

Ozawa now derived mathematically well defined "noise-disturbance uncertainty relations" for very general measurements. Fortunately to understand the spin measurements in the above quoted paper, one doesn't need this pretty complicated mathematical apparatus, because this experiment realizes (to a phantastic accuracy) "ideal von Neumann measurements" as described above.

The first step is to define the measures for the "systematic uncertainty" of A and the "disturbance" of B in a clear way. We'll do this right away for the experiment performed in the above cited paper. The authors use the spin of a single neutron, which is prepared in a certain spin-z state:
[tex]|\psi \rangle=|\sigma_z \rangle.[/tex]
The observable to be weakly measured is [itex]A=\sigma_x[/itex] and the observable of which we like to determine the disturbance through this weak measurement is [itex]B=\sigma_y[/itex], which is incompatible with [itex]\sigma_x[/itex], because [itex][\sigma_x,\sigma_y]=\mathrm{i} \sigma_x \neq 0[/itex]. The weak measurement of [itex]\sigma_x[/itex] is realized as an ideal measurement of the spin component
[tex]M_{\varphi}=\sigma_{\varphi}=\sigma_x \cos \varphi+\sigma_y \sin \varphi.[/tex] Note that this covers the special cases where one does an ideal measurement of [itex]\sigma_x[/itex] or one does an measurement of [itex]\sigma_y[/itex], which shouldn't disturb the outcome of a subsequent new measurement of [itex]\sigma_y[/itex].

It is now easy to define the systematic error for the measurement of [itex]\sigma_x[/itex], given the neutron is prepared in the state [itex]|\sigma_x \rangle[/itex] before its weak measurement. It should reflect the systematic error by not performing an ideal measurement of [itex]\sigma_x[/itex] but instead do an ideal measurement of [itex]\sigma_{\varphi}[/itex]. By definition that's simply given by
[tex]\epsilon(\sigma_x|\psi,\sigma_{\varphi})=\sqrt{ \langle \psi|(\hat{\sigma}_{x}-\hat{\sigma}_{\varphi})^2| \psi \rangle}.[/tex]

To find a measure for the "disturbance" we must analyse what happens, according to standard quantum theory, if we first perform the weak measurement of [itex]\sigma_x[/itex], i.e., here the ideal measurment of [itex]\sigma_{\varphi}[/itex], and then measure [itex]\sigma_y[/itex] compared to the case that we directly measure [itex]\sigma_y[/itex] without doing first the weak measurment of [itex]\sigma_x[/itex].

According to the usual rules of quantum theory concerning ideal measurements, after the ideal measurement of [itex]\sigma_{\varphi}[/itex] but not filtering to a specific outcome of this ideal measurement (an ideal measurement always permits in principle such a filtering, because by definition it resolves precisely the possible values of the measured observable) the system is described by a mixed state, i.e., it is clear that after the measurement of [itex]\sigma_{\varphi}[/itex] one knows its value, because we have measured it, but one keeps all neutrons of the ensemble. This situation is described by the Statistical Operator
[tex]\hat{R}(\sigma_{\varphi}|\psi)=\sum_{ \sigma_{\varphi} = \pm 1} |\langle \sigma_{\varphi}|\psi \rangle|^2 |\sigma_{\varphi} \rangle \langle \sigma_{\varphi}|.[/tex]
Thus performing a measurement on any observable [itex]B[/itex] leads to the expectation value of this variable given by
[tex]E(B|\psi,\sigma_{\varphi})=\mathrm{Tr} [\hat{R}(\sigma_{\varphi}|\psi) \hat{B}] = \sum_{\sigma_{\varphi}=\pm 1} | \langle \sigma_{\varphi}|\psi \rangle|^2 \langle \sigma_{\varphi}|\hat{B}|\sigma_{\varphi} \rangle.[/tex]
Seen from the point of view of the so well defined procedure of a joint measurement of [itex]\sigma_{\varphi}[/itex] and [itex]B[/itex] by firstly do an ideal measurement of [itex]\sigma_{\varphi}[/itex] and then subsequently an ideal measurement of [itex]B[/itex], we can describe this as an ideal measurement of the observable [itex]O(B|\sigma_{\varphi})[/itex] represented by the observable
[tex]\hat{O}(B|\sigma_{\varphi})=\sum_{\sigma_{\varphi}=\pm 1}<br /> |\sigma_{\varphi} \rangle \langle \sigma_{\varphi} \langle<br /> \sigma_{\varphi}|\hat{B}|\sigma_{\varphi} \rangle |\sigma_{\varphi}<br /> \rangle.[/tex]
Thus it is intuitive to quantify the disturbance of [itex]B=\sigma_y[/itex] as the given by the root-mean square of the inaccuracy given by measuring not [itex]\sigma_y[/itex] on the system prepared in [itex]|{\psi} \rangle[/itex] but in the state after the ideal filter measurement of [itex]\sigma_{\varphi}[/itex] (which defines the weak measurement of [itex]\sigma_x[/itex]) and the statistical uncertainty of the ideal measurement of [itex]\sigma_y[/itex] given the outcome of the ideal [itex]\sigma_{\varphi}[/itex] measurement. The former inaccuracy is thus given by [tex]\eta_1^2(\sigma_y|\psi,\sigma_{\varphi})=\langle \psi |(\hat{O}(\sigma_y|\sigma_{\varphi})-\hat{\sigma_y})^2| \psi \rangle.[/tex] The latter is defined as [tex]\eta_2^2(\sigma_y|\psi,\sigma_{\varphi})=\sum_{\sigma_{\varphi}=\pm 1} |\langle \sigma_{\varphi}|\psi \rangle|^2 \langle \sigma_{\varphi}|\hat{\sigma}_y^2|\sigma_{\varphi} \rangle-\left (\sum_{\sigma_{\varphi}=\pm 1} |\langle \sigma_{\varphi}|{\psi} \rangle|^2 \langle \sigma_{\varphi}|\hat{\sigma}_y| \sigma_{\varphi} \rangle \right )^2.[/tex] After some algebra one can prove that this is identical with the definition in the above quoted paper, i.e.,

[tex]\eta^2(\sigma_y|\psi,\sigma_{\varphi})=\sigma_1^2(\sigma_y|\psi,\sigma_{\varphi})+\sigma_2^2(\sigma_y|\psi,\sigma_{\varphi})=\sum_{\sigma_{\varphi}=\pm 1} \left \langle \left [|\sigma_{\varphi} \rangle \langle \sigma_{\varphi}|,\hat{\sigma_y}\right]\psi \right | \left . \left [|\sigma_{\varphi} \rangle \langle \sigma_{\varphi}|,\hat{\sigma_y} \right]\psi \right \rangle[/tex]
Now we can calculate easily the inaccuracy for the weakly measured value [itex]\sigma_x[/itex] and the disturbance of [itex]\sigma_y[/itex] according to these definitions by just using the concrete realization of the Pauli matrices in [itex]\mathbb{C}^2[/itex]. As given in the paper, we get

[tex]\epsilon(\sigma_x|\psi,\sigma_{\varphi})=2 \sin \left(\frac{\varphi}{2} \right)[/tex]

and [tex]\eta(\sigma_{\varphi}|\psi,\sigma_{\varphi})=\sqrt{2} \cos \varphi.[/tex]

On the other hand the uncertainty bound in the Robertson uncertainty relation for the product of the variances of [itex]\sigma_x[/itex] and [itex]\sigma_y[/itex], both measured separately (!) on an ensemble of neutrons prepared in the state [itex]|\psi \rangle=|\sigma_1=+1 \rangle[/itex] is given by

[tex]\frac{1}{2}|\langle \psi|[\hat{\sigma}_x,\hat{\sigma}_y]| \psi \rangle|=\frac{1}{2}|\langle \psi|2\mathrm{i} \hat{\sigma}_z |\psi \rangle|=1.[/tex]

It's easy to evaluate that the noise-disturbance product takes its maximum value for [itex]\varphi_{\text{max}}=\arccos(2/3) \simeq 48.2^\circ[/itex]. The value of the product reached is

[tex]\epsilon(\sigma_x|\psi,\sigma_{\varphi_{\text{max}}}) \eta(\sigma_x|\psi,\sigma_{\varphi_{\text{max}}}) \simeq 0.77,[/tex]

which is considerably smaller than the Robertson-uncertainty boundary of 1.

This shows that the interpretation of the Robertson-uncertainty relation, which gives a lower bound for the standard deviations of two observables, both measured independently from each other on ensembles of systems prepared in a definite state, while the noise-disturbance product refers to a weak measurement of one observable followed by the
weak measurement of a second observable, defining a "joint measurement" of both variables. This product is dependent on both the initial preparation of the system and (!) on the definite measurement procedure. In the here demonstrated case, the initial preparetion and the measurement procedure has been chosen such that the noise-disturbance product (in the paper also called Heisenberg product) is always below the lower Robertson-uncertainty bound. Thus the original interpretation of the Robertson-uncertainty relation is not generally valid, and one has to be clear about the meaning of the various "uncertainty measures".

What do you think of this example and my simplified treatment? I'd be also interested to discuss these findings in terms of other
interpretations of quantum mechanics than the ensemble interpretation. I think the ensemble interpretation gives the most clear
analysis.

Greg Bernhardt · May 7, 2019

@vanhees71 did you find any more insight on this topic? Do you want to make it an Insight?

vanhees71 · May 7, 2019

There has been a lot of insight on this topic in recent years, and I'm not a real expert in it.

At the moment I'm quite busy with teaching obligations. I hope to find the time to finish an insight on "Not against photons" this summer in the semester break, which is planned for quite some time but hasn't advanced much yet.

Heisenberg-Robertson uncertainty relation vs. noise-disturbance measures

Similar threads

Graduate Two equivalent statements of time reversal symmetric Hamiltonian

High School Interesting paper on QM in Scientific American

Undergrad ##r-##independent angular momentum in quantum mechanics

Graduate Consistency of Relativistic QM

Graduate Some derivation in QFT in Curved SpaceTime by Birrell and Davies

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect