I Confusion about the thermal interpretation's account of measurement

nicf
Messages
12
Reaction score
2
TL;DR Summary
Does the account of measurement in Neumaier's thermal interpretation papers actually depend on the thermal interpretation?
I'm a mathematician with a longstanding interest in physics, and I've recently been enjoying reading and thinking about Arnold Neumaier's thermal interpretation, including some threads on this forum. There's something that's still confusing me, though, and I'm hoping someone here can clear it up. Most of the questions here come from the third paper in the series.

Consider some experiment, like measuring the spin of a suitably prepared electron, where we can get one of two outcomes. The story usually goes that, before the electron sets off the detector, the state is something like ##\left[\sqrt{\frac12}(|\uparrow_e\rangle+|\downarrow_e\rangle)\right]\otimes|\mbox{ready}\rangle##, where ##|\uparrow_e\rangle## denotes the state of an electron which has spin up around the relevant axis, and afterwards the state is something like ##\sqrt{\frac12}(|\uparrow_e\rangle\otimes |\uparrow_D\rangle+|\downarrow_e\rangle\otimes|\downarrow_D\rangle)##, where ##|\uparrow_D\rangle## denotes a state in which the (macroscopic) detector has reacted the way it would have if the electron had started in the state ##|\uparrow_e\rangle##. It's usually argued that this macroscopic superposition has to arise because the Schrödinger equation is linear. Let's call this the first story.

This description has struck many people (including me) as confusing, since it seems to contradict what I actually see when I run the experiment: if I see the "up" result on my detector, then the "down" term above doesn't seem to have anything to do with the world I see in front of me. It's always seemed to me that this apparent contradiction is the core of the "measurement problem" and, to me at least, resolving it is the central reason to care about interpretations of quantum mechanics.

Neumaier seems to say that the first story is simply incorrect. Instead he tells what I'll call the second story: because the detector is in a hot, noisy, non-at-all-isolated environment, and I only care about a very small number of the relevant degrees of freedom, I should instead represent it as a reduced density matrix and, since I've chosen to ignore most of the physical degrees of freedom in this system, the detector's position evolves in some complicated nonlinear way, but with the two possible readings as the only (relevant) stable states of the system. Which result actually happens depends on details of the state of the detector and the environment which aren't practically knowable, but the whole process is, in principle, deterministic. But the macroscopic superposition from the first story doesn't actually ever obtain, or if it does it quickly evolves into one of the two stable states.

So, finally, here's what I'd like to understand better:

(0) Did I describe the second story correctly?

(1) It seems to me that the second story could be told entirely within what Neumaier calls the "formal core" of quantum mechanics, the part that every interpretation agrees on. In his language, after my experiment, the q-probability distribution of the location of the detector needle really is supported only in the "up" region, and this follows from ordinary, uncontroversial quantum mechanics. Is this right? Does anything about the second story actually depend on the thermal interpretation?

(2) A more philosophical question: If macroscopic superpositions never actually appear, why all the fuss about interpretations? (For example, the many worlds interpretation seems to exist entirely to describe what it would mean for the universe to end up in such a macroscopic superposition.) What else even is there to worry about? If this does resolve the measurement problem, why wasn't it pointed out a long time ago?

(3) I've seen many arguments (e.g. https://plato.stanford.edu/entries/qm-decoherence/#SolMeaPro, which cites https://arxiv.org/abs/quant-ph/0112095 and https://arxiv.org/abs/quant-ph/9506020 pp. 14-15) that sound to me like they're saying the second story can't possibly work, usually with language like "decoherence cannot solve the measurement problem". Am I misunderstanding them? If not, would the counterargument just be that they're making the same linearity mistake as the first story?
 
Physics news on Phys.org
nicf said:
(0) Did I describe the second story correctly?
Yes, if (in case of the qubit discussed in Part IV of my series of papers) 'the system' exclusively refers to the reduced 2-state system and not to any other property of the detector.
nicf said:
(1) It seems to me that the second story could be told entirely within what Neumaier calls the "formal core" of quantum mechanics, the part that every interpretation agrees on. In his language, after my experiment, the q-probability distribution of the location of the detector needle really is supported only in the "up" region, and this follows from ordinary, uncontroversial quantum mechanics. Is this right? Does anything about the second story actually depend on the thermal interpretation?
As long as one only looks at an ensemble of similarly prepared systems, nothing depends on the thermal interpretation. But the thermal interpretation explains what happens in each individual case, and why.
nicf said:
(2) A more philosophical question: If macroscopic superpositions never actually appear, why all the fuss about interpretations? (For example, the many worlds interpretation seems to exist entirely to describe what it would mean for the universe to end up in such a macroscopic superposition.) What else even is there to worry about? If this does resolve the measurement problem, why wasn't it pointed out a long time ago?
That macroscopic systems can in principle be described by a pure state is a prerequisite of the traditional discussions, and is part of almost all interpretaions in print. The thermal interpretation explicitly negates this assumption.
nicf said:
(3) I've seen many arguments (e.g. https://plato.stanford.edu/entries/qm-decoherence/#SolMeaPro, which cites https://arxiv.org/abs/quant-ph/0112095 and https://arxiv.org/abs/quant-ph/9506020 pp. 14-15) that sound to me like they're saying the second story can't possibly work, usually with language like "decoherence cannot solve the measurement problem". Am I misunderstanding them? If not, would the counterargument just be that they're making the same linearity mistake as the first story?
Decoherence cannot solve the measurement problem since it still assumes the eigenvalue link to measurement and hence has no explanation for unique outcomes. The thermal interpretation has unique outcomes built in from the outset, hence only has to explain the origin of the probabilities.
 
Last edited:
  • Like
Likes mattt
Thanks for taking the time to reply! I have a couple more questions, but what you've said so far is helpful.

A. Neumaier said:
Yes, if (in case of the qubit discussed in Part IV of my series of papers) 'the system' exclusively refers to the reduced 2-state system and not to any other property of the detector.
Yes, that's what I meant --- I'm referring to the variable that encodes which of the two readings ends up being displayed on the detector, and this omits the vast majority of the physical properties of the detector and indeed the rest of the universe. I think we're on the same page.

A. Neumaier said:
As long as one only looks at an ensemble of similarly prepared systems, nothing depends on the thermal interpretation. But the thermal interpretation explains what happens in each individual case, and why.
I think I understand what you're saying here, but I'm asking because I thought that, in addition, you were actually claiming something even stronger: that the "first story" fails on its own terms. That is, I read you as saying that the problem arises from describing the detector as a pure state, which forces you into linear dynamics, which in turn forces you into the macroscopic superposition. You seem to be saying the pure-state assumption is simply a mistake no matter which interpretation you subscribe to, because the detector needle isn't isolated from its environment. Is that right?

Once one agrees that the macroscopic superposition can't happen, and that in the end the q-probability distribution of location of the needle has almost all its mass located in one of the two separated regions, it seems to me that we've already eliminated all the "mystery" that's usually associated with quantum measurements --- you now just need some way to attach a physical meaning to the mathematical objects in front of you, and I agree with you that, since the q-variance is small, it's very natural to interpret the q-expectation of the needle position variable as "where the needle is".

Part of the reason I've enjoyed reading this series of papers is that I find your explanation of measurement very attractive; it's the only story I've ever seen that I could imagine finding fully satisfying. The reason I'm confused is that I don't understand why, if the macroscopic superposition actually doesn't occur, anyone would still be proposing things like many-worlds, Bohmian mechanics, or objective collapse theories. When smart people do things that don't make sense to me, it makes me think I'm not understanding something! Are the people proposing these other interpretations are just making the mistake of trying to describe the detector with a pure state?
 
  • Like
Likes Troubled Muppet
nicf said:
Once one agrees that the macroscopic superposition can't happen
Macroscopically, one has density operators, and talking about their superposition is meaningless.
nicf said:
Are the people proposing these other interpretations are just making the mistake of trying to describe the detector with a pure state?
In the standard interpretations, this is not a mistake but a principal feature!
 
Well, there are some macroscopic systems which show specific quantum behavior (superfluidity of liquid Helium, super conductivity), but it's of course not so easy to prepare macroscopic systems in states such that quantum behavior is observable in macroscopic quantities. That's why classical physics indeed usually works so well for macroscopic matter. The standard interpretation of the QT formalism allows me to say that this is due to course graining, averaging over many microscopic degrees of freedom such that quantum fluctuations become irrelevant for the observed macrosocopic quantities on the scales of the typical resolution of their dynamical behavior. Within the thermal interpretaion I'm not allowed to say this anymore, but I don't know what I'm allowed to say.

It's new to me that detectors are described by "pure states". Usually it's described as a classical macroscopic device, or which particular example do you have in mind?
 
vanhees71 said:
It's new to me that detectors are described by "pure states". Usually it's described as a classical macroscopic device, or which particular example do you have in mind?
Well, surely a classical macroscopic device is also a quantum system, hence described by a quantum state. At least people who need to design detectors in silico treat the macroscopic system formed by the device as a quantum system.

Elsewhere you just said,
vanhees71 said:
Incoherent light can, e.g., be described by taking the intensity of coherent light and randomize phase differences. The same holds for polarization.
Thus you regard the mixed quantum state with rotation invariant density matrix as a randomized pure (polarized) quantum state. This exemplifies the rule, stated in all standard quantum mechanics books, that the density operator of any quantum system is regarded (by the orthodoxy supported, e.g., by Landau and Lifshitz) as representing an unknown pure state, randomized over the macroscopic uncertainty.
 
nicf said:
(3) I've seen many arguments (e.g. https://plato.stanford.edu/entries/qm-decoherence/#SolMeaPro, which cites https://arxiv.org/abs/quant-ph/0112095 and https://arxiv.org/abs/quant-ph/9506020 pp. 14-15) that sound to me like they're saying the second story can't possibly work, usually with language like "decoherence cannot solve the measurement problem". Am I misunderstanding them? If not, would the counterargument just be that they're making the same linearity mistake as the first story?
Decoherence does solve a big part the measurement problem, because it shows how wavefunctions will seem to collapse upon measurement, without actually having to postulate that they do actually collapse. The demonstration of this is just a technical matter, independent of any interpretation. What it doesn't explain is where probabilities come from - that is another story.
 
  • Like
Likes Demystifier
vanhees71 said:
Well, there are some macroscopic systems which show specific quantum behavior (superfluidity of liquid Helium, super conductivity), but it's of course not so easy to prepare macroscopic systems in states such that quantum behavior is observable in macroscopic quantities.
The macroscopic laws of a superfluid are as classical as the macroscopic laws of hydromechanics for water, though quantitatively slightly different. Both depend for their details (thermodynamic state functions) on quantum properties of matter. But the macroscopic limit is in both cases classical and deterministic.

vanhees71 said:
That's why classical physics indeed usually works so well for macroscopic matter. The standard interpretation of the QT formalism allows me to say that this is due to course graining, averaging over many microscopic degrees of freedom such that quantum fluctuations become irrelevant for the observed macrosocopic quantities on the scales of the typical resolution of their dynamical behavior. Within the thermal interpretation I'm not allowed to say this anymore, but I don't know what I'm allowed to say.
The thermal interpretation also explains this by coarse-graining, but the latter is not seen as an averaging process (which it isn't in the standard formulations of coarse graining, except in limiting cases such as very dilute gases). Instead, coarse-graining is seen as an approximation process in which one restricts attention to a collection of relevant macroscopic variables and neglects small amplitude variations with high spatial or temporal frequencies.
 
  • Like
Likes vanhees71
A. Neumaier said:
Well, surely a classical macroscopic device is also a quantum system, hence described by a quantum state. At least people who need to design detectors in silico treat the macroscopic system formed by the device as a quantum system.

Elsewhere you just said,

Thus you regard the mixed quantum state with rotation invariant density matrix as a randomized pure (polarized) quantum state. This exemplifies the rule, stated in all standard quantum mechanics books, that the density operator of any quantum system is regarded (by the orthodoxy supported, e.g., by Landau and Lifshitz) as representing an unknown pure state, randomized over the macroscopic uncertainty.
Well, it's one way to describe it. I'd not say that's the most general case. Another important example is if you have a quantum system that may be prepared in a pure state and you want to describe a subsystem, which you describe the (usually mixed-state) statistical operator you get from a partial trace.

I'd say that usually a measurement device is usually rather described in a mixed rather than a pure state.
 
  • #10
A. Neumaier said:
The macroscopic laws of a superfluid are as classical as the macroscopic laws of hydromechanics for water, though quantitatively slightly different. Both depend for their details (thermodynamic state functions) on quantum properties of matter. But the macroscopic limit is in both cases classical and deterministic.The thermal interpretation also explains this by coarse-graining, but the latter is not seen as an averaging process (which it isn't in the standard formulations of coarse graining, except in limiting cases such as very dilute gases). Instead, coarse-graining is seen as an approximation process in which one restricts attention to a collection of relevant macroscopic variables and neglects small amplitude variations with high spatial or temporal frequencies.
But that in fact IS the usual coarse-graining I'm talking about. You average over the many microscopic details to describe the average behavior of macroscopic observables. One usual formal way is the gradient expansion (which can also be formulated as a formal ##\hbar## expansion). So after all the thermal interpretation is again equivalent to the standard interpretation? Still puzzled...
 
  • #11
vanhees71 said:
I'd say that usually a measurement device is usually rather described in a mixed rather than a pure state.
I agree. But this description is usually (and in particular by Landau & Lifshits) taken to be statistical, i.e., as a mixture indicating ignorance of the true pure state.
vanhees71 said:
Well, it's one way to describe it. I'd not say that's the most general case. Another important example is if you have a quantum system that may be prepared in a pure state and you want to describe a subsystem, which you describe the (usually mixed-state) statistical operator you get from a partial trace.
Well, you could consider the detector as being a subsystem of the lab; then the lab would be in an unkown pure state (but described by a mixture mostly in local equilibrium) and the detector would be described by a partial trace.

Within the traditional foundation you cannot escape assuming that the biggest system considered should be in a pure state if the details could be gathered for describing this state.
 
  • #12
vanhees71 said:
But that in fact IS the usual coarse-graining I'm talking about. You average over the many microscopic details to describe the average behavior of macroscopic observables.
I don't see any of this in the usual 2PI formalism for deriving the coarse-grained kinetic equations of
Kadanoff-Baym, say.
vanhees71 said:
One usual formal way is the gradient expansion (which can also be formulated as a formal ##\hbar## expansion).
In which step, precisely, does the gradient expansion involve an average over microscopic details (rather than an ensemble average over imagined replicas of the fields) ?
vanhees71 said:
So after all the thermal interpretation is again equivalent to the standard interpretation? Still puzzled...
On the level of statistical mechanics, the thermal interpretation is essentially equivalent to the standard interpretation, except for the way of talking about things. The thermal interpretation talk is adapted to the actual usage rather than to the foundational brimborium.

On this level, the thermal interpretation allows one to describe with the multicanonical ensemble a single lump of silver, whereas tradition takes the ensemble to be a literal ensemble of many identically prepared lumps of silver - even when only one of a particular form (a statue of Pallas Athene, say) has ever been prepared.
 
  • #13
In the 2PI treatment for deriving the Kadanoff-Baym equation and then doing the gradient expansion to get quantum-transport equations you work in the Wigner picture, i.e., a Fourier transform in ##(x-y)##, where ##x## and ##y## are the space-time coordinates of the two-point (contour) Green's function, i.e., you neglect the rapid changes in the variable ##(x-y)##, which is effectively an averaging out of (quantum) fluctuations.

In standard quantum-statistical mechanics one very well describes single macroscopic objects like a lump of silver. The coarse graining is over microscopic large but macroscopic small space-time cells. The Gibbs ensemble is just a tool to think statistically about this (or to program Monte Carlo simulations ;-)).

You still have not made clear to me, what's the Thermal Interpretation really is, if I'm not allowed to think about the ##Tr(\hat{\rho} \hat{O})## as an averaging procedure!
 
  • #14
vanhees71 said:
In the 2PI treatment for deriving the Kadanoff-Baym equation and then doing the gradient expansion to get quantum-transport equations you work in the Wigner picture, i.e., a Fourier transform in ##(x-y)##, where ##x## and ##y## are the space-time coordinates of the two-point (contour) Green's function, i.e., you neglect the rapid changes in the variable ##(x-y)##, which is effectively an averaging out of (quantum) fluctuations.
It smoothes rapid spatial changes irrespective of their origin. This is of the same kind as when in classical optics one averages over fast oscillations. It has nothing to do with microscopic degrees
of freedom - it is not an average over a large number of atoms or electrons!

vanhees71 said:
In standard quantum-statistical mechanics one very well describes single macroscopic objects like a lump of silver. The coarse graining is over microscopic large but macroscopic small space-time cells. The Gibbs ensemble is just a tool to think statistically about this (or to program Monte Carlo simulations ;-)).
As it is defined, the Gibbs ensemble is an ensemble of copies of the original lumps of silver. This was clearly understood in Gibbs' time, where it was a major point of criticism of his method! For example, on p.226f of
  • P. Hertz, Über die mechanischen Grundlagen der Thermodynamik, Ann. Physik IV. Folge (33) 1910, 225--274.
one can read:
Paul Hertz said:
So aufgefaßt, scheint die Gibbssche Definition geradezu widersinnig. Wie soll eine dem Körper wirklich eignende Größe abhangen nicht von dem Zustand, den er hat, sondern den er möglicherweise haben könnte? [...] Es wird eine Gesamtheit mathematisch fingiert [...] erscheint es schwierig, wenn nicht ausgeschlossen, dem Begriffe der kanonischen Gesamtheit eine physikalische Bedeutung abzugewinnen.
To reinterpret it as an ensemble of space-time cells is completely changing the meaning it has by definition!
vanhees71 said:
You still have not made clear to me, what's the Thermal Interpretation really is, if I'm not allowed to think about the ##Tr(\hat{\rho} \hat{O})## as an averaging procedure!
You may think of it as a purely mathematical computation, of the same kind as many other purely mathematical computations done in the derivation of the Kadanoff-Baym equations. You may think of the result as the ''macroscopic value'' of ##O##, lying somewhere in the convex hull of the spectrum of ##O##.
 
  • Like
Likes dextercioby
  • #15
I see. So it's just an extreme form of the shutup-and-calculate advice: you use the established math wothout any heuristics simply because it works. I find this quite nice, but it's hard to believe that without some heuristics in the relation to the abstract formalism QT would ever have been so successfully applied to the description of real-world processes.

In our interpretation of the standard derivation I think we agree, because indeed it's just the same averaging process as in classical statistics. That you may average over much more than just quantum fluctuations is also clear. That's done to the extreme when you further break the dynamics down to ideal hydro, i.e. assuming local equilibrium. From this you do the other direction in figuring in ever more fluctuations in varous ways to derive viscous hydro (Chapman-Enkog, method of moments etc.).
 
  • #16
vanhees71 said:
I see. So it's just an extreme form of the shutup-and-calculate advice: you use the established math wothout any heuristics simply because it works. I find this quite nice, but it's hard to believe that without some heuristics in the relation to the abstract formalism QT would ever have been so successfully applied to the description of real-world processes.
The heuristic of ignoring tiny high frequency contributions in space or time - independent of any reference to the microscopic degrees of freedom - is very powerful and sufficient to motivate everything that works. For example, the gradient expansion can be motivated by single particle quantum mechanics where in the position representation, the gradient expansion is just an expansion into low powers of momentum, i.e., a low momentum = slow change expansion. One just keeps the least changing contributions. Clearly, this is not averaging over microscopic degrees of freedom.
vanhees71 said:
In our interpretation of the standard derivation I think we agree, because indeed it's just the same averaging process as in classical statistics.
Effectively, yes, since you are employing the averaging idea for much more than only statistics.
But from a strict point of view there is a big difference, since the averaging per se has nothing to do with statistics. The thermal interpretation is about giving up statistics at a fundamental level and to employ it only where it is needed to reduce the amount of noise. This makes the thermal interpretation applicable to single systems where at least the literal use of the traditional quantum postulates would require (as in Gibbs' time) the use of a fictitious ensemble of imagined copies of the whole system.
 
  • #17
I've still no clue, what's the meaning of the formula ##\langle A \rangle=\mathrm{Tr}(\hat{\rho} \hat{A})##, if it's not an averaging procedure? It doesn't need to be an ensemble average. You can also simply "coarse grain" in the sense you describe it, i.e., averaging over "microscopically large, macroscopically small" space (or space-time) volums. This is in fact what's effectively done in the gradient expansion.

Of course, another argument for the gradient expansion as a means to derive effective classical descriptions for macroscopic quantities is that it can as well be formalized as an expansion in powers of ##\hbar##.

I also don't see a problem with the treatment of a single system in standard quantum theory within the standard statistical interpretation of the state since, whenever the classical approximation is valid, the standard deviations from the mean values of the macroscopic observables are irrelevant (that's a tautology), and then the probabilisitic nature of the quantum state is simply hard to observe and everything looks classical.

Take the famous ##\alpha##-particles in a cloud chamber as an example a la Mott. Each single particle seems to behave classical, i.e., following a classical (straight) trajectory, but of course it's because it's not a single-particle system at all, but a single particle interacting (practically continuously) with the vapor in the cloud chamber. The macroscopic trajectory, for which you can in principle observe position and velocity of the particle on macroscopic level of accuracy by just observing the trails building up during the particle is moving through the chamber, is due to this interaction "with the environment". For a single ##\alpha## particle in a vacuum originating from a single ##\alpha##-decaying nucleus you cannot say too much indeed: You neither know when it's exactly created nor in which direction it's flying, while all this is known simply by observation of the macroscopic trails of the ##\alpha## particle in the cloud chamber.
 
  • #18
vanhees71 said:
I've still no clue, what's the meaning of the formula ##\langle A \rangle=\mathrm{Tr}(\hat{\rho} \hat{A})##, if it's not an averaging procedure?
It's a property of the system like angular momentum in Classical Mechanics.
 
  • #19
vanhees71 said:
I've still no clue, what's the meaning of the formula ##\langle A \rangle=\mathrm{Tr}(\hat{\rho} \hat{A})##, if it's not an averaging procedure? It doesn't need to be an ensemble average.
But you defined it in your lecture notes as being the average over the ensemble of all systems prepared in the state ##\rho##. Since you now claim the opposite, you should emphasize in your lecture notes that it does not need to be an ensemble average, but also applies to a single, uniquely prepared system, such as the beautifully and uniquely shaped lump of silver under discussion.

In my view, the meaning of the formula ##\langle A \rangle=\mathrm{Tr}(\rho A)## is crystal clear. It is the trace of a product of two Hermitian operators, expressing a property of the system in the quantum state ##\rho##, just like a function ##A(p,q)## of a classical Hamiltonian system expresses a property of the system in the given classical state ##(p,q)##.

Ostensibly, ##\langle A \rangle## is not an average of anything (unless you introduce additional machinery and then prove it to be such an average). If the Hilbert space is ##C^n##, it is a weighted sum of the ##n^2## matrix elements of ##A##, with in general complex weights. Nothing at all suggests this to be an average over microscopic degrees of freedom, or over short times, or over whatever else you may think of.
 
  • #20
Of course, I've defined it in this way, because it's most easy to derive the formalism from it. Also the mass is indeed crystal clear, but to do physics you need to know the meaning of the procedure to real-world observables. So still, if ##\langle A \rangle## is not an average, I don't know what it is and how to apply it to the real world.

Obviously you simply don't understand the argument, why the formal manipulations used to derive macroscopic (classical) behavior, be it from quantum or classical statistical physics, always is an averaging procedure over many microscopic degrees of freedom. That all started from the very beginning of statistical mechanics by Bernoulli, Maxwell, and Boltzmann. It has also been used in classical electrodynamics to describe the intensity of electromagnetic fields, particularly optics (where the averaging is a time average), the derivation of electromagnetic properties of matter from classical electron theory (where the averaging is over spatial cells) by Lorentz et al. The same holds true for hydrodynamics (local thermal equilibrium and the corresponding expansions around it to yield all kinds of transport coefficients). All this is completely analogous in Q(F)T. It's the same basic understanding of the physics underlying the mathematical techniques which indeed turned out to be successful.

So it seems as if the Thermal Interpretation is just the "shut-up-and-calculate interpretation" pushed into the extreme, such that it's not useful anymore for the (phenomenological) theoretical physicist. I've some sympathies for this approach, because it avoids philosophical gibberish confusing the subject, but if you don't allow heuristic thinking (like the extremely useful idea of the Gibbs ensemble), there's no chance to apply a theory to new physical problems in the real world.
 
  • #21
vanhees71 said:
if ##\langle A \rangle## is not an average, I don't know what it is and how to apply it to the real world.
By your (i.e., the conventional minimal statistical) definition, it is an average over the ensemble of equally prepared systems, and nothing else.

How to apply it to the real world, e.g., to a single beautifully shaped lump of silver or to hydromechanics, should be a consequence of the definitions given. If you interpret it as another average, you therefore need to derive it from this original definition (which is possible only in very special model cases). Otherwise, why should one believe you?
vanhees71 said:
Obviously you simply don't understand the argument, why the formal manipulations used to derive macroscopic (classical) behavior, be it from quantum or classical statistical physics, always is an averaging procedure over many microscopic degrees of freedom. That all started from the very beginning of statistical mechanics by Bernoulli, Maxwell, and Boltzmann.
Yes, I really don't understand it, since in this generality it is simply false. Your argument is valid only in the special case where you assume (as Bernoulli, Maxwell, and Boltzmann) an ideal gas, so that you have an ensemble of independent particles and not (as in dense matter and in QFT) an ensemble of large systems.
vanhees71 said:
if you don't allow heuristic thinking (like the extremely useful idea of the Gibbs ensemble), there's no chance to apply a theory to new physical problems in the real world.
The thermal interpretation turns the heuristic thinking of Gibbs (where people complained how the property of a realization can depend on a theory for all the possibilities, which is indeed not sensible) into something rational that needs no heuristic anymore. Physicists are still allowed to use in addition to the formally correct stuff all the heuristics they are accustomed to, as long as it leads to correct predictions, just as they use the heuristics of virtual particles popping in and out of existence, while in fact they just work with the formal rules.
 
  • #22
Sure, in practice nearly everything is mapped to an ideal gas of quasiparticles, if possible, and it's amazing how far you come with this strategy. Among other things it can describe the color of a shiny lump of silver or the hydrodynamics of a fluid.
 
  • #23
Michael Price said:
Decoherence does solve a big part the measurement problem, because it shows how wavefunctions will seem to collapse upon measurement, without actually having to postulate that they do actually collapse. The demonstration of this is just a technical matter, independent of any interpretation. What it doesn't explain is where probabilities come from - that is another story.

This is a clearer way of saying exactly what I meant, thank you :). Let me use this as a jumping-off point to try to state my original question more clearly, since I think I am still confused.

The part of the measurement problem that's relevant to my question is exactly the part that decoherence doesn't try to solve: what determines which of (say) two possible measurement outcomes I actually end up seeing? The reason I'm confused is that, when I try to combine what I understand about decoherence with what I understand about the account described in the thermal interpretation, I arrive at two conclusions that don't line up with each other:

(a) The decoherence story as it's usually given explains how, using ordinary unitary quantum mechanics with no collapse, I end up in a state where neither possible outcome can "interfere" with the other (since both outcomes are entangled with the environment), thereby explaining why the wavefunction appears to collapse. But if I write down a mathematical description of the final state, there are parts of it that correspond to both of the two possibilities with no way to choose between them. This explanation comes with the additional claim that, due to the linearity of time evolution, there's no possible way that the final state could privilege one outcome over the other. (Bell is also often invoked here, although I don't know if I know how to turn that into a proof.)

(b) The description in the thermal interpretation papers seems to claim that, in fact, if I had a good enough description of the details of the initial state of the microscopic system and the measurement apparatus, I would be able to deduce which of the two possibilities "really happened", and that I could do this, again, using only ordinary unitary quantum mechanics with no collapse.

Since both stories use the same initial condition and the same rule for evolving in time, these seem to be two different claims about the exact same mathematical object --- the density matrix of the final state of the system. If that's true, then one of them ought to be wrong. The claim in (b) is much stronger than (a), and I think that if it works any reasonable person ought to regard something like (b) as a solution to the measurement problem! It would certainly be enough to satisfy me. But I've heard (a) enough times that I'm confused about (b). Is your position that the (a) story is incorrect, or am I misunderstanding something else?
 
  • Like
Likes akvadrako
  • #24
nicf said:
I could do this, again, using only ordinary unitary quantum mechanics with no collapse.

I'm not sure this is possible. Ordinary unitary QM with decoherence can give you two non-interfering outcomes. It can't give you a single outcome; that obviously violates linearity.

My understanding of the thermal interpretation (remember I'm not its author so my understanding might not be correct) is that the two non-interfering outcomes are actually a meta-stable state of the detector (i.e., of whatever macroscopic object is going to irreversibly record the measurement result), and that random fluctuations cause this meta-stable state to decay into just one of the two outcomes. An analogy that I have seen @A. Neumaier use is a ball on a very sharp peak between two valleys; the ball will not stay on the peak because random fluctuations will cause it to jostle one way or the other and roll down into one of the valleys.

However, the dynamics of this collapse of a meta-stable detector state into one of the two stable outcomes can't be just ordinary unitary QM, because ordinary unitary QM is linear and linear dynamics can't do that. In ordinary unitary QM, fluctuations in the detector would just become entangled with the system being measured and would preserve the multiple outcomes. There would have to be some nonlinear correction to the dynamics to collapse the state into just one outcome.
 
  • #25
PeterDonis said:
I'm not sure this is possible. Ordinary unitary QM with decoherence can give you two non-interfering outcomes. It can't give you a single outcome; that obviously violates linearity.

Exactly, that's why I'm confused! My impression is that @A. Neumaier is somehow denying this, and that somehow the refusal to describe macroscopic objects with state vectors is related to the way he gets around this linearity argument, although I don't see how.

If we're supposed to be positing nonunitary dynamics on a fundamental level, then that would obviate my whole question, but from the papers I understood @A. Neumaier to be specifically not doing that.
 
  • #26
nicf said:
(b) The description in the thermal interpretation papers seems to claim that, in fact, if I had a good enough description of the details of the initial state of the microscopic system and the measurement apparatus, I would be able to deduce which of the two possibilities "really happened", and that I could do this, again, using only ordinary unitary quantum mechanics with no collapse.
Correct.
nicf said:
Since both stories use the same initial condition and the same rule for evolving in time, these seem to be two different claims about the exact same mathematical object --- the density matrix of the final state of the system. If that's true, then one of them ought to be wrong.
No. Decoherence tells the same story but only in the statistical interpretation (using Lindblad equations rather than stochastic trajectories), where ensembles of many identically prepared systems are considered, so that only the averaged results (which must feature all possibilities) can be deduced. The thermal interpretation refines this to a different, more detailed story for each single case. Averaging the latter recovers the former.
PeterDonis said:
My understanding of the thermal interpretation (remember I'm not its author so my understanding might not be correct) is that the two non-interfering outcomes are actually a meta-stable state of the detector (i.e., of whatever macroscopic object is going to irreversibly record the measurement result), and that random fluctuations cause this meta-stable state to decay into just one of the two outcomes. An analogy that I have seen @A. Neumaier use is a ball on a very sharp peak between two valleys; the ball will not stay on the peak because random fluctuations will cause it to jostle one way or the other and roll down into one of the valleys.
Correct.
PeterDonis said:
However, the dynamics of this collapse of a meta-stable detector state into one of the two stable outcomes can't be just ordinary unitary QM, because ordinary unitary QM is linear and linear dynamics can't do that. In ordinary unitary QM, fluctuations in the detector would just become entangled with the system being measured and would preserve the multiple outcomes. There would have to be some nonlinear correction to the dynamics to collapse the state into just one outcome.
I explained how the nonlinearities naturally come about through coarse graining. An example of coarse graining is the classical limit, where nonlinear Hamiltonian dynamics arises from linear quantum dynamics for systems of sufficently heavy balls. This special case is discussed in Section 2.1
of Part IV
, and explains to some extent why heavy objects behave classically but nonlinearly.

nicf said:
Exactly, that's why I'm confused! My impression is that @A. Neumaier is somehow denying this, and that somehow the refusal to describe macroscopic objects with state vectors is related to the way he gets around this linearity argument, although I don't see how.

If we're supposed to be positing nonunitary dynamics on a fundamental level, then that would obviate my whole question, but from the papers I understood A. Neumaier to be specifically not doing that.
As you can see from the preceding, this is not necessary.
 
Last edited:
  • #27
PeterDonis said:
Ordinary unitary QM with decoherence can give you two non-interfering outcomes. It can't give you a single outcome; that obviously violates linearity.
This is according to the traditional interpretations, where outcome = eigenvalue, which must be statistical. But in the thermal interpretation, outcome = q-expectation, which is always single-valued. This makes a big difference in the interpretation of everything! See Chapter 4 of Part IV.
 
  • #28
No it goes in circles again. The identification of expectation values with meaurement outcomes had to be abandoned very early in the history of the development of QT. E.g., it contradicts the fact that the absorption and emission of electromagnetic radiation by charged-matter systems is in discrete "lumps of energy" ##\hbar \omega##. That's in fact how the whole quantum business started with Planck's analysis of the black-body spectrum.
 
  • #29
vanhees71 said:
Now it goes in circles again. The identification of expectation values with measurement outcomes had to be abandoned very early in the history of the development of QT.
The observable outcome is a property of the detector, e.g., a photocurrent. This is not quantized but a continuous burst, for each single observation of a detection event. The derivation of macroscopic electrodynamics from QED by QED shows that the measured currents are q-expectations. No experiment in the history of quantum mechanics contradicts this.

The relation between observed outcome and true result is in general approximate (especially when the spectrum spacing is of the order of the observation error or larger) and depends of what one considers to be the true (unobservable) result. This is not observable hence a matter of interpretation.

Here tradition and the thermal interpretation differ in what they postulate to be the true result, i.e., how to split the observed result (a left spot and a right spot) into a true result (eigenvalue or q-expectation) and an observational error (the difference). See Sections 4.1 and 4.2 of Part IV.

Since this doesn't change the experimental record it is cannot be contradicted by any experiment.
vanhees71 said:
E.g., it contradicts the fact that the absorption and emission of electromagnetic radiation by charged-matter systems is in discrete "lumps of energy" ##\hbar \omega##. That's in fact how the whole quantum business started with Planck's analysis of the black-body spectrum.
The black body spectrum was explained by Bose 1924 by the canonical ensemble of a Bose-Einstein gas, although Planck had derived it from quantized lumps of energy. Only a quantized spectrum is needed, no discrete lumps of radiation energy.

Just as the photoeffect could be explained by Wentzel 1928 with classical light (no lumps of energy), although Einstein had originally explained it in terms of quantized light. Only quantized electrons are needed, no discrete lumps of radiation energy.
 
Last edited:
  • Like
Likes dextercioby
  • #30
Well, Planck's derivation was in terms of a canonical ensemble of a Bose-Einstein gas. Of course, at this time it wasn't known as such.

The "q-expectation" value in general does not reflect what's measured by an actual device. For this you'd have to put information on the device into the description. This of course always have to be done when data from a real detector are evaluated, but it cannot be part of the general description of a system.

Nowadays it's no problem to prepare single photons, and all experiments show that an "entire photon" is registered (if it is registered at all) but not some fraction of a photon. So obviously at this (today indeed technically realized!) resolution, you measure discrete photon energies ##\hbar \omega## and not some expectation value.
 
  • #31
vanhees71 said:
Well, Planck's derivation was in terms of a canonical ensemble of a Bose-Einstein gas.
For the equilibrium thermodynamics of a Bose-Einstein gas one doesn't need anything more than the maximum entropy state corresponding to the q-expectation of a Hamiltonian with discrete eigenvalues. The according to you necessary interpretation as discrete lumps of energy nowhere enters.

vanhees71 said:
The "q-expectation" value in general does not reflect what's measured by an actual device.
Any measured current is the q-expectation of a smeared version of the QED current; only the details of the smearing depend on the actual device. This is the q-expectation that is relevant for the thermal interpretation.

vanhees71 said:
Nowadays it's no problem to prepare single photons, and all experiments show that an "entire photon" is registered (if it is registered at all) but not some fraction of a photon. So obviously at this (today indeed technically realized!) resolution, you measure discrete photon energies ##\hbar \omega## and not some expectation value.
No. What is measured in each single photodetection event (called a photon by convention) is a magnified current of energy much larger than ##\hbar \omega##.
 
  • Like
Likes dextercioby
  • #32
Indeed, the Hamiltonian, representing the energy of the system (in this case an ensemble of non-interacting harmonic oscillators, representing the em. field), takes discrete values, which are the possible outcomes of precise measurement of energy, while the expectation values can take all continuous values ##\geq 0##.
 
  • #33
vanhees71 said:
the Hamiltonian, representing the energy of the system (in this case an ensemble of non-interacting harmonic oscillators, representing the em. field), takes discrete values, which are the possible outcomes of precise measurement of energy
Well, only energy differences are measured, and in general many at the same time (through emission or absorption spectra). These all have widths and do not give exact values, and recovering from a spectrum the energy levels is a highly nontrivial process.

Thus the connection between measured values (always on a continuous scale, a q-expectation of something macroscopic) to the theoretical true values is always somewhat indirect, and therefore the designation of something (eigenvalue or q-expectation) as true measurement value is a matter of interpretation. Tradition and the thermal interpretation differ in the choice made in this.
 
  • #34
A. Neumaier said:
in the thermal interpretation, outcome = q-expectation, which is always single-valued.

Yes, you're right, I left that part out. But for the benefit of @nicf, it might be worth spelling out how this works in the case of a simple binary measurement such as the Stern-Gerlach experiment. Say we are measuring a spin-z up electron using a Stern-Gerlach apparatus oriented in the x direction. Then we have the following account of what happens according to ordinary QM vs. the thermal interpretation:

(a) Ordinary QM: the measurement creates an entanglement between the spin of the electron and its momentum (which direction it comes out of the apparatus). When this entangled state interacts with the detector, decoherence occurs, which produces two non-interfering outcomes. How this becomes one outcome (or the appearance of one) depends on which interpretation you adopt (where "interpretation" here means basically collapse vs. no collapse, something like Copenhagen vs. something like MWI).

(b) Thermal interpretation: The q-expectation of the measurement is zero (an equal average of +1 and -1), but each individual measurement gives an inaccurate result because of the way the measurement/detector are constructed, so only the average over many results on an ensemble of identically prepared electrons will show the q-expectation. For each individual measurement, random nonlinear fluctuations inside the detector cause the result to be either +1 or -1.
 
  • #35
PeterDonis said:
(b) Thermal interpretation: The q-expectation of the measurement is zero (an equal average of +1 and -1), but each individual measurement gives an inaccurate result because of the way the measurement/detector are constructed, so only the average over many results on an ensemble of identically prepared electrons will show the q-expectation. For each individual measurement, random nonlinear fluctuations inside the detector cause the result to be either +1 or -1.
I might be misunderstanding you, but I actually think (b) is not what @A. Neumaier is saying, at least if by "random nonlinear fluctuations" you mean that there's some change to the unitary dynamics underlying quantum mechanics. Rather, he's saying that the nonlinearity comes from coarse-graining, that is, from neglecting some details of the state, which would actually evolve linearly if you could somehow add those details back in.

This was my reading from the beginning, and is the source of my question. I feel quite stupid right now and that I must be missing something obvious, but I'm going to press on anyway and try to be more specific about my confusion.

----------------------------------

Let's start with the setup in Section 3 of the fourth TI paper, where we've written the Hilbert space of the universe as ##H=H^S\otimes H^E## where ##H^S## is two-dimensional, and assume our initial state is ##\rho_0=\rho^S\otimes\rho^E##. We have some observable ##X^E## on ##H^E##, and we're thinking of this as being something like the position of the tip of a detector needle. Using thermal interpretation language, we can say that we're interested in the "q-probability distribution" of ##X^E## after running time forward from this initial state; this can be defined entirely using q-expectation values, so I think @A. Neumaier and I agree that this is a physically meaningful object to discuss. If, after some experiment, the q-probability distribution of ##X^E## has most of its mass near some particular ##x\in\mathbb{R}##, then I'm happy to say that there's no "measurement problem" with respect to that experiment.

Consider two state vectors ##\psi_1## and ##\psi_2##, and suppose they are orthogonal, and pick some initial density matrix ##\rho^E## for the environment. Suppose that:
(i) starting in the state ##\psi_1\psi_1^*\otimes\rho^E## and running time forward a while yields a q-probability distribution for ##X^E## with a single spike around some ##x\gg 0##, and
(ii) similarly with ##\psi_2## around ##-x##.

The question then is: what does the q-probability distribution of ##X^E## look like if we instead start with ##\frac12(\psi_1+\psi_2)(\psi_1+\psi_2)^*\otimes\rho^E##?

The two competing answers I'm worried about are:
(a) It will be bimodal, with a peak around ##x## and a peak around ##-x##
(b) It will be unimodal, concentrated around ##-x## or around ##x##, with the choice between the two depending in some incredibly complicated way on the exact form of ##\rho_E##. (In this story, maybe there's a choice of ##\rho^E## that will give something like (a), but it would require a ludicrous amount of symmetry and so there's no need to worry about it.)

The reason I'm confused, then, is that I thought that the decoherence story involves (among other things) deducing (a) from (i) and (ii). In particular, I thought it followed from the linearity of time evolution together with the whole business with decaying off-diagonal terms in the density matrix, but I don't understand the literature enough to be confident here.

Am I just wrong about what the decoherence story claims? Is it just that they assume enough symmetry in ##\rho^E## to get (a) to happen, but actually (b) is what happens for the majority of initial environment states? I can see that this would be a sensible thing to do if you think of ##\rho^E## as representing an ensemble of initial environments rather than the "true" initial environment.

There is also the separate claim that, if the environment starts in a pure state (a possibility which TI denies but many other interpretations don't) then the linearity alone should leave the whole universe in a superposition of "what would have happened if ##\rho^S## were ##\psi_1##" and the same with ##\psi_2##, which feels to me like it ought to imply an outcome like (a), and it seems like I could then extract (a) for an arbitrary ##\rho^E## by writing it as a convex combination of pure states. I assume this paragraph also contains a mistake, but I would be interested to know where it is.
 
  • #36
nicf said:
he's saying that the nonlinearity comes from coarse-graining, that is, from neglecting some details of the state, which would actually evolve linearly if you could somehow add those details back in.

I think he's saying that (sort of--see below), but remember that he's also saying that, in the thermal interpretation, the "beable" is not the eigenvalue; it's the q-expectation. So the reason there is only one result is that there is only one beable. There aren't two decoherent possibilities that both exist; there is only one result, which is an inaccurate measurement of the single beable, the q-expectation.

In other words, he is not interpreting a wave function that has two entangled terms that decohere to what look like two measurement results, as actually describing two real possibilities. He's just interpreting them as a tool for calculating the q-expectation, which is what is real. So in order to understand the TI, you have to unlearn much of what you learned from other QM interpretations, since all of them focus on eigenvalues instead of q-expectations.

The reason I said "sort of" above is that, if the "beable" is q-expectations, not eigenvalues, then I'm not sure there is an underlying linear dynamics; the linear dynamics is the dynamics of the wave function, which gives you eigenvalues. I don't know that the dynamics of the q-expectations is always linear even for wave function dynamics that are always linear.

nicf said:
The two competing answers I'm worried about are:
(a) It will be bimodal, with a peak around ##x## and a peak around ##−x##
(b) It will be unimodal, concentrated around ##-x## or around ##x##, with the choice between the two depending in some incredibly complicated way on the exact form of ##\rho_E##. (In this story, maybe there's a choice of ##\rho^E## that will give something like (a), but it would require a ludicrous amount of symmetry and so there's no need to worry about it.)

I think neither of these are correct; I think the TI prediction is that the q-expectation will be peaked around ##0##. The ##+x## and ##-x## are eigenvalues, not q-expectations, and eigenvalues aren't "real" in the TI.
 
  • #37
PeterDonis said:
I think neither of these are correct; I think the TI prediction is that the q-expectation will be peaked around ##0##. The ##+x## and ##-x## are eigenvalues, not q-expectations, and eigenvalues aren't "real" in the TI.
This part I actually think I understand well enough to explain. The q-probability distribution isn't the same object as the q-expectation; it's a measure on ##\mathbb{R}##, whereas the q-expectation is just a number. (In fact, the q-expectation is the mean of the q-probability distribution.) You can extract the q-probability distribution from q-expectations of various observables --- the q-probability that ##X## lies in some set ##U## is ##\langle\chi_U(X)\rangle## where ##\chi_U## is the characteristic function of ##U## --- and so it's also "real" in the TI. (See (A5) on p. 5 of the first paper.) But there's no need to interpret it as a probability of anything; hence the "q". Eigenvalues don't enter into it at all.

The density matrix is also a completely legitimate physical object in the TI, since you can extract it if you know the q-expectations of all observables, and he's not changing the usual picture of how they evolve. (You can phrase the time evolution in terms of either the density matrix or the q-expectation, but they're mathematically equivalent. This happens on p. 6 of the second paper.) He does deny that we can always describe the universe with a pure state, that is, the density matrix doesn't have to have rank 1. But this isn't really all that big a deal for the question of the dynamics, since again it's possible to phrase ordinary unitary quantum mechanics in terms of density matrices without ever mentioning pure states.

I could, of course, be wrong about everything I just said, but I'm more confident about this part than about anything in my post two up from this one!
 
  • #38
PeterDonis said:
Yes, you're right, I left that part out. But for the benefit of @nicf, it might be worth spelling out how this works in the case of a simple binary measurement such as the Stern-Gerlach experiment. Say we are measuring a spin-z up electron using a Stern-Gerlach apparatus oriented in the x direction. Then we have the following account of what happens according to ordinary QM vs. the thermal interpretation:

(a) Ordinary QM: the measurement creates an entanglement between the spin of the electron and its momentum (which direction it comes out of the apparatus). When this entangled state interacts with the detector, decoherence occurs, which produces two non-interfering outcomes. How this becomes one outcome (or the appearance of one) depends on which interpretation you adopt (where "interpretation" here means basically collapse vs. no collapse, something like Copenhagen vs. something like MWI).

(b) Thermal interpretation: The q-expectation of the measurement is zero (an equal average of +1 and -1), but each individual measurement gives an inaccurate result because of the way the measurement/detector are constructed, so only the average over many results on an ensemble of identically prepared electrons will show the q-expectation. For each individual measurement, random nonlinear fluctuations inside the detector cause the result to be either +1 or -1.
Another nice example for the reason for abandoning the identification of expectation values with values of observables but using the standard one, namely that the possible values an observable can take is given by the spectrum of the representing self-adjoint operator of this observable. Thus, for the SGE "ordinary QM" provides the correct prediction, namely that a spin component of an electron can take two values ##\pm \hbar/2## when accurately measured. If the electron is "unpolarized", i.e., in the spin state ##\hat{\rho}_{\text{spin}}=\hat{1}/2## the expectation value is ##\langle \sigma_z \rangle=0##, which according to QT is not in the spectrum of ##\hat{\sigma}_z## and not what should be measured when accurately measuring the spin component. Guess what: All experiments measuring the spin of an electron accurately (as was first done by Stern and Gerlach for the valence electron in a Ag atom) only the values ##\pm\hbar/2## have been found and not the expectation value ##0##. The expectation value can be measured by measuring accurately the spin component on an ensemble of unpolarized electrons, which then of course gives ##0## (together with the correct statistical and usually also systematic error analysis of course).
 
  • #39
PeterDonis said:
My understanding of the thermal interpretation (remember I'm not its author so my understanding might not be correct) is that the two non-interfering outcomes are actually a meta-stable state of the detector (i.e., of whatever macroscopic object is going to irreversibly record the measurement result), and that random fluctuations cause this meta-stable state to decay into just one of the two outcomes.
Does this mean that if an EPR experiment is performed "random fluctuations" at detector A are entangled with "random fluctuations" at detector B such that the outcomes are correlated?
 
  • #40
vanhees71 said:
Guess what: All experiments measuring the spin of an electron accurately (as was first done by Stern and Gerlach for the valence electron in a Ag atom) only the values ##\pm\hbar/2## have been found and not the expectation value ##0##.
What is measured accurately are the positions of the spots in the Stern-Gerlach experiment. That these spots mean an accurate spin measurement is already an interpretation.

It is this interpretation that the thermal interpretation calls into question. It replaces it by the claim that it is an inaccurate measurement of a continuous particle spin with an error of the order of ##O(\hbar)## (as expected for nonclassical measurements, with the correct classical limit). This error is magnified by the experimental arrangement to a macroscopic size.

This is fully compatible with the experimental record and produces precisely the same statistics.
 
  • #41
nicf said:
This part I actually think I understand well enough to explain. The q-probability distribution isn't the same object as the q-expectation; it's a measure on ##\mathbb{R}##, whereas the q-expectation is just a number. (In fact, the q-expectation is the mean of the q-probability distribution.) You can extract the q-probability distribution from q-expectations of various observables --- the q-probability that ##X## lies in some set ##U## is ##\langle\chi_U(X)\rangle## where ##\chi_U## is the characteristic function of ##U## --- and so it's also "real" in the TI. (See (A5) on p. 5 of the first paper.) But there's no need to interpret it as a probability of anything; hence the "q". Eigenvalues don't enter into it at all.

The density matrix is also a completely legitimate physical object in the TI, since you can extract it if you know the q-expectations of all observables, and he's not changing the usual picture of how they evolve. (You can phrase the time evolution in terms of either the density matrix or the q-expectation, but they're mathematically equivalent. This happens on p. 6 of the second paper.) He does deny that we can always describe the universe with a pure state, that is, the density matrix doesn't have to have rank 1. But this isn't really all that big a deal for the question of the dynamics, since again it's possible to phrase ordinary unitary quantum mechanics in terms of density matrices without ever mentioning pure states.

I could, of course, be wrong about everything I just said, but I'm more confident about this part than about anything in my post two up from this one!
This is correct.
nicf said:
Let's start with the setup in Section 3 of the fourth TI paper, where we've written the Hilbert space of the universe as ##H=H^S\otimes H^E## where ##H^S## is two-dimensional, and assume our initial state is ##\rho_0=\rho^S\otimes\rho^E##. We have some observable ##X^E## on ##H^E##, and we're thinking of this as being something like the position of the tip of a detector needle.
This is correct.
nicf said:
Using thermal interpretation language, we can say that we're interested in the "q-probability distribution" of ##X^E## after running time forward from this initial state;
Here is the origin of your confusion. We are interested in the q-expectation of ##X^E##, and in its distribution when ##\rho_E## has a random part that cannot be controlled in the experiment under consideration. This distribution has a priori nothing to do with the q-probability distribution of ##X^E##. To relate the two constitutes part of the measurement problem - namely the part solved under suitable assumptions by decoherence.
nicf said:
Consider two state vectors ##\psi_1## and ##\psi_2##, and suppose they are orthogonal, and pick some initial density matrix ##\rho^E## for the environment. Suppose that:
(i) starting in the state ##\psi_1\psi_1^*\otimes\rho^E## and running time forward a while yields a q-probability distribution for ##X^E## with a single spike around some ##x\gg 0##, and
(ii) similarly with ##\psi_2## around ##-x##.

The question then is: what does the q-probability distribution of ##X^E## look like if we instead start with ##\frac12(\psi_1+\psi_2)(\psi_1+\psi_2)^*\otimes\rho^E##?
Because of the mixed terms in this expression, the q-probability distribution of ##X^E## cannot in general be deduced from the result of the two cases (i) and (ii). Thus the superposition arguments break down completely, due to the nonlinear dependence of expectations on the wave function.

To proceed and conclude anything, one therefore needs to make additional assumptions.
nicf said:
The two competing answers I'm worried about are:
(a) It will be bimodal, with a peak around ##x## and a peak around ##-x##

The reason I'm confused, then, is that I thought that the decoherence story involves (among other things) deducing (a) from (i) and (ii).
In each particular experiment, the q-expectation of the pointer variable is fully determined by ##\rho_E## and will in each case be close to one of ##\pm x##. The q-probability distribution of the pointer variable is irrelevant for the TI but is centered close to the q- expectation.

But in each experiment, ##\rho_E## will be different, and the sign of the q-expectation depends chaotically on the details, hence appears random.

On the other hand, the q-probability distribution deduced by decoherence is based on assuming ##\rho_E## to be an exact equilibrium state, which is the case only in the mean over many experiments. Therefore it gives the mean of the q-probability distributions of the pointer variable in all these experiments, which is bimodal.
PeterDonis said:
I think the TI prediction is that the q-expectation will be peaked around 000. The +x and -x are eigenvalues, not q-expectations, and eigenvalues aren't "real" in the TI.
No. You are probably thinking of the q-expectation of the measured variable, which in the symmetric case will have q-expectation zero, and for identically prepared qubits, this will be the case in all realizations. But for the interaction with the detector, the whole state of the measured system counts, not only its q-expectation; thus its q-expectation is fairly irrelevant (except when discussing errors).

Under discussion is, however, not the distribution of the q-expectation of the measured variable but the distribution of the q-expectations of the measurement results. According to the thermal interpretation, the q-expectation of the pointer variable will be essentially random (i.e., depending on the details of ##\rho_E##) , with a strongly bimodal distribution reflecting the binary q-probability distribution of the variable measured.

This is indeed what decoherence claims as average result. However, decoherence cannot resolve it into single events since according to all traditional interpretations of quantum mechanics, single events (in a single world) have no theoretical representation in the generally accepted quantum formalism.

Note that this even holds for Bohmian mechanics. Here single events have a theoretical representation, but this representation is external to the quantum formalism, given by the additionally postulated position variables.

Only the thermal interpretation represents single events within the generally accepted quantum formalism, though in a not generally accepted way.
 
  • #42
A. Neumaier said:
What is measured accurately are the positions of the spots in the Stern-Gerlach experiment. That these spots mean an accurate spin measurement is already an interpretation.

It is this interpretation that the thermal interpretation calls into question. It replaces it by the claim that it is an inaccurate measurement of a continuous particle spin with an error of the order of ##O(\hbar)## (as expected for nonclassical measurements, with the correct classical limit). This error is magnified by the experimental arrangement to a macroscopic size.

This is fully compatible with the experimental record and produces precisely the same statistics.
If the expectation value was the "true spin component" as you claim, you'd expect one blurred blob around ##\sigma_z=0##, but that's not what's observed in the SGE, and that was one of the "confirmations" of "old quantum theory", though ironically just 2 mistakes cancel (gyro factor 1 but using orbital angular momentum and somehow discussing why one doesn't see three lines (which would be natural for ##\ell=1##) but rather only two).

With the advent of half-integer spins and spin 1/2 with a gyrofactor of 2 all the quibbles were solved. The gyrofactor of 2 was, by the way, known from the refinements of the Einstein-de Haas experiment. De Haas himself had already measured a value closer to the right value 2, but was dissuaded by Einstein to publish it, because "of course the gyro factor must be 1" according to classical Amperian models of magnetic moments (ok, in 1915 you can't blame Einstein, but de Haas should have insisted to publish all experimental results, not the ones Einstein liked better, but that's another story).
 
  • #43
vanhees71 said:
If the expectation value was the "true spin component" as you claim, you'd expect one blurred blob around ##\sigma_z=0##
You'd expect that assuming a Gaussian error.

But there is no necessity for a Gaussian error distribution. If you measure something with true value 0.37425 with a 4 digit digital device the error distribution will be discrete, not Gaussian.

The arrangement in a Stern-Gerlach experiment together with simple theory accounts for the discreteness in the response. Thus one must not attribute the discreteness to the value of the spin but can as well attribute it to the detection setup.
 
  • #44
A. Neumaier said:
Here is the origin of your confusion. We are interested in the q-expectation of ##X^E##, and in its distribution when ##\rho_E## has a random part that cannot be controlled in the experiment under consideration. This distribution has a priori nothing to do with the q-probability distribution of ##X^E##. To relate the two constitutes part of the measurement problem - namely the part solved under suitable assumptions by decoherence.

Because of the mixed terms in this expression, the q-probability distribution of ##X^E## cannot in general be deduced from the result of the two cases (i) and (ii). Thus the superposition arguments break down completely, due to the nonlinear dependence of expectations on the wave function.

To proceed and conclude anything, one therefore needs to make additional assumptions.

In each particular experiment, the q-expectation of the pointer variable is fully determined by ##\rho_E## and will in each case be close to one of ##\pm x##. The q-probability distribution of the pointer variable is irrelevant for the TI but is centered close to the q- expectation.

But in each experiment, ##\rho_E## will be different, and the sign of the q-expectation depends chaotically on the details, hence appears random.

On the other hand, the q-probability distribution deduced by decoherence is based on assuming ##\rho_E## to be an exact equilibrium state, which is the case only in the mean over many experiments. Therefore it gives the mean of the q-probability distributions of the pointer variable in all these experiments, which is bimodal.
Excellent! I think this gets at the heart of what I was asking about. Specifically, I think I was failing to appreciate both
(a) the obvious-once-you've-stated-it fact that mixed terms like ##\psi_1\psi_2^*\otimes\rho^E## prevent us from deducing the ##\frac{1}{\sqrt{2}}(\psi_1+\psi_2)## result from the ##\psi_1## and ##\psi_2## results, and
(b) the fact that the decoherence arguments depend on ##\rho^E## being an exact equilibrium state, and so in particular we shouldn't expect the "bimodal" result from a generic ##\rho^E##. That is, I thought the decoherence argument was contradicting your analysis of the particular case, whereas they are in fact merely silent about the particular case and only making a claim about the average.

With respect to q-expectations vs. q-probability distributions, I think we actually agree and I was just speaking imprecisely. I'm happy to replace every instance of "the q-probability distribution is tightly centered around ##x##" with "the q-expectation is ##x## and the q-variance is small"; I didn't mean to imply that the q-probability distribution necessarily has anything to do with the probability that the pointer needle is in a particular place. But it seems like you do want the q-variance to be small in order for ##\langle X^E\rangle## to be an informative measure of "where the pointer needle actually is". This is my reading, for example, of the discussion on p. 15 of the second paper.

----------------------------------------------

I think my question has basically been answered, but out of curiosity I'm interested in drilling deeper into the bit about the mixed terms in the q-expectation. Specifically, how important is it to your picture of measurement that we can't assign pure states to macroscopic objects? In a modified TI where the whole universe can always be assigned a pure state, does everything break in the story we're discussing? It seems to me like it might break, although my reason for thinking this could just be a repetition of the same mistake from before, so if you'll indulge me once again I want to lay out my thinking.

Suppose again that the state space of the universe is ##H^S\otimes H^E## with ##H^S## one-dimensional, but suppose now that we can represent the state as a vector ##\psi_S\otimes\psi_E##. Fix ##\psi_E## and an observable ##X^E##, and let ##U## be the unitary operator which runs time forward enough for the measurement to be complete. Suppose that ##\psi_1## and ##\psi_2## are orthogonal, and that
(i') in the state ##\phi_1:=U(\psi_1\otimes\psi_E)##, we have ##\langle X^E\rangle=x## and the q-variance is very small, and
(ii') similarly for ##\phi_2:=U(\psi_2\otimes\psi_E)## and ##-x##.

Then ##U(\frac{1}{\sqrt{2}}(\psi_1+\psi_2)\otimes\psi_E)=\frac{1}{\sqrt{2}}(\phi_1+\phi_2)##, and we can ask what ##\langle X^E\rangle## is in this state. The answer is ##\frac{1}{2}\langle\phi_1+\phi_2,X^E(\phi_1+\phi_2)\rangle##, and we again have "mixed terms" which prevent us from deducing the answer from (i') and (ii').

But in this case, I think the smallness of the q-variance might get us into trouble. In particular, in the extreme case where ##\phi_1## and ##\phi_2## are eigenvectors of ##X^E##, the orthogonality kills off the mixed terms (since ##\phi_1## and ##\phi_2## are still orthogonal), and it seems like an extension of this argument should mean that the mixed terms are very small when the q-variances are very close to 0; I think I have most of a proof in mind using the spectral theorem. If this is right, it seems like, in this state, ##\langle X^E\rangle## should be close to 0, meaning it can't be ##\pm x##.

Is this right, or am I spouting nonsense again? If so, what is it about only using density matrices that "saves us" from this?
 
Last edited:
  • #45
Hi @A. Neumaier, @PeterDonis,
I read this post regarding the thermal interpretation and the Stern–Gerlach experiment:
PeterDonis said:
For each individual measurement, random nonlinear fluctuations inside the detector cause the result to be either +1 or -1.
Considering these words, I wonder if the magnets in the experiment are considered part of the detector, or the "detector" is only the screen, so to say.
 
  • #46
DennisN said:
I wonder if the magnets in the experiment are considered part of the detector, or the "detector" is only the screen, so to say.

It's the latter. The magnets are part of the internals of the experiment; all they do is implement a reversible unitary transformation on the quantum state of the electron. There's no decoherence in that part of the experiment. Only the screen has decoherence and random fluctuations, etc.
 
  • Like
Likes DennisN
  • #47
PeterDonis said:
It's the latter. The magnets are part of the internals of the experiment; all they do is implement a reversible unitary transformation on the quantum state of the electron. There's no decoherence in that part of the experiment. Only the screen has decoherence and random fluctuations, etc.
Thanks!
 
  • #48
PeterDonis said:
It's the latter. The magnets are part of the internals of the experiment; all they do is implement a reversible unitary transformation on the quantum state of the electron. There's no decoherence in that part of the experiment. Only the screen has decoherence and random fluctuations, etc.
Hmm, that got me thinking... if I remember correctly, sequential measurements of spin in one direction yield the same result (+1 and then +1 again, or -1 and then -1 again). How does this fit in with the thermal interpretation if it is random nonlinear fluctuations in the screen that cause decoherence*?

* From the previous quote:
PeterDonis said:
For each individual measurement, random nonlinear fluctuations inside the detector cause the result to be either +1 or -1.

I'm confused. :smile:
 
  • #49
DennisN said:
sequential measurements of spin in one direction yield the same result (+1 and then +1 again, or -1 and then -1 again)

The term "sequential measurements" is a misnomer here, because in the process you're describing, there are not multiple detector screens, there's only one. The electron just passes through multiple Stern-Gerlach magnets before it gets to the one detector screen. So there is only one measurement being made in this process.

"Sequential measurements" properly interpreted would mean passing an electron through one Stern-Gerlach magnet, then have it hit a detector screen, then somehow take the same electron and pass it through another Stern-Gerlach magnet and have it hit a second detector screen. But that's not actually possible because once the electron hits the first detector screen it gets absorbed and you can't manipulate it any more.

DennisN said:
How does this fit in with the thermal interpretation if it is random nonlinear fluctuations in the screen that cause decoherence*?

Because there's only one screen. See above.
 
  • #50
DarMM said:
It's a property of the system like angular momentum in Classical Mechanics.
However, this is deducted from the calculation of a mathematical expected value !

http://www.cithep.caltech.edu/~fcp/physics/quantumMechanics/densityMatrix/densityMatrix.pdf

1564726059704.png
1564726094899.png


/Patrick
 

Similar threads

Replies
2
Views
2K
Replies
17
Views
3K
Replies
84
Views
6K
Replies
38
Views
5K
Replies
6
Views
2K
Replies
15
Views
3K
Replies
309
Views
14K
Replies
59
Views
12K
Replies
2
Views
2K
Back
Top