# Bell vs Kolmogorov: Unravelling Probability Theory Limits

• I
• Killtech
In summary: QT guarantees that the only states that survive post measurement are the eigenstates of the observable operator. In the absence of QT, any process could reach detailed balance.
Killtech
TL;DR Summary
How do people get the idea Bell's inequalities invalidate probability theory?
For some reason on this forums here I found some weird opinions that Bell's inequality were somehow disqualifying classical probability theory in general rather then showing merely the limits of locality. I do no understand where that misunderstanding comes from, so I decided to find out.

Let's start with looking at Bell's formulation of the problem formulated with classic probability theory. While his choice of probability space is generic, his random variables on the other hand are very special. The concept of locality is translated into what those random variables may or may not depend on and this constrain is crucial in proving the theorem of the classical limit. In literature this condition on probability is called Bell locality, or factorizability and shared by all inequalities of this type. Conversely losing that constrain breaks the inequality since probability theory itself knows no such limits.

It is also very enlightening to check the counter example which shows how QT violates the inequality, e.g. for the CHSH case. It is a very easy derivation describing how the correlations come to be... so let's check what happens there from a probability theory perspective. And indeed it's pretty standard stuff. Probabilities are calculated from an object of the generic form ##|\langle a|i\rangle|^2## and in order for this to produce valid probabilities this must be part of a stochastic matrix - one of the classics of probability theory (if we expand the ##|i\rangle## states into a basis). These are normally introduced for Markov chains and stochastic processes. So this implicitly defined discrete time Markov process is used to model the measurements in CHSH and is indeed all it needs to violate the inequality. It is important to note the crucial difference to the classic limit is that here the process isn't local and practically allows the settings of both detectors to communicate with the underlying state during the transition - i.e. it bluntly ignores Bell's locality.

It may be noteworthy to mention that in this case the stochastic matrix is still restricted by the rules of QT, while for a general Markov process any stochastic matrix is valid. So without QT we can see that Markovs theory is not only able to violate Bell's inequality but can do so maximally well beyond Tsirelson's bound.

The matrix taken as it is however, is a little untypical. It does not transition from the entire ensemble space onto itself (like one would expect it to for a usual Markov process) but instead takes only a discrete subset of the state space - only ensembles composed of the states |i⟩ and no superpositions thereof) and transitions these onto ensembles composed of |a⟩ states. However QT guarantees that post measurement only eigenstates of the observable operator survive while the initial basis can be chosen at will and so we have a stochastic matrix for every choice. Both together uniquely specify the full Markov kernel (of which the stochastic matrix is just an extract) of a process transitioning any initial ensemble onto the ensembles after measurement (the entire state space is too large to allow a matrix depiction) .

Anyhow, that process is peculiar in that it reaches detailed balance (equilibrium) after a single step: a second application does not change the resulting ensemble any further - so it looks like the limit of some underlying process. Generally it is indeed very unusual to express measurement via a process as opposed to a simple random variable (like Bell did). The latter are always compatible with each other, while the prior usually are not - i.e. different processes transitioning the very same system and its state normally don't commute.

So, like how do people get the idea that any of this invalidates classic probability theory?

Killtech said:
Summary:: How do people get the idea Bell's inequalities invalidate probability theory?

For some reason on this forums here I found some weird opinions that Bell's inequality were somehow disqualifying classical probability theory in general rather then showing merely the limits of locality. I do no understand where that misunderstanding comes from, so I decided to find out.

Killtech said:
...So, like how do people get the idea that any of this invalidates classic probability theory?

Is there a specific position you can reference? It is a bit difficult to know what point you are referring to.

If you assume "EPR/Bell realism" and "EPR/Bell locality", then Bell's Theorem can lead to some disagreement with traditional probability concepts. For example, you *can* get to negative probabilities with these assumptions. Of course that implies to me that one or more assumptions are invalid, as I don't think there are negative probabilities.

http://www.drchinese.com/David/Bell_Theorem_Negative_Probabilities.htm

DrChinese said:
If you assume "EPR/Bell realism" and "EPR/Bell locality", then Bell's Theorem can lead to some disagreement with traditional probability concepts.
The problem ist that Kolmogorovs theory assumes neither. So why would it on it's own have an issue with Bell?

DrChinese said:
Is there a specific position you can reference? It is a bit difficult to know what point you are referring to.

Last edited:
Killtech said:
The problem is that Kolmogorovs theory assumes neither. So why would it on it's own have an issue with Bell?

Well, in the example I referenced: the quantum prediction (assuming EPR/Bell realism) leads to a violation of the first and third axioms. There are 8 subcases presented, and cases [2] and [7] have a combined probability of occurring of -10.36% (violating the first axiom, that all cases be non-negative real numbers) - either [2] or [7] or both must be negative. The other 6 cases have a combined likelihood of occurring of 110.36% (violating the third axiom, requiring that a set of 8 cases have a probability greater than or equal to any subset of 6 of those 8 cases) - since all 8 cases must by definition total 100%.

But yes, you are correct: If you don't consider Bell realism (which is the same as the EPR "elements of reality"), then there is no conflict between Bell and Kolmogorov. On the other hand, that's the entire point of contention. If you think Kolmogorov is wrong to begin with, then maybe you can rescue local realism.

Or you simply reject (classical) realism, i.e. you require an observer dependent reality (a/k/a "contextuality"). The choice of measurement basis somehow influences the reality witnessed elsewhere.

EPR rejected this path, and concluded: "No reasonable definition of reality could be expected to permit this." Of course, EPR didn't know about Bell. Had they known, I believe that the EPR authors would have conceded the point: elements of reality are limited to those that can be simultaneously measured/predicted.

Last edited:
Killtech
Demystifier said:
Oh god, wow. I never thought there was any other option then 1.

Like observing the impact of measurement in double-slit which-way experiments which leads to objectively very different outcomes i would have never thought of anything else then option 1. For me a change in the experimental result is always due to a physical process of some sort. I never though about the idea of seeing a different outcome and yet assuming no change in the system. Isn't this a bit like the often quoted phrase on the "definition if insanity" just in reverse?

Option 2 means that i do favor my theory over the simple reality of the experiment and therefore have to build up a completely new idea of reality... okay. Fair enough, Kolmogorov definition of probability is made for our reality i.e. option 1 only, so if we opt for another it makes sense to think of something else.

But what's the benefit of option 2 anyway? I see that both approaches can potentially deal with the problem but the second one makes everything way more complicated. I mean you will lose so much time and effort translating everything to a different reality that you may lose sight to ask the right questions to expand your understanding.

... and in the discussion i read a lot of misconceptions (how can i get a quote from a closed thread in here?).

Where quantum probabilities get weird is if you take the events to be of the form "The probability that the particle has spin-up in the x-direction, given that it has spin-up in the z-direction". The collapse interpretation doesn't give a meaning to such statements.
- @stevendaryl

Really? I would think the meaning is rather obvious if you can put correctly it terms of Kolmogorovs theory.

Demystifier
DrChinese said:
But yes, you are correct: If you don't consider Bell realism (which is the same as the EPR "elements of reality"), then there is no conflict between Bell and Kolmogorov. On the other hand, that's the entire point of contention. If you think Kolmogorov is wrong to begin with, then maybe you can rescue local realism.

Or you simply reject (classical) realism, i.e. you require an observer dependent reality (a/k/a "contextuality"). The choice of measurement basis somehow influences the reality witnessed elsewhere.

EPR rejected this path, and concluded: "No reasonable definition of reality could be expected to permit this." Of course, EPR didn't know about Bell. Had they known, I believe that the EPR authors would have conceded the point: elements of reality are limited to those that can be simultaneously measured/predicted.
But why in heavens name would i want save Bell's definition of local realism at all costs? Fair enough, coming for relativity i see it to be well motivated. But if it fails in experiments, then it fails. The factorization is a very strict restriction after all, which i find way less attractive to hold on to compared to a intuitionist definition of probability.

Opting for Kolmogorov does not reject classical realism on it's own. You do have to overburden yourself with additional assumptions to go there. You do have to assume that the physics as described by each relativistic observer is not just a convenient and correct subjective description of observations, no you have to top it off by assuming that each such description also directly depicts objective reality (i.e. the math is not just a convenient formalism to predict reality, no, it actually is reality). Such assumptions are pure metaphysical as they change nothing in terms of predictions, but indeed together with Kolmogorov it will break classical realism - though i have to wonder if having such assumptions still allows to call that realism classical.

But i for myself am akin to an intuitionist approach, so i don't like to overburden myself with a lot of weird assumptions that serve no purpose in terms of making any model predictions.

Killtech said:
Like observing the impact of measurement in double-slit which-way experiments which leads to objectively very different outcomes i would have never thought of anything else then option 1. For me a change in the experimental result is always due to a physical process of some sort. I never though about the idea of seeing a different outcome and yet assuming no change in the system. Isn't this a bit like the often quoted phrase on the "definition if insanity" just in reverse?
I don't understand, what's puzzling for you. In QT it is always important to precisely state the preparation procedure (here: a particle with a pretty precise momentum directed towards a double slit from a source sufficiently far from these slits) and the measurement made.

So how do you have to measure (register) the particle to have (a) "which-way information" or (b) observe "interference of probability waves"?

Very simple: In case (a) you have to put the screen (photoplate or more modern a CCD Cam) close enough to the slits such that the partial waves originating from the slits don't overlap. In case (b) you have to put the screen in a sufficiently large distance, i.e., far enough such that the partial waves originating from the slits do overlap.

Why is it surprising for you that the patterns are different, if the experimental setups are different. This simple example also shows that you can have either which-way information (screen near the slits) or interference of the probability waves (screen far from the slits), because you can only have the screen either near the slits or far away from them but not both at the same time.

Of course you see the distribution of particles on the screen in both cases only if you use a large ensemble of equally prepared particles in the same setup. The single particle's position (and time of registration if needed) on the screen is probabilistic with the probability distribution given by the wave function of the particle behind the slit.
Killtech said:
Option 2 means that i do favor my theory over the simple reality of the experiment and therefore have to build up a completely new idea of reality... okay. Fair enough, Kolmogorov definition of probability is made for our reality i.e. option 1 only, so if we opt for another it makes sense to think of something else.

But what's the benefit of option 2 anyway? I see that both approaches can potentially deal with the problem but the second one makes everything way more complicated. I mean you will lose so much time and effort translating everything to a different reality that you may lose sight to ask the right questions to expand your understanding.

... and in the discussion i read a lot of misconceptions (how can i get a quote from a closed thread in here?).

Where quantum probabilities get weird is if you take the events to be of the form "The probability that the particle has spin-up in the x-direction, given that it has spin-up in the z-direction". The collapse interpretation doesn't give a meaning to such statements.
- @stevendaryl
What's weird here? Quantum theory precisely tells you what happens here: If you have prepared a spin-1/2 particle to have a determined spin component ##\sigma_x=+\hbar/2## and then you measure the spin component ##\sigma_z## you get with probability 1/2 ##\sigma_z=+1/2## and with probability 1/2 ##\sigma_z=-1/2##. There's nothing weird with this. Quantum theory just tells you, how to describe the "probabilistic reality" of, well, quantum phenomena ;-).
Killtech said:
Really? I would think the meaning is rather obvious if you can put correctly it terms of Kolmogorovs theory.
Last but not least I don't see what's the problem with Kolmogorov's axiomatic system of probability theory. Whenever you describe an experiment feasible in the real world, the probabilities according to quantum theory obey these axioms. Of course, if you ask for probabilities which can never be measured on each single system pepared in a given (pure or mixed) state, these don't follow Kolmogorov's system, but then the fictitious probabilities are also nothing that can be tested by observations in the real world and thus are not subject to what quantum theory describes.

sysprog
vanhees71 said:
Why is it surprising for you that the patterns are different, if the experimental setups are different.
I never said i was surprised about that. I'm not. If the experimental setups are different I'm perfectly fine with that as it means the state space or the initial conditions has changed. If we go that road we do everything classically by the book. That's option 1.

It is however surprising to me that one would try explaining the same without admitting that the experimental setup has changed - i.e. this is what i understand option 2 is.

(see the links from @Demystifier for what the options are - as that was what i was referring to).

vanhees71 said:
Last but not least I don't see what's the problem with Kolmogorov's axiomatic system of probability theory. Whenever you describe an experiment feasible in the real world, the probabilities according to quantum theory obey these axioms. Of course, if you ask for probabilities which can never be measured on each single system pepared in a given (pure or mixed) state, these don't follow Kolmogorov's system, but then the fictitious probabilities are also nothing that can be tested by observations in the real world and thus are not subject to what quantum theory describes.
Yes! Thank you! this is precisely my point, if one just puts it correctly within the framework of Kolmogorov and Markov QT works perfectly fine within classical probability theory. It's just a question of looking how QT represents measurements and formulate that procedure correctly within the terminology of Markov (there is no way we can omit time from the probability model, so we need Kolmogorov+Markov to be able to handle it properly).

If we do that and understand the way QT calculates measurement predictions implicitly via a process we already yield a model that isn't even able to express any probabilities that cannot be measured.

vanhees71
Why are you bringing Markov in? I don't understand what you mean. I know him in connection with stochastic processes, and if you describe an open quantum system usually to assume it to be described by a Markov process is an approximation.

https://en.wikipedia.org/wiki/Markov_chain

vanhees71 said:
Why are you bringing Markov in? I don't understand what you mean. I know him in connection with stochastic processes, and if you describe an open quantum system usually to assume it to be described by a Markov process is an approximation.

https://en.wikipedia.org/wiki/Markov_chain
Well, Markov can just as well describe deterministic processes or mixtures thereof. Almost anything time dependent really.

Now, it's not me really. Look how QT calculates/demonstrates the violation of Bell's inequality in CHSH case. That could be an exercise from 2nd semesters probability theory where Markov chains are introduced. Fair enough, i follow my math professors advice "when it comes to physics, don't listen to what they say, just look at what they do". And if QT calculates it like this - which isn't the most intuitive approach to describe a measurement - but it works, then I'm fine with it. But it practically uses a transition matrix, which defines a Markov process.

Killtech said:
But it practically uses a transition matrix, which defines a Markov process.
A Markov process is a simplification, and talking about the "state" of a photon is deceptive. (This is why people have difficulties understanding what the wave function represents.) You cannot measure the polarization in an instant ("at time ##t##"), because what polarization means is the correlation between electric field vectors at two successive instants of time.

WernerQH said:
A Markov process is a simplification, and talking about the "state" of a photon is deceptive. (This is why people have difficulties understanding what the wave function represents.) You cannot measure the polarization in an instant ("at time ##t##"), because what polarization means is the correlation between electric field vectors at two successive instants of time.
Yeah, this perfectly coincides with what it means by modelling measurement via a process. The complete system can only pick one time evolution for it's next step. So the choice bars you to do any two measurements at the same time.

Basically if you accept that modelling you can never write ##A## for a random variable but have to index all of them with ##A_t## which means that there is some time evolution between different instances of ##A## (Note, that therefore ##Cor(A_t, A_{t'})## may be anything). And as the calculus works every measurement progresses time at least by an abstract amount such that you have ##A_{t_{before B}}## and ##A_{t_{after B}}## before and after ##B_t## was measured and ##A_t## itself does not exist. Time evolution has to fixed by defining which random variables are measured and when as that actually specifies your exact experimental setup you are using.

A Markov process may not tell you what the state really is. But what it tells you is what information is needed to describe it to make predictions for any scenarios and it separates what we know about the system (reflected by ensembles and probabilities) from what information the system itself holds (describe by the state space) and what experimental setup is used (described by the effective time evolution including all measurements).

Killtech said:
i follow my math professors advice "when it comes to physics, don't listen to what they say, just look at what they do".

(Off topic: I use that rule for politicians and athletes.)

physika and DrChinese
Killtech said:
But why in heavens name would i want save Bell's definition of local realism at all costs?

You don't. Instead reject local realism, and more specifically, EPR/Bell realism. In essence, and despite protestations to the contrary by many: all interpretations of QM are contextual (as can be seen by spin entanglement correlation statistics, which depend on context and nothing else). The choice of what to measure steers the outcomes, and you end up with a subjective, observer dependent world.

DrChinese said:
The choice of what to measure steers the outcomes
Obviously, since the choice of what to measure changes the experimental conditions.

DrChinese said:
and you end up with a subjective, observer dependent world.
I don't think this follows. For example, if Alice and Bob each make spin measurements on one of a pair of entangled particles, while the results will obviously depend on which measurement each chooses to make, they will end up agreeing, once they are able to share information about their results, on which results each one obtained. So the results are not subjective or observer dependent. They are only "experiment dependent"--dependent on which actual measurements Alice and Bob end up making--but that does not require any subjectivity at all; the processes Alice and Bob use to choose which measurements they make are also perfectly objective--each one will agree, after the fact once they are able to share information, about how each one's choice process worked and which choice each one made.

DrChinese said:
subjective, observer dependent world.
I would say observer dependent but objective. An analogy would be stock market prices, they are observer dependent (they depend on expectations and actions of investors), but objective (the price at any time is written in a computer and this computer record is there even when the investor does not read it).

A truly subjective thing would be a value of a painting on the wall in my home. The painting has some subjective value for me, which may be very different from its objective price (e.g. the amount of money I payed on auction).

Last edited:
Killtech said:
A Markov process may not tell you what the state really is. But what it tells you is what information is needed to describe it to make predictions for any scenarios and it separates what we know about the system (reflected by ensembles and probabilities) from what information the system itself holds (describe by the state space) and what experimental setup is used (described by the effective time evolution including all measurements).
It is not clear how to perform such a separation. I think the "state space" of a photon is problematic. In Bell-type experiments the state spaces of the two photons become strangely fused, don't they? I'm convinced that there is a realistic, statistical (stochastic) description of such experiments, but it must be non-local. It cannot involve "individual" photons, but must consider the whole experiment.

WernerQH said:
It is not clear how to perform such a separation. I think the "state space" of a photon is problematic. In Bell-type experiments the state spaces of the two photons become strangely fused, don't they? I'm convinced that there is a realistic, statistical (stochastic) description of such experiments, but it must be non-local. It cannot involve "individual" photons, but must consider the whole experiment.
But isn't this precisely the point of Markov and doesn't QT already also follow that logic? Markov knows only one state space, and that is always the state space of the entire system. And there is a huge beauty in this approach: at the start we don't have to make any wild assumptions about what the state space might or might not represent. We don't have to know if we are dealing with particles or waves or whatever. We can just figure out the time evolution of quantities of interest experimentally first and then can think about what it means later. QT gives us already the prior.

So if you take Markov, you work with states of the entire system but don't know anything about individual photon states. If the Markov chain however turns out to become reducible under certain contexts, then you can start to identify those individual components in more detail.

QT works the same as that there is just one Hilbert space containing everything. Well, actually not everything. While it does contain the entire quantum system, it does not describe the experimental setup it interacts with. This is a problem, since changing the setup isn't axiomatically properly reflected. The time evolution of the system (given by the Hamiltonian) doesn't actually change by changing the setup, yet the outcome does... this wacky formulation is probably what inspired some people to go for option 2 (see the discussion above). So here is another beauty of Markov: it won't let you get away with such laziness.

As for the separation it's not difficult in concept: You write down the time evolution of the system, but not for states, but rather for any ensemble (the Markov kernel). Everything which represents our knowledge of the system (i.e. ensemble information) can always be depicted in a way that has linear time evolution of probabilities over the state space. Everything that can't be made linear in probabilities has a characteristic impact on outcomes (i.e. profile of an interaction) that cannot be reproduced by any ensemble so it has to be a state of the system / physical information about the system.

Killtech said:
Oh god, wow. I never thought there was any other option then 1.

Like observing the impact of measurement in double-slit which-way experiments which leads to objectively very different outcomes i would have never thought of anything else then option 1. For me a change in the experimental result is always due to a physical process of some sort. I never though about the idea of seeing a different outcome and yet assuming no change in the system. Isn't this a bit like the often quoted phrase on the "definition if insanity" just in reverse?
You are then in disagreement with the idea that the whole universe is described by quantum theory and that time evolution is always unitary. That's a minority position in the theoretical physics community. At least, all the major contenders for a theory of everything, like string theory, loop quantum gravity and others assume a fully unitary quantum universe. It's fair to disagree with those, but it would be a minority view.
Killtech said:
But what's the benefit of option 2 anyway? I see that both approaches can potentially deal with the problem but the second one makes everything way more complicated. I mean you will lose so much time and effort translating everything to a different reality that you may lose sight to ask the right questions to expand your understanding.
What different reality are you talking about? All measurement devices are made of ordinary matter that should be described by quantum theory. The split into system and measurement setup is artificial and made only for practical purposes. The benefit of option 2 is that there isn't an artificial split and no modification of quantum theory is needed.

Nullstein said:
You are then in disagreement with the idea that the whole universe is described by quantum theory and that time evolution is always unitary.
I'd call me undecided on such issues. I don't think the theory is in a state to make such claims about the whole universe. Currently such discussions are mostly metaphysics so you cannot find a reasonable answer. For now i not entirely sure if this is might be even just a question of esthetics of the calculus or if it has any practical measurable meaning.

In this instance we are discussing only how the theory describes a single experiment - not the whole universe. In these type of experiments it is noticeable that the measurement devices and overall setup (which is both part of the universe and the given system) are not part of the Hamiltonian, but described outside of it. So the situation is inherently different from your general case. Don't mix these up please.

But in general, the complete experimental system could still have an unitary evolution in an extended model - iff the Markov process describing measurement is just a stochastic simplification of an underlying deterministic chaotic process. So basically the same as situation as with a coin flip - a chaotic but ultimately deterministic process, therefore time reversible and unitary... but we still opt for practicality and describe it by stochastics.

That makes option 1 still compatible with unitary time evolution in theory as it means the measurement postulates are just a proxy (coin flip type) for what the Hamiltonian could describe accurately.

Nullstein said:
What different reality are you talking about? All measurement devices are made of ordinary matter that should be described by quantum theory. The split into system and measurement setup is artificial and made only for practical purposes. The benefit of option 2 is that there isn't an artificial split and no modification of quantum theory is needed.
If you can express all measurement interaction via the Hamiltonian correctly, then you are still in option 1. Any change of the experimental setup then changes the Hamiltonian (or the state space). That change results in a different time evolution, so the disturbance resulting from the measurement is fully modeled and accounted for. But nothing of this can produces a conflict with classical probability theory, unless you overburden it with EPR style assumptions.

Killtech said:
We don't have to know if we are dealing with particles or waves or whatever. We can just figure out the time evolution of quantities of interest experimentally first and then can think about what it means later. QT gives us already the prior.
I'm afraid I can only speculate what you have in mind. I too believe that Q(F)T is firmly grounded in classical probability theory, and I'd love to see a simple stochastic model with all the features of a Bell-type experiment that let's us see more clearly what quantum physics is about. But it seems you don't (yet?) have one. As a pragmatist I'm content to have a successful description in terms of QED, and not particularly keen on a new formalism with unknown physical content.

Killtech said:
Well, Markov can just as well describe deterministic processes or mixtures thereof. Almost anything time dependent really.

Now, it's not me really. Look how QT calculates/demonstrates the violation of Bell's inequality in CHSH case. That could be an exercise from 2nd semesters probability theory where Markov chains are introduced. Fair enough, i follow my math professors advice "when it comes to physics, don't listen to what they say, just look at what they do". And if QT calculates it like this - which isn't the most intuitive approach to describe a measurement - but it works, then I'm fine with it. But it practically uses a transition matrix, which defines a Markov process.
I still don't understand what this has to do with Markov processes. The most straight-forward analysis of Bell's inequalities and their violation by quantum theory can be found in Sakurai, Modern Quantum Physics, Revised edition or, even more clearly, Weinberg, Lectures on Quantum Mechanics. The difference between the local hidden variable model and QT is that in the former you assume the values of all spin components are determined but unknown and thus somehow statistically described by some arbitrary probability distribution of the hidden parameters.

In QT the single-particle spins are maximally indetermined, and measurements in three different directions are incompatible, i.e., they have to be performed on three different equally prepared ensembles. That's why the quantum probabilities don't fulfill the Bell inequalities and in this sense the Kolmogorov probability theory.

PeterDonis said:
Obviously, since the choice of what to measure changes the experimental conditions.I don't think this follows. For example, if Alice and Bob each make spin measurements on one of a pair of entangled particles, while the results will obviously depend on which measurement each chooses to make, they will end up agreeing, once they are able to share information about their results, on which results each one obtained. So the results are not subjective or observer dependent. They are only "experiment dependent"--dependent on which actual measurements Alice and Bob end up making--but that does not require any subjectivity at all; the processes Alice and Bob use to choose which measurements they make are also perfectly objective--each one will agree, after the fact once they are able to share information, about how each one's choice process worked and which choice each one made.

Demystifier said:
I would say observer dependent but objective. An analogy would be stock market prices, they are observer dependent (they depend on expectations and actions of investors), but objective (the price at any time is written in a computer and this computer record is there even when the investor does not read it).

A truly subjective thing would be a value of a painting on the wall in my home. The painting has some subjective value for me, which may be very different from its objective price (e.g. the amount of money I payed on auction).

"Subjective", in the context of EPR/Bell, means dependent on the measurement choice the observer makes. This is in keeping with the language of EPR:

"Indeed, one would not arrive at our conclusion if one insisted that two or more physical quantities can be regarded as simultaneous elements of reality only when they can be simultaneously measured or predicted. On this point of' view, since either one or the other, but not both simultaneously, of the quantities P and Q can be predicted, they are not simultaneously real. This makes the reality of P and Q depend upon the process of measurement carried out on the first system, which does not disturb the second system in any way."

The measurement results themselves are objectively real. But certainly, a measurement on one basis does not represent the "uncovering" or "discovery" of a result that is but one of many results that are simultaneously available, as would be the case with stock prices.

WernerQH said:
I'm afraid I can only speculate what you have in mind. I too believe that Q(F)T is firmly grounded in classical probability theory, and I'd love to see a simple stochastic model with all the features of a Bell-type experiment that let's us see more clearly what quantum physics is about. But it seems you don't (yet?) have one. As a pragmatist I'm content to have a successful description in terms of QED, and not particularly keen on a new formalism with unknown physical content.
Hmm, maybe let me try again since i think QT already has it, it just fails to name it correctly.

Let's start with what a Markov process practically does: it gives the time evolution of ensembles over some underlying real process on a state space - whatever the real process may be. Notice that Markov can be applied on top of an existing deterministic processes to generalize it to ensembles rather then just states. In case the state space itself is hidden (states themselves cannot be measured), then it's called a Hidden Markov model. A model where all measurement is additionally invasive and modeled via interaction... well that's QT.

Now look at this, point 6 in particular:
https://www.physicsforums.com/insights/the-7-basic-rules-of-quantum-mechanics/

Point 6 describes how any single state transitions onto an ensemble in the special scenario of measurement. The transition depends only on the state and not it's history - i.e. the transition has the Markov property.

Let's look at measurement of a single particle spin along axis z and extend point 6 into a full transition matrix of all states of interest (i won't bother to write them in the same basis):
##|S_z = +\rangle##, ##|S_z = -\rangle##, ##|S_{z+x} = +\rangle##, ##|S_{z+x} = -\rangle##

$$\begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0.5 & 0.5 & 0 & 0 \\ 0.5 & 0.5 & 0 & 0 \end{pmatrix}$$
That's a discrete time Markov chain transitioning ensembles composed of any distributions of those 4 states, e.g. ##(p_{z+}, p_{z-}, p_{zx+}, p_{zx-})^T## for any probabilities summing up to 1.

After purification, this matrix also coincides with the density matrix - well except that the probabilities of individual ensembles are separated into a dual vector on which this matrix acts on. In Markovs theory purification is always done preventively to maximal extend with the means to handle any ensemble including continuous distributed cases, which QT lacks a framework for (ensembles over a continuous state space just can't be always handled by matrices). In a sense the ensemble space of Markov is maximally purified but because the density operator does not live there, you have to do it manually in QT.

So what about the normal time evolution outside of measurement? Well we have a Hamiltonian that describes a deterministic time evolution. Deterministic Markov processes all look the same. I don't want to make the effort to write the entire Markov kernel formally, so i discretize time and just look at a minimal part of the state space:
##|\Psi\rangle##, ##U(t_1)|\Psi\rangle##, ##U(t_2)|\Psi\rangle##, ... (where ##U## is the time evo op)

then for ensembles such time evolution yields the following trivial transition matrix:
$$\begin{pmatrix} 0 & 1 & 0 & ... \\ 0 & 0 & 1 & ... \\ 0 & 0 & 0 & ... \\ ... & ... & ... & ... \end{pmatrix}$$
It's still the same as the time evo of the density matrix but includes automatic purification (which creates this trivial structure), even though it's not needed for as long as we don't intend to handle ensembles of ##|\Psi\rangle##, ##U(t_1)|\Psi\rangle##, ...

Now the full time evolution of the system is given by piecing together the two very different processes according to the experimental setup. Done, that's the classical probability space of QT. So QT differs from a Hidden Markov Model by this piecing together of the time evolution and restricting access to information through measurement even further.

vanhees71 said:
I still don't understand what this has to do with Markov processes.
This post above should clear that up. If you generalize the density operator to be able to handle any ensemble (maximally purified) you end up with a classic Markov process describing the whole system.

Last edited:
WernerQH
DrChinese said:
"Subjective", in the context of EPR/Bell, means dependent on the measurement choice the observer makes.
I just think that "subjective" is not a good word for that. The only truly subjective interpretation of quantum contextuality is QBism, and I don't think that you endorse QBism.

Isn't this what's usually called "contextuality", i.e., that the values of observables have not necessarily predefined values but are usually indetermined, depending on the state the system is prepared in before measurement. All there is according to QT, given the state, are probabilities for the outcome of measurements, in the most general sense described by POVMs with the special case of ideal (von Neumann) measurements ("projective measurements") as a special case. The Kolomogrov axioms apply for the probabilities so predicted, referring to a measurement which can be done on a single system. They do not apply to the probabilities for the outcome of "any measurement", i.e., in the case that you cannot perform a measurement on a single system but only on single systems with different measurement setup using ensembles of identically prepared systems for each of these measurement setups.

E.g., you can prepare a neutron with spin component ##\sigma_z=+1/2## and then either precisely measure the spin component ##\sigma_z## or ##\sigma_x## but not both on the same single neutron. All you can do is to measure ##\sigma_z## on one ensemble (giving with 100% probability ##\sigma_z=+1/2##) or to measure ##\sigma_x## on another ensemble (giving with 50% probability ##\sigma_x=+1/2## or ##\sigma_x=-1/2##).

Demystifier
vanhees71 said:
The Kolomogrov axioms apply for the probabilities so predicted, referring to a measurement which can be done on a single system. They do not apply to the probabilities for the outcome of "any measurement", i.e., in the case that you cannot perform a measurement on a single system but only on single systems with different measurement setup using ensembles of identically prepared systems for each of these measurement setups.
In other word: the Kolmogorov axioms do apply to the probabilities for all outcomes in the case that the measurement setup is explicitly added as part of the system. But that rises the question, why that isn't done.

The classical way to do so would be to define the state space on ##S \times H## where ##S## describes all measurement setups while ##H## is the Hilbert space of quantum states. Now this represents quite exactly what you wrote: each "single system" in your words would be distinguished by ##s\epsilon S## the measurement setup.

However, that's not how QT get's the job done. Instead it accounts for the setup by providing a calculus how single states evolve into ensembles under the given setup (von Neumann). While this is supposed to happen instantaneously, this follows the exact same structure as a Markov transition: for each possible state transition probabilities are defined. If we only look at a finite selection of states we could draw this transition it in the typical Markov directional graph.

Markov has a probability space of the form ##T \times H##. Normally ##T## represents just some abstract notion of time but here it also used to distinguish how measurement setups affect the evolution of the system... so the way QT needs it, ##T## would need a tree like structure: whenever a measurement happens, the tree forks according to possible settings of the measurement device the quantum state interacts with. So for each setting yields a unique Markov chain. If we skip the deterministic time evolution in between measurements, the probability space becomes effectively the same as in the simple Kolmogorovs case with ##T=S##.

Edit:
After looking a bit around, such a thing is called a Markov decision process - but dropping the reward function since there is no optimization goal in QT. I think it is proper to reflect each measurements as actions at time ##t## since in some Bell experiments the detector axis setting is delayed after the emission of the photon pair.

Last edited:
vanhees71 said:
E.g., you can prepare a neutron with spin component σz=+1/2 and then either precisely measure the spin component σz or σx but not both on the same single neutron. All you can do is to measure σz on one ensemble (giving with 100% probability σz=+1/2) or to measure σx on another ensemble (giving with 50% probability σx=+1/2 or σx=−1/2).
That simple scenario we can easily translate it into a Markov decision process. I will be using the notation from Wikipedia.

We start with ##P(s_0=|\sigma_z=+\rangle)=1## meaning the prepared initial ensemble is made of a single state ##\sigma_z=+1/2## that we know with certainty as per your example. Now after the neutron emission, we have only one decision/action to make and that is at ##t=1## we decide if we measure ##\sigma_z## or ##\sigma_x## thus ##a_1 \epsilon \{\sigma_z, \sigma_x\}##. For each decision ##a_1## the probabilities ##P_{a_1}(s_1 = s' | s_0 = s, a_1 = a)## are calculated according to rule 6 from this article.

(in that notation of the article this would be ##p_\psi (a) = P_{d_1}(s_1 = |a,\nu\rangle \text{ } | s_0 = \psi, d_1 = A)##)

So we have:
##P_{a_1}(s_1 = |\sigma_z=+\rangle\text{ }| s_0 = |\sigma_z=+\rangle, a_1 = \sigma_z) = 1##
##P_{a_1}(s_1 = |\sigma_z=-\rangle\text{ }| s_0 = |\sigma_z=+\rangle, a_1 = \sigma_z) = 0##
##P_{a_1}(s_1 = |\sigma_x=+\rangle\text{ }| s_0 = |\sigma_z=+\rangle, a_1 = \sigma_x) = 0.5##
##P_{a_1}(s_1 = |\sigma_x=-\rangle\text{ }| s_0 = |\sigma_z=+\rangle, a_1 = \sigma_x) = 0.5##

The Markov decision process will only have to distinguish between the two ensembles in the ##\sigma_z## or ##\sigma_x## measurement setups only after the measurement has happened, but before that they can be treated as representing the same thing.

Extended experimental setups with multiple measurements of the same neutron along its path would then result in consecutive decisions ##a_t## for ##t>1## with different probabilities for each step forking into decision tree.

WernerQH said:
But it seems you don't (yet?) have one
Isn't the above just that?

Last edited:
Killtech said:
In other word: the Kolmogorov axioms do apply to the probabilities for all outcomes in the case that the measurement setup is explicitly added as part of the system. But that rises the question, why that isn't done.

The classical way to do so would be to define the state space on ##S \times H## where ##S## describes all measurement setups while ##H## is the Hilbert space of quantum states. Now this represents quite exactly what you wrote: each "single system" in your words would be distinguished by ##s\epsilon S## the measurement setup.

However, that's not how QT get's the job done. Instead it accounts for the setup by providing a calculus how single states evolve into ensembles under the given setup (von Neumann). While this is supposed to happen instantaneously, this follows the exact same structure as a Markov transition: for each possible state transition probabilities are defined. If we only look at a finite selection of states we could draw this transition it in the typical Markov directional graph.

Markov has a probability space of the form ##T \times H##. Normally ##T## represents just some abstract notion of time but here it also used to distinguish how measurement setups affect the evolution of the system... so the way QT needs it, ##T## would need a tree like structure: whenever a measurement happens, the tree forks according to possible settings of the measurement device the quantum state interacts with. So for each setting yields a unique Markov chain. If we skip the deterministic time evolution in between measurements, the probability space becomes effectively the same as in the simple Kolmogorovs case with ##T=S##.

Edit:
After looking a bit around, such a thing is called a Markov decision process - but dropping the reward function since there is no optimization goal in QT. I think it is proper to reflect each measurements as actions at time ##t## since in some Bell experiments the detector axis setting is delayed after the emission of the photon pair.
Well, that's exactly, how in my opinion the apparent measurement problem is solved by the theory of open quantum systems. Because the measurement device (and may be also additional parts of "the environment" if they are relevant for the measurement process) is a macroscopic system you cannot describe the composite system+measurement device (+"environment") in all microscopic details and that's also not relevant nor of any use. So you desribe the measurement process by treating the system as part of an open quantum system, where the measurement device+environment part is described in a much coarse grained way by considering only the macroscopic relevant observables which represent the "pointer observables" letting you "read out a measurement result", and this leads to a classical description of the measurement device.

I'm not sure about why you bring "Markov" in. The system's state is described by the corresponding reduced statistical operator, i.e., by tracing out the measurement device+environment part. The result is a quantum master equation, which usually is not a Markov process. Often it can be approximated by a Markov process, leading to some Lindblad equation for the time evolution of the system. At the moment I'm reading a nice book about all this:

H.-P. Breuer and F. Petruccione, The theory of open quantum systems, Oxford University Press, Oxford, New York (2002).

Another technique, I'm more familiar with is to start with the real-time Green's function technique to derive the Kadanoff Baym equations which then can be reduced to semiclassical quantum transport equations through gradient expansion:

W. Cassing, From Kadanoff-Baym dynamics to off-shell parton transport, Eur. Phys. J. ST 168, 3 (2009),
https://doi.org/10.1140/epjst.

Killtech
vanhees71 said:
I'm not sure about why you bring "Markov" in. The system's state is described by the corresponding reduced statistical operator, i.e., by tracing out the measurement device+environment part. The result is a quantum master equation, which usually is not a Markov process. Often it can be approximated by a Markov process, leading to some Lindblad equation for the time evolution of the system
I know the quantum master equation, but I wonder if for the aim of a better understanding/interpretation a full classical probability formulation of QT isn't at least somewhat useful. And as far as i can see, i can't find any major obstacles other then terminology.

1. Measurement
What brings me to Markov is however not the regular time evolution (which the quantum master equation describes) but measurement. The way calculation of probabilities work in projective measurements has the same signature as a discrete time Markov chain. Basically all i am saying is that you can build a stochastic matrix out of the rule 6 (as linked above) - or a Markov kernel if that is done for the entire state space. Nothing more is needed to define a discrete time Markov chain. But it would then describe only a single measurement in a single experimental setup and nothing else. Is there any argument about that?

2. Undisturbed time evolution
When it comes to the quantum master equation it is a huge simplification utilizing the linear formulation of the time evo of states - but due to Born's rule you cannot fully utilize the linearity of the state space and linearity of ensemble probabilities at the same time.

However, any time evolution of a deterministic system can still be written in terms of a Markov process. All you need is the Markov property that the time evolution depends only on the current state and not the history how it got there. That seems to be the case in QT and the time evo outside of measurement is actually deterministic, right?

Now let's be clear that in QT the state space H is continuous and not discrete. So the master equation for such system is usually given by the Fokker Planck equation. Now the dimension of the Hilbert space is infinite so... it's a highly dimensional problem. However, the deterministic nature means we only have a drift term to worry about with no diffusion ##D (a_{n,t},t)=\frac 1 2 \sigma^2 (a_{n,t},t)=0## (where ##a_{n,t}## is the amplitude for the state ##\Psi_n##). Hmm, if we write in terms of the amplitudes of the Hamiltonian eigenstates basis, it's actually not difficult to solve since we have only a trivial drift ##\mu (a_{n,t}, t) = iE_n \hbar^{-1}##.

So that would be the formal master equation for a continuously distributed ensemble given by a probability density ##\rho(a_1, a_2, ...)## over complex amplitudes for each basis state (we would have to limit it to a finite number of states/dimension for a probability density to exist, but whatever). For that matter it works the same for electrodynamics as it would for QT - but only for as long as there is no measurement which messes this up.

That said i have no idea how QT actually deals with any non trivial ensembles like the continuously distributed case i have above. Just don't see how a density matrix is able to handle such a case. Are such ensembles of no interest in QT?

You could potentially dump this down to a discrete state space of interest enabling a simpler matrix representation in some simplified cases and using other trickery.

3. Total time evolution
Lastly those two types of very different processes - measurement and undisturbed time evolution - have to be properly merged together to obtain the full "master equation" of QT for an experiment. For me that task sounds very much like what Markov decision processes can do.

That's it
So i hope that helped a little. Which of the 3 points is causing the biggest trouble to follow?

vanhees71 said:
H.-P. Breuer and F. Petruccione, The theory of open quantum systems, Oxford University Press, Oxford, New York (2002).

Another technique, I'm more familiar with is to start with the real-time Green's function technique to derive the Kadanoff Baym equations which then can be reduced to semiclassical quantum transport equations through gradient expansion:

W. Cassing, From Kadanoff-Baym dynamics to off-shell parton transport, Eur. Phys. J. ST 168, 3 (2009),
https://doi.org/10.1140/epjst.

Last edited:
Killtech said:
a full classical probability formulation of QT
Does one exist? If so, please give a reference. Personal theories/speculations are not permitted. If you can't give a reference for such a formulation, it is off topic for discussion here.

PeterDonis said:
Does one exist? If so, please give a reference. Personal theories/speculations are not permitted. If you can't give a reference for such a formulation, it is off topic for discussion here.
I think we established with @vanhees71 that in principle the theory of open quantum systems is one way to do that. References are given in his post.

Killtech said:
I think we established with @vanhees71 that in principle the theory of open quantum systems is one way to do that.
You were talking about "a full classical probability formulation of QT". That's not what the theory of open quantum systems is. As @vanhees71 said:

vanhees71 said:
this leads to a classical description of the measurement device.
The bolded phrase is the crucial one, and it does not support your claim.

PeterDonis said:
You were talking about "a full classical probability formulation of QT". That's not what the theory of open quantum systems is. As @vanhees71 said:

The bolded phrase is the crucial one, and it does not support your claim.
I'm not sure what your question here is. Why would you expect a classical probability formulation to describe the macroscopic measurement device in a non-classical fashion? That would be failing the goal. The point is that that seems to be a proper description of a QT system in question being able to fully explain all its possible outcomes. What more do you want? I don't think one can ask for more then that.

• Quantum Physics
Replies
0
Views
770
• Quantum Physics
Replies
1
Views
860
• Quantum Physics
Replies
72
Views
4K
• Quantum Physics
Replies
4
Views
1K
• Quantum Physics
Replies
4
Views
755
• Quantum Physics
Replies
50
Views
4K
• Quantum Physics
Replies
5
Views
1K
• Quantum Physics
Replies
80
Views
4K
• Quantum Physics
Replies
16
Views
2K
• Quantum Physics
Replies
1
Views
1K