gill1109 said:
Please tell me your two rules, Bill. I am not familiar with Ballentine. Obviously I should be ... but sorry.
No need to be sorry :thumbs::thumbs::thumbs::thumbs:
It is however the BEST book on QM I know fixing up many issues and misconceptions and is very well thought of by many that post here - not Atty though - he has issues with it - but its best if he explains them.
Since your background is math, and mine is as well, I will build up to the two axioms in a slightly different way than Ballentine does.
First we need to define a Positive Operator Value Measure (POVM). A POVM is a set of positive operators Ei ∑ Ei =1 from, for the purposes of QM, an assumed complex vector space.
Elements of POVM's are called effects and its easy to see a positive operator E is an effect iff Trace(E) <= 1.
Now we can state the single foundational axiom QM is based on in the way I look at it which is a bit different than Ballentine who simply states the axioms without a discussion of why they are true - it's interesting it can be reduced to basically just one. Of course there is more to QM than just one axiom - but the rest follow in a natural way.
An observation/measurement with possible outcomes i = 1, 2, 3 ... is described by a POVM Ei such that the probability of outcome i is determined by Ei, and only by Ei, in particular it does not depend on what POVM it is part of.
Now I will evoke a very beautiful theorem which is a modern version of a famous theorem you may have heard of called Gleason's, and will in fact prove it.
Only by Ei means regardless of what POVM the Ei belongs to the probability is the same. This is the assumption of non contextuality and is the well known rock bottom essence of Born's rule via Gleason. The other assumption, not explicitly stated, but used, is the strong law of superposition ie in principle any POVM corresponds to an observation/measurement.
I will let f(Ei) be the probability of Ei. Obviously f(I) = 1 since the POVM contains only one element. Since I + 0 = I f(0) = 0.
First additivity of the measure for effects.
Let E1 + E2 = E3 where E1, E2 and E3 are all effects. Then there exists an effect E, E1 + E2 + E = E3 + E = I. Hence f(E1) + f(E2) = f(E3)
Next linearity wrt the rationals - its the usual standard argument from additivity from linear algebra but will repeat it anyway.
f(E) = f(n E/n) = f(E/n + ... + E/n) = n f(E/n) or 1/n f(E) = f(E/n). f(m E/n) = f(E/n + ... E/n) or m/n f(E) = f(m/n E) if m <= n to ensure we are dealing with effects.
Will extend the definition to any positive operator E. If E is a positive operator a n and an effect E1 exists E = n E1 as easily seen by the fact effects are positive operators with trace <= 1. f(E) is defined as nf(E1). To show well defined suppose nE1 = mE2. n/n+m E1 = m/n+m E2. f(n/n+m E1) = f(m/n+m E1). n/n+m f(E1) = m/n+m f(E2) so nf(E1) = mf(E2).
From the definition its easy to see for any positive operators E1, E2 f(E1 + E2) = f(E1) + f(E2). Then similar to effects show for any rational m/n f(m/n E) = m/n f(E).
Now we want to show continuity to show true for real's.
If E1 and E2 are positive operators define E2 < E1 as a positive operator E exists E1 = E2 + E. This means f(E2) <= f(E1). Let r1n be an increasing sequence of rational's whose limit is the irrational number c. Let r2n be a decreasing sequence of rational's whose limit is also c. If E is any positive operator r1nE < cE < r2nE. So r1n f(E) <= f(cE) <= r2n f(E). Thus by the pinching theorem f(cE) = cf(E).
Extending it to any Hermitian operator H.
H can be broken down to H = E1 - E2 where E1 and E2 are positive operators by for example separating the positive and negative eigenvalues of H. Define f(H) = f(E1) - f(E2). To show well defined if E1 - E2 = E3 - E4 then E1 + E4 = E3 + E1. f(E1) + f(E4) = f(E3) + f(E1). f(E1) - f(E2) = f(E3) - f(E4). Actually there was no need to show uniqueness because I could have defined E1 and E2 to be the positive operators from separating the eigenvalues, but what the heck - its not hard to show uniqueness.
Its easy to show linearity wrt to the real's under this extended definition.
Its pretty easy to see the pattern here but just to complete it will extend the definition to any operator O. O can be uniquely decomposed into O = H1 + i H2 where H1 and H2 are Hermitian. f(O) = f(H1) + i f(H2). Again its easy to show linearity wrt to the real's under this new definition then extend it to linearity wrt to complex numbers.
Now the final bit. The hard bit - namely linearity wrt to any operator - has been done by extending the f defined on effects. The well known Von Neumann argument can be used to derive Born's rule. But for completeness will spell out the detail.
First its easy to check <bi|O|bj> = Trace (O |bj><bi|).
O = ∑ <bi|O|bj> |bi><bj| = ∑ Trace (O |bj><bi|) |bi><bj|
Now we use the linearity that the forgoing extensions of f have led to.
f(O) = ∑ Trace (O |bj><bi|) f(|bi><bj|) = Trace (O ∑ f(|bi><bj|)|bj><bi|)
Define P as ∑ f(|bi><bj|)|bj><bi| and we have f(O) = Trace (OP).
P, by definition, is called the state of the quantum system. The following are easily seen. Since f(I) = 1, Trace (P) = 1. Thus P has unit trace. f(|u><u|) is a positive number >= 0 since |u><u| is an effect. Thus Trace (|u><u| P) = <u|P|u> >= 0 so P is positive.
Hence a positive operator of unit trace P, the state of the system, exists such that the probability of Ei occurring in the POVM E1, E2 ... is Trace (Ei P).
To derive Ballentine's two axioms we need to define what is called a resolution of the identity which is POVM that is disjoint. Such are called Von Neumann observations. We know from the Spectral theorem Hermitian operators, H, can be uniquely decomposed into resolutions of the idenity H = ∑ yi Ei. So what we do is given any observation based on a resolution of the identity Ei we can associate a real number yi with each outcome and uniquely define a Hermitian operator O = ∑ yi Ei, called the observable of the observation.
This gives the first axiom found in Ballentine - but the wording I will use will be slightly different because of the way I have presented it which is different to Ballentine - eg he doesn't point out he is talking about Von Neumann measurements, but measurements in general are wider than that, although all measurements can be reduced to Von Neumann measurements by considering a probe interacting with a system - but that is another story.
Axiom 1
Associated with each Von Neumann measurement we can find a Hermitian operator O, called the observations observable such that the possible outcomes of the observation are its eigenvalues yi.
Axiiom 2 - called the Born Rule
Associated with any system is a positive operator of unit trace, P, called the state of the system, such that expected value of of the outcomes of the observation is Trace (PO).
Axiom 2 is easy to see from what I wrote previously E(O) = ∑yi probability (Ei) = ∑yi Trace (PEi) = Trace (PO).
Now using these two axioms Ballentine develops all of QM.
A word of caution however. Other axioms are introduced as you go - but they occur in a natural way. Schroedinger's equation is developed from probabilities being invariant between frames ie the Principle Of Relativity. That the state after a filtering type observation is an eigenvalue of the observable is a consequence of continuity.
Obviously a lot more can be said, but will leave it for now - its a lot to digest already.
Thanks
Bill