# Confused about wavefunctions and kets

bhobba
Mentor
I will try and get hold of the Ballentine book. But for now I've got conflicting answers. The wave function contains everything ! and it doesn't ! I have looked in several books.
The math is probably a bit beyond what you know right now, but really its the only way to explain what's going on so I will post the full detail of exactly what the state is from what I said in another thread.

Its based on a bit of advanced math called Gleason's theorem that is usually only discussed in advanced treatments, but for me its the best way to understand what the state is.

First we need to define a Positive Operator Value Measure (POVM). A POVM is a set of positive operators Ei ∑ Ei =1 from, for the purposes of QM, an assumed complex vector space.

Elements of POVM's are called effects and its easy to see a positive operator E is an effect iff Trace(E) <= 1.

Now we can state the single foundational axiom QM is based on in the way I look at it which is a bit different than Ballentine who simply states the axioms without a discussion of why they are true - it's interesting it can be reduced to basically just one. Of course there is more to QM than just one axiom - but the rest follow in a natural way.

An observation/measurement with possible outcomes i = 1, 2, 3 ..... is described by a POVM Ei such that the probability of outcome i is determined by Ei, and only by Ei, in particular it does not depend on what POVM it is part of.

Its very strange, but is still true, that this is basically all that required for QM. The state, and what it is follows from this.

Only by Ei means regardless of what POVM the Ei belongs to the probability is the same. This is the assumption of non contextuality and is the well known rock bottom essence of Born's rule via Gleason. The other assumption, not explicitly stated, but used, is the strong law of superposition ie in principle any POVM corresponds to an observation/measurement.

I will let f(Ei) be the probability of Ei. Obviously f(I) = 1 since the POVM contains only one element. Since I + 0 = I f(0) = 0.

First additivity of the measure for effects.

Let E1 + E2 = E3 where E1, E2 and E3 are all effects. Then there exists an effect E, E1 + E2 + E = E3 + E = I. Hence f(E1) + f(E2) = f(E3)

Next linearity wrt the rationals - its the usual standard argument from additivity from linear algebra but will repeat it anyway.

f(E) = f(n E/n) = f(E/n + ..... + E/n) = n f(E/n) or 1/n f(E) = f(E/n). f(m E/n) = f(E/n + ...... E/n) or m/n f(E) = f(m/n E) if m <= n to ensure we are dealing with effects.

Will extend the definition to any positive operator E. If E is a positive operator a n and an effect E1 exists E = n E1 as easily seen by the fact effects are positive operators with trace <= 1. f(E) is defined as nf(E1). To show well defined suppose nE1 = mE2. n/n+m E1 = m/n+m E2. f(n/n+m E1) = f(m/n+m E1). n/n+m f(E1) = m/n+m f(E2) so nf(E1) = mf(E2).

From the definition its easy to see for any positive operators E1, E2 f(E1 + E2) = f(E1) + f(E2). Then similar to effects show for any rational m/n f(m/n E) = m/n f(E).

Now we want to show continuity to show true for real's.

If E1 and E2 are positive operators define E2 < E1 as a positive operator E exists E1 = E2 + E. This means f(E2) <= f(E1). Let r1n be an increasing sequence of rational's whose limit is the irrational number c. Let r2n be a decreasing sequence of rational's whose limit is also c. If E is any positive operator r1nE < cE < r2nE. So r1n f(E) <= f(cE) <= r2n f(E). Thus by the pinching theorem f(cE) = cf(E).

Extending it to any Hermitian operator H.

H can be broken down to H = E1 - E2 where E1 and E2 are positive operators by for example separating the positive and negative eigenvalues of H. Define f(H) = f(E1) - f(E2). To show well defined if E1 - E2 = E3 - E4 then E1 + E4 = E3 + E1. f(E1) + f(E4) = f(E3) + f(E1). f(E1) - f(E2) = f(E3) - f(E4). Actually there was no need to show uniqueness because I could have defined E1 and E2 to be the positive operators from separating the eigenvalues, but what the heck - its not hard to show uniqueness.

Its easy to show linearity wrt to the real's under this extended definition.

Its pretty easy to see the pattern here but just to complete it will extend the definition to any operator O. O can be uniquely decomposed into O = H1 + i H2 where H1 and H2 are Hermitian. f(O) = f(H1) + i f(H2). Again its easy to show linearity wrt to the real's under this new definition then extend it to linearity wrt to complex numbers.

Now the final bit. The hard bit - namely linearity wrt to any operator - has been done by extending the f defined on effects. The well known Von Neumann argument can be used to derive Born's rule. But for completeness will spell out the detail.

First its easy to check <bi|O|bj> = Trace (O |bj><bi|).

O = ∑ <bi|O|bj> |bi><bj| = ∑ Trace (O |bj><bi|) |bi><bj|

Now we use the linearity that the forgoing extensions of f have led to.

f(O) = ∑ Trace (O |bj><bi|) f(|bi><bj|) = Trace (O ∑ f(|bi><bj|)|bj><bi|)

Define P as ∑ f(|bi><bj|)|bj><bi| and we have f(O) = Trace (OP).

P, by definition, is called the state of the quantum system. The following are easily seen. Since f(I) = 1, Trace (P) = 1. Thus P has unit trace. f(|u><u|) is a positive number >= 0 since |u><u| is an effect. Thus Trace (|u><u| P) = <u|P|u> >= 0 so P is positive.

Hence a positive operator of unit trace P, the state of the system, exists such that the probability of Ei occurring in the POVM E1, E2 ..... is Trace (Ei P).

To derive Ballentine's two axioms we need to define what is called a resolution of the identity which is POVM that is disjoint. Such are called Von Neumann observations. We know from the Spectral theorem Hermitian operators, H, can be uniquely decomposed into resolutions of the idenity H = ∑ yi Ei. So what we do is given any observation based on a resolution of the identity Ei we can associate a real number yi with each outcome and uniquely define a Hermitian operator O = ∑ yi Ei, called the observable of the observation.

This gives the first axiom found in Ballentine - but the wording I will use will be slightly different because of the way I have presented it which is different to Ballentine - eg he doesn't point out he is talking about Von Neumann measurements, but measurements in general are wider than that, although all measurements can be reduced to Von Neumann measurements by considering a probe interacting with a system - but that is another story.

Axiom 1
Associated with each Von Neumann measurement we can find a Hermitian operator O, called the observations observable such that the possible outcomes of the observation are its eigenvalues yi.

Axiiom 2 - called the Born Rule
Associated with any system is a positive operator of unit trace, P, called the state of the system, such that expected value of of the outcomes of the observation is Trace (PO).

Axiom 2 is easy to see from what I wrote previously E(O) = ∑yi probability (Ei) = ∑yi Trace (PEi) = Trace (PO).

Now using these two axioms Ballentine develops all of QM.

A word of caution however. Other axioms are introduced as you go - but they occur in a natural way. Schroedinger's equation is developed from probabilities being invariant between frames ie the Principle Of Relativity. That the state after a filtering type observation is an eigenvalue of the observable is a consequence of continuity.

From this we see the state is simply a mathematical requirement that helps in calculating the probability of the outcomes of observations.

The state (or wavefunction which is simply the representation of the state in the position basis) contains everything to calculate the probabilities of the outcomes of observations. Now for the subtle point - its a matter of opinion and interpretation if the outcomes of observations and their probability is all that going on.

Thanks
Bill

Last edited:
Thanks for that. I hope to be able to understand all of it one day. I now know the difference between the wavefunction and the ket. But what is a ket ? An infinite dimensional column vector of what ? Infinite in what respect ? We can find <r|ψ> and <p|ψ> so <r| and <p| must both be infinite dimensional row vectors ?

Nugatory
Mentor
Thanks for that. I hope to be able to understand all of it one day. I now know the difference between the wavefunction and the ket. But what is a ket ? An infinite dimensional column vector of what ? Infinite in what respect ? We can find <r|ψ> and <p|ψ> so <r| and <p| must both be infinite dimensional row vectors ?
If you haven't yet done so, dig up the mathematical definition of a "vector space"

I looked it up in Shankar. My understanding is that the range of r or p is "chopped" up into n segments and then n→∞. But what are the infinite elements of the ket |ψ> ? They must be independent of any particular basis eg. position or momentum so I can't picture what they are ?

atyy
Thanks for that. I hope to be able to understand all of it one day. I now know the difference between the wavefunction and the ket. But what is a ket ? An infinite dimensional column vector of what ? Infinite in what respect ? We can find <r|ψ> and <p|ψ> so <r| and <p| must both be infinite dimensional row vectors ?
Roughly, a ket can be represented as a column vector. So let's say there are only 2 possible positions. Also let us choose to represent the ket |x=1> as the column vector [1 0]T, and the ket |x=2> as the column vector [0 1]T, ie. we choose as basis vectors states of definite position. An arbitary ket is then |ψ>=ψ(1)|x=1>+ψ(2)|x=2>, or equivalently as the wavefunction [ψ(1) ψ(2)]T, or equivalently the wavefunction ψ(x) where x is an index that runs from 1 to 2.

However, x actually is not discrete with only 2 values, it runs continuously. So if we use basis vectors with a definite position, then the ket |ψ> is an infinite dimensional column vector. An element of this column vector ψ(x) is the probability amplitude that a particle will be found at location x.

The above is rigourously incorrect, because there are sonme subtleties for infinite dimensional spaces, but the idea is roughly ok. Take a look at the explanations in http://physics.mq.edu.au/~jcresser/Phys304/Handouts/QuantumPhysicsNotes.pdf (chapters 8-10).

Last edited:
Roughly, a ket can be represented as a column vector. So let's say there are only 2 possible positions. Also let us choose to represent the ket |x=1> as the column vector [1 0]T, and the ket |x=2> as the column vector [0 1]T, ie. we choose as basis vectors states of definite position. An arbitary ket is then |ψ>=ψ(1)|x=1>+ψ(2)|x=2>, or equivalently as the wavefunction [ψ(1) ψ(2)]T, or equivalently the wavefunction ψ(x) where x is an index that runs from 1 to 2.

However, x actually is not discrete with only 2 values, it runs continuously. So if we use basis vectors with a definite position, then the ket |ψ> is an infinite dimensional column vector. An element of this column vector ψ(x) is the probability amplitude that a particle will be found at location x.

The above is rigourously incorrect, because there are sonme subtleties for infinite dimensional spaces, but the idea is roughly ok. Take a look at the explanations in http://physics.mq.edu.au/~jcresser/Phys304/Handouts/QuantumPhysicsNotes.pdf (chapters 8-10).
Thanks. You said an element of the column vector ψ(x). Did you mean an element of the ket |ψ> ? But you then relate it to location x. I thought kets are independent of basis ? So why would it be location x and not momentum p or some other basis ?

atyy
Thanks. You said an element of the column vector ψ(x). Did you mean an element of the ket |ψ> ?
Before you represent a ket as a column vector, you must always choose a basis. In the above the choice of basis means we choose |x=1> to be the column vector [1 0]T, and |x=2> to be the column vector [0 1]T.

Then the ket |ψ> will be the column vector which can be written [ψ(1) ψ(2)]T, or for short ψ(x) which is an element of the column vector [ψ(1) ψ(2)]T.

But you then relate it to location x. I thought kets are independent of basis ? So why would it be location x and not momentum p or some other basis ?
Yes, because I chose at the start a basis in which a state with a definite position |x=1> is the column vector [1 0]T, and the state with definite position |x=2> is the column vector [0 1]T. This is why the column vector [ψ(1) ψ(2)]T is also written as [ψ(x=1) ψ(x=2)]T, or for short ψ(x) is an element of that column vector.

If at the start I had chosen to represent the state of definite momentum as the basis, eg. choose |p=1> as the column vector [1 0]T, then the elements of the column vector representing the ket |ψ> would be ψ(p).

Some things are clearer now but as for the rest ; my head is spinning more and more. I just want to thank everyone who has persevered with me on this thread.