MHB Probability and Operators in Quantum Mechanics

Ackbach
Gold Member
MHB
Messages
4,148
Reaction score
93
Unfortunately, I can't find the thread (if someone finds it, please let me know, and I'll merge this post onto that thread), but someone asked why it is that in quantum mechanics, if you have an observable $B$, that the expectation value (average value) $\langle B \rangle$ is given by
$$\langle B \rangle = \int_{\mathbb{R}}\Psi^{*} \hat{B} \, \Psi\,dx,$$
where $\hat{B}$ is the operator "associated with" the observable $B$.
If I recall, the observable in question was the momentum operator
$$\hat{p}= -i \hbar\,\frac{\partial}{ \partial x}.$$
So, why is
$$ \langle p \rangle= \int_{\mathbb{R}}\Psi^{*} \left( -i \hbar\,\frac{\partial}{ \partial x} \right)\, \Psi\,dx?$$

The following derivation will follow Griffiths' Introduction to Quantum Mechanics, 1st Ed., pages 11-16.

Let's start with $\Psi$, which is the wave function solution of the time-dependent Schrödinger equation in one dimension:
$$i \hbar \frac{ \partial \Psi}{ \partial t}=- \frac{ \hbar^{2}}{2m}
\frac{ \partial^{2} \Psi}{\partial x^{2}}+V \Psi.$$
The statistical interpretation of the Schrödinger equation tells us that the wave function $|\Psi(x,t)|^{2}$ is the probability density for finding the particle at point $x$ at time $t$. So if I want to find the expectation value of the observable $x$, I should do
$$\langle x \rangle=\int_{\mathbb{R}}x|\Psi(x,t)|^{2}\,dx.$$
Now, if we want the observable $p$, we would like
$$p=mv=m\frac{dx}{dt}.$$
Shifting to expectation values, we would like
$$\langle p \rangle=m\frac{d \langle x \rangle}{dt}
=m\,\frac{d}{dt}\int_{\mathbb{R}}x|\Psi(x,t)|^{2}\,dx
=m\int_{\mathbb{R}}x \frac{ \partial}{ \partial t}| \Psi(x,t)|^{2}\,dx.$$
Now the Schrödinger equation tells us that
$$ \frac{ \partial \Psi}{\partial t}= \frac{i \hbar}{2m} \frac{ \partial^{2} \Psi}{ \partial x^{2}}- \frac{i}{ \hbar} V \Psi,$$
and hence
$$ \frac{ \partial \Psi^{*}}{\partial t}=- \frac{i \hbar}{2m} \frac{ \partial^{2} \Psi^{*}}{ \partial x^{2}}+ \frac{i}{ \hbar} V \Psi^{*}.$$
Thus,
$$\frac{ \partial}{ \partial t}| \Psi(x,t)|^{2}
=\frac{ \partial}{ \partial t}(\Psi^{*} \Psi)
= \Psi^{*} \frac{ \partial \Psi}{ \partial t}+ \Psi \frac{ \partial \Psi^{*}}{ \partial t}$$
$$=\Psi^{*} \left(\frac{i \hbar}{2m} \frac{ \partial^{2} \Psi}{ \partial x^{2}}- \frac{i}{ \hbar} V \Psi \right)+
\Psi \left( - \frac{i \hbar}{2m} \frac{ \partial^{2} \Psi^{*}}{ \partial x^{2}}+ \frac{i}{ \hbar} V \Psi^{*} \right)$$
$$= \frac{i \hbar}{2m} \left( \Psi^{*} \frac{ \partial^{2} \Psi}{ \partial x^{2}}- \Psi \frac{ \partial^{2} \Psi^{*}}{ \partial x^{2}} \right)$$
$$= \frac{i \hbar}{2m} \frac{ \partial}{ \partial x}\left( \Psi^{*} \frac{ \partial \Psi}{ \partial x}- \Psi \frac{ \partial \Psi^{*}}{ \partial x} \right).$$
Plugging this into our latest expression for $\langle p \rangle$ yields
$$\langle p \rangle=m\int_{\mathbb{R}}x \frac{ \partial}{ \partial t}| \Psi(x,t)|^{2}\,dx
=m\int_{\mathbb{R}}x \left( \frac{i \hbar}{2m} \frac{ \partial}{ \partial x}\left( \Psi^{*} \frac{ \partial \Psi}{ \partial x}- \Psi \frac{ \partial \Psi^{*}}{ \partial x} \right) \right)\,dx$$
$$=\frac{i \hbar}{2}\int_{\mathbb{R}}x \frac{ \partial}{ \partial x}\left( \Psi^{*} \frac{ \partial \Psi}{ \partial x}- \Psi \frac{ \partial \Psi^{*}}{ \partial x} \right) \,dx.$$
We can integrate this by-parts:
$$ \langle p \rangle=-\frac{i \hbar}{2}\int_{\mathbb{R}} \left( \Psi^{*} \frac{ \partial \Psi}{ \partial x}- \Psi \frac{ \partial \Psi^{*}}{ \partial x} \right) \,dx.$$
Here we have used the fact that $\Psi$ must go to zero as $x\to \pm \infty$ to eliminate the boundary term. Now, if we integrate the second term by-parts, we will obtain
$$ \langle p \rangle =-i \hbar \int_{ \mathbb{R}} \Psi^{*} \frac{ \partial \Psi}{ \partial x} \, dx
=\int_{\mathbb{R}}\Psi^{*} \left( -i \hbar\,\frac{\partial}{ \partial x} \right)\, \Psi\,dx.$$
As it turns out, all observables can be written in terms of position and momentum, so everywhere you see an $x$, "replace" it with multiplication by $x$, and everywhere you see a $p$, replace it with the operator $-i \hbar (\partial / \partial x)$.

So, to sum up: this is the operator representation of $p$, because we want to impose the condition that $\langle p \rangle= m d \langle x \rangle/dt$. And then, because of the Schrödinger equation and the derivation I showed above, you obtain the desired representation.
 
Mathematics news on Phys.org
Nice answer. A more complete analysis of these foundational issues in QM can be found in Ballentine:
https://www.amazon.com.au/dp/9814578584/.

In that standard textbook, you will find QM based on just two axioms. The first is a hermitian operator is associated with any observation whose eigenvalues are the possible outcomes called the observations observable. The second is the so-called Born Rule - Given an observable O then there exists a positive operator of unit trace, P, called the systems state, such that the expectation of the outcome of the observation associated with O, E(O) = Trace (PO). It is a surprising fact that using a famous theroem called Gleasons Thereoem the second axiom actually follows from the first (there is an assumption called non-contextuality involved; its full implications is best left for another thread however it will be pointed out when required here).

Initially it was a difficult theroem to prove, but in modern times it has been greatly simplified using the concept of POVM. Here is the proof I came up with using POVM's.

Just for completeness let's define a POVM. A POVM is a set of positive operators Ei ∑ Ei =1 from, for QM, an assumed complex vector space.

Elements of POVMs are called effects, and it's easy to see a positive operator E is an effect if and only if Trace(E) <= 1.

First, let's start with the foundational axiom the proof uses as its starting point. It is a generalisation of the first axiom of QM I gave by decomposing O = ∑λi |ui><ui|, where λi is an eigenvalue and |ui> is the corresponding eigenvector. The last part is actually non-contextuality I mentioned before.

An observation/measurement with possible outcomes i = 1, 2, 3 ... is described by a POVM Ei such that the probability of outcome Ei determines me, and only by Ei; in particular, it does not depend on what POVM it is part of.

Only by Ei means that regardless of what POVM the Ei belongs to, the probability is the same. This assumption of non-contextuality is the well-known rock bottom essence of Born's rule via Gleason. The other assumption, not explicitly stated, but used, is the strong law of superposition ie in principle, any POVM corresponds to an observation/measurement.

I will let f(Ei) be the probability of Ei.

First, additivity of the measure for effects.

Let E1 + E2 = E3 where E1, E2 and E3 are all effects. Then there exists an effect E E1 + E2 + E = E3 + E = I. Hence f(E1) + f(E2) = f(E3)

f (I) = 1 from the law of total probability. Since I + 0 = I f(0) = 0.

Next, linearity wrt the rationals - it's the usual standard argument from additivity from linear algebra but will repeat it anyway.

f(E) = f(n E/n) = f(E/n + ... + E/n) = n f(E/n) or 1/n f(E) = f(E/n). f(m E/n) = f(E/n + ... E/n) or m/n f(E) = f(m/n E) if m <= n to ensure we are dealing with effects.

Will extend the definition to any positive operator E. If E is a positive operator, an n and an effect E1 exists; E = n* E1, as easily seen by the fact effects are positive operators with trace <= 1. f(E) is defined as nf(E1). To show well-defined suppose nE1 = mE2. n/n+m E1 = m/n+m E2. f(n/n+m E1) = f(m/n+m E1). n/n+m f(E1) = m/n+m f(E2) so nf(E1) = mf(E2).

From the definition its easy to see for any positive operators E1, E2 f(E1 + E2) = f(E1) + f(E2). Then similar to effects show for any rational m/n f(m/n E) = m/n f(E).

Now we want to show continuity to show true for real's.

If E1 and E2 are positive operators define E2 < E1 as a positive operator E exists E1 = E2 + E. This means f(E2) <= f(E1). Let r1n be an increasing sequence of rationals whose limit is the irrational number c. Let r2n be a decreasing sequence of rationals whose limit is also c. If E is any positive operator r1nE < cE < r2nE. So r1n f(E) <= f(cE) <= r2n f(E). Thus by the pinching theorem f(cE) = cf(E).

Extending it to any Hermitian operator H.

H can be broken down to H = E1 - E2, where E1 and E2 are positive operators by, for example, separating the positive and negative eigenvalues of H. Define f(H) = f(E1) - f(E2). To show well defined if E1 - E2 = E3 - E4 then E1 + E4 = E3 + E1. f(E1) + f(E4) = f(E3) + f(E1). f(E1) - f(E2) = f(E3) - f(E4). Actually, there was no need to show uniqueness because I could have defined E1 and E2 as the positive operators from separating the eigenvalues, but what the heck - it's not hard to show uniqueness.

It's easy to show linearity wrt to the real's under this extended definition.

It's pretty easy to see the pattern here but just to complete it will extend the definition to any operator O. O can be uniquely decomposed into O = H1 + i H2 where H1 and H2 are Hermitian. f(O) = f(H1) + i f(H2). Again it's easy to show linearity wrt to the real's under this new definition then extend it to linearity wrt to complex numbers.

Now the final bit. The hard bit - namely linearity wrt to any operator - has been done by extending the f defined on effects. The well-known Von Neumann argument can be used to derive Born's rule. But for completeness will spell out the detail.

First its easy to check <bi|O|bj> = Trace (O |bj><bi|).

O = ∑ <bi|O|bj> |bi><bj| = ∑ Trace (O |bj><bi|) |bi><bj|

Now we use the linearity that the forgoing extensions of f have led to.

f(O) = ∑ Trace (O |bj><bi|) f(|bi><bj|) = Trace (O ∑ f(|bi><bj|)|bj><bi|)

Define P as ∑ f(|bi><bj|)|bj><bi| and we have f(O) = Trace (OP).

P, by definition, is called the state of the quantum system. The following are easily seen. Since f(I) = 1, Trace (P) = 1. Thus P has a unit trace. f(|u><u|) is a positive number >= 0 since |u><u| is an effect. Thus Trace (|u><u| P) = <u|P|u> >= 0 so P is positive.

Hence a positive operator of unit trace P exists such that the probability of Ei occurring in the POVM E1, E2 ... is Trace (Ei P).

Whew. Glad that's over with.

So at rock bottom, QM is modelling the outcomes of observations by a hermitian operator. Why we would want to do that is a deep issue at the foundations of QM - possibly the deep issue:
https://www.physicsforums.com/insig...ciple-at-the-foundation-of-quantum-mechanics/

Thanks
Bill
 
Last edited:
  • Like
Likes Greg Bernhardt
Insights auto threads is broken atm, so I'm manually creating these for new Insight articles. In Dirac’s Principles of Quantum Mechanics published in 1930 he introduced a “convenient notation” he referred to as a “delta function” which he treated as a continuum analog to the discrete Kronecker delta. The Kronecker delta is simply the indexed components of the identity operator in matrix algebra Source: https://www.physicsforums.com/insights/what-exactly-is-diracs-delta-function/ by...
Fermat's Last Theorem has long been one of the most famous mathematical problems, and is now one of the most famous theorems. It simply states that the equation $$ a^n+b^n=c^n $$ has no solutions with positive integers if ##n>2.## It was named after Pierre de Fermat (1607-1665). The problem itself stems from the book Arithmetica by Diophantus of Alexandria. It gained popularity because Fermat noted in his copy "Cubum autem in duos cubos, aut quadratoquadratum in duos quadratoquadratos, et...
I'm interested to know whether the equation $$1 = 2 - \frac{1}{2 - \frac{1}{2 - \cdots}}$$ is true or not. It can be shown easily that if the continued fraction converges, it cannot converge to anything else than 1. It seems that if the continued fraction converges, the convergence is very slow. The apparent slowness of the convergence makes it difficult to estimate the presence of true convergence numerically. At the moment I don't know whether this converges or not.
Back
Top