# I Delta Function Ordering

1. Dec 20, 2016

### jim burns

I want to calculate $$\langle x|XP|y \rangle$$ where X is the position operator and P the momentum operator, and the states are position eigenstates. But I get two different answers depending on if I insert a complete set of states.

First way:

$$\langle x|XP|y \rangle=x\langle x|P|y \rangle=-ix \partial_x \delta(x-y)$$

Second way:

$$\langle x|XP|y \rangle=\int dz \langle x| z\rangle \langle z|XP|y \rangle= \int dz \delta(x-z) [-iz\partial_z\delta(z-y)]$$

Now to complete the second way, since there is $$\delta(x-z)$$, we can set z=x to get:

$$\langle x|XP|y \rangle=\int dz \langle x| z\rangle \langle z|XP|y \rangle= \int dz \delta(x-z) [-iz\partial_z\delta(z-y)]=[-ix\partial_x\delta(x-y)]$$

However, if we use the formula $$\int dz f(z)\partial_z \delta(z-y)=-\partial_y f(y)$$ we get:

$$\langle x|XP|y \rangle=\int dz \langle x| z\rangle \langle z|XP|y \rangle= \int dz \delta(x-z) [-iz\partial_z\delta(z-y)]=\partial_y [+iy\delta(x-y)]=\\i\delta(x-y)+iy\partial_y\delta(x-y)= i\delta(x-y)-ix\partial_x\delta(x-y)$$

There seems to be an extra term $$i\delta(x-y)$$. Why doesn't this second way work?

2. Dec 20, 2016

### Staff: Mentor

Your notation is confusing you. Try it with a notation that clearly separates the position variable $x$ from the eigenvalues associated with particular eigenstates.

Call the position variable $x$ and its eigenstates $\vert a \rangle$ and $\vert b \rangle$. Then the first way gives:

$$\langle a \vert XP \vert b \rangle = a \langle a \vert P \vert b \rangle = - i a \partial_x \delta(a - b)$$

Notice that the partial derivative in the $P$ operator is taken with respect to $x$, the variable, not $a$, the value.

Now for the second way:

$$\langle a \vert XP \vert b \rangle = \int dz \langle a \vert z \rangle \langle z \vert XP \vert b \rangle = \int dz \delta(a - z) \left[ - i z \partial_x \delta(z - b) \right]$$

The integrand is zero unless $z = a$, so this gives the same answer as the first way above. Note carefully how the clearer notation avoids the problem: $z$ is not the position variable, it is an eigenvalue--i.e., the integral is over all possible position eigenstates and their eigenvalues, and has nothing whatever to do with the partial derivative with respect to the position variable $x$.

3. Dec 21, 2016

### vanhees71

Imho this doesn't make sense either. How can you take the derivative with respect to $x$ if nothing depends on $x$ and getting something different from 0?

My take is the following:
$$\langle x|\hat{x} \hat{p}| y \rangle=\langle \hat{x} x|\hat{p} y \rangle=x \langle x |\hat{p} y \rangle=-\mathrm{i} x \partial_x \langle x| y \rangle=-\mathrm{i} x \partial_x \delta(x-y).$$
So the first way looks correct. I don't understand what you try to achieve with the 2nd way. Of course, you can put a "decomposition of the identity" as follows:
$$\langle x|\hat{x} \hat{p}| y \rangle=\int_{\mathbb{R}} \mathrm{d} z \langle x|\hat{x} z \rangle \langle z|\hat{p} y \rangle=\int_{\mathbb{R}} \mathrm{d} z x \delta(x-z) (-\mathrm{i} \partial_z) \delta(z-y)=-\mathrm{i} x \partial_x \delta(x-y).$$

4. Dec 21, 2016

### Staff: Mentor

Who said nothing depends on $x$? The wave function for a position eigenstate certainly is not independent of $x$. So applying the $P$ operator to such an eigenstate should not give zero. See further comments below.

Your notation has the same ambiguity as the OP's. You are using $x$ to denote both the position variable and a particular position eigenstate. So the $x$ in your $\delta(x - y)$ is not the same as the $x$ in your $\partial_x$ operator. So on its face, you still are taking the derivative with respect to $x$ of an expression that doesn't depend on $x$, since both the $x$ and the $y$ in your $\delta(x - y)$ are particular values of the position variable $x$ (the values that I labeled $a$ and $b$).

This again has the same notational ambiguity as the OP.

The real question here is, given a position eigenstate $\vert b \rangle$, what is the result of applying the momentum operator $P$ to it? Well, we can write the position eigenstate as $\delta(x - b)$, and the momentum operator is $- i \partial_x$, so we have

$$P \vert b \rangle = - i \partial_x \delta(x - b)$$

If we now form the inner product of $P \vert b \rangle$ with $\langle a \vert X$, we end up just multiplying the above by $a \delta(x - a)$. So we have

$$\langle a \vert X P \vert b \rangle = - i a \delta(x - a) \partial_x \delta(x - b)$$

If you want to say that this is a clearer way of writing the result than $- i a \partial_x \delta(a - b)$, I would agree. But it still keeps a clear distinction between the position variable $x$ and the eigenvalues $a$ and $b$.

5. Dec 21, 2016

### vanhees71

No, your notation is the mess, not mine! You mix up the Dirac notation (basis/representation independent) and the wave-function notation. The consistent notation in the latter representation is indeed more straight forward and goes as follows for an arbitrary wave function:
$$\hat{x}\hat{p} \psi(x)=-\mathrm{i}x \partial_x\psi(x).$$
Now set
$$\psi(x)=u_y(x)=\delta(x-y).$$
Plugging this into the general formula, you'll again get the result from the previous posting.

6. Dec 21, 2016

### Staff: Mentor

Ok, that gives me the position eigenstate that was labeled with the ket $\vert y \rangle$. Fine, no problem. Now tell me how I should write, in wave function form, the position eigenstate that was labeled with the bra $\langle x \vert$. By analogy with the above, it would be $\psi(x) = u_x(x) = \delta(x - x)$. Does that make sense to you? Because it doesn't to me.

7. Dec 21, 2016

### Staff: Mentor

[Edit: removed mistaken post.]

8. Dec 22, 2016

### PeroK

In general, $x$ is overloaded here. If you take the original expression and assume we are talking about specific position eigenstates, we could label them $x_0, y_0$ instead of $x, y$.

In any case, we ought to avoid using the same label, in this case $x$, that is used in partial derivative.

The key thing to note is what is meant by the partial derivative representation of $P$. I would explain it this:

$f(x,y) = \langle x|P|y \rangle$ is a function of two variables. And we have:

$f(x,y) = -i \partial_x \langle x|y \rangle$

For two particular eigenstates $x_0, y_0$:

$f(x_0, y_0) = \langle x_0|P|y_0 \rangle = -i \partial_x \langle x_0|y_0 \rangle$

Where we take this last expression to be the partial derivative (which is a well-defined function) evaluated at $(x_0, y_0)$.

Note also that this representation for $P$ leads to the partial derivative in the first variable, which is why we can choose $(x, y)$ and not worry about notation, but if we reversed $x$ and $y$ in the original expression we would need to be careful about the partial derivative notation we use.

On the last point, we would need to distinguish between the eigenbra $\langle x_0|$ and the free variable in the x-representation wave function:

$\psi(x) = u_{x_0}(x) = \delta(x - x_0)$

9. Dec 22, 2016

### vanhees71

Well, of course you must be careful with distributions. The position eigenstate is not a vector in the Hilbert space and thus, in position representation, not a square-integrable function but a distribution. Formally it belongs to the dual space of the domain (which is also the co-domain) of the self-adjoint position operator (see, e.g., Ballentine's textbook for a gentle introduction into the modern formulation as "rigged Hilbert space"; there are also good introductions online, e.g. https://arxiv.org/pdf/quant-ph/0502053 ). Thus it doesn't make sense to set $x=y$ here. The generalized eigenfunction of $\hat{x}$ with the eigenvalue $y$ is
$$u_y(x)=\langle x|y \rangle=\delta(x-y),$$
and that's what I used in my previous posting as well as the momentum operator in position representation. To be as clear as possible I write $\hat{p}$ for the representation-free abstract operator in the Dirac bra-ket formalism and $\tilde{p}$ in the position representation, where it is acting on wave functions in its domain (which is a proper dense subspace of the full Hilbert space, in the position representation realized as $\mathrm{L}^2(\mathbb{R})$):
$$\tilde{p} \psi(x)=\langle x|\hat{p} \psi \rangle=-\mathrm{i} \partial_x \psi(x).$$
All this can be derived from the Heisenberg algebra of non-relativistic qm. The only non-trivial commutator relation in this case is
$$[\hat{x},\hat{p}]=\mathrm{i} \hat{1}.$$
In position representation you have
$$\tilde{x} \psi(x)=\langle x|\hat{x} \psi \rangle=\langle \hat{x} x|\psi \rangle=x \langle x|\psi \rangle=x \psi(x),$$
and thus also in the position representation you find
$$[\tilde{x},\tilde{p}]=\mathrm{i},$$
i.e., indeed it provides a representation of the Heisenberg algebra.

The Heisenberg algebra itself follows either from "canonical quantization", which is a handwaving argument, only working by chance in Cartesian coordinates, or from the symmetry of the one-dimensional configuration space, i.e., homogeneity (translation invariance), where the momentum is defined as the generator for spatial translations (which is true all over physics thanks to Noether, i.e., it's the rule to define the momentum observable from classical mechanics, classical field theory to QM and relativistic QFT).

10. Dec 22, 2016

### Staff: Mentor

Exactly. And doing this resolves the question posed in the OP; you can insert a "decomposition of the identity" in terms of a complete set of states and still get the same answer.

Exactly.

All of this is perfectly true, but it's probably over the OP's head and as far as I can see it doesn't address the OP's question the way PeroK's remark, quoted above, does. The point is that the notation $x$ should not be overloaded; that's all I was saying. Fixing that resolves the OP's question.

11. Dec 22, 2016

### vanhees71

Well, I don't understand what PeroK does differently than I did in his first part. The second part I don't understand again. If you derive a function $f(x_0,y_0)$ partially wrt. $x$ you get 0, because $f(x_0,y_0)$ doesn't depend on $x$. Obviously I don't understand your notation, nor the problem the OP has. Sorry for that.

12. Dec 22, 2016

### Staff: Mentor

You used notation in which $x$ refers both to the position variable (what $\psi(x)$ is a function of) and a particular position eigenvalue. PeroK used $x_0$ instead of $x$ to refer to the eigenvalue; I used $a$. The OP's problem was that, since he was using notation like yours, he confused the position variable with the eigenvalue, and that is what led him to think that the second way of doing his derivation got a different answer.

That's not what is being done. What is being done is to evaluate the function $- i \partial_x \delta(x - y)$ (which actually is a distribution, as you point out, because of the $\delta$, but issues like that are above the level of this thread) at a particular value for $x$, the one PeroK called $x_0$ and I called $a$. (And since he is considering it as a function of two variables, he also has to specify the value of $y$; he called it $y_0$ and I called it $b$. I didn't go into that aspect of it since I was considering $b$ to be fixed and the function to be a function of one variable, $x$, only, but either way works.)

13. Dec 22, 2016

### vanhees71

Perhaps the probem is still that the connection between the representation free Dirac notation and the position representatoon is not clarified enough. Let $|\psi (t)\rangle$ be a normalized Hilbert-space vector and $|x \rangle$ the generalized eigenvector of the position operator of the generalized eigenvalue $x \in \mathbb{R}$. Then the wave function of the particle is given by
$$\psi(t,x)=\langle x|\psi(t) \rangle.$$
This means that indeed the eigenvalue $x$ is the argument of the wave function, because it's just the position representation, i.e., you choose to work with the components wrt. the generalized position-operator eigenbasis.

Now in the OP a somewhat complicated situation is discussed, namely the eigenvector of the position operator of eigenvalue $y$. This means you formally set $|\psi(t) \rangle=|y \rangle$, and this gives the generalized wavefunction
$$u_y(x)=\langle x|y \rangle=\delta(x-y).$$

14. Dec 22, 2016

### Staff: Mentor

I still don't understand your notation. The argument $x$ of the wave function is a variable. The eigenvalue $x_0$ (or $a$, or $y$, or whatever symbol other than just $x$ you want to use) is a constant. The quantity $\langle x \vert \psi(t) \rangle$, assuming you mean $\langle x \vert$ and $\vert \psi(t) \rangle$ to denote single vectors (i.e., we have chosen a particular eigenvector of position and a particular label $t$ that identifies a single Hilbert space vector), is a number, not a function; it's the inner product of two vectors. So $\psi(t, x)$, as you've written it, is the value of the wave function for a particular choice of arguments, not the wave function itself.

In the latter part of your post, you write $u_y(x)$, which looks like a function, but you equate it to $\langle x \vert y \rangle$, which is again a number, an inner product of two vectors. Unless you mean $\langle x \vert$ to denote a set of vectors, the set of all eigenvectors of position; that would make $u_y(x)$ a function that maps eigenvectors of position to numbers (the number for each eigenvector being its inner product with the specific eigenvector $\vert y \rangle$. But in that case, once again, the argument $x$ of $u_y(x)$ is a variable; it's not any single eigenvalue, it ranges over all possible eigenvalues.

15. Dec 23, 2016

### PeroK

This makes perfect sense. One step that could be explained is why you can change the variable in the partial derivative from $\partial_z$ to $\partial_x$. One way to see this is to use a different notation for the partial derivative. For example $\partial_1$ to indicate the derivative wrt to first argument of a function. In which case you have:
$$\langle x|\hat{x} \hat{p}| y \rangle=\int_{\mathbb{R}} \mathrm{d} z \langle x|\hat{x} z \rangle \langle z|\hat{p} y \rangle=\int_{\mathbb{R}} \mathrm{d} z x \delta(x-z) (-\mathrm{i} \partial_1) \delta(z-y)=-\mathrm{i} x \partial_1 \delta(x-y).$$
Until now you can think of $x, y$ as fixed, but now you have:
$$\forall x, y: \ \langle x|\hat{x} \hat{p}| y \rangle=-\mathrm{i} x \partial_1 \delta(x-y).$$
In which case, $x$ is now the first variable, so we can change the notation to the more usual $\partial_1 \delta(x-y) = \partial_x \delta(x-y)$

An interesting analogy is in the single variable calculus, where we can use $f'$ as the variable-free notation for the derivative and this makes it easier to see something like:

If $x = z$ then $f'(x) = f'(z)$

Using $\partial_1$ etc. is a way of expressing partial derivatives as functions free of any specific variable.

I don't know whether that clears up the misunderstanding, but that's my somewhat laboured justification of why you can change the variable in the partial derivative here.

16. Dec 23, 2016

### vanhees71

I'm a bit confused about this debate, because I use standard Dirac notation all the way. I think, we have to really define everything clearly to make progress. I work in the SchrÃ¶dinger picture of time evolution which is usually introduced first in the introductory QM lecture (which at least in my case at the TU Darmstadt in the mid-1990ies started with the representation-free Dirac approach right away, but we had also a more conventional treatment in terms of wave mechanics in the experimental-physics course lecture before). So let's start with the formalism (the "kinematical part only"):

(a) A (pure) state of a quantum system is described by a ray in Hilbert space, i.e., by a normalized vector $|\psi(t) \rangle \in \mathcal{H}$ modulo a phase factor (where $\mathcal{H}$ is the up to equivalence unique separable Hilbert space, i.e., it can be represented by, e.g., the Hilbert space of square-summable complex sequences, $\ell^2$ or the Hilbert space of square-integrable functions $\mathrm{L}^2$, corresponding to the representation of the abstract formalism in terms of "matrix mechanics" or "wave mechanics" respectively). Equivalenty, and sometimes more conveniently, the pure state is represented by the corresponding Statistical Operator $\hat{\rho}_{\psi}(t)=|\psi(t) \rangle \langle \psi(t) |$.

(b) The observables $A,B,C,\ldots$ of the quantum system are described by essential self-adjoint operators $\hat{A}$, $\hat{B}$, $\hat{C},\ldots$. These operators have a complete set of (generalized) eigenvectors with real (generalized) orthonomal eigenvectors (a complete orthonormalized system, CONS, or (generalized basis).

(c) Given the system is prepared in the pure state $\hat{\rho}_{\psi}(t)$ and if $|a,\beta \rangle$ are a CONS of eigenvectors of $\hat{A}$ (where $\beta$ is a label to count the degeneracy of the eigenvalue $a$ of $\hat{A}$, then the probability to find the value $a$, when the observable $A$ is measured is given by Born's rule:
$$P_{\psi}(a)=\sum_{\beta} \langle a,\beta|\hat{\rho}_{\psi}(t) a,\beta \rangle=\sum_{\beta} |\langle a,\beta |\Psi(t) \rangle|^2.$$
Here I assumed that $\beta$ runs over a discrete set. If there are continuous parts or if the range of $\beta$ is entirely continuous, then one has to integrate instead.

Now since the set $|a,\beta \rangle$ is a CONS, you can uniquely represent everything in the "$A$ representation", i.e., by the vector components
$$\psi(t,a,\beta)=\langle a,\beta |\psi(t) \rangle.$$
One should note that in the Dirac notation you often introduce also co-vectors ("bras") in addition to the vectors ("kets"). In the context of CONSs related to generalized eigenvectors of the self-adjoint operators these live in a larger space than the dual space of $\mathcal{H}$ (note that $\mathcal{H}^*=\mathcal{H}$ due to the scalar product), i.e., in the dual space of the domain (which is the same as the co-domain) of the operator, i.e., $\langle a,\beta|$ is in general a distribution valued linear form on $\mathcal{H}$).

Now let's clarify this with the example we are discussing here, namely the motion of a scalar non-relativistic particles restricted to one direction in space. As for any real-world physical system, you have to make a model to construct the Hilbert space, and the most straight-forward one is to use "canonical quantization". In the classical case the phase space is given by $x$ (position) and $p$ (momentum) of the particle. Now these observables become self-adjoint operators on a Hilbert space, $\hat{x}$ and $\hat{p}$, and the entire algebra of observables is defined by the Lie-algebra of commutators. The only non-trivial commutator is
$$[\hat{x},\hat{p}]=\mathrm{i} \hat{1}.$$
The most difficult part to construct the Hilbert space such that you can calculate real things is to figure out the spectrum of the operators and how these operators can be realized. Here, this works as follows. The heuristics is that the momentum is generating spatial translations in the classical case, and that's also the case in quantum theory, as is demonstrated as follows. Consider the operator-valued function
$$\hat{X}(\xi)=\exp(\mathrm{i} \xi \hat{p}) \hat{x} \exp(-\mathrm{i} \xi \hat{p}), \quad \xi \in \mathbb{R}.$$
Then
$$\mathrm{d}_{\xi} \hat{X} = \mathrm{i} \exp(\mathrm{i} \xi \hat{p}) [\hat{p},\hat{x}] \exp(-\mathrm{i} \xi \hat{p})=\hat{1}.$$
In the last step we've used the commutator relation. Since further obviously $\hat{X}(0)=\hat{x}$ we thus find
$$\hat{X}(\xi)=\hat{x}+\xi \hat{1}.$$
Now since $\hat{x}$ is self-adjoint it as a (generalized) eigenvector $|x_0 \rangle$ with a real eigenvalue $x_0$. Then let's consider $\exp(-\mathrm{i} \xi \hat{p} |x_0 \rangle$. From the above it's suggestive to think that this is also an eigenvector of the position operator thus we calculate
$$\hat{x} \exp(-\mathrm{i} \xi \hat{p}) |x_0 \rangle=\exp(-\mathrm{i} \xi \hat{p}) \hat{X}(\xi) |x_0 \rangle= \exp(-\mathrm{i} \xi \hat{p}) (\hat{x}+\xi \hat{1}) |x_0 \rangle = (x_0+\xi) \exp(-\mathrm{i} \xi \hat{p})|x_0 \rangle.$$
This means that $\exp(-\mathrm{i} \xi \hat{p} |x_0 \rangle$ is position eigenvector with eigenvalue $x_0+\xi$, which implies that the spectrum of $\hat{x}$ is the entire real line $\mathbb{R}$. Since we further assume that the representation of the Heisenberg commutator algebra is irreducible (since there's no other independent observable in the classical system that "commutes" with the position observable) the CONS of generalized eigenvectors of the position observable are given by
$$|x \rangle=\exp(-\mathrm{i} x \hat{p}) |0 \rangle, \quad x \in \mathbb{R}.$$
The generalized eigenvectors are normalized "to a $\delta$ distribution",
$$\langle x'|\langle x \rangle=\delta(x-x')$$
for convenience. Further the completeness relation is assumed to hold:
$$\int_{\mathbb{R}} |x \rangle \langle x|=\hat{1}.$$
Now we are ready to work in position representation (aka "wave mechanics"). The pure states are represented in the above specified sense by the "components"
$$\psi(t,x)=\langle x|\psi(t) \rangle,$$
and the state ket itself is given due to the completeness relation by
$$|\psi(t) \rangle=\int_{\mathbb{R}} \mathrm{d} x |x \rangle \psi(x,t).$$
The scalar product is realized by integrals, again due to the completeness relation, by
$$\langle \psi_1|\psi_2 \rangle=\int_{\mathbb{R}} \mathrm{d} x \langle \psi_1|x \rangle \langle x|\psi_2 \rangle=\int_{\mathbb{R}} \mathrm{d} x \psi_1^*(x) \psi_2(x),$$
i.e., we have a map between the abstract Hilbert-space $\mathcal{H}$ and the Hilbert space $\mathrm{L}^2(\mathbb{R},\mathbb{C})$ of complex-valued square-integrable functions (modulo equivalence of functions that only deviate on set of real numbers of Lebesgue-measure 0).

Finally we need the momentum operator. It's uniquely defined when we know how it acts on the generalized position eigenvectors, but that's easy since in our choice of the basis we have
$$|x \rangle = \exp(-\mathrm{i} x \hat{p}) | 0 \rangle \; \Rightarrow \; \hat{p} |x \rangle=\mathrm{i} \mathrm{d}_x |x \rangle.$$
Taking the adjoint of this relation we find
$$\langle x| \hat{p}=-\mathrm{i} \mathrm{d}_x \langle x|.$$
Multiplying with $|\psi \rangle$ we find that
$$\tilde{p} \psi(x)=\langle x|\hat{p} \psi \rangle=-\mathrm{i} \mathrm{d}_x \langle x|\psi \rangle=-\mathrm{i} \psi'(x).$$
Now you can easily find the momentum eigenvectors in position representation by solving the eigenvalue equation
$$\tilde{p} u_p(x)=p u_p (x), \text{where} \quad u_p(x)=\langle x|p \rangle.$$
Of course $u_p$ is a function of position with a real parameter $p$ (or, if you wish a function of two independent variables $x$ and $p$). This gives (including the "normalization to a $\delta$ distribution"),
$$u_p(x)=\frac{1}{\sqrt{2 \pi}} \exp(\mathrm{i} x p),$$
and you can easily transform from the position to the momentum distribution, via
$$\tilde{\psi}(p)=\langle p|\psi \rangle=\int_{\mathbb{R}} \mathrm{d} x \langle p|x \rangle \langle x|\psi \rangle=\int_{\mathbb{R}} \mathrm{d} x \frac{1}{\sqrt{2 \pi}} \exp(-\mathrm{i} p x) \psi(x),$$
i.e., you get the momentum-space wave function as Fourier transform of the position-space wave function and vice versa,
$$\psi(x)=\int_{\mathbb{R}} \mathrm{d} p \frac{1}{\sqrt{2 \pi}} \exp(+\mathrm{i} p x) \tilde{\psi}(p).$$
I hope this clarifies the standard notation of the Dirac formalism sufficiently.