# Momentum operator in curvilinear coordinates

• ShayanJ

#### ShayanJ

Gold Member
This paper is about momentum operator in curvilinear coordinates. The author says that using $\vec p=\frac{\hbar}{i} \vec \nabla$ is wrong and this form is only limited to Cartesian coordinates. Then he tries to find expressions for momentum operator in curvilinear coordinates. He's starting point is uncertainty principle in curvilinear coordinates $[q_i,p_j]=i\hbar \delta_{ij}$ and it becomes obvious that by $q_i$, he means the coordinates themselves, e.g. $r, \theta, \varphi$ in spherical coordinates. But both intuition and dimensional analysis tell us that qs should be (for e.g. spherical coordinates)$r, r\theta, r\sin\theta \varphi$. So I think because of this wrong starting point, the paper is going wrong all the way to the end and its initial claim is wrong. I want to know others' ideas. Any comment is welcome.

The conjugate momentum for angular variables will itself have units of angular momentum as opposed to just momentum, so the units of the commutator are fine.

• thecommexokid
The conjugate momentum for angular variables will itself have units of angular momentum as opposed to just momentum, so the units of the commutator are fine.
But even if we take that into account, then we should note that angular variables don't represent a constant spread in length for all distances from origin. For example for a constant $\Delta \varphi$, if $R_1 > R_2$ then $R_1 \Delta \varphi>R_2 \Delta \varphi$ so $\Delta \varphi$ alone can't represent the uncertainty in position! But by using the angle operator alone in the commutation relation, the corresponding uncertainty relation will be in terms of spread in angle only.

Last edited:
I don't find anything in that article related to uncertainty relations of any kind.

• thecommexokid
I don't find anything in that article related to uncertainty relations of any kind.
+1 on that.

@Shyan: the paper is talking about generalized commutation relations. Given these, one derives uncertainty (aka indeterminacy) relations. Cf. Ballentine section 8.4.

I don't find anything in that article related to uncertainty relations of any kind.
@Shyan: the paper is talking about generalized commutation relations. Given these, one derives uncertainty (aka indeterminacy) relations. Cf. Ballentine section 8.4.
That's exactly what I mean. If we say e.g. $[\varphi,m r^2 \dot \varphi]=i\hbar$, then we'll have an uncertainty relation $\Delta \varphi \Delta (mr^2 \dot\varphi) \geq \frac \hbar 2$. What I'm saying is a particular value of $\Delta\varphi$ alone can't give us enough information about uncertainty in position, because such an angle uncertainty gives a larger length uncertainty the further we are from the origin. So if we have points 1 and 2 such that $r_1>r_2$ and we somehow have a constant amount of $\Delta\varphi$, we know that the particle will have a larger amount of uncertainty in its position if its in position 1 than it had if it was in position 2 because $r_1 \Delta \varphi> r_2 \Delta \varphi$, so I think this should somehow be taken into account in the commutation relations so that the resulting uncertainty relations take this into account too.
Anyway, despite my argument, the response of you science advosors makes me think I'm missing something here. So I'm ready to hear!

The transformation of the momentum operator to other coordinate systems has been studied intensively since the very beginning of QM and every detail of it has been made watertight by mathematical physicists. I don't understand how you can write an article on it citing only some elementary textbooks on QM nor why one should read this article.

The transformation of the momentum operator to other coordinate systems has been studied intensively since the very beginning of QM and every detail of it has been made watertight by mathematical physicists. I don't understand how you can write an article on it citing only some elementary textbooks on QM nor why one should read this article.
I came to this paper by accident and because it seemed to me that $\vec p=\frac \hbar i \vec \nabla$ is well established and also I saw the mentioned problem with the paper, I came here to ask others' ideas about this paper. Actually I didn't read the paper, I just glanced at the equations.
The main point is, the guy is either right or wrong. If he's wrong, then there should be some reason and I want to find it out because it seems to me it doesn't need a lot of time and effort so its OK. But if we can find no reason to say he's wrong, then maybe he has a point!
Can you mention some papers or books that actually prove $\vec p=\frac \hbar i \vec \nabla$ is OK in general? I always thought its somehow trivial but now I think its good to see at least a hand wavy proof.

Last edited:
The paper is completely scrap. Already the starting point Eq. 4 is wrong as q is not a well defined operator in the case of angular momentum, i.e. an angle operator does not exist.

• ShayanJ
an angle operator does not exist
And yet, you can measure the angle by measuring the position, which, in turn, is an operator. Or is it? How would you interpret the idea that in quantum mechanics you can measure something which is not described by an operator?

• ShayanJ
Now that I think, its pretty straight forward. We should just consider a wave-function in position representation written in curvilinear coordinates. Then we should translate it by an infinitesimal amount and subtract it from the original wave-function. Then an operator simply pops out which we define to be the momentum operator and this is something you can't argue with. So I wonder why the authors doubt it!
The important point here is that, in spherical coordinates, such a translation only involves the r coordinate and in cylindrical coordinates it only involves $\rho$ and z coordinates. So this way it seems that we get no canonical momentums conjugate to angle coordinates and so they aren't on an equal footing with the ones conjugate to distance coordinates.

And yet, you can measure the angle by measuring the position, which, in turn, is an operator. Or is it? How would you interpret the idea that in quantum mechanics you can measure something which is not described by an operator?
Ok, so let us introduce an angle operator as ##\phi=\mathrm{atan2}(x,y) ## this operator will jump from pi to -pi at when x becomes negative for negative y. We already know the correct angular momentum operator to be ##-i (x\partial/\partial y -y \partial/\partial x)##, so this will introduce a delta function into the commutation relations, which are therefore no longer of the canonical form we were assuming.

Last edited:
• Demystifier
In "An introduction to quantum theory" by F.S. Levin, section 10.4, its explained that the operator $\hat \Phi |\phi\rangle=\phi |\phi\rangle$ is ambiguous and both $\hat \Phi |\phi\rangle=\phi |\phi\rangle$ and $[\hat L_z,\hat \Phi]=-i\hbar$ lead to inconsistencies. But he mentions that it doesn't mean an angle operator does not exist and in fact there are several possibilities among which $e^{i\hat \Phi}, \cos\hat\Phi \ and \ \sin\hat\Phi$ are examples. He then examines $\hat\Upsilon=e^{i\hat \Phi}$. He says:
Levin said:
The fact that ## \hat\Upsilon ## is unitary rather than Hermitian is not a problem, since the angle is not an observable.
He then calculates $[\hat L_z,\hat \Upsilon]=\hbar \hat \Upsilon$ and $\hat \Upsilon$'s matrix elements which makes clear that it is actually a raising operator, i.e. $\hat \Upsilon=C_+^{-1} \hat L_+$ with $C_+=\sqrt{(l-m_l)(l+m_l+1)}$.

• Demystifier and DrDu
Interesting, we had a similar discussion only a week ago:
If we take the angle operator to be a generator of shifts of the angular momentum, we have to take into account that angular momentum is quantized. However, this good, as ##\exp (im\phi)## will only transform a continuous function into a continuous function if m is integer.
So I think we can write some Weyl commutation relations ##UV=VU\exp(i\alpha m)## where ##U=\exp(i\alpha L_z)##, ##V=\exp(i m \phi)## with alpha real and m integer. For non-integer values of m, the Weyl commutation relations won't hold.

He then calculates $[\hat L_z,\hat \Upsilon]=\hbar \hat \Upsilon$ and $\hat \Upsilon$'s matrix elements which makes clear that it is actually a raising operator, i.e. $\hat \Upsilon=C_+^{-1} \hat L_+$ with $C_+=\sqrt{(l-m_l)(l+m_l+1)}$.
So one could define ##\Upsilon## in terms of ##L^2##, ##L_z##, and ##L_+##, although the representation is somewhat nasty.

Ok, so let us introduce an angle operator as ##\phi=\mathrm{atan2}(x,y) ## this operator will jump from pi to -pi at when x becomes negative for negative y. We already know the correct angular momentum operator to be ##-i (x\partial/\partial y -y \partial/\partial x)##, so this will introduce a delta function into the commutation relations, which are therefore no longer of the canonical form we were assuming.
So there is angle operator, but it does not satisfy the canonical commutation relation. Is that right?

As I said in my last post, $\Upsilon$ is clearly unitary. So the fact that its proportional to $L_+$, means that $L_+ L_+^\dagger \propto I$. But when I calculate $L_+ L_+^\dagger$ using the definitions given here, I find out that $L_+ L_+^\dagger=\hbar^2 \left\{ \frac{\partial^2}{\partial \theta^2}+i\frac{\partial}{\partial \varphi}+\cot\theta(\frac{\partial}{\partial \theta }+\cot\theta \frac{\partial^2}{\partial \varphi^2})\right\}$ which doesn't seem to be proportional to identity! What's wrong here?

Why should it be proportional to the identity? However, it should commute with L^2 and L_z.

Because $\Upsilon$ is unitary, we have $\Upsilon \Upsilon^\dagger=I \Rightarrow C_+^{-2} L_+ L_+^\dagger= I \Rightarrow L_+ L_+^\dagger\propto I$.

Ok, but this ##C_+## is an operator, too, which can be expressed in terms of L^2 and L_z.

I haven't read the paper nor the postings so far, but from the abstract of the paper, I can only conclude that it's utter nonsense. Momentum is defined as the generator for translations in Euclidean space (which is the space for inertial observers both in non-relativistic as well as special-relativistic physics, including quantum theory). Even in position representation, where
$$\hat{\vec{p}}=-\mathrm{i} \vec{\nabla},$$
you have an coordinate-independent description. You can express the nabla operator in any curvilinear coordinates you like. It's a vector operator and thus independent of the choice of coordinates!

So there is angle operator, but it does not satisfy the canonical commutation relation. Is that right?
Yes, you were right about that.

• Demystifier
Ok, but this ##C_+## is an operator, too, which can be expressed in terms of L^2 and L_z.
I don't think that's true! If ##C_+## was an operator, then Levin couldn't say ## \Upsilon ## is a raising operator because ## C_+ ## could do things which make ## \Upsilon ## very different from ## L_+ ##!
Also ## C_+=\sqrt{(l-m_l)(l+m_l+1)} ## and ## l \ and \ m_l ## are numbers!
I don't see how ## C_+ ## can be an operator!

Last edited:
My present understanding is that ##\Phi## is a well defined self-adjoint operator which is even bounded. Hence ##\Upsilon## must be hermitian. On the other hand, the shift operators aren't hermitian. We can determine ##A:=C_+^{-1}##from
##1=\Upsilon^\dagger \Upsilon=L_-A^\dagger A L_+=AL_+L_-A^\dagger =\Upsilon \Upsilon^\dagger=1##.
Now ##L_+L_-=L_x^2-i(L_xL_y -L_yL_x)+L_y^2=L^2-L_z^2+L_z## and ##L_-L_+=L^2-L_z^2-L_z##. Now multiply from the right with A and from the left with ##A^\dagger##: ##A^\dagger A=A^\dagger AL_+L_-A^\dagger A##, so we get ##A^\dagger A=(L_+L_-)^{-1}##.
More later.

• ShayanJ
Ok, last step: In a basis of eigenstates of ##L^2## and ##L_z##, both ##\Upsilon## and ##L_+## make a transition from ##m_z## to ##m_z +1## for fixed l. Hence A must be diagonal and is therefore a hermitian operator ##A=A^\dagger =(L_+L_-)^{-1/2}##.

Ok, last step: In a basis of eigenstates of ##L^2## and ##L_z##, both ##\Upsilon## and ##L_+## make a transition from ##m_z## to ##m_z +1## for fixed l. Hence A must be diagonal and is therefore a hermitian operator ##A=A^\dagger =(L_+L_-)^{-1/2}##.

I'm confused because there are several things I don't understand:

1) Here you're assuming that ## \Upsilon## makes a transition from ## m_z ## to ##m_z+1##, like ## L_+ ##. But if we say that ## \Upsilon =A L_+ ## where A is an unknown operator we're trying to find, then we can't be sure of the first sentence unless we assume A is a number, not an operator!(Or an operator that commutes with ##L_+##?) So I should ask how else can you know that ## \Upsilon ## acts like ## L_+ ##? Are you assuming

2) I understand the calculation in your last post and it seems A is actually an operator but how can you reconcile it with the fact that ## A=\left[(l-m_l)(l+m_l+1)\right]^{-\frac 1 2}## ?

3) I somehow know the answer to the first question but the problem is its in contradiction with A being an operator. The point is, as Levin explains, the commutation relation ## [L_z,\Upsilon]=\hbar \Upsilon ## implies that the matrix elements ## \Upsilon_{m_l' m_l} ## are non-zero only when ## m_l'=m_l+1 ##. This way Levin concludes that ## \Upsilon ## is a raising operator but this can only mean that ## \Upsilon=D L_+ ## where D is a constant number! (Or any operator that commutes with ## L_+ ##, but Levin seems to accept that D is a constant number!)

I'm confused because there are several things I don't understand:

1) Here you're assuming that ## \Upsilon## makes a transition from ## m_z ## to ##m_z+1##, like ## L_+ ##. But if we say that ## \Upsilon =A L_+ ## where A is an unknown operator we're trying to find, then we can't be sure of the first sentence unless we assume A is a number, not an operator!(Or an operator that commutes with ##L_+##?) So I should ask how else can you know that ## \Upsilon ## acts like ## L_+ ##? Are you assuming

2) I understand the calculation in your last post and it seems A is actually an operator but how can you reconcile it with the fact that ## A=\left[(l-m_l)(l+m_l+1)\right]^{-\frac 1 2}## ?

3) I somehow know the answer to the first question but the problem is its in contradiction with A being an operator. The point is, as Levin explains, the commutation relation ## [L_z,\Upsilon]=\hbar \Upsilon ## implies that the matrix elements ## \Upsilon_{m_l' m_l} ## are non-zero only when ## m_l'=m_l+1 ##. This way Levin concludes that ## \Upsilon ## is a raising operator but this can only mean that ## \Upsilon=D L_+ ## where D is a constant number! (Or any operator that commutes with ## L_+ ##, but Levin seems to accept that D is a constant number!)

1. is a consequence of the line of thought in post #14, i.e. the commutation relations which only hold for integer values of m.
2. This are exactly the eigenvalues of the operator A I indroduced in the basis ##|l,m_z\rangle##. In my last post, I showed that A is diagonal in that basis.
3. I can't follow your conclusion, and I think you yourself can't either. I suppose you can work out yourself how B must look like if ##AL_+=L_+B##.

1. is a consequence of the line of thought in post #14, i.e. the commutation relations which only hold for integer values of m.
2. This are exactly the eigenvalues of the operator A I indroduced in the basis ##|l,m_z\rangle##. In my last post, I showed that A is diagonal in that basis.
3. I can't follow your conclusion, and I think you yourself can't either. I suppose you can work out yourself how B must look like if ##AL_+=L_+B##.
OK, let's get back to the beginning. You said that because ## \Upsilon ## is a raising operator, then it should be equal to ## A L_+ ## where A is diagonal.(Is it what you said?) But I'm saying this is not the only possibility because we may have ## \Upsilon=BL_+ ## where ## [B,L_+]=0 ##. This way surely ##\Upsilon## will be a raising operator too. How can you exclude this case?

As I said in my last post, $\Upsilon$ is clearly unitary. So the fact that its proportional to $L_+$, means that $L_+ L_+^\dagger \propto I$. But when I calculate $L_+ L_+^\dagger$ using the definitions given here, I find out that $L_+ L_+^\dagger=\hbar^2 \left\{ \frac{\partial^2}{\partial \theta^2}+i\frac{\partial}{\partial \varphi}+\cot\theta(\frac{\partial}{\partial \theta }+\cot\theta \frac{\partial^2}{\partial \varphi^2})\right\}$ which doesn't seem to be proportional to identity! What's wrong here?
Since you're necessarily working in a particular Hilbert space here, I suspect it's only reasonable to expect the identity when sandwiched between the eigenstates. Cf. look at Levin's eq(10.99), i.e.,
Levin said:
##\langle n,\ell,m_\ell| \hat L_x |n,\ell,m_\ell\rangle
~=~ \langle n,\ell,m_\ell| \hat L_y |n,\ell,m_\ell\rangle ~=~ 0 ~. ~~~~~~~~ (10.99)##
This holds on this Hilbert space, but clearly ##\hat L_x \ne 0## if considered as an operator in isolation. I.e., one must beware of the distinction between "strong" and "weak" properties of operators -- "weak properties" are those which hold with respect to matrix elements on a specific Hilbert space. (This is actually a very important distinction in advanced QM and QFT.)

To amplify this point, note that Levin's relation ##\hat\Upsilon = C^{-1}_+ \hat L_+## was inferred by looking the matrix-element equation (10.115).

Dirac used a different equality sign to denote weak equality (though that was in the context of constrained classical dynamics). It would make some parts of quantum theory clearer if that same distinction were adopted to express strong and weak equality of operators.

To check all this, you could try evaluating your expression for ##L_+ L_+^\dagger## between such eigenstates. Of course, it would be messy and error-prone to work with all those spherical harmonic functions, etc.

BTW, this highlights the problems that can arise when one assumes that the CCRs can be represented on a finite-dimensional Hilbert space. Sometimes the affine commutation relations (Klauder's term -- see his paper) are easier to work with. I've also seen them called exponential commutation relations'' (which is what you get when using the phase operator ##e^{i\phi}## instead of an angle operator).

All this has its beginnings back in classical Hamiltonian dynamics: for a integrable system, it's often useful to express the system in terms of (so-called) generalized action-angle variables, which satisfy an affine-like Poisson bracket relation instead of the canonical Poisson bracket.

It's interesting that some things become easier if we attempt to quantize such classical systems by reference to their affine Poisson brackets, instead of the canonical brackets. In the present case, it allows us to bypass the multi-valuedness issues that an angle variable like ##\phi## introduces. Use of ##e^{i\phi}## locks such inconveniences inside the exponential, and we can get on with our business...

Last edited:
BTW, it seems a bit harsh to be throwing strong criticism at these authors if one has not studied the paper properly.

The authors apparently intend more general cases of canonical variable pairs. See, e.g., their brief remark about the EM minimal coupling case. Although I didn't get much out of this particular paper for myself, there have certainly been examples in the past where I thought a paper was total rubbish, but later realized this merely reflected my own embarrassing lack of understanding.

That's of course always possible, but it's well known, that "canonical quantization" only works for position and momentum in total ##\mathbb{R}^3##. The operator algebra has to be derived from group theoretical considerations, and still then it's a hypothesis whether this procedure really describes the system at hand correctly. The only way to figure this out is to do experiments to validate of invalidate a model in question. Physics is an empirical science!

The paper is completely scrap. Already the starting point Eq. 4 is wrong as q is not a well defined operator in the case of angular momentum, i.e. an angle operator does not exist.

I think that conclusion is a little strong, but I don't want to argue about it. However, I remember from studying quantum mechanics with periodic boundary conditions that if $x$ and $x+L$ represent the same physical location, then we replaced the usual commutation relation:

$[p, x] = -i \hbar$

by one that's appropriate for the ambiguity in $x$:

$[p, e^{\frac{2 \pi i x}{L}}] = \frac{2 \pi \hbar}{L} e^{\frac{2 \pi i x}{L}}$

The function $e^{\frac{2 \pi i x}{L}}$ is single-valued, even though $x$ is not.

I haven't read the paper nor the postings so far, but from the abstract of the paper, I can only conclude that it's utter nonsense. Momentum is defined as the generator for translations in Euclidean space (which is the space for inertial observers both in non-relativistic as well as special-relativistic physics, including quantum theory). Even in position representation, where
$$\hat{\vec{p}}=-\mathrm{i} \vec{\nabla},$$
you have an coordinate-independent description. You can express the nabla operator in any curvilinear coordinates you like. It's a vector operator and thus independent of the choice of coordinates!

I'm not qualified to argue, but I had heard the claim years ago, in studying quantum mechanics in spherical coordinates, that it was convenient to define:

$p_r = -i \hbar (\frac{\partial}{\partial r} + \frac{1}{r})$

rather than simply

$p_r = -i \hbar \frac{\partial}{\partial r}$

I do not remember the reason for this choice, but it agrees with the result in the paper.

• dextercioby
I'm not qualified to argue, but I had heard the claim years ago, in studying quantum mechanics in spherical coordinates, that it was convenient to define:

$p_r = -i \hbar (\frac{\partial}{\partial r} + \frac{1}{r})$

rather than simply

$p_r = -i \hbar \frac{\partial}{\partial r}$

I do not remember the reason for this choice, but it agrees with the result in the paper.

For example, in this paper, equation 3.2.4:
http://users.ece.gatech.edu/~alan/ECE6451/Lectures/StudentLectures/Brown_3p2_HydrogenAtom.pdf

I'm not qualified to argue, but I had heard the claim years ago, in studying quantum mechanics in spherical coordinates, that it was convenient to define:

$p_r = -i \hbar (\frac{\partial}{\partial r} + \frac{1}{r})$

rather than simply

$p_r = -i \hbar \frac{\partial}{\partial r}$

I do not remember the reason for this choice, but it agrees with the result in the paper.
You can do so, and it may be useful, especially in compound expressions, however, ##p_r## itself is not self-adjoint.

• dextercioby and vanhees71