# Commutators and traces

I read the following as a model solution to a question but I don't understand it - " there is no possible finite dimensional representation of the operators x and p that can reproduce the commutator [x,p] = I(hbar)(identity matrix) since the LHS has zero trace and the RHS has finite trace. My questions are what has the trace got to do with it ? Also an infinite dimensional identity matrix will not have zero trace either so how is the equation satisfied in that case ?
Thanks

vanhees71
Gold Member
In finite-dimensional vector spaces the trace of an operator always exists and for any two operators ##\mathrm{Tr} (A B)=\mathrm{Tr} (B A)## or, equivalently, ##\mathrm{Tr}([A,B]=0##. This immediately leads to a contradiction, when applied to the Heisenberg algebra, ##[x,p]=\mathrm{i} \mathbb{1}##, if you assume that it is realized on a finite-dimensional Hilbert space.

In the separable Hilbert space of infinite dimension, there's no such contradiction, because the unit operator has no trace.

so for the equation to be valid the trace must be the same on both sides ? The commutator has zero trace but wouldn't an infinite dimensional identity matrix have an infinite trace ?

vanhees71
Gold Member
That's the point. The trace operation has a limited domain, and if the commutator is not in that domain, it's simply not defined. In infinite-dim. vector spaces you thus can not prove that the trace of any commutator necessarily vanishes. That's precisely what happens in the case of the Heisenberg commutation relation. Note that also position and momentum operators have a limited domain, i.e., they are only defined on a dense subspace and not on the entire Hilbert space. However, you can extend the definition of the pre-unitary operators induced by the self-adjoint operators (e.g., for momentum you get a spatial translation operator) to unitary operators on the entire Hilbert space.

In infinite dim space doesn't the commutator still have zero trace as Tr(AB)=Tr(BA) ? I thought the model solution implied that HUP is not valid for finite dim spaces but valid for infinite dim spaces ?

vanhees71
Gold Member
The trace of the unit operator doesn't exist in infinite-dimensional Hilbert space. The trace of the commutator ##[x,p]## is not defined at all and that's why it cannot be 0.

I'm getting confused now. Is the x,p commutator relation satisfied anywhere ? It seems it isn't satisfied with finite dim or infinite dim ?

vanhees71
Gold Member
Well, look for the concrete relization of ##x## and ##p## on the Hilbert space ##L^2(\mathbb{R},\mathbb{C})## ("wave mechanics"). You have
$$\hat{x} \psi(x)=x \psi(x), \quad \hat{p} \psi(x)=-\mathrm{i} \partial_x \psi(x).$$
Both operators are self-adjoint with a domain that is for sure not the entire Hilbert space but a dense subspace (like Schwartz's space of rapidly falling ##C^{\infty}##. Their spectrum is entirely continuous and the whole real line ##\mathbb{R}##. The "eigenvectors" are in the dual of their domain, which is much larger than ##L^2##. They are distributions rather than real functions,
$$u_{x_0}(x)=\delta(x-x_0), \quad u_p(x)=\frac{1}{\sqrt{2 \pi}} \exp(\mathrm{i} p x).$$
The commutator is
##[\hat{x},\hat{p}]=\mathrm{i} \mathbb{1},##
and neither the trace of ##\hat{x}##, ##\hat{p}##, ## \hat{x} \hat{p}##, and ##\hat{p} \hat{x}## nor that of ##[\hat{x},\hat{p}]## exists.

For a more mathematically rigorous treatment, see, e.g., the books by Galindo and Pascual.

dextercioby
Homework Helper
Some of this is going over my head. I know that the commutation relation is not satisfied for finite dim spaces as I did an example which showed that and I know in finite dim the commutator has zero trace and the identity matrix has finite trace. I know x and p are infinite dim in reality and in infinite dim traces do not exist. Am I right so far ? If yes then is the commutator satisfied exactly in infinite dim ?

Some of this is going over my head. I know that the commutation relation is not satisfied for finite dim spaces as I did an example which showed that and I know in finite dim the commutator has zero trace and the identity matrix has finite trace. I know x and p are infinite dim in reality and in infinite dim traces do not exist. Am I right so far ? If yes then is the commutator satisfied exactly in infinite dim ?
It is not exactly that the traces don't exist, they are either undefined or their value depends on the basis chosen, so that's all that can be said of their commutator. The first option is useless and QM is a pragmatic theory.

strangerep
I'm getting confused now. Is the x,p commutator relation satisfied anywhere ? It seems it isn't satisfied with finite dim or infinite dim ?
You're confronting (one of) the problems that make a framework larger than ordinary Hilbert space desirable. That framework is called "Rigged Hilbert Space".

Ch1 of Ballentine gives a gentle introduction.

I came across the following question which requires a true or false answer. " for any integer N ≥ 2 there exists N x N matrices X and P such that [ X , P ] = I(hbar)(identity matrix)
From what has been said above I hope the correct answer is FALSE. Can somebody confirm for me that I am correct ? Thanks

samalkhaiat
I came across the following question which requires a true or false answer. " for any integer N ≥ 2 there exists N x N matrices X and P such that [ X , P ] = I(hbar)(identity matrix)
From what has been said above I hope the correct answer is FALSE. Can somebody confirm for me that I am correct ? Thanks
For finite N, it is false. See below
The set of all $n \times n$ matrices with entries in $\mathbb{C}$ forms a (finite-dimensional) vector space, $M_{n} ( \mathbb{C} )$, of dimension $n^{2}$. The trace operation is a functional over $M_{n} ( \mathbb{C} )$ which satisfies:
i) Linearity: for all $(A , B , C) \in M_{n} ( \mathbb{C} )$ and $( \alpha , \beta ) \in \mathbb{C}$, $$\mbox{Tr} ( \alpha A + \beta B ) = \alpha \mbox{Tr} A + \beta \mbox{Tr} B ,$$
ii) Cyclicity: $$\mbox{Tr} (A B C ) = \mbox{Tr} ( C A B ) = \mbox{Tr} ( B C A ) .$$ Now, let $A_{1}$ and $A_{2}$ be any matrices such that $A_{1} A_{2} \in M_{n} ( \mathbb{C} )$ and $A_{2} A_{1} \in M_{n} ( \mathbb{C} )$, then (i) and (ii) imply that $$\mbox{Tr} ( A_{1} A_{2} - A_{2} A_{1} ) = \mbox{Tr} ( A_{1} A_{2} ) - \mbox{Tr} ( A_{2} A_{1} ) = 0 .$$ This means that the equation $$A B - B A = \mathbb{I}_{n} ,$$ has no solution: $$\mbox{Tr} ([ A , B ]) = 0 < \mbox{Tr} ( \mathbb{I}_{n} ) = n .$$ Now, let us consider infinite-dimensional matrices, i.e. matrices with infinite but countable number of rows and columns. Let us also assume that multiplication of such matrices is sensible, i.e. for any two such matrices $(A , B)$, the series $(A B)_{i j} = \sum_{k}^{\infty} A_{i k} B_{k j}$ and $(B A)_{i j} = \sum_{k}^{\infty} B_{i k} A_{k j}$ converge for all $(i,j) \in \mathbb{Z}$. In this sense we can speak of the infinite-dimensional space $M_{\infty} ( \mathbb{C} )$. Let us now ask similar question as in the finite-dimensional case, namely, is the equation $$A B - B A = \mathbb{I}_{\infty} , \ \ \ \ \ \ (1)$$ solvable in $M_{\infty} (\mathbb{C})$? The answer is yes. As you can easily check, the following (infinite-dimensional) matrices do satisfy Eq(1):
$$A = \sqrt{n} \ \delta_{m , n - 1} = \begin{pmatrix} 0 & \sqrt{1} & 0 & 0 & \cdots \\ 0 & 0 & \sqrt{2} & 0 & \cdots \\ 0 & 0 & 0 & \sqrt{3} & \cdots \\ \vdots & \vdots & \vdots & \vdots & \ddots \end{pmatrix} ,$$
$$B = A^{\dagger} = \sqrt{n + 1} \ \delta_{m , n + 1} = \begin{pmatrix} 0 & 0 & 0 & \cdots \\ \sqrt{1} & 0 & 0 & \cdots \\ 0 & \sqrt{2} & 0 & \cdots \\ 0 & 0 & \sqrt{3} & \cdots \\ \vdots & \vdots & \vdots & \ddots \end{pmatrix} .$$ These, as you might know, are infinite-dimensional matrix representation of the operator algebra $[ a , a^{\dagger} ] = 1$ of the SHO in the energy eigen-states. Furthermore, you can also check that the following Hermitian combinations $$X = \frac{1}{\sqrt{2}} ( A^{\dagger} + A ) , \ \ \ P = \frac{i}{\sqrt{2}} ( A^{\dagger} - A ) ,$$ satisfy $$[X , P] = i [A , A^{\dagger}] = i \mathbb{I}_{\infty} .$$ These are the well-known matrix representation of position and momentum operators in the energy-eigen states of the simple harmonic oscillator.
The definition of the trace operation in $M_{\infty} ( \mathbb{C} )$ is a bit technical. However, if the infinite sum of “diagonal elements” converge, we can define the trace by $\mbox{Tr} ( M ) = \sum^{\infty} M_{i i}$. Now, we can make the following observation: the matrix $A^{\dagger} A$, which is in $M_{\infty} ( \mathbb{C} )$, is non-negative and Hermitian. Therefore, using the expressions above for $(A , A^{\dagger})$ and our definition for the trace, we find (according to Euler) $$\mbox{Tr} ( A^{\dagger} A) = \mbox{Tr} ( A A^{\dagger} ) = \sum_{k = 1}^{\infty} k = - \frac{1}{12}.$$ Thus, on one hand $$\mbox{Tr} ( A A^{\dagger} ) - \mbox{Tr} ( A^{\dagger} A ) = 0 ,$$ on the other hand $$\mbox{Tr} ( A A^{\dagger} - A^{\dagger} A ) = \mbox{Tr} ( \mathbb{I}_{\infty}) = 1 + 1 + 1 + \cdots .$$ Since everybody (including Euler) agree that the infinite sum $( 1 + 1 + 1 + \cdots )$ diverges, we conclude that $$\mbox{Tr} ( A A^{\dagger} - A^{\dagger} A ) \neq \mbox{Tr}( A A^{\dagger} ) - \mbox{Tr}( A^{\dagger} A ) .$$ This means that “our definition of the trace” can not be linear functional on $M_{\infty}( \mathbb{C} )$, which seems very bizarre.

Sam

Last edited:
• Ernesto Paas, Frimus and TrickyDicky
Thanks for confirming me correct. As for infinite dim spaces I give up !

vanhees71
Gold Member
Of course, you should explain, how you come to the formula
$$\sum_{k=1}^{\infty} k=-\frac{1}{12}.$$
As it stands, it's of course wrong. The usual definition of the limit of a series is the limit of the sequence of partial sums, and that give ##\infty##.

As for infinite dim spaces I give up !
Do not. That's the meaningful bit.

Of course, you should explain, how you come to the formula
$$\sum_{k=1}^{\infty} k=-\frac{1}{12}.$$
As it stands, it's of course wrong. The usual definition of the limit of a series is the limit of the sequence of partial sums, and that give ##\infty##.
Hopefully Sam will answer but in the meantime this is my understanding.
You have to use analytic continuation, that is, impose dependence on Riemann zeta function, in this particular case for s=1, and the particular series 1+2+3+.... , in other bases you get other convergences. This comes at the cost of losing basis-independence of the trace i.e. incompatibility with linearity that Sam found bizarre above.

Thanks. I won't give up but at this moment in time I know nothing about Riemmann zeta functions and the like so I will have to come back to it when I know more but even some of you advanced guys seem to have different interpretations of it.

Riemann zeta function, in this particular case for s=1
Actually that was meant to be -1, 1 is the pole of the function.

Thanks. I won't give up but at this moment in time I know nothing about Riemmann zeta functions and the like so I will have to come back to it when I know more but even some of you advanced guys seem to have different interpretations of it.
I'm sure the advanced guys(I'm certainly not included in that group) will agree on the math.

strangerep
If one is going to start taking traces of operators, those operators must first be of trace class. But ##\mathbb{I}_\infty## is not trace class, so one's attempted manipulations disintegrate right there. No point proceeding further.

Another complication is that expressions like ##AB-BA## only make sense if ##A## and ##B## have compatible domains and ranges. Again, that's why one must move to the larger framework of Rigged Hilbert Spaces.

There's also the subtle issue of: "which topology are you working in?". The problems found so far are partly because one is implicitly working in a strong topology. But consider what happens if we move to weak topology:

Let's adopt the usual representations ##X=x##, ##P= -i\partial_x##, operating on Schwartz space (i.e., functions of rapid decrease). Let ##f,g## be any 2 such Schwartz functions. Then
$$\int\! dx\, g^* \Big( -ix\partial_x f + i\partial_x x f \Big) = \int\! dx\, g^* \Big( -ix\partial_x f + ix\partial_x f + i (\partial_x x) f\Big) = \int\! dx\, g^* ( i f ) ~.$$Since ##f,g,## are arbitrary Schwartz functions, we have thus established the CCRs, i.e., ##[X,P]=i,## in weak topology on Schwartz space.

-> dyn: As others have said, don't give up. Can you access Ballentine, as I suggested earlier?

Last edited:
vanhees71
Gold Member
Well, ok, but this is a statement that depends on this specific regularization. So one has to argue, why one should use this ##\zeta##-function regularization in this case. Indeed there's a similar technique used to renormalize Feynman diagrams in perturbative relativistic QFTs, the heat-kernel (or the closely related Schwinger proper-time) method. There you can justify these methods by using analytical arguments about the corresponding Green's functions, which make the the renormalized versions uniquely defined, given a renormalization scheme and one can show that physically relevant (observable) quantities like S-matrix elements are renormalization-scheme independent, leading to the renormalization-group equations, which again have a clear physical meaning according to K. Wilson's work.

Well, ok, but this is a statement that depends on this specific regularization. So one has to argue, why one should use this ##\zeta##-function regularization in this case. Indeed there's a similar technique used to renormalize Feynman diagrams in perturbative relativistic QFTs, the heat-kernel (or the closely related Schwinger proper-time) method. There you can justify these methods by using analytical arguments about the corresponding Green's functions, which make the the renormalized versions uniquely defined, given a renormalization scheme and one can show that physically relevant (observable) quantities like S-matrix elements are renormalization-scheme independent, leading to the renormalization-group equations, which again have a clear physical meaning according to K. Wilson's work.
AFAIK the result is not dependent on this particular Riemann zeta regularizatio(one can use cutoff reg., Ramanujan summation...http://en.wikipedia.org/wiki/1_+_2_+_3_+_4_+_⋯), it is just formally the most adequate when dealing with complex matrices (analytic continuation is granted).
In fact one can use the same Riemann zeta regularization to arrive to convergence for ##i\mathbb{I}_{\infty}## , for ##\zeta(0)=-1/2## http://en.wikipedia.org/wiki/1_+_1_+_1_+_1_+_⋯ rendering a contradiction for the commutation equation also in the infinite dimensional case.

If one is going to start taking traces of operators, those operators must first be of trace class. But ##\mathbb{I}_\infty## is not trace class, so one's attempted manipulations disintegrate right there. No point proceeding further.
The root of the problem is IMO that the operators ##X## and ##P## are indeed not trace class operators, but that implies not exactly that a trace cannot be defined on them but more specifically one with the additional features that the trace be finite and independent of the choice of basis. If one takes as valid(wich is debatable) the analytic continuation discussed above, a finite trace but certainly not independent of the choice of basis is defined for ##X## and ##P##.
The case of ##i\mathbb{I}_{\infty}## seems different, again if we take as valid the convergence to -1/2 it seems to be independent of the choice of basis and it would then still be a trace class operator.
It seems it all rests upon the validity of the sum convergences obtained by analytic continuation. I surely don't know if they are valid, but similar manipulations are routinely done in QFT, then again there the physical justification lies on the great results obtained with such manipulations when compared with experiment.

Another complication is that expressions like ##AB-BA## only make sense if ##A## and ##B## have compatible domains and ranges. Again, that's why one must move to the larger framework of Rigged Hilbert Spaces.

There's also the subtle issue of: "which topology are you working in?". The problems found so far are partly because one is implicitly working in a strong topology. But consider what happens if we move to weak topology:

Let's adopt the usual representations ##X=x##, ##P= -i\partial_x##, operating on Schwartz space (i.e., functions of rapid decrease). Let ##f,g## be any 2 such Schwartz functions. Then
$$\int\! dx\, g^* \Big( -ix\partial_x f + i\partial_x x f \Big) = \int\! dx\, g^* \Big( -ix\partial_x f + ix\partial_x f + i (\partial_x x) f\Big) = \int\! dx\, g^* ( i f ) ~.$$Since ##f,g,## are arbitrary Schwartz functions, we have thus established the CCRs, i.e., ##[X,P]=i,## in weak topology on Schwartz space.

-> dyn: As others have said, don't give up. Can you access Ballentine, as I suggested earlier?
I fail to see the direct relation of anything said in this thread with rigged HS. The discussion didn't enter into the link between distributions and square-integrability, it is in a stage quite previous to reaching that issue, if only for one thing, to construct a continuous dual space it is necessary to guarantee the existence of linear functionals from the space into the complex field.

• vanhees71
vanhees71
Gold Member
Well, that may all well be true, but in the here discussed case, it's totally unclear to me, why it should be a "natural thing" to use zeta-function regularization or a cutoff regularization or whatever to define these traces. Take, e.g., a simple regularization
$$f(z)=\sum_{n=0}^{\infty} n \exp(-z n),$$
which converges obviously for ##\mathrm{Re} z>0##. It's easy to give the analytic expression by introducing the generating function
$$g(z)=\sum_{n=0}^{\infty} \exp(-z n)=\frac{1}{1-\exp(-z)}.$$
Obviously we have
$$f(z)=g'(z)=\frac{\exp(-z)}{[1-\exp(-z)]^2}.$$
Now you can Laurent expand this around ##z=0##, leading to
$$f(z)=\frac{1}{z^2}-\frac{1}{12} + \mathcal{O}(z^2).$$
To give the trace, discussed in Posting #14 a meaning, you have to subtract the singular term ##1/z^2## and then take ##z \rightarrow 0##, but why should this "renormalization" be justified from any physics considerations in this case? At least, I can't make sense of such a subtraction.

In the field-theory cases, where a similar series, e.g., occurs in the usual most simple treatment of the Casimir effect for two infinitely large metallic plates, the necessary subtraction makes perfect physical sense, because you have to subtract the vacuum fluctions in free space, which are also divergent integrals. Using the same (cutoff or zeta-function or heat-kernel or whatever other) regularization you find a unique answer and particularly a reason for why to subtract the divergent pieces in whatever regularization you use.

As far as I can see, there's no physical meaning to define the trace of the unit operator or that of non-trace-class operators like ##\hat{x}##, ##\hat{p}##, ##\hat{x} \hat{p}##, etc. using whatever regularization technique to "sum" the divergent series.

What is, however, very clear is that the Heisenberg algebra cannot be realized exactly in any finite-dimensional Hilbert space, and the trace argument is very valid, leading to a contradiction, as discussed in this thread.

Well, that may all well be true, but in the here discussed case, it's totally unclear to me, why it should be a "natural thing" to use zeta-function regularization or a cutoff regularization or whatever to define these traces. Take, e.g., a simple regularization
$$f(z)=\sum_{n=0}^{\infty} n \exp(-z n),$$
which converges obviously for ##\mathrm{Re} z>0##. It's easy to give the analytic expression by introducing the generating function
$$g(z)=\sum_{n=0}^{\infty} \exp(-z n)=\frac{1}{1-\exp(-z)}.$$
Obviously we have
$$f(z)=g'(z)=\frac{\exp(-z)}{[1-\exp(-z)]^2}.$$
Now you can Laurent expand this around ##z=0##, leading to
$$f(z)=\frac{1}{z^2}-\frac{1}{12} + \mathcal{O}(z^2).$$
To give the trace, discussed in Posting #14 a meaning, you have to subtract the singular term ##1/z^2## and then take ##z \rightarrow 0##, but why should this "renormalization" be justified from any physics considerations in this case? At least, I can't make sense of such a subtraction.

In the field-theory cases, where a similar series, e.g., occurs in the usual most simple treatment of the Casimir effect for two infinitely large metallic plates, the necessary subtraction makes perfect physical sense, because you have to subtract the vacuum fluctions in free space, which are also divergent integrals. Using the same (cutoff or zeta-function or heat-kernel or whatever other) regularization you find a unique answer and particularly a reason for why to subtract the divergent pieces in whatever regularization you use.

As far as I can see, there's no physical meaning to define the trace of the unit operator or that of non-trace-class operators like ##\hat{x}##, ##\hat{p}##, ##\hat{x} \hat{p}##, etc. using whatever regularization technique to "sum" the divergent series.

What is, however, very clear is that the Heisenberg algebra cannot be realized exactly in any finite-dimensional Hilbert space, and the trace argument is very valid, leading to a contradiction, as discussed in this thread.
I share your concerns wrt the physical justification of those traces other than in physics it is in general preferable to have finite than infinite results. Of course in this particular case defining these finite traces for infinite-dimensional HS has important consequences, probably the reason why even though these methods of obtaining finite traces have been known for years, they are not applied to the commutation equation, but this is probably outside the scope of this forum.

Mathematically the manipulation here is on par with all those done for many different calculations on QFT, from Casimir effect to vacuum energy density , to Hawking radiation in curved spacetime, or anything for wich you might use dimensional regularization (that according to wikipedia is equivalent to RZ regularization).
I think the zeta regularization is cleaner than your example and doesn't involve that substraction(has no counterterms) but I'm surely no expert. I know Zeta funtion regularization is routinely used to define traces and determinants of self-adjoint operators.
It is important to make clear that the traces defined this way are not the usual sense traces from linear algebra, they are highly nonlinear as all these regularizations are clearly nonlinear.

Last edited: