Commutators, Traces & [x,p] = I(hbar) Explained

In summary: Z) = \mbox{Tr} (B...Z A).iii) \mbox{Tr} (I_{n} ) = n, where I_{n} is the n \times n identity matrix.The trace operation is not defined for every matrix. However, if the trace is defined, the value is unique. Because the trace is a linear functional, if two matrices have the same trace, then the trace of their difference is zeroIn summary, the conversation discusses the impossibility of finding finite-dimensional representations of the operators x and p that can reproduce the commutator [x,p] = I(hbar)(identity matrix). This is because the trace of the commutator is zero, while the trace
  • #1
dyn
773
61
I read the following as a model solution to a question but I don't understand it - " there is no possible finite dimensional representation of the operators x and p that can reproduce the commutator [x,p] = I(hbar)(identity matrix) since the LHS has zero trace and the RHS has finite trace. My questions are what has the trace got to do with it ? Also an infinite dimensional identity matrix will not have zero trace either so how is the equation satisfied in that case ?
Thanks
 
Physics news on Phys.org
  • #2
In finite-dimensional vector spaces the trace of an operator always exists and for any two operators ##\mathrm{Tr} (A B)=\mathrm{Tr} (B A)## or, equivalently, ##\mathrm{Tr}([A,B]=0##. This immediately leads to a contradiction, when applied to the Heisenberg algebra, ##[x,p]=\mathrm{i} \mathbb{1}##, if you assume that it is realized on a finite-dimensional Hilbert space.

In the separable Hilbert space of infinite dimension, there's no such contradiction, because the unit operator has no trace.
 
  • #3
so for the equation to be valid the trace must be the same on both sides ? The commutator has zero trace but wouldn't an infinite dimensional identity matrix have an infinite trace ?
 
  • #4
That's the point. The trace operation has a limited domain, and if the commutator is not in that domain, it's simply not defined. In infinite-dim. vector spaces you thus can not prove that the trace of any commutator necessarily vanishes. That's precisely what happens in the case of the Heisenberg commutation relation. Note that also position and momentum operators have a limited domain, i.e., they are only defined on a dense subspace and not on the entire Hilbert space. However, you can extend the definition of the pre-unitary operators induced by the self-adjoint operators (e.g., for momentum you get a spatial translation operator) to unitary operators on the entire Hilbert space.
 
  • #5
In infinite dim space doesn't the commutator still have zero trace as Tr(AB)=Tr(BA) ? I thought the model solution implied that HUP is not valid for finite dim spaces but valid for infinite dim spaces ?
 
  • #6
The trace of the unit operator doesn't exist in infinite-dimensional Hilbert space. The trace of the commutator ##[x,p]## is not defined at all and that's why it cannot be 0.
 
  • #7
I'm getting confused now. Is the x,p commutator relation satisfied anywhere ? It seems it isn't satisfied with finite dim or infinite dim ?
 
  • #8
Well, look for the concrete relization of ##x## and ##p## on the Hilbert space ##L^2(\mathbb{R},\mathbb{C})## ("wave mechanics"). You have
$$\hat{x} \psi(x)=x \psi(x), \quad \hat{p} \psi(x)=-\mathrm{i} \partial_x \psi(x).$$
Both operators are self-adjoint with a domain that is for sure not the entire Hilbert space but a dense subspace (like Schwartz's space of rapidly falling ##C^{\infty}##. Their spectrum is entirely continuous and the whole real line ##\mathbb{R}##. The "eigenvectors" are in the dual of their domain, which is much larger than ##L^2##. They are distributions rather than real functions,
$$u_{x_0}(x)=\delta(x-x_0), \quad u_p(x)=\frac{1}{\sqrt{2 \pi}} \exp(\mathrm{i} p x).$$
The commutator is
##[\hat{x},\hat{p}]=\mathrm{i} \mathbb{1},##
and neither the trace of ##\hat{x}##, ##\hat{p}##, ## \hat{x} \hat{p}##, and ##\hat{p} \hat{x}## nor that of ##[\hat{x},\hat{p}]## exists.

For a more mathematically rigorous treatment, see, e.g., the books by Galindo and Pascual.
 
  • #10
Some of this is going over my head. I know that the commutation relation is not satisfied for finite dim spaces as I did an example which showed that and I know in finite dim the commutator has zero trace and the identity matrix has finite trace. I know x and p are infinite dim in reality and in infinite dim traces do not exist. Am I right so far ? If yes then is the commutator satisfied exactly in infinite dim ?
 
  • #11
dyn said:
Some of this is going over my head. I know that the commutation relation is not satisfied for finite dim spaces as I did an example which showed that and I know in finite dim the commutator has zero trace and the identity matrix has finite trace. I know x and p are infinite dim in reality and in infinite dim traces do not exist. Am I right so far ? If yes then is the commutator satisfied exactly in infinite dim ?
It is not exactly that the traces don't exist, they are either undefined or their value depends on the basis chosen, so that's all that can be said of their commutator. The first option is useless and QM is a pragmatic theory.
 
  • #12
dyn said:
I'm getting confused now. Is the x,p commutator relation satisfied anywhere ? It seems it isn't satisfied with finite dim or infinite dim ?
You're confronting (one of) the problems that make a framework larger than ordinary Hilbert space desirable. That framework is called "Rigged Hilbert Space".

Ch1 of Ballentine gives a gentle introduction.
 
  • #13
I came across the following question which requires a true or false answer. " for any integer N ≥ 2 there exists N x N matrices X and P such that [ X , P ] = I(hbar)(identity matrix)
From what has been said above I hope the correct answer is FALSE. Can somebody confirm for me that I am correct ? Thanks
 
  • #14
dyn said:
I came across the following question which requires a true or false answer. " for any integer N ≥ 2 there exists N x N matrices X and P such that [ X , P ] = I(hbar)(identity matrix)
From what has been said above I hope the correct answer is FALSE. Can somebody confirm for me that I am correct ? Thanks
For finite N, it is false. See below
The set of all [itex]n \times n[/itex] matrices with entries in [itex]\mathbb{C}[/itex] forms a (finite-dimensional) vector space, [itex]M_{n} ( \mathbb{C} )[/itex], of dimension [itex]n^{2}[/itex]. The trace operation is a functional over [itex]M_{n} ( \mathbb{C} )[/itex] which satisfies:
i) Linearity: for all [itex](A , B , C) \in M_{n} ( \mathbb{C} )[/itex] and [itex]( \alpha , \beta ) \in \mathbb{C}[/itex], [tex]\mbox{Tr} ( \alpha A + \beta B ) = \alpha \mbox{Tr} A + \beta \mbox{Tr} B ,[/tex]
ii) Cyclicity: [tex]\mbox{Tr} (A B C ) = \mbox{Tr} ( C A B ) = \mbox{Tr} ( B C A ) .[/tex] Now, let [itex]A_{1}[/itex] and [itex]A_{2}[/itex] be any matrices such that [itex]A_{1} A_{2} \in M_{n} ( \mathbb{C} )[/itex] and [itex]A_{2} A_{1} \in M_{n} ( \mathbb{C} )[/itex], then (i) and (ii) imply that [tex]\mbox{Tr} ( A_{1} A_{2} - A_{2} A_{1} ) = \mbox{Tr} ( A_{1} A_{2} ) - \mbox{Tr} ( A_{2} A_{1} ) = 0 .[/tex] This means that the equation [tex]A B - B A = \mathbb{I}_{n} ,[/tex] has no solution: [tex]\mbox{Tr} ([ A , B ]) = 0 < \mbox{Tr} ( \mathbb{I}_{n} ) = n .[/tex] Now, let us consider infinite-dimensional matrices, i.e. matrices with infinite but countable number of rows and columns. Let us also assume that multiplication of such matrices is sensible, i.e. for any two such matrices [itex](A , B)[/itex], the series [itex](A B)_{i j} = \sum_{k}^{\infty} A_{i k} B_{k j}[/itex] and [itex](B A)_{i j} = \sum_{k}^{\infty} B_{i k} A_{k j}[/itex] converge for all [itex](i,j) \in \mathbb{Z}[/itex]. In this sense we can speak of the infinite-dimensional space [itex]M_{\infty} ( \mathbb{C} )[/itex]. Let us now ask similar question as in the finite-dimensional case, namely, is the equation [tex]A B - B A = \mathbb{I}_{\infty} , \ \ \ \ \ \ (1)[/tex] solvable in [itex]M_{\infty} (\mathbb{C})[/itex]? The answer is yes. As you can easily check, the following (infinite-dimensional) matrices do satisfy Eq(1):
[tex]
A = \sqrt{n} \ \delta_{m , n - 1} = \begin{pmatrix}
0 & \sqrt{1} & 0 & 0 & \cdots \\
0 & 0 & \sqrt{2} & 0 & \cdots \\
0 & 0 & 0 & \sqrt{3} & \cdots \\
\vdots & \vdots & \vdots & \vdots & \ddots \end{pmatrix} ,
[/tex]
[tex]
B = A^{\dagger} = \sqrt{n + 1} \ \delta_{m , n + 1} = \begin{pmatrix}
0 & 0 & 0 & \cdots \\
\sqrt{1} & 0 & 0 & \cdots \\
0 & \sqrt{2} & 0 & \cdots \\
0 & 0 & \sqrt{3} & \cdots \\
\vdots & \vdots & \vdots & \ddots \end{pmatrix} .
[/tex] These, as you might know, are infinite-dimensional matrix representation of the operator algebra [itex][ a , a^{\dagger} ] = 1[/itex] of the SHO in the energy eigen-states. Furthermore, you can also check that the following Hermitian combinations [tex]X = \frac{1}{\sqrt{2}} ( A^{\dagger} + A ) , \ \ \ P = \frac{i}{\sqrt{2}} ( A^{\dagger} - A ) ,[/tex] satisfy [tex][X , P] = i [A , A^{\dagger}] = i \mathbb{I}_{\infty} .[/tex] These are the well-known matrix representation of position and momentum operators in the energy-eigen states of the simple harmonic oscillator.
The definition of the trace operation in [itex]M_{\infty} ( \mathbb{C} )[/itex] is a bit technical. However, if the infinite sum of “diagonal elements” converge, we can define the trace by [itex]\mbox{Tr} ( M ) = \sum^{\infty} M_{i i}[/itex]. Now, we can make the following observation: the matrix [itex]A^{\dagger} A[/itex], which is in [itex]M_{\infty} ( \mathbb{C} )[/itex], is non-negative and Hermitian. Therefore, using the expressions above for [itex](A , A^{\dagger})[/itex] and our definition for the trace, we find (according to Euler) [tex]\mbox{Tr} ( A^{\dagger} A) = \mbox{Tr} ( A A^{\dagger} ) = \sum_{k = 1}^{\infty} k = - \frac{1}{12}.[/tex] Thus, on one hand [tex]\mbox{Tr} ( A A^{\dagger} ) - \mbox{Tr} ( A^{\dagger} A ) = 0 ,[/tex] on the other hand [tex]\mbox{Tr} ( A A^{\dagger} - A^{\dagger} A ) = \mbox{Tr} ( \mathbb{I}_{\infty}) = 1 + 1 + 1 + \cdots .[/tex] Since everybody (including Euler) agree that the infinite sum [itex]( 1 + 1 + 1 + \cdots )[/itex] diverges, we conclude that [tex]\mbox{Tr} ( A A^{\dagger} - A^{\dagger} A ) \neq \mbox{Tr}( A A^{\dagger} ) - \mbox{Tr}( A^{\dagger} A ) .[/tex] This means that “our definition of the trace” can not be linear functional on [itex]M_{\infty}( \mathbb{C} )[/itex], which seems very bizarre.

Sam
 
Last edited:
  • Like
Likes Ernesto Paas, Frimus and TrickyDicky
  • #15
Thanks for confirming me correct. As for infinite dim spaces I give up !
 
  • #16
Of course, you should explain, how you come to the formula
$$\sum_{k=1}^{\infty} k=-\frac{1}{12}.$$
As it stands, it's of course wrong. The usual definition of the limit of a series is the limit of the sequence of partial sums, and that give ##\infty##.
 
  • #17
dyn said:
As for infinite dim spaces I give up !
Do not. That's the meaningful bit.
 
  • #18
vanhees71 said:
Of course, you should explain, how you come to the formula
$$\sum_{k=1}^{\infty} k=-\frac{1}{12}.$$
As it stands, it's of course wrong. The usual definition of the limit of a series is the limit of the sequence of partial sums, and that give ##\infty##.
Hopefully Sam will answer but in the meantime this is my understanding.
You have to use analytic continuation, that is, impose dependence on Riemann zeta function, in this particular case for s=1, and the particular series 1+2+3+... , in other bases you get other convergences. This comes at the cost of losing basis-independence of the trace i.e. incompatibility with linearity that Sam found bizarre above.
 
  • #19
Thanks. I won't give up but at this moment in time I know nothing about Riemmann zeta functions and the like so I will have to come back to it when I know more but even some of you advanced guys seem to have different interpretations of it.
 
  • #20
TrickyDicky said:
Riemann zeta function, in this particular case for s=1
Actually that was meant to be -1, 1 is the pole of the function.

dyn said:
Thanks. I won't give up but at this moment in time I know nothing about Riemmann zeta functions and the like so I will have to come back to it when I know more but even some of you advanced guys seem to have different interpretations of it.
I'm sure the advanced guys(I'm certainly not included in that group) will agree on the math.
 
  • #21
If one is going to start taking traces of operators, those operators must first be of trace class. But ##\mathbb{I}_\infty## is not trace class, so one's attempted manipulations disintegrate right there. No point proceeding further.

Another complication is that expressions like ##AB-BA## only make sense if ##A## and ##B## have compatible domains and ranges. Again, that's why one must move to the larger framework of Rigged Hilbert Spaces.

There's also the subtle issue of: "which topology are you working in?". The problems found so far are partly because one is implicitly working in a strong topology. But consider what happens if we move to weak topology:

Let's adopt the usual representations ##X=x##, ##P= -i\partial_x##, operating on Schwartz space (i.e., functions of rapid decrease). Let ##f,g## be any 2 such Schwartz functions. Then
$$ \int\! dx\, g^* \Big( -ix\partial_x f + i\partial_x x f \Big)
= \int\! dx\, g^* \Big( -ix\partial_x f + ix\partial_x f + i (\partial_x x) f\Big)
= \int\! dx\, g^* ( i f ) ~.
$$Since ##f,g,## are arbitrary Schwartz functions, we have thus established the CCRs, i.e., ##[X,P]=i,## in weak topology on Schwartz space.

-> dyn: As others have said, don't give up. Can you access Ballentine, as I suggested earlier?
 
Last edited:
  • #22
Well, ok, but this is a statement that depends on this specific regularization. So one has to argue, why one should use this ##\zeta##-function regularization in this case. Indeed there's a similar technique used to renormalize Feynman diagrams in perturbative relativistic QFTs, the heat-kernel (or the closely related Schwinger proper-time) method. There you can justify these methods by using analytical arguments about the corresponding Green's functions, which make the the renormalized versions uniquely defined, given a renormalization scheme and one can show that physically relevant (observable) quantities like S-matrix elements are renormalization-scheme independent, leading to the renormalization-group equations, which again have a clear physical meaning according to K. Wilson's work.
 
  • #23
vanhees71 said:
Well, ok, but this is a statement that depends on this specific regularization. So one has to argue, why one should use this ##\zeta##-function regularization in this case. Indeed there's a similar technique used to renormalize Feynman diagrams in perturbative relativistic QFTs, the heat-kernel (or the closely related Schwinger proper-time) method. There you can justify these methods by using analytical arguments about the corresponding Green's functions, which make the the renormalized versions uniquely defined, given a renormalization scheme and one can show that physically relevant (observable) quantities like S-matrix elements are renormalization-scheme independent, leading to the renormalization-group equations, which again have a clear physical meaning according to K. Wilson's work.
AFAIK the result is not dependent on this particular Riemann zeta regularizatio(one can use cutoff reg., Ramanujan summation...http://en.wikipedia.org/wiki/1_+_2_+_3_+_4_+_⋯), it is just formally the most adequate when dealing with complex matrices (analytic continuation is granted).
In fact one can use the same Riemann zeta regularization to arrive to convergence for ##i\mathbb{I}_{\infty}## , for ##\zeta(0)=-1/2## http://en.wikipedia.org/wiki/1_+_1_+_1_+_1_+_⋯ rendering a contradiction for the commutation equation also in the infinite dimensional case.

strangerep said:
If one is going to start taking traces of operators, those operators must first be of trace class. But ##\mathbb{I}_\infty## is not trace class, so one's attempted manipulations disintegrate right there. No point proceeding further.
The root of the problem is IMO that the operators ##X## and ##P## are indeed not trace class operators, but that implies not exactly that a trace cannot be defined on them but more specifically one with the additional features that the trace be finite and independent of the choice of basis. If one takes as valid(wich is debatable) the analytic continuation discussed above, a finite trace but certainly not independent of the choice of basis is defined for ##X## and ##P##.
The case of ##i\mathbb{I}_{\infty}## seems different, again if we take as valid the convergence to -1/2 it seems to be independent of the choice of basis and it would then still be a trace class operator.
It seems it all rests upon the validity of the sum convergences obtained by analytic continuation. I surely don't know if they are valid, but similar manipulations are routinely done in QFT, then again there the physical justification lies on the great results obtained with such manipulations when compared with experiment.

Another complication is that expressions like ##AB-BA## only make sense if ##A## and ##B## have compatible domains and ranges. Again, that's why one must move to the larger framework of Rigged Hilbert Spaces.

There's also the subtle issue of: "which topology are you working in?". The problems found so far are partly because one is implicitly working in a strong topology. But consider what happens if we move to weak topology:

Let's adopt the usual representations ##X=x##, ##P= -i\partial_x##, operating on Schwartz space (i.e., functions of rapid decrease). Let ##f,g## be any 2 such Schwartz functions. Then
$$ \int\! dx\, g^* \Big( -ix\partial_x f + i\partial_x x f \Big)
= \int\! dx\, g^* \Big( -ix\partial_x f + ix\partial_x f + i (\partial_x x) f\Big)
= \int\! dx\, g^* ( i f ) ~.
$$Since ##f,g,## are arbitrary Schwartz functions, we have thus established the CCRs, i.e., ##[X,P]=i,## in weak topology on Schwartz space.

-> dyn: As others have said, don't give up. Can you access Ballentine, as I suggested earlier?
I fail to see the direct relation of anything said in this thread with rigged HS. The discussion didn't enter into the link between distributions and square-integrability, it is in a stage quite previous to reaching that issue, if only for one thing, to construct a continuous dual space it is necessary to guarantee the existence of linear functionals from the space into the complex field.
 
  • Like
Likes vanhees71
  • #24
Well, that may all well be true, but in the here discussed case, it's totally unclear to me, why it should be a "natural thing" to use zeta-function regularization or a cutoff regularization or whatever to define these traces. Take, e.g., a simple regularization
$$f(z)=\sum_{n=0}^{\infty} n \exp(-z n),$$
which converges obviously for ##\mathrm{Re} z>0##. It's easy to give the analytic expression by introducing the generating function
$$g(z)=\sum_{n=0}^{\infty} \exp(-z n)=\frac{1}{1-\exp(-z)}.$$
Obviously we have
$$f(z)=g'(z)=\frac{\exp(-z)}{[1-\exp(-z)]^2}.$$
Now you can Laurent expand this around ##z=0##, leading to
$$f(z)=\frac{1}{z^2}-\frac{1}{12} + \mathcal{O}(z^2).$$
To give the trace, discussed in Posting #14 a meaning, you have to subtract the singular term ##1/z^2## and then take ##z \rightarrow 0##, but why should this "renormalization" be justified from any physics considerations in this case? At least, I can't make sense of such a subtraction.

In the field-theory cases, where a similar series, e.g., occurs in the usual most simple treatment of the Casimir effect for two infinitely large metallic plates, the necessary subtraction makes perfect physical sense, because you have to subtract the vacuum fluctions in free space, which are also divergent integrals. Using the same (cutoff or zeta-function or heat-kernel or whatever other) regularization you find a unique answer and particularly a reason for why to subtract the divergent pieces in whatever regularization you use.

As far as I can see, there's no physical meaning to define the trace of the unit operator or that of non-trace-class operators like ##\hat{x}##, ##\hat{p}##, ##\hat{x} \hat{p}##, etc. using whatever regularization technique to "sum" the divergent series.

What is, however, very clear is that the Heisenberg algebra cannot be realized exactly in any finite-dimensional Hilbert space, and the trace argument is very valid, leading to a contradiction, as discussed in this thread.
 
  • #25
vanhees71 said:
Well, that may all well be true, but in the here discussed case, it's totally unclear to me, why it should be a "natural thing" to use zeta-function regularization or a cutoff regularization or whatever to define these traces. Take, e.g., a simple regularization
$$f(z)=\sum_{n=0}^{\infty} n \exp(-z n),$$
which converges obviously for ##\mathrm{Re} z>0##. It's easy to give the analytic expression by introducing the generating function
$$g(z)=\sum_{n=0}^{\infty} \exp(-z n)=\frac{1}{1-\exp(-z)}.$$
Obviously we have
$$f(z)=g'(z)=\frac{\exp(-z)}{[1-\exp(-z)]^2}.$$
Now you can Laurent expand this around ##z=0##, leading to
$$f(z)=\frac{1}{z^2}-\frac{1}{12} + \mathcal{O}(z^2).$$
To give the trace, discussed in Posting #14 a meaning, you have to subtract the singular term ##1/z^2## and then take ##z \rightarrow 0##, but why should this "renormalization" be justified from any physics considerations in this case? At least, I can't make sense of such a subtraction.

In the field-theory cases, where a similar series, e.g., occurs in the usual most simple treatment of the Casimir effect for two infinitely large metallic plates, the necessary subtraction makes perfect physical sense, because you have to subtract the vacuum fluctions in free space, which are also divergent integrals. Using the same (cutoff or zeta-function or heat-kernel or whatever other) regularization you find a unique answer and particularly a reason for why to subtract the divergent pieces in whatever regularization you use.

As far as I can see, there's no physical meaning to define the trace of the unit operator or that of non-trace-class operators like ##\hat{x}##, ##\hat{p}##, ##\hat{x} \hat{p}##, etc. using whatever regularization technique to "sum" the divergent series.

What is, however, very clear is that the Heisenberg algebra cannot be realized exactly in any finite-dimensional Hilbert space, and the trace argument is very valid, leading to a contradiction, as discussed in this thread.
I share your concerns wrt the physical justification of those traces other than in physics it is in general preferable to have finite than infinite results. Of course in this particular case defining these finite traces for infinite-dimensional HS has important consequences, probably the reason why even though these methods of obtaining finite traces have been known for years, they are not applied to the commutation equation, but this is probably outside the scope of this forum.

Mathematically the manipulation here is on par with all those done for many different calculations on QFT, from Casimir effect to vacuum energy density , to Hawking radiation in curved spacetime, or anything for which you might use dimensional regularization (that according to wikipedia is equivalent to RZ regularization).
I think the zeta regularization is cleaner than your example and doesn't involve that substraction(has no counterterms) but I'm surely no expert. I know Zeta funtion regularization is routinely used to define traces and determinants of self-adjoint operators.
It is important to make clear that the traces defined this way are not the usual sense traces from linear algebra, they are highly nonlinear as all these regularizations are clearly nonlinear.
 
Last edited:
  • #26
TrickyDicky said:
Hopefully Sam will answer but in the meantime this is my understanding.
You have to use analytic continuation, that is, impose dependence on Riemann zeta function, in this particular case for s=1, and the particular series 1+2+3+... , in other bases you get other convergences. This comes at the cost of losing basis-independence of the trace i.e. incompatibility with linearity that Sam found bizarre above.

I see no question to answer, so I just make the following remarks for added clarification:
1) Divergent series are not “the invention of the devil”. Some divergent series can be summed rigorously. For example, the “usual sum” of [itex]1 - 2 + 3 - 4 + \cdots[/itex] diverges, but we can rigorously show that its Abel sum exists and is equal to [itex]1/4[/itex]. Other divergent series need to be regularized: by introducing suitable cutoff function, one can justify the use of the Euler-Maclaurin formula to them.
2) The Euler sum [itex]\sum_{1}^{\infty} k = - 1 / 12[/itex]:
A) In physics, it has passed the test:
(i) The whole derivation of the Casimir force can be summarized by the replacement [itex]\sum^{\infty} k \to ( - 1 / 12 )[/itex].
(ii) In bosonic string theory, the mass of particles in the [itex]J[/itex] excitation is given by [tex]\alpha^{ ’ } m^{2} = J + \frac{D - 2}{2} \sum_{n = 1}^{\infty} n .[/tex] It is also known that the [itex]J = 1[/itex] excitations are massless. Thus, by using the Euler sum in the relation above, we conclude that bosonic string theory is consistent only in space-time of dimension [itex]D = 26[/itex].
B) In mathematics, “all roads take you to Rome”:
(i) Zeta function regularization leads to [tex]\zeta ( - 1 ) = - \frac{1}{12} .[/tex]
(ii) The Heat Kernel Regularization gives you [tex]\lim_{r \to 0} \left( \frac{e^{r}}{( e^{r} - 1)^{2}} - \frac{1}{r^{2}} \right) = \lim_{r \to 0} \left( - \frac{1}{12} + \mathcal{O} ( r^{2} ) \right) = - 1 / 12 .[/tex]
(iii) Ramanujan’s summation method gives us (in terms of Bernoulli’s numbers) [tex]C = - \frac{B_{1}}{1!} f ( 0 ) - \frac{B_{2}}{2!} f^{ ’ } ( 0 ), [/tex] which for our case, [itex]f(x) = x[/itex], leads to [itex]( - 1 /12)[/itex].
3) There is no such thing as “free lunch” : all regularization methods are either not linear or not stable.
4) The main point of my previous post is to highlight point (3): The trace of infinite-dimensional matrices is neither rigorous nor nice: (i) the very notion of diagonal elements lose its meaning. This is why I wrote “diagonal elements”. (ii) the defining properties of the trace in finite dimension, i.e. linearity and invariance (or cyclicity), can not both be maintained in infinite-dimension. This is what happened when we “traced” the number operator [itex]A^{\dagger} A[/itex]: the trace was invariant (cyclic) but not linear.

Sam
 
  • #27
TrickyDicky said:
Mathematically the manipulation here is on par with all those done for many different calculations on QFT, from Casimir effect to vacuum energy density , to Hawking radiation in curved spacetime, or anything for which you might use dimensional regularization (that according to wikipedia is equivalent to RZ regularization).
I think the zeta regularization is cleaner than your example and doesn't involve that substraction(has no counterterms) but I'm surely no expert. I know Zeta funtion regularization is routinely used to define traces and determinants of self-adjoint operators.
It is important to make clear that the traces defined this way are not the usual sense traces from linear algebra, they are highly nonlinear as all these regularizations are clearly nonlinear.
Of course, you can use the ##\zeta##-function regularization as well. Then you simply use the definition of the ##\zeta## function
$$\zeta(z)=\sum_{n=1}^{\infty} n^{-z}$$
and then analytically continue it. This immediately gives the ##\zeta##-regularized sum
$$\sum_{n=1}^{\infty} n \simeq \zeta(-1)=-\frac{1}{12}.$$
My point, however is that this cannot be interpreted as a meaningful result for the application to the trace discussed here.

In contrast to that, in the QFT examples (any Feynman diagram at 0 or finite temperature, closed diagrams giving contributions to the thermodynamic potential, vacuum energies for evaluating the Casimir effect etc.) you have clearly defined renormalization conditions to make physical sense. Even if some regularization scheme (as for the sum of all integers the ##\zeta## regularization does) gives a finite result, you still have to make the appropriate subtractions which are well defined by your renormalization scheme, and one can show that this renormalized results within a given scheme are independent of the regularization scheme used. Often you can forget about regularization altogether and just subtract directly the integrands given by your Feynman diagrams and evaluate the then finite integral (BPHZ renormalization).

Now, in the case of the position-momentum commutator there is no such renormalization prescription which makes physics sense out of this trace.
 
  • #28
vanhees71 said:
Of course, you can use the ##\zeta##-function regularization as well. Then you simply use the definition of the ##\zeta## function
$$\zeta(z)=\sum_{n=1}^{\infty} n^{-z}$$
and then analytically continue it. This immediately gives the ##\zeta##-regularized sum
$$\sum_{n=1}^{\infty} n \simeq \zeta(-1)=-\frac{1}{12}.$$
My point, however is that this cannot be interpreted as a meaningful result for the application to the trace discussed here.

In contrast to that, in the QFT examples (any Feynman diagram at 0 or finite temperature, closed diagrams giving contributions to the thermodynamic potential, vacuum energies for evaluating the Casimir effect etc.) you have clearly defined renormalization conditions to make physical sense. Even if some regularization scheme (as for the sum of all integers the ##\zeta## regularization does) gives a finite result, you still have to make the appropriate subtractions which are well defined by your renormalization scheme, and one can show that this renormalized results within a given scheme are independent of the regularization scheme used. Often you can forget about regularization altogether and just subtract directly the integrands given by your Feynman diagrams and evaluate the then finite integral (BPHZ renormalization).

Now, in the case of the position-momentum commutator there is no such renormalization prescription which makes physics sense out of this trace.

Renormalization prescriptions originally had no physical justification other than getting rid of nonsensical results of the original theory, only three decades later a heuristic justification based on the concept of "effective theories" was put forth. In our case what one "substracts" are the divergent traces if you insist. The independence of results from regularization scheme has already been commented.
We have agreed that there is no finite dimensional representation of the canonical commutation using the trace properties. It is perhaps sensible to try and find a trace that makes sense of the the canonical commutation formula for linear operators in the infinite-dimensional case, for instance if one decides that just the "no-contradiction" of the undefined trace case is not enough formally to justify the commutative relation(the OP for one insistently demanded a formal justification beyond the absence of contradiction).
Apparently the regularized trace also gives a contradictory position-momentum commutator equation, whether this result is physically meaningful or pure nonsense is not for me to decide.
 
Last edited:
  • #29
The point of the divergences in perturbative relativistic QFT's is that it is an abuse of mathematics rather than a really physically relevant "bug". The reason for the divergences is to multiply distribution-valued field operators at the same space-time point and take expectation values of such undefined products. If you formulate everything in terms of "smeared" field operators, no such divergence problems exist. This is put forward nicely in the textbook by G. Scharff, Finite QED.

Nevertheless, also with perfectly finite results you have to renormalize, because what you do in terms of perturbation theory is also unphysical. You start with, say, non-interacting electrons, neglecting the electromagnetic field produced by it. Then you add the interaction in a perturbative way, but this also implies the eigenfields of the particles, adding to the energy (and mass) of the described particle-like excitation (usually called a "particle" for convenience). The observed mass is of course that of a full electron, including its em. field around it, and this mass is observed in scattering experiments at a certain energy level. The same holds true for the other parameters appearing in the theory like coupling constants (here the em. fine-structure constant), the wave-function normalization, etc. These free parameters must be defined via a given renormalization scheme. Changing the renormalization scheme doesn't change the physics, leading to the renormalization-group equations.

The only difference between a Dyson-renormalizable and an "effective" field theory is that in the former case you need to define only a finite number of parameters for fixing the renormalization scheme, why you need infinitely many for effective field theories. The latter class of theories is justified by symmetry considerations, giving the only constraints to the infinitely many parameters, leading to some predictive power (e.g., chiral perturbation theory as an effective theory for hadrons, based on the (accidental) symmetries of QCD in the light-quark sector).
 
  • Like
Likes TrickyDicky
  • #30
I tend to agree that the parallelism with renormalization is a red herring. The OP was about the validity of the position-momentum commutator equation. If we dedecide the trace cannot help us here because it is undefined, the problem remains of how to make sense of the commutator if the product of infinite-dimensional position and momentum is not well defined either(or is it?). I'd like to hear the experts wisdom on this.
 
  • #31
Why shouldn't the product of operators be well-defined? Of course it is on their domain, which is a proper dense subset of the separable Hilbert space. It's most simple to see in the position representation, where these operators are defined as
$$\hat{x} \psi(x)=x \psi(x), \quad \hat{p} \psi(x)=-\mathrm{i} \mathrm{d}_x \psi(x).$$
These operators are well-defined on, e.g., the Schwartz space of the rapidly falling smooth functions, which is dense on the realization of the separable Hilbert space in terms of square-integrable functions, ##L^2(\mathbb{R},\mathbb{C})##, and they are essentially self-adjoint. So they are properly defined as representants of position and momentum in non-relativistic single-particle quantum theory.
 
  • #32
vanhees71 said:
Why shouldn't the product of operators be well-defined? Of course it is on their domain, which is a proper dense subset of the separable Hilbert space. It's most simple to see in the position representation, where these operators are defined as
$$\hat{x} \psi(x)=x \psi(x), \quad \hat{p} \psi(x)=-\mathrm{i} \mathrm{d}_x \psi(x).$$
These operators are well-defined on, e.g., the Schwartz space of the rapidly falling smooth functions, which is dense on the realization of the separable Hilbert space in terms of square-integrable functions, ##L^2(\mathbb{R},\mathbb{C})##, and they are essentially self-adjoint. So they are properly defined as representants of position and momentum in non-relativistic single-particle quantum theory.
They are well-defined that way by postulate after the fact that the commutator equation is an axiom of the theory. The whole thread is about trying to specify how the axiom fits in functional analysis before taking it as an axiom, that's the hard part, the textbook answer you give starts from the axiom obviously.
When Born and Jordan firts came to to derive it in 1925 from Heisenberg breakthrough that summer, they did it using semiclassical arguments that were later found not general, and mathematically based on infinite matrices (therefore the derivation was representation-dependent). That was even before the linear operator concept applied to QM appeared.
 
  • #33
That's why nowadays we prefer an approach to teach quantum theory in the modern way, starting right away with Dirac's formulation.
 
  • #34
Sure, that's quite practical when encountering QM for the first time, but let's say a fellow physicist researching foundations asked about this.
 
  • #35
TrickyDicky said:
Sure, that's quite practical when encountering QM for the first time, but let's say a fellow physicist researching foundations asked about this.

Well, this fellow would do something along the following lines: we look for a Hilbert space H equipped with the CCR. We can see this statement as some abstract "equation" for which we seek for solutions. The "solutions" are pairs composed by: i) concrete Hilbert spaces; ii) equipped with a concrete realization of the CCR in terms of operators in this space. The problem will be solved once we find all the possible concrete solutions. The equivalence of solutions is given by unitary equivalence of the pairs. By inspection, we can see that no such solution exists if the Hilbert space is taken to be finite dimensional (because of the trace argument mentioned earlier). On the other hand, the solution offered in post #31 is a valid solution, so at least we know that solutions exist. A much more serious issue is raised by the fact that one can show that at least one of the operators in the (abstract) CCR cannot be bounded. This forces us to take the domain of the operators in consideration, since two pairs can be made inequivalent just by changing the domain of the operators (i.e., the formal expression of the operators is still the same for both, we just change the domain). A possible solution to this problem is to restrict the kind of solutions we look for, i.e., rather than considering some general solution for the CCR, we look for a solution that arises as the infinitesimal generators of the "exponentiated CCR", i.e., the Weyl *-algebra (based on a finite dimensional symplectic vector space, since we are dealing with ordinary QM rather than bosonic QFT). There's actually a physical justification for this restriction, since the Weyl relations are equivalent to the Imprimitivity Condition, which captures the notion of the homogeneity of (physical) space.

So, we have posed the problem in a way that takes into consideration the domain difficulties. What's next? we could keep trying to find solutions by inspection or guessing. But that's useless, since there could be an infinite number of inequivalent solutions!

The best way to attack the problem is to formulate it in such a way so that we can apply some powerful theorem that will give us automatically all of the possible solutions of the problem at once (note that, since this process will give us all the solutions, it already has to contain the previous result we got by inspection, namely, that there's no solution in finite dimensions).

There are two important ways to do this. The first is group theoretical. One notices that asking for a realization of the Weyl relations is equivalent to asking for a (unitary, etc.) representation of certain Lie group (the Heisenberg group, which is the unique connected, simply connected Lie group with the CCR as Lie algebra) and that satistifies certain condition. So, the problem now has been reduced to the mathematical problem of finding all of the possible representations of the Heisenberg group that satisfy certain condition. The Heisenberg group is a non-compact group, so we cannot use the Peter-Weyl theorem here (and that's good, otherwise we would get that the irreducible representations are necessarily finite dimensional, in contradiction with the previous result by inspection). But the Heisenberg group is a semidirect product of a normal, abelian subgroup and a closed subgroup. And there's a very powerful theorem, called the Mackey Machine, that deals with the representation theory of these type of groups. The theorem states that there's a family of irreducible and inequivalent representations (called induced representations) and that all other abstract or generic irreducible representation is equivalent to some representation in this family. Thus, the problem is solved if we can build this family of induced representations. Fortunately, there's an algorithmic recipe for doing this. In the case of the Heisenberg group, the result is the following: there's only one non-trivial induced representation (given by the exponentiation of the representation of the CCR given in post #31, sometimes called the "Schrödinger representation"). So, up to unitary equivalence of solutions, there's essentially only one solution to the problem (as stated in terms of the Weyl relations). This result is called the Stone-von Neumann theorem. Evidently, since this solution is over an infinite dimensional space and all other solution is equivalent to this one, then no solution exists over a finite dimensional space. Also, since the solution is over a separable space and all other solution is equivalent to this one, then no solution exists over a non-separable space.

The other method relies on the algebraic formulation of QM. By putting a norm, the Weyl *-algebra is transformed into a C*-algebra. The GNS construction theorem states that any representation of a C*-algebra that satisfies <f|Raf>=w(a) (where |f> is a cyclic vector, R is the representation of the algebra, and w is an algebraic state), is equivalent to the GNS construction based on the state w. Now, for the case of the finite dimensional Weyl C*-algebra, one can show that for any abstract (i.e., generic) realization R of the algebra on a Hilbert space, there always exists a cyclic vector |f> such that <f|Raf>=w(a), where w is an algebraic state that does not depend on the particular realization, i.e., it's always the same. So, any realization is equivalent to the GNS construction based on this w. It can be shown, of course, that the previous Schrödinger representation satisfies this and in this way both methods simply prove the same result.
 
Last edited:
  • Like
Likes vanhees71 and TrickyDicky

Similar threads

  • Quantum Physics
Replies
31
Views
1K
Replies
4
Views
2K
  • Quantum Physics
Replies
1
Views
811
Replies
10
Views
4K
  • Quantum Physics
Replies
13
Views
718
Replies
2
Views
1K
  • Quantum Physics
2
Replies
57
Views
5K
Replies
6
Views
1K
Replies
2
Views
4K
Replies
17
Views
1K
Back
Top