What are the key theorems on the mathematics of symmetries in quantum mechanics?

Fredrik · Jun 12, 2011

I would like to study the mathematics of symmetries in QM rigorously. Any recommendations?

Let H be a Hilbert space, U the group of unitary operators on H, L the lattice of closed subspaces of H, and G a symmetry group. I'm particularly interested in theorems about the relationship between group homomorphisms from G into Aut(L) and unitary representations (i.e. group homomorphisms from G into U). I would prefer a proof that's general enough to handle both of the most interesting cases at once. That would be G=the Poincaré group and G=the Galilei group. For the moment, I only care about single-particle theories. So I'm not looking for rigorous theorems about QFTs with interactions.

Most, if not all, of the relevant results are proved in Varadarajan (Geometry of quantum theory). I'm wondering if there's another option, since that book is quite hard to read.

samalkhaiat · Jun 12, 2011

Personally, I believe the following are the best ever

C. A. Orzalesi; Charges and Generators of Symmetry Transformations, Rev. Mod. Phys. 42, 381 (1970).

D. Kastler, D. W. Robinson and A. Swieca; Conserved Currents and Associated Symmetries, Comm. Math. Phys. 2, 108 (1966).

sam

dextercioby · Jun 13, 2011

I think Varadarajan's book should be used as a reference and not as a learning tool. The subject of symmetries implementation in quantum mechanics is definitely discussed in other places as well, from the reading of whom you can benefit more.

I believe there's a shortage of sources where authors discuss the subject without having a specific example in mind, and the most choose the (restricted) Poincare group for its importance in building a quantum theory of fields. Others choose the Galilei group simply because it's more mathematically interesting and the amount of <literature> is very small compared to the Poincare case.

So I can reference you to the standard (minimal) sources such as Barut & Raczka for a general overview on group theory, then D.J. Simms for the symmetry problem in case of the Poincare group (a better overview with more functional analysis and less differential geometry is in the second chapter of Bernd Thaller's "The Dirac Equation"), and Cassinelli et al. in case of the Galilei group.

Of course, an alternative to it all is the second chapter in Weinberg's book (1st volume), which I'm sure you have already seen.

Fredrik · Jun 13, 2011

Thanks guys. I think all of those tips will be useful at some point, but I think they're not quite what I'm looking for right now. (I haven't checked out Simms yet, and only read the abstracts of the two articles Sam referenced. The books that discuss how to find irreducible representations will be useful soon, but not today.

)

The theorem that's the most interesting for me right now is the one that describes the relationship between projective representations of a group and unitary representations of its universal covering group (with all the technical assumptions included in the statement of the theorem, and then used in its proof). As I recall, Weinberg's presentation is a good overview, but it's lacking a rigorous statement and proof. Varadarajan's theorem 7.40 looks good, but it's hard to tell how much of the three chapters "Measure theory on G-spaces", "Systems of imprimitivity" and "Multipliers" I need to study before I can understand the theorem and its proof.

strangerep · Jun 14, 2011

Fredrik said:

The theorem that's the most interesting for me right now is the one that describes the relationship between projective representations of a group and unitary representations of its universal covering group (with all the technical assumptions included in the statement of the theorem, and then used in its proof). As I recall, Weinberg's presentation is a good overview, but it's lacking a rigorous statement and proof.

I guess you're referring to Weinberg v1, sect 2.7 (theorem statement starts at
bottom of p83), and his more extensive proof in appendix B to ch2?

Since I haven't studied Weinberg's appendix B in detail, it occurred to me that if you
can't find another reference more to your liking then it might be "fun" to try and
identify the insufficiently-rigorous parts of Weinberg's proof and tighten them up,
(i.e., as an exercise here on PF).

Fredrik · Jun 14, 2011

strangerep said:

I guess you're referring to Weinberg v1, sect 2.7 (theorem statement starts at
bottom of p83), and his more extensive proof in appendix B to ch2?

Since I haven't studied Weinberg's appendix B in detail, it occurred to me that if you
can't find another reference more to your liking then it might be "fun" to try and
identify the insufficiently-rigorous parts of Weinberg's proof and tighten them up,
(i.e., as an exercise here on PF).

That would be fun. I guess I should start by reading appendix B. (I only had a quick look at it years ago). Even if I will find it unsatisfactory, it might make it easier to understand the rigorous proofs later. I'm a bit concerned by the fact that I don't see the words "locally compact", "analytic", "Borel" or "central extension" anywhere in there, while they appear all over the place in Varadarajan. But it's possible that some of those things are only relevant to Varadarajan because he wants to prove theorems that are more general than what we need. For example, when he writes U(xy)=m(x,y)U(x)U(y), the codomain of the function m is an arbitrary locally compact and simply connected abelian group K, not specifically the group of complex numbers with absolute value 1.

vanhees71 · Jun 14, 2011

Bargmann's original paper is also good to read, and for my poor physics purposes it sounds quite strict. Whether pure mathematicians are satisfied with its rigorosity, I can't judge:

Bargmann, V.: Note on Wigner's Theorem on Symmetry Operations , Journ. Math. Phys. 5, 862, 1964

http://www.staff.science.uu.nl/~henri105/Teaching/CFTclass-Bar64.pdf

dextercioby · Jun 14, 2011

See posts 71-74 from here https://www.physicsforums.com/showthread.php?t=304711&page=5&highlight=Unbounded+operators

samalkhaiat · Jun 14, 2011

Fredrik said:

The theorem that's the most interesting for me right now is the one that describes the relationship between projective representations of a group and unitary representations of its universal covering group

This part of the story had already been told by Wigner and Bragmann in their classic papers:

E. Wigner; On Unitary Representations of the Homogenous Lorentz Group, Ann. Math. 40, 149 (1939).
V. Bragmann; On Unitary Ray Representations of Continuous Groups, Ann. Math. 59, 1 (1954).

(with all the technical assumptions included in the statement of the theorem, and then used in its proof)

For rigorous but readable treatment of cohomological problems in physics, you may look at the following monograph;

Lie groups, Lie algebras, cohomology and some applications in physics
Jose A. De Azcarraga and Jose M. Izquierdo. Camb. Uni. Press (1998).

sam

Fredrik · Jun 14, 2011

vanhees71 said:

Bargmann's original paper is also good to read, and for my poor physics purposes it sounds quite strict. Whether pure mathematicians are satisfied with its rigorosity, I can't judge:

Bargmann, V.: Note on Wigner's Theorem on Symmetry Operations , Journ. Math. Phys. 5, 862, 1964

http://www.staff.science.uu.nl/~henri105/Teaching/CFTclass-Bar64.pdf

Thanks for the link. That looks quite readable. But I see that this particular paper only covers a part of the material in section IV.3 of Varadarajan. Since that's one of the most readable sections, I think I will study it first.

dextercioby said:

See posts 71-74 from here https://www.physicsforums.com/showthread.php?t=304711&page=5&highlight=Unbounded+operators

I remember seeing that when you posted it. Thanks for the reminder. You are of course right that Wigner's theorem about how orthogonality-preserving permutations of the set of rays correspond to unitary or antiunitary operators (theorem 4.29 in Varadarajan) doesn't have anything to do with groups or their topological properties. I guess I've been confusing a few different issues with each other.

samalkhaiat said:

This part of the story had already been told by Wigner and Bragmann in their classic papers:

E. Wigner; On Unitary Representations of the Homogenous Lorentz Group, Ann. Math. 40, 149 (1939).
V. Bragmann; On Unitary Ray Representations of Continuous Groups, Ann. Math. 59, 1 (1954).

For rigorous but readable treatment of cohomological problems in physics, you may look at the following monograph;

Lie groups, Lie algebras, cohomology and some applications in physics
Jose A. De Azcarraga and Jose M. Izquierdo. Camb. Uni. Press (1998).

Thanks for the tips. I've been thinking that it's usually easier to learn from books than from articles, but this might be a situation where I really should be looking at the original articles. I will have a look as soon as I can.

Fredrik · Jun 15, 2011

I haven't yet read the article by Bargmann that Sam referenced, but I think it will solve most of my problems. Right now I don't have an easy way to access it. I might have to go to a library. If someone wouldn't mind downloading it for me, I'd appreciate it. http://www.jstor.org/pss/1969831. Let me know if you have downloaded it, and I'll PM you my email address.

I should probably explain what I'm trying to do, in case someone knows a particularly nice proof of one of the missing pieces.

The concept of "state" can be defined in a theory-independent way: States are equivalence classes of objects that can be "measured" by measuring devices. If I have understood this approach correctly, there's a lattice (bounded, σ-complete and orthocomplemented) associated with each theory in a class of theories that's large enough to include all the classical theories and all the quantum theories. Let S be the set of states, and let P be the set of probability measures on the lattice. When we're dealing with a theory from this class, there's a function that takes [n members of S, and n numbers in the interval [0,1] that add up to 1] to a member of S. The result can be thought of as a convex combination of states.

P is a convex subset of a vector space. By an "automorphism of P", I mean a bijection from P onto itself that preserves convex combinations. By an "automorphism of S", I mean a bijection from S onto itself that preserves those things that we can think of as convex combinations of states. There's a bijection B from S onto P that "preserves convex combinations" in the sense that it takes the state formed using the numbers [itex]c_1,\dots,c_n[/itex] and states [itex]s_1,...,s_n[/itex] to the convex combination [itex]\sum_i c_iB(s_i)[/itex].

What I've said so far shouldn't be thought of as theorems, but rather as a definition of the class of theories we're dealing with. From this point of view, the most natural way to incorporate an assumption about the properties of spacetime into any theory in this class is to translate it to a statement about automorphisms of the set of states. For example, "space is isotropic" translates to "there's a homomorphism from SO(3) into Aut(S)". Since S is isomorphic to P, that translates to "there's a homomorphism from SO(3) into Aut(P)".

In a quantum theory, the lattice is the set of closed subspaces of a Hilbert space, partially ordered by inclusion (i.e. by [itex]\subset[/itex]). So P is the set of probability measures on that particular lattice. Gleason's theorem says that P is isomorphic to the set of state operators (right?). I will denote that set by N. So we can also translate "space is isotropic" to "there's a homomorphism from SO(3) into Aut(N)".

If I understand theorem IV.33 (page 111) in Varadarajan correctly, each member of Aut(N) is of the form [itex]\rho\mapsto T\rho T^{-1}[/itex], where T is a symmetry operator (i.e. a unitary or antiunitary operator) that's determined up to a phase factor by this relationship.

Let Z be the set of symmetry operators, and instead of SO(3), let's consider an arbitrary group G that satisfies the appropriate technical requirements, whatever they are. What does the existence of a homomorphism [itex]\phi:G\rightarrow\operatorname{Aut}(N)[/itex] imply? Does it imply that there's a projective representation [itex]T:G\rightarrow Z[/itex]. If it does, does that imply that there's a unitary representation [itex]U:G^*\rightarrow Z[/itex], where G* is the universal covering group of G.

What I need are the theorems that complete this picture. What is the relationship between homomorphisms [itex]\phi:G\rightarrow\operatorname{Aut}(N)[/itex] and projective representations [itex]T:G\rightarrow Z[/itex]? What is the relationship between such projective representations of G and unitary representations of its universal covering group? What additional assumption do I need to make about the homomorphism from G into Aut(S) (or Aut(P)) to end up with an irreducible representation of the covering group in the final step?

I suspect that the Bargmann article has most of the answers, but if any of you have any insights to share, feel free to do so.

element4 · Jun 15, 2011

You might find http://www.staff.science.uu.nl/~ban00101/lecnotes/repq.pdf" by Erik van den Ban useful. I had a course on these topics some years ago where these notes were used. We also used some very old notes by DJ Simms which were very good, but I don't think it contained the proof you are interested in.

Fredrik · Jun 15, 2011

I've been doing some more reading, and it's getting a bit clearer. If S={states}, P={probability measures on the lattice of closed subspaces}, N={state operators}, Z={symmetry operators}, and E={operators of the form c1, where 1 is the identity operator and c is a complex number with |c|=1}, then we have the following isomorphisms: [tex]\operatorname{Aut}(S)\simeq \operatorname{Aut}(P)\simeq \operatorname{Aut}(N)\simeq Z/E[/tex] The first one should be thought of as a part of the definition of "quantum mechanics" (the framework in which quantum theories are defined). The others are all proved in Varadarajan. So the assumption that there exists a homomorphism from G into S is equivalent to the assumption that there exists a homomorphism [tex]\phi:G\rightarrow Z/E.[/tex] This [itex]\phi[/itex] is by definition a projective representation (if we also assume that [itex]\phi[/itex] satisfies an additional technical requirement). Assuming that I can understand the details of Varadarajan's proofs, all that remains is a theorem about how homomorphisms of the kind just mentioned correspond to homomorphisms [tex]\hat\phi:\hat G\rightarrow U,[/tex] where [itex]\hat G[/itex] is the universal covering group of G and U is the group of unitary operators. Everyone says that a theorem like that is stated and proved in Bargmann's article, so I will definitely check it out tomorrow.

element4 said:

You might find http://www.staff.science.uu.nl/~ban00101/lecnotes/repq.pdf" by Erik van den Ban useful. I had a course on these topics some years ago where these notes were used. We also used some very old notes by DJ Simms which were very good, but I don't think it contained the proof you are interested in.

Thanks. I haven't had time to really check it out yet, but I have added it to my collection of possibly useful stuff and will examine it more closely in a few days.

jambaugh · Jun 15, 2011

Fredrik said:

I've been doing some more reading, and it's getting a bit clearer. If S={states}, P={probability measures on the lattice of closed subspaces}, N={state operators}, Z={symmetry operators}, and E={operators of the form c1, where 1 is the identity operator and c is a complex number with |c|=1}, then we have the following isomorphisms: [tex]\operatorname{Aut}(S)\simeq \operatorname{Aut}(P)\simeq \operatorname{Aut}(N)\simeq Z/E[/tex] The first one should be thought of as a part of the definition of "quantum mechanics" (the framework in which quantum theories are defined). The others are all proved in Varadarajan. [...]

There's a bit if difficulty with that first one. The {states} object will, as a set have a much larger automorphism group. (The automorphisms of a set is the group of permutations of its elements). If you're considering --rather-- the way "states" are represented(I rather use the word "sharp modes") e.g. the one dimensional subspaces of a complex linear space, you then have a projective sphere (topological object) and the automorphisms are the diffeomorphism of that sphere, again much bigger. It is when you impose the metric structure on the space making it a Hilbert space that you get the same automorphism group. You have then a complex projective sphere with a metric structure (distance between points = angle between rays).

Note that one can derive and prove the Born probability formula from the metric structure (with some interpretation of 0 and 1 norm amplitudes as forbidden and certain transitions) by considering the limit (as the number of systems goes to infinity) of large independent ensembles as a single composite system.

Another slight issue here. Be careful with the word "measure" when speaking of the probability structure on the lattice of subspaces. It doesn't have all the properties of a true measure. (This is why quantum isn't simply classical + uncertainty). Most especially you have for e.g. a finite dimensional system a continuum of "states" (modes) but you do not define a probability density over this manifold, your define finite probabilities for each point.

However once you choose an orthogonal basis and consider only the sub-lattice of subspaces spanned by sets of basis elements you get a (classical) discrete lattice and the probabilities will then form a measure on that basis set. This object (sub-lattice) has again an automorphism group equal to the group of permutations (of the basis).

haushofer · Jun 16, 2011

Hi Fredrik,

as fas as I know Ballentine's book is the only book on QM in which the Bargmann algebra and its relevance for QM is nicely treated.

Fredrik · Jun 16, 2011

jambaugh said:

There's a bit if difficulty with that first one. The {states} object will, as a set have a much larger automorphism group. (The automorphisms of a set is the group of permutations of its elements).

As I said in #11, the term "state" can be defined in a theory independent way, as an equivalence class of objects that can be "measured" by measuring devices. For each [itex]a\in\mathcal L[/itex], where [itex]\mathcal L[/itex] is the lattice associated with the theory, there's a function [itex]P^a:S\rightarrow[0,1][/itex]. We can write [itex]P^a(s)=P_s(a)[/itex] and take this as the definition of another function [itex]P_s:\mathcal L\rightarrow[0,1][/itex]. These comments apply to a very large class of theories, large enough to include all classical theories, all quantum theories, and more. One of the assumptions that go into the definition of this very large class of theories is that if [itex]s_1,\dots,s_n\in S[/itex] and [itex]c_1,\dots,c_n\in[0,1][/itex] are such that [itex]\sum_{k=1}^n c_k =1[/itex], then there's a state s such that [itex]P^a(s)=\sum_{k=1}^n c_k P^a(s_k)[/itex]. It makes sense to think of s as a convex combination of [itex]s_1,\dots,s_n[/itex]. By an "automorphism" of S, I mean a permutation that preserves these convex combinations.

The assumptions that define QM ensure that [itex]\mathcal L[/itex] is the lattice of closed subspaces of a Hilbert space, and that [itex]s\mapsto P_s[/itex] is an isomorphism from S onto P (that "preserves convex combinations"...if I'm allowed to use that phrase a bit loosely).

I have always been interested in theory independent considerations, but it was only recently that I began to understand these things. If it hadn't been for that, I probably wouldn't have mentioned the set S at all. If all we're interested in is QM, than we might as well skip the set S completely and define "state" as "probability measure on the lattice".

jambaugh said:

Note that one can derive and prove the Born probability formula from the metric structure (with some interpretation of 0 and 1 norm amplitudes as forbidden and certain transitions) by considering the limit (as the number of systems goes to infinity) of large independent ensembles as a single composite system.

If you're referring to a 1968 article by James Hartle, I have read it several times, but it just looks more wrong to me each time. I think his argument is deeply flawed, and I reject his conclusion.

dextercioby · Jun 18, 2011

Big thanks to the user <element4> for the reference to the freely available notes. Just skimmed through them and they seem indeed an easier read than Varadarajan's book and somewhat at the level of Cassinelli (but with different focus in terms of examples).

I wish to bring forth one more time the discussion in the second chapter of B. Thaller's <Dirac Equation> for being so intelligible, that one could use all of it to build lecture notes for students (advanced course on quantum mechanics based on functional analysis). I trully found it illuminating.

haushofer · Jun 20, 2011

I have a naive question concerning this Bargmann algebra. As I understand it, in QM the Galilei algebra as a symmetry algebra defines a projective representation on the states in your Hilbert space. Physically I would say that's good enough, because a state is represented by a ray in your Hilbert space, right? So is the only reason considering the Bargmann algebra in QM because of the central extension playing the role of mass (i.e. there are no Casimirs in the Galilei algebra playing the role of mass)?

Second, if I consider classical field theory, what exactly is the meaning of this central extension M at the level of the group? I know that Poisson brackets and the Lagrangian of a free point particle reveal the role of the central extension to be a mass, but what does it mean for the group action on irreps?

I.e., if I have a boost B_i and a translation P_j, then the Baker-Campbell-Hausdorff formula tells me that in the Galilei group

[tex]
e^{t G_i} e^{s P_j} = e^{tG_i + s P_j} = e^{s P_j}e^{t G_i}
[/tex]

So boosts and translations commute, which is already clear from the commutator [B,P]=0. But for the Bargmann algebra, where [P,B]=M and M is a central extension (so it commutes with all the elements of the algebra),

[tex]
e^{t G_i} e^{s P_j} = e^{tG_i + s P_j - \frac{1}{2}st \delta_{ij}M}
[/tex]

while

[tex]
e^{s P_j} e^{t G_i} = e^{tG_i + s P_j + \frac{1}{2}st \delta_{ij}M}
[/tex]

So

[tex]
e^{s P_j} e^{t G_i} = e^{t G_i} e^{s P_j} e^{st \delta_{ij} M}} \,,
[/tex]

and thus boosts and translations applied in the same direction don't commute anymore, but one has an extra phase factor. Does this physically make sense at the classical level?

haushofer · Jun 20, 2011

-edit

jambaugh · Jun 30, 2011

Fredrik said:

...
I have always been interested in theory independent considerations, but it was only recently that I began to understand these things. If it hadn't been for that, I probably wouldn't have mentioned the set S at all. If all we're interested in is QM, than we might as well skip the set S completely and define "state" as "probability measure on the lattice".

Pardon the long pause...
The point I was making was that you weren't being careful with the specific categories of the mathematical objects upon which you were defining your automorphism groups. E.g. the sets qua sets have much more "symmetry" than the intended objects. To restrict to the appropriate symmetry one is appending auxiliary structure. The devil is in the details of that structure (typically in the form of implicit assumptions.)

If you're referring to a 1968 article by James Hartle, I have read it several times, but it just looks more wrong to me each time. I think his argument is deeply flawed, and I reject his conclusion.

I've reworked the derivation myself and there's been quite a bit of work on foundations e.g. projective QM where one relaxes the "assured transition" part of the base assumptions. I believe the argument is sound, if not in the '68 paper, in followup research. (I suggest as a reference: Quantum Relativity by David Finkelstein.)

The point about this derivation (and why I think it valid) is the matching up of the automorphism groups, comparing that for the probability structure (conditioned on the mentioned derivation and relaxing the condition that it is a distribution in the classical sense) vs working with the automophisms of the Hilbert space (which implicitly includes the metric structure) and for the operator algebra (including the adjoint structure). One gets the same non-trivial automorphisms (for the Hilbert space one of course has a central extension which manifests trivially on the prob. and algebra.)

Note that it is enlightening to go beyond symmetries (automorphism group) to consider the monadic structure (endomorphisms) and then to the full blown category structure (homomorphisms).

Back to your definition of "states". I think if you are careful with the categories you'll find that there is an issue with your definition which relates to the error of thinking of the probability structure as a "distribution over states" (I need to parse your definition in more detail to be sure I'm correct in this thinking.)

Specifically the lattice structure is not categorical (not enough identities) for quantum "states".

I'm going to spend an hour or two parsing your definitions to see if I can back up these heuristic criticisms with some specifics.

On a general note, one can embed classical theory within a quantum theory by restricting the available observables, this severely restricts the automorphisms and one gets the equivalent of the unitary embedding of the permutation groups in the embedding of a classical set of states within the quantum mode space.

(Here considering only the automorphisms of the logic, not of the dynamics or even the kinematics.)

Likewise one can always (in a highly artificial fashion) embed arbitrary group representations within a very large permutation group (effectively group actions on a linear space are permutations of elements or rays of that space). And I see this as paralleling the (in my opinion artificial) ontological (re)interpretations of QM e.g. EMW.

(Which then brings up severe problems when one considers kinematics and dynamics.)

jambaugh · Jun 30, 2011

Having looked at it further I believe much of my criticism was misplaced. However I suggest that a more operational format would be to replace your "states" with boolean measurements.

My reasoning is this. You are e.g. defining the lattice structure in terms of the particular mathematical objects one uses in a specific formulation. If you instead begin with an equivalence class of boolean measurements, B with equivalence defined by:
[tex] B_1 \simeq B_2[/tex]
iff sequences of measurement are always exactly correlated in either order.

To be a boolean measurement it must of course have a boolean outcome and must also be idempotent in that repeated immediate applications of the same measurement yields correlated outcomes. (idempotency).

Define the ordered composite measurement [tex]B_3 = B_2 B_1[/tex] to be the

Then you have the lattice structure defined by:
[tex] B_1 \subseteq B_2 \equiv B_1 \circ B_2 = B_2 \circ B_1 = B_1[/tex]
where the composition of actions here is not necessarily in B in general but must be in this case.

This gives a nice empirical basis to the lattice. This doesn't affect anything in previous posts, its just advice.

What I would rather suggest is that you begin with ALL observables then define your "states" (what I would call modes of system preparation) to be identified with the mapping from the observables to their expectation values subject to certain regularity conditions, most especially the identity observable (tautological boolean observable) must have expectation value 1... but one may view states projectively relaxing this condition and defining the states as mappings to relative expectation values.

Call them partition functions:
[tex] Z=\{\zeta: X \to \mathbb{R}\} | (\zeta_1 + \zeta_2)[x] = \zeta_1[x]+\zeta_2[x][/tex]
with X the set of observables.

The reason for this is to give operational meaning to the addition and scalar multiplication operations. What does it mean to take linear combinations of states? We can speak of linear combinations of (relative) expectation value mappings. If you are using the additive structure of the probabilities to define addition of states then you should (IMNSHO) just stick with the entities doing the addition.

Theory specific structure is then:
--1: What are the actualizable observables?
--2: Which of the partition mappings are actualizable as modes of system preparation?
--3: What partition mappings are equivalent to a given measurement after a given partition mapping. In effect what are the conditional expectation values post measurement.

This last will define the logical and transition probability structure of the theory.

Once --1 is established you have aut(X) = Perm(X) and aut(Z) = GL(Z)/GL(1).
Given --2 you should have a restricted space of partition mappings Z'.

--3: May give a representation of X in the endomorphism structure of Z' or you can consider more peculiar theories where the elements of X do not act linearly.

Given Z' is a space and given X acts linearly on Z' via conditional expectation values you can then extend X under addition and ask the question: Does X close under addition (yes in QM, no in CM) or is there a larger space X' to which X is a subset?

Note that in QM, where X forms a linear space, Z is the dual space and Z' is an exponential mapping of Z:
[tex] Z' = \{ e^\zeta: \zeta\in Z\} = \{ \exp[\lambda^k x^*_k]:\lambda_k \in \mathbb{R}\}[/tex]
where the [itex]x^*_k[/itex] are the dual basis of a basis [itex]\{x^k\}[/itex] of X, (and I typically define [itex]x_0=1[/itex] the tautological boolean observable.)

OK, So I got a bit more detailed than I had intended... and there's some regularity questions I may have glossed over. But consider the intend of this procedure... to keep things as operational as possible. Well on second though you can interpret the convex combinations of "states" as probability weighted random selections of modes of system preparation. My "hidden agenda" intent is to make apparent the difference in classical vs quantum theories which mostly lies in the closure of the set of observables to form a linear space. The other is the Noetherean identification of observables with generators of the dynamic group for the system. This implies that X transforms under the adjoint representation of the dynamic group and thence that we an map the X to its Lie algebra.

In the end the logic via boolean observables must extend to the expectation values in the above fashion. That extension imposes regularity conditions which are --I think-- more apparent when considering automorphisms and symmetries, by beginning with the full class of observables. One can even utilize the topology implicit in the conditional probability structure.

Something to think about.?

Fredrik · Jul 1, 2011

jambaugh said:

The point I was making was that you weren't being careful with the specific categories of the mathematical objects upon which you were defining your automorphism groups.

Perhaps I wasn't careful enough with my explanation, but I'm pretty sure I understand the issues. You seem to have come to that conclusion too.

If you find this approach interesting, there are additional details in posts 4-5 in this thread (starting at "think of it this way"). The basic idea is this: We can discover most of this stuff just by thinking hard about what we really mean by a "theory". (A theory assigns probabilities to verifiable statements. Probabilities are numbers assigned by probability measures. Verifiable statements are...well, there's no need to repeat everything I said in the other thread. My point is, there's a lot of information contained in the idea that theories are probability assignments).

Once we have discovered the stuff that all theories have in common, we can write down a few more assumptions that define a smaller class of theories. Some of those assumptions will be assumptions about what reality is like. Some will just be mathematical idealizations. One such collection of additional assumptions defines a class of theories that is large enough to include all the classical theories and all the quantum theories.

jambaugh said:

I believe the argument is sound, if not in the '68 paper, in followup research.

Are we talking about the same article? The one I meant has the title "Quantum mechanics of individual systems".

Hartle writes down an eigenvalue equation [itex]A|i\rangle=a_i|i\rangle[/itex]. He then defines an operator [itex]f_N{}^k[/itex] on the set of N-particle states in a way that's equivalent to saying that its eigenvectors are of the form [tex]|i_1,1\rangle\otimes\cdots\otimes|i_N,N\rangle[/tex] and the eigenvalue is the fraction of the [itex]i_j[/itex] in the eigenvector that are equal to [itex]k[/itex].

The main result is roughly that for large N, [tex]f_N{}^k|s^N\rangle\approx |\langle k|s\rangle|^2|s^N\rangle,[/tex] where [itex]|s^N\rangle=|s\rangle\otimes\cdots\otimes|s\rangle[/itex]. He also takes the limit [itex]N\rightarrow\infty[/itex]. I've seen that there are published articles that have issues with how he does that, but I'm not concerned about the technicalities. My problem is that I can't see a good reason to interpret the result as a derivation of the Born rule. I can't even see a good reason to interpret [itex]f_N{}^k[/itex] as a frequency operator when it acts on something other than one of its eigenstates.

jambaugh said:

[tex]B_3 = B_2 B_1[/tex]

What do you mean by stuff like this? I'm not sure which ones of the sets I defined your Bs belong to. Are they observables? Are they what I called propositions? That would mean that they are equivalence classes of pairs (A,E) where A is an observable and E a Borel set. The pair (A,E) can be interpreted as a yes-no experiment: measure A and interpret a result in A as a "yes" result.

strangerep · Jul 2, 2011

Regarding

J. B. Hartle, "Quantum mechanics of individual systems",
Am. J. Phys., vol36, no8, 1968, p704

Fredrik said:

Hartle writes down an eigenvalue equation [itex]A|i\rangle=a_i|i\rangle[/itex]. He then defines an operator [itex]f_N{}^k[/itex] on the set of N-particle states in a way that's equivalent to saying that its eigenvectors are of the form [tex]|i_1,1\rangle\otimes\cdots\otimes|i_N,N\rangle[/tex] and the eigenvalue is the fraction of the [itex]i_j[/itex] in the eigenvector that are equal to [itex]k[/itex].

The main result is roughly that for large N, [tex]f_N{}^k|s^N\rangle\approx |\langle k|s\rangle|^2|s^N\rangle,[/tex] where [itex]|s^N\rangle=|s\rangle\otimes\cdots\otimes|s\rangle[/itex]. He also takes the limit [itex]N\rightarrow\infty[/itex]. I've seen that there are published articles that have issues with how he does that,

Which articles?

but I'm not concerned about the technicalities. My problem is that I can't see a good reason to interpret the result as a derivation of the Born rule. I can't even see a good reason to interpret [itex]f_N{}^k[/itex] as a frequency operator when it acts on something other than one of its eigenstates.

I don't understand why you say that.

Hartle says that [itex]|s\rangle[/itex] is a general (i.e., arbitrary) pure state, and [itex]|s^N\rangle[/itex] is short for a tensor product of N copies of that state. The reason to interpret [itex]f_N{}^k[/itex] as a frequency operator is embodied in its definition, i.e., Hartle's eq(5). I don't see anything wrong with that.

As for the Born rule, I would have interpreted it in this case to mean that the (frequentist) probability of measuring eigenvalue [itex]k[/itex] associated with the observable A in the (normalized, but otherwise arbitrary) state [itex]|s\rangle[/itex] is [itex]|\langle k|s\rangle|^2[/itex].
What's wrong with that?

Fredrik · Jul 2, 2011

strangerep said:

Which articles?

I don't remember, and I couldn't find them in the ten minutes I was willing to spend on it. I think I've come across two such articles, at least one of them published (probably both), but I didn't read either of them. So I certainly can't vouch for their correctness.

strangerep said:

I don't understand why you say that.

Hartle says that [itex]|s\rangle[/itex] is a general (i.e., arbitrary) pure state, and [itex]|s^N\rangle[/itex] is short for a tensor product of N copies of that state. The reason to interpret [itex]f_N{}^k[/itex] as a frequency operator is embodied in its definition, i.e., Hartle's eq(5).

[itex]f_N{}^k[/itex] can certainly tell us the frequency of |k> in the N-particle state [itex]|i,1\rangle\otimes\cdots\otimes|i,N\rangle[/itex], because that's what it's designed to do. But why would it have anything to do with the frequency of |k> in [itex]|s\rangle\otimes\cdots\otimes|s\rangle[/itex]? That concept doesn't even make sense.

strangerep said:

As for the Born rule, I would have interpreted it in this case to mean that the (frequentist) probability of measuring eigenvalue [itex]k[/itex] associated with the observable A in the (normalized, but otherwise arbitrary) state [itex]|s\rangle[/itex] is [itex]|\langle k|s\rangle|^2[/itex].

Your statement here is the Born rule. But what is it about Hartle's calculations that make people claim that they prove the Born rule? Do you think that they do? If yes, why? All I see is the result that for large N, [tex]f_N{}^k|s^N\rangle\approx |\langle k|s\rangle|^2|s^N\rangle.[/tex] What does this have to do with the Born rule? Is the Born rule supposed to follow from the fact that [itex]\langle s^N|f_N{}^k|s^N\rangle \approx |\langle k|s\rangle|^2[/itex]? To interpret the left-hand side as an expectation value, we have to use the Born rule. And even if we ignore that and pretend that we have obtained the result "the average value of [itex]f_N{}^k[/itex] in a long series of measurements on the state |s> will be [itex]|\langle k|s\rangle|^2[/itex]", that still doesn't imply the Born rule. Maybe if we believe that |s> is a classical superposition of |i> states and that "the value of [itex]f_N{}^k[/itex]" is the frequency of |k> in the classical ensemble. But the idea that |s> represents a classical ensemble is proved false by Bell inequality violations.

strangerep · Jul 2, 2011

Fredrik said:

[itex]f_N{}^k[/itex] can certainly tell us the frequency of |k> in the N-particle state [itex]|i,1\rangle\otimes\cdots\otimes|i,N\rangle[/itex], because that's what it's designed to do. But why would it have anything to do with the frequency of |k> in [itex]|s\rangle\otimes\cdots\otimes|s\rangle[/itex]? That concept doesn't even make sense.

Sure it does: [itex]|s\rangle[/itex] can be decomposed as a linear combination of eigenstates of A. So [itex]|s\rangle\otimes\cdots\otimes|s\rangle[/itex] is just a particular state in an N-fold tensor product space. (Calling it "N-particle" can be misleading here, unless one keeps in mind that it means "N independent copies".)

The important point here about tensor products is that the component spaces are essentially independent of each other, (provided one doesn't introduce new operators that mix them up). And that's what we want in an ensemble of identically-prepared states -- all the copies must be mutually independent.

Your statement here is the Born rule. But what is it about Hartle's calculations that make people claim that they prove the Born rule? Do you think that they do? If yes, why? All I see is the result that for large N,
[tex]f_N{}^k|s^N\rangle\approx |\langle k|s\rangle|^2|s^N\rangle.[/tex]
What does this have to do with the Born rule? [...]

Certainly he's using another axiom of QM -- the one about any measured value being one of the eigenvalues of the operator. In the limit of infinite N and fixed k, the operator [itex]f_N{}^k[/itex] (acting on the infinite tensor product state above) gives the frequency that the value k will be found.

So I suppose one could read Hartle's paper as: "given one axiom of QM applying to states which are dispersion-free wrt the observable A, we can derive the Born rule applying to more general superpositions of those (eigen)states -- in the infinite-ensemble limit."

[...] if we [...] pretend that we have obtained the result "the average value of [itex]f_N{}^k[/itex] in a long series of measurements on the state |s> will be [itex]|\langle k|s\rangle|^2[/itex]", that still doesn't imply the Born rule.

Hartle's result is stronger than that: the value for the observable [itex]f_\infty^k[/itex] in an infinite collection of independent measurements on identically-prepared copies of the state |s> becomes definite: [itex]|\langle k|s\rangle|^2[/itex]. (I.e., no dispersion.)

Fredrik · Jul 3, 2011

strangerep said:

Hartle's result is stronger than that: the value for the observable [itex]f_\infty^k[/itex] in an infinite collection of independent measurements on identically-prepared copies of the state |s> becomes definite: [itex]|\langle k|s\rangle|^2[/itex]. (I.e., no dispersion.)

That may be true (and it may not be), but I won't have a reason to look at the N→∞ limit until I've seen a reason to interpret the approximate result as an approximate derivation of the Born rule.

strangerep said:

Sure it does: [itex]|s\rangle[/itex] can be decomposed as a linear combination of eigenstates of A.

So? That's only useful when we want to use the Born rule to calculate the frequency of |k> results in a long series of measurements of A. But if the interpretation of [itex]f_N{}^k[/itex] is the reason to think of the result [itex]f_N{}^k|s^N\rangle\approx |\langle k|s\rangle|^2|s^N\rangle[/itex] as an approximate derivation of the Born rule, we certainly can't use the Born rule to justify the interpretation of [itex]f_N{}^k[/itex]. So, without using the Born rule, what would you say is the frequency of [itex]|\left\uparrow\right\rangle[/itex] in the the spin-1/2 state [itex]|s\rangle=\frac{1}{\sqrt 2}\big(|\left\uparrow\right\rangle +|\left\downarrow\right\rangle\big)[/itex]? I would say that the question doesn't make sense, but doesn't the derivation rely on this making sense?

strangerep said:

Certainly he's using another axiom of QM -- the one about any measured value being one of the eigenvalues of the operator.

I read section II again today, hoping to understand what he's really assuming. These are my conclusions:

His starting point is that a state vector identifies all the true propositions about a single system. In this context, a proposition is a statement of the form "if we do a measurement using measuring device A, the result will be in the set E". So it can be uniquely identified by the pair (A,E). He's assuming that propositions have three possible truth values: true, false and indeterminate. What it means to say that a proposition is true/false/indeterminate is given by the following definition:

The proposition (A,E) is said to be true when we know for sure that the result of an A measurement will be in E, is said to be false when we know for sure that it won't be in E, and is said to be indeterminate when we can't know if the result will be in E or not.

He assumes that measuring devices are represented by self-adjoint operators. He assumes that the proposition (A,E) is true if the system's state vector is an eigenvector of A with eigenvalue in E, false if the system's state vector is an eigenvector of A with an eigenvalue not in E, and indeterminate if the system's state vector isn't an eigenvector of A.

This doesn't help me at all. At the beginning of section III, he states "in this section we show how the probability interpretation of the wave function results from an application of the previous discussion to ensembles of identical systems themselves considered as individual systems". In other words, "we will show how the Born rule follows from what I just said". Then he derives the result [itex]f_N{}^k|s^N\rangle\approx |\langle k|s\rangle|^2|s^N\rangle[/itex], which seems to have nothing to do with the Born rule, takes the limit [itex]N\rightarrow\infty[/itex], and claims victory. I feel like I'm looking at the underpants gnomes' 3-step plan:

1. Collect underpants.
3. Profit.

Something huge appears to be missing.

If we make the absurd assumption that every system that's described as being in state |s> is actually in an eigenstate, then I can see how probabilities arise from this approach. The probability of |k> could be defined as "number of copies that are actually in the state |i>" / "total number of copies". If we don't make that absurd assumption at the start, I don't see why the "approximate eigenvalue" of [itex]f_N{}^k[/itex] would have anything to do with the "probability of |k>".

jambaugh · Jul 3, 2011

Fredrik said:

Perhaps I wasn't careful enough with my explanation, but I'm pretty sure I understand the issues. You seem to have come to that conclusion too.

Yes, I was too quick to critique and should have read farther up the thread.

If you find this approach interesting, there are additional details in posts 4-5 in this thread (starting at "think of it this way").

Thanks I'll take a look.

The basic idea is this: We can discover most of this stuff just by thinking hard about what we really mean by a "theory". (A theory assigns probabilities to verifiable statements. Probabilities are numbers assigned by probability measures. Verifiable statements are...well, there's no need to repeat everything I said in the other thread. My point is, there's a lot of information contained in the idea that theories are probability assignments).

I have spent some years on and off considering the very same stuff. My recent conclusion is that, expectation values (averages of measurements) are a better starting point than probabilities as such. It is a more general language in that one can consider probabilities as expectation values for boolean observables and by calling a probability an expectation value one can relax assumptions of spectrum for the observational act (specifically that it has {0,1} spectrum).

For example you may have a device you call a "particle detector" with raw output an oscilloscope on which peaks are interpreted as "yes a particle was present". But you can have on rare occasions a double peak, indicating two particles. You either implicitly reject that as an invalid experiment and don't count it in your probability verification, or you can include it and realize it as a q=2 measurement relaxing your a priori assumption that the spectrum of your device is {0,1}.

Are we talking about the same article? The one I meant has the title "Quantum mechanics of individual systems".

I don't have a copy of the article handy, nor am I sure if it is the one I've read. What I recall is working through a similar argument in our research seminar back in grad school with the Hartle article referenced.

At the time we were considering how much of the quantum theory remains if you remove the metric structure. Your outline of Hartle sounds familiar.

My problem is that I can't see a good reason to interpret the result as a derivation of the Born rule. I can't even see a good reason to interpret [itex]f_N{}^k[/itex] as a frequency operator when it acts on something other than one of its eigenstates.

The way we were doing it, rather than considering the frequency operator as such, look at the projection operators onto subspaces of the big product space with a set number of factor modes parallel, and the other factors orthogonal to the target mode. (mode = "state") In short the projectors onto the eigen-spaces of the frequency operator.

In considering a source to target transition experiment, the limit as N->infinity of the N copies becomes an eigen-mode (eigen-value 1) of the projector onto the subspace corresponding to a frequency of N times the square magnitude of the transition amplitude. It likewise becomes an eigen-mode (eigen-value 0) of the projection operators onto other subspaces with different frequencies. The a priori assumptions are 0 transition amplitudes mean forbidden transitions, and norm 1 transition amplitudes mean assured transitions.

This implies that in the limit the transition frequency will correspond to N time Born's probability formula.

What do you mean by stuff like this? I'm not sure which ones of the sets I defined your Bs belong to. Are they observables? Are they what I called propositions? That would mean that they are equivalence classes of pairs (A,E) where A is an observable and E a Borel set. The pair (A,E) can be interpreted as a yes-no experiment: measure A and interpret a result in A as a "yes" result.

Yes, except I don't see a reason to place propositions in a separate category from observables. My B's were what is represented by projection operators in a quantum theory.

As I have been playing with the ideas however I think it is better not to distinguish propositions as one is basically in defining an observable to be "boolean" making implicit assumptions about its spectrum. Possibly that should be empirically determined.

BTW Thanks for the discussion as it has gotten me back into this subject on foundations and I've made a bit of progress on old thoughts on the matter. My starting point is to assume what I call "Noetherian Theories" which are theories assuming that the space of observables corresponds one to one with a Lie algebra of kinematic transformations. (Specifically they have the same (adjoint) representation.)

I hope to define modes in terms of partition functionals and jump straight to the thermodynamics. Then consider sharp modes i.t.o. zero entropy, be they classical or quantum. I'm still trying to work out the structure of the modes but I think they reside in the coalgebra of the universal covering algebra.

jambaugh · Jul 3, 2011

jambaugh said:

Define the ordered composite measurement [tex]B_3 = B_2 B_1[/tex] to be the

Oops! No wonder there was confusion, I jumped away (realizing I was about to make a mistake) without finishing the thought. It should have read
"...to be the boolean measurement resulting from preforming the sequence of measurments and "and'ing" the results."
But of course that is not correct.

What I realized while typing this was that the composition shouldn't be assumed to be again a measurement (as in the case where B2 and B3 do not commute in QM). I think my intent when editing was to delete this line entirely. I recall typing a lengthy qualifier and then realizing it was running too far afield and deleting it.

strangerep · Jul 4, 2011

Fredrik said:

I feel like I'm looking at the underpants gnomes' 3-step plan:

1. Collect underpants.
3. Profit.

Something huge appears to be missing.

I attempt to supply the missing piece in this separate thread:

https://www.physicsforums.com/showthread.php?p=3388955

What are the key theorems on the mathematics of symmetries in quantum mechanics?

1. What are symmetries in mathematics?

2. How are symmetries relevant in mathematics?

3. What are some examples of symmetries in mathematics?

4. How do symmetries impact other areas of science?

5. Can symmetries be broken?

Similar threads

Hot Threads

Recent Insights