# Tensor products and subsystems

1. Dec 3, 2009

### Fredrik

Staff Emeritus
I'm not sure what forum to put this in. It's a math question, but it's only of interest to physics people.

Given two Hilbert spaces $H_1$ and $H_2$, we can construct their tensor product $H=H_1\otimes H_2$. This is another Hilbert space.

What I'm wondering is if there are any theorems about what sort of decompositions we can make if we're given a Hilbert space $H$, and want to express it as a tensor product of two "smaller" Hilbert spaces. Can we pick an arbitrary subspace and call it $H_1$, and then construct $H_2$ from $H$ and $H_1$?

I'm interested in subsystems in QM, and not just ensembles of identically prepared systems. I want to know e.g. if we can always decompose the Hilbert space of the universe (in the many-worlds interpretation) into "this guy" $\otimes$ "everything else".

Maybe I'm phrasing the question wrong. Maybe I should focus on the observables instead of the states. I don't know. If you do, let me know.

2. Dec 3, 2009

### meopemuk

Fredrik,

there is a theorem in quantum logic, which justifies the use of the tensor product of Hilbert spaces to describe a compound system

T. Matolcsi, "Tensor product of Hilbert lattices and free orthodistributive product of orthomodular lattices", Acta Sci. Math. (Szeged) 37 (1975), 263.

D. Aerts and I. Daubechies, "Physical justification for using the tensor product to describe two quantum systems as one joint system", Helv. Phys. Acta, 51 (1978), 661.

This theorem does not answer your question exactly, but possibly it can give you a clue.

3. Dec 3, 2009

### strangerep

One needs to think in terms of correlations between subsystems.
Two theorems may be relevant here:

A) The SSC Theorem: "Subsystem correlations determine the state".
Roughly stated: [...] the density matrix of a composite system determines all the correlations
among the subsystems that make it up and, conversely, the correlations among all the
subsystems completely determine the density matrix for the composite system they make up.

B) "The external correlations of a system are necessarily trivial if and only
if its state is pure".

The above are taken from Mermin's paper quant-ph/9801057 (see the two appendices).

That paper may also provide food for thought in other ways. :-)

4. Dec 3, 2009

### Fredrik

Staff Emeritus
Thanks guys. I'm going to sleep now, but I'll take at look at the Mermin paper tomorrow. (I don't have an easy way to access the articles Meopemuk referenced, and I suspect they'll just be more elaborate versions of the argument that we use tensor products because that's what the Born rule needs).

5. Dec 3, 2009

### hamster143

Generally speaking, no. An n-dimensional Hilbert space, where n is prime, can't be decomposed into a tensor product at all. If n=ab, then there is an enormous number of possible decompositions into an a-dimensional and a b-dimensional space. You just pick an arbitrary basis in the original space and form 'a' groups of 'b' basis vectors.

With regard to physical systems, you need to think about lattice discretizations. Simplest case: consider a set of n nodes, each of which may or may not be occupied by a single particle. The Hilbert space is 2^n dimensional. You can decompose the set into a+b nodes, with 2^a and 2^b dimensional Hilbert spaces whose tensor product gives the original space.

6. Dec 4, 2009

### Fredrik

Staff Emeritus
I haven't read the whole article yet, but I'm going to. It's a pretty good article. Mermin is a good enough writer to make sure that even some of the things he's wrong about are worth reading.

My first impression was that everything he's saying about his "Ithaca" interpretation is exactly the view of the many-worlds interpretation that I've arrived at through my discussions here. A few pages later he made it clear that what distinguishes "Ithaca" from MWI is that he assumes that QM just can't be applied to consciousness! I find that very strange, but I'll keep reading. If I just ignore the stuff about consciousness, the rest of it seems to be the sort of things I've been wanting to read about the MWI for a while now.

7. Dec 17, 2009

### Demystifier

Hm, my impression is that the Mermin interpretation is a variant of the information interpretation, not a variant of MWI.

8. Dec 17, 2009

### Fredrik

Staff Emeritus
I'm not familiar with the information interpretation. (Not even sure if I've heard of it). Mermin seems to agree that we need an additonal assumption on top of QM to avoid having to deal with many worlds, and his assumption is that consciousness is weird enough to invalidate the conclusion that realism implies many worlds.

9. Dec 17, 2009

### Fredrik

Staff Emeritus
I didn't read the appendices until today. They're not quite what I asked for, but they matched my needs pretty well anyway. I think the answer to my question is actually quite obvious. (I don't know why I didn't get this right away). QM doesn't answer questions like that, but each specific theory of matter in the framework of QM does.

10. Dec 17, 2009

### Fredrik

Staff Emeritus
I have a question about appendix A in Mermin's article, but someone who knows this stuff well might be able to answer without reading it. Why do expressions of the form $\mbox{Tr }A\otimes B$ represent correlations between subsystems?

What Mermin does in appendix A is to prove that any matrix element of the state operator (I don't want to call it "density matrix" when it's an operator) can be expressed as a sum of terms of the form $\mbox{Tr }A\otimes B$. He claims that the interpretation of this is that the correlations are completely determined by the state operator and that the state operator is completely determined by the correlations.

11. Dec 17, 2009

### strangerep

Think about an ordinary Hilbert space for a moment. Suppose you have two state
operators $\rho_1, \rho_2$. How do you compute the correlation between these states?
$Tr(\rho_1 \rho_2)$, right?
(If that's not clear, think about the case when both are pure...)

The same principle applies in a the tensor product Hilbert space (operators therein are in
general a sum of the tensor products of operators in the component subspaces).

"Density matrix" and "state operator" are pretty much interchangeable terms
(though I prefer the latter because it's less confusing in infinite dimensions).

12. Dec 18, 2009

### Fredrik

Staff Emeritus
I understand what you're saying about pure states, but nothing beyond that. When they're both pure, we have

$$\mbox{Tr}(|\alpha\rangle\langle\alpha|\beta\rangle\langle\beta|)=\sum_n\langle n|\alpha\rangle\langle\alpha|\beta\rangle\langle\beta|n\rangle=\langle\beta|\Big(\sum_n|n\rangle\langle n|\Big)|\alpha\rangle\langle\alpha|\beta\rangle=|\langle\alpha|\beta\rangle|^2$$

This is 1 when they're the same, and 0 when they're orthogonal, so it's a measure of how similar they are. The corresponding result when they're not pure is

$$\sum_i\sum_j a_i b_j |\langle\alpha_i|\beta_j\rangle|^2$$

I don't see how this measures anything significant. If we e.g. take the alphas to be orthogonal to each other, and the two state operators to be the same, we get

$$\sum_i|a_i|^2$$

So we don't even get the same value each time we consider two identical ensembles.

The type of correlation I think Mermin is talking about is the kind we encounter in a Stern-Gerlach experiment. When the silver atom has passed through the magnetic field, its state vector is |↑>|left>+|↓>|right>. This is a correlation between position states and spin states. If the state vector had contained the terms |↑>|right> and |↓>|left> with coefficients of the same magnitude as the ones in front of the other two terms, the states of the subsystems (spin-z and position-x) had been uncorrelated. A good measure of the degree of correlation should be large when the coefficients in front of the first two terms are large and the other two coefficients are small, or vice versa.

What makes this even weirder is that if A and B are observables, $\mbox{Tr}(A\otimes B)$ is independent of the state and therefore can't be a measure of correlation, and that if A and B are state operators, the result is always 1, because $\mbox{Tr}(A\otimes B)=\mbox{Tr }A\cdot \mbox{Tr }B$.

Perhaps I'm a bit slower than usual. I'm writing this one just before going to bed. Same thing with the question in my previous post.

Edit: I didn't mean to suggest that we're looking for a single number that tells us how correlated the subsystems are. We're looking for a set of numbers that in some way represent correlations of the type I have described, and together represent all of the information that's contained in the state operator.

"State operator" feels like a natural thing to call it, but I honestly don't know if I've ever seen anyone use that term. Ballentine calls it "statistical operator". Sakurai calls it "density.." uhh I don't remember if he calls it "density operator" or "density matrix". Wikipedia calls it "density operator", but the article is titled "density matrix".

Last edited: Dec 18, 2009
13. Dec 19, 2009

### Fredrik

Staff Emeritus
D'oh. I felt that something was very wrong when I wrote my previous post, but I didn't know what. Now I see that Mermin says that the correlations are represented by quantities of the form $Tr(\rho A\otimes B)$, not $Tr(A\otimes B)$. I haven't thought about what that means yet. I'm just posting to say that I've spotted this mistake.

14. Dec 19, 2009

### strangerep

I'm sorry. My previous post was unhelpful. (I foolishly thought I could dash off an

Oh well... this is why I enjoy talking to you. You have a way of making me realize when
I don't understand things thoroughly enough. :-)

I'll try to post something more helpful later when I've re-studied some things properly.

In the meantime, you might also like to read Mermin's earlier paper quant-ph/9609013.
I get the feeling his later paper is partly an attempted response to questions about
this earlier paper about things that weren't clear to others. But he's partly only
succeeded in saying a lot more words without a proportionate increase in clarity (imho).
All this stuff about subsystem correlations is deceptively difficult. It looks easy enough
on a casual reading, but it's really quite subtle.

I suspect I'm yet to emerge from the "mist of incomprehension" that Herbut mentions
in his paper (0811.3674 [quant-ph]) in which he builds on Mermin's interpretation.
Actually, you might want to take a look at Herbut's paper as well. Among other things,
it reveals indirectly how hard it is to understand the depth of Mermin's ideas properly.

15. Dec 20, 2009

### Fredrik

Staff Emeritus
The calculation in appendix A is pretty straightforward. If $\{|\psi_\mu\rangle\}$ is a basis for a Hilbert space, then $\{|\psi_\mu\rangle\langle\psi_\nu|\}$ is a basis for the algebra of operators. This follows immediately from

$$X|\alpha\rangle=\sum_{\mu,\nu}|\psi_\mu\rangle\langle\psi_\mu|X|\psi_\nu\rangle\langle\psi_\nu|\alpha\rangle$$

The identity

$$(|\alpha\rangle\langle\beta|)^\dagger=|\beta\rangle\langle\alpha|$$

implies that the $|\psi_\mu\rangle\langle\psi_\nu|$ operators aren't hermitian in general. Mermin wants to work with hermitian operators, so he uses a standard trick to express the basis operators in terms of hermitian operators. Define $K_{\mu\nu}=|\psi_\mu\rangle\langle\psi_\nu|$ and write

$$K_{\mu\nu}=\frac{K_{\mu\nu}+K_{\mu\nu}^\dagger}{2}+\frac{K_{\mu\nu}-K_{\mu\nu}^\dagger}{2}=\frac{K_{\mu\nu}+K_{\nu\mu}}{2}+i\frac{K_{\mu\nu}-K_{\nu\mu}}{2i}=M^r_{\mu\nu}+iM^i_{\mu\nu}$$

Mermin considers a system that consists of two subsystems. He writes the basis vectors as $|\psi_\mu,\phi_\alpha\rangle=|\psi_\mu\rangle\otimes|\phi_\nu\rangle$. He calls the state operator W, and uses the above to express an arbitrary matrix element of W in terms of the hermitian M operators for the two subsystems. (He uses the letter N for the "M operators" of the second subsystem).

$$\langle\psi_\nu,\phi_\beta|W|\psi_\mu,\phi_\alpha\rangle=\sum_{\rho,\gamma}\langle\psi_\nu,\phi_\beta|\psi_\rho,\psi_\gamma\rangle\langle\psi_\rho,\psi_\gamma|W|\psi_\mu,\phi_\alpha\rangle=\sum_{\rho,\gamma}\langle\psi_\rho,\psi_\gamma|W|\psi_\mu,\phi_\alpha\rangle\langle\psi_\nu,\phi_\beta|\psi_\rho,\psi_\gamma\rangle$$

$$=\mbox{Tr}\big(W|\psi_\mu,\phi_\alpha\rangle\langle\psi_\nu,\phi_\beta|\big)=\mbox{Tr}\Big(W\Big(|\psi_\mu\rangle\langle\psi_\nu|\otimes|\phi_\alpha\rangle\langle\phi_\beta|\Big)\Big)$$

$$=\mbox{Tr}\Big(W\Big((M^r_{\mu\nu}+iM^i_{\mu\nu})\otimes(N^r_{\alpha\beta}+iN^i_{\alpha\beta})\Big)\Big)$$

$$=\mbox{Tr}\big(W\big(M^r_{\mu\nu}\otimes N^r_{\alpha\beta}\big)\big) +i\mbox{Tr}\big(W\big(M^r_{\mu\nu}\otimes N^i_{\alpha\beta}\big)\big) +i\mbox{Tr}\big(W\big(M^i_{\mu\nu}\otimes N^r_{\alpha\beta}\big)\big) -\mbox{Tr}\big(W\big(M^i_{\mu\nu}\otimes N^i_{\alpha\beta}\big)\big)$$

So the calculation isn't too hard. Now I just need to understand why we're doing it.

I'm including links to the three papers you've mentioned, just to make it easier to access them from here:

http://arxiv.org/abs/quant-ph/9801057
http://arxiv.org/abs/quant-ph/9609013
http://arxiv.org/abs/0811.3674

I've read the easy parts of the original Ithaca paper now, but I haven't tried to understand the details of the proofs. Herbut's paper looks much more difficult.

Last edited: Dec 20, 2009
16. Dec 20, 2009

### yossell

Not sure what your current worry is. What this proof shows or the meaning of $Tr(\rho A\otimes B)$?

17. Dec 20, 2009

### Fredrik

Staff Emeritus
Aren't those two options the same? I'd like to know why the terms in the final result represent correlations between subsystems.

18. Dec 20, 2009

### strangerep

OK, I'll try to say something more helpful this time.

First, a quick review...

The "correlation" $\rho_{X,Y}$ between two observables X,Y in a given state is defined as
$$\rho_{X,Y} ~:=~ \frac{cov(X,Y)}{\sigma_X ~ \sigma_Y} ~,$$
where the terms in the denominator are the standard deviations
of the respective observables, and the numerator is their covariance:

$$cov(X,Y)_\Psi ~:=~ \langle \; (X - \bar{X}_\Psi) \; (Y - \bar{Y}_\Psi) \; \rangle_\Psi ~,$$

where the barred terms denote the expectations of the respective observables
in the state $\Psi$.

The "correlation" defined above can be viewed as a "normalized" covariance.

Since the barred terms are just scalars, a few lines of linear algebra shows that

$$cov(X,Y)_\Psi ~=~ \langle X Y \rangle_\Psi ~-~ \bar{X}_\Psi \, \bar{Y}_\Psi ~.$$

So far, this is just equivalent to the well-known result that the covariance
between two observables is zero if and only if their expectation factorizes
as shown above.

Now we apply this to the case of a tensor product system, where X is an
operator acting nontrivially only on the first subsystem, and Y acts
nontrivially only on the second. Let W be the tensor product state.

$$\bar{X}_W$$ clearly involves only the first subsystem, so
$$(X - \bar{X}_W)$$ is also an observable acting nontrivially
only on the first subsystem. Similarly for Y and the second subsystem.

Likewise, the standard deviation $$(\sigma_X)_W$$ only involves
quantities from the first subsystem nontrivially, so we can divide
$$(X - \bar{X}_W)$$ by $$(\sigma_X)_W$$ and get yet another
observable that acts nontrivially only on the first subsystem.
(And similarly for Y and the second subsystem.)

So let's consider the observable

$$A ~:=~ \frac{X - \bar{X}}{\sigma_X}$$

and similarly B in terms of Y.

(Of course, I'm ignoring the special case where the standard
deviation is zero, but in that case we'd just use covariances alone.)

We can now see that correlations between A and B can be
expressed simply as

$$\langle A B \rangle_W$$

provided we note the abuse of notation by which "A" and "B" have been
extended to mean

$$A ~\equiv~ A \otimes 1 ~~;~~~~~ B ~\equiv~ 1 \otimes B$$

To tie this in more closely to Mermin's SSC theorem, we note that the
theorem says "...values of $$Tr(W\, A\otimes B)[/itex] for an appropriate set of observable pairs A,B ...." (my italics). Is this adequate to explain how/why the far RHS of Mermin's eq(30) represents "the values of the subsystem correlations between all the A's and B's", and hence that they "are enough to determine all the matrix elements of W in a complete set of states for the total system [...]" ? 19. Dec 21, 2009 ### Fredrik Staff Emeritus Thank you strangerep. That was helpful. I didn't know those definitions. I see that the definition [tex]A=\frac{X-\langle X\rangle}{\Delta X}$$

implies

$$\langle A\rangle=\frac{\langle X\rangle-\langle X\rangle}{\Delta X}=0$$

$$\langle A^2\rangle=\left\langle\left(\frac{X-\langle X\rangle}{\Delta X}\right)^2\right\rangle=\left\langle\frac{X^2+\langle X\rangle^2-2X\langle X\rangle}{(\Delta X)^2}\right\rangle=\frac{\langle X^2\rangle-\langle X\rangle^2}{(\Delta X)^2}=1$$

and therefore

$$(\Delta A)^2=\langle A^2\rangle-\langle A\rangle^2=1$$

and similarly for B. So the correlation of A and B is equal to the covariance of A and B, which is equal to

$$\langle AB\rangle-\langle A\rangle\langle B\rangle=\langle AB\rangle =\mbox{Tr}(WAB)=\mbox{Tr}(W(A\otimes I)(I\otimes B))=\mbox{Tr}(W(A\otimes B))$$

where W is the state operator.

I also see that this can be related to what I said before about the state |↑>|left>+|↓>|right>. Consider the pure state

a|↑>|left> + b|↓>|right> + c|↑>|right> + d|↓>|left>

What I said there can be translated to "the states of the subsystems are correlated when the magnitudes of a and b are large and the magnitudes of c and d are small". Your definitions agree with that pretty well. When the state is $|\psi\rangle$=|↑>|left>+|↓>|right>, we have

$$\langle\psi|S_z|\psi\rangle\langle\psi|X|\psi\rangle=\Big(\langle\uparrow|S_z|\uparrow\rangle+\langle\downarrow|S_z|\downarrow\rangle\Big)\Big(\langle L|X|L\rangle+\langle R|X|R\rangle\Big)$$

$$=\langle\uparrow|S_z|\uparrow\rangle\langle L|X|L\rangle+\langle\uparrow|S_z|\uparrow\rangle\langle R|X|R\rangle+\langle\downarrow|S_z|\downarrow\rangle\langle L|X|L\rangle+\langle\downarrow|S_z|\downarrow\rangle\langle R|X|R\rangle$$

where I have abbreviated "left" and "right" to L and R, and X is the "position" operator that only has two eigenstates, "left" and "right". Your definition compares this to

$$\langle\psi|S_zX|\psi\rangle=\langle\uparrow|S_z|\uparrow\rangle\langle L|X|L\rangle+\langle\downarrow|S_z|\downarrow\rangle\langle R|X|R\rangle$$

Your definition says that the correlation of Sz and X in the state $|\psi\rangle$ is just the difference between these two quantities, which is

$$\langle\uparrow|S_z|\uparrow\rangle\langle R|X|R\rangle+\langle\downarrow|S_z|\downarrow\rangle\langle L|X|L\rangle=\langle\phi|S_z|\phi\rangle\langle\phi|X|\phi\rangle$$

where $|\phi\rangle=|\uparrow\rangle\otimes|R\rangle+|\downarrow\rangle\otimes |L\rangle$. So the correlation is (at least in this case) the difference between two expressions of the form $\langle\chi|S_z|\chi\rangle\langle\chi|X|\chi\rangle$, where $|\chi\rangle$ is first chosen to consist of the two "desirable terms" from a|↑>|left> + b|↓>|right> + c|↑>|right> + d|↓>|left> and then of the two "undesirable" terms.

I have to leave the computer now...will add a few more things later.

Last edited: Dec 21, 2009
20. Dec 21, 2009

### Demystifier

The main difference between MWI and the information interpretation (II) is that MWI claims that the wave function is an objectively real entity, while II claims that the wave function is not an objectively real entity, but a mathematical function that describes our information about reality. An extreme II (which Mermin seems to adopt in his "correlations without correlata" mantra) asserts that objective reality does not even exist, i.e., that there is only information and nothing else. Another prominent supporter of such an extreme II is Zeilinger.