# Entanglement Entropy – Part 1: Quantum Mechanics

~~Quantum Mechanics(QM) is one of the greatest intellectual achievements in human history. Not only because it describes the world at the microscopic level and in turn provides us with the technological advancement that we enjoy today, but also because it shows us how little we know about the world we’ve been living in for so long. It shows us the strength of the human mind and its power to use logical arguments and mathematical reasoning to the point that it can surpass not only what we have imagined, but also what we are able to imagine. It shows us the limitations of our common sense that is the result of being compiled by observing only the phenomena that are common around us. In short, it has provided us with an enhanced view of ourselves and the world we live in.~~

In this insight, I’m going to explain a little bit about entanglement in QM, which is one of the interesting features of this amazing theory. But before delving into entanglement, its helpful to have some discussion about the notion of a quantum state.

**Quantum State**

Look around yourself. Surely there are some stuff around you that make up your environment. If I ask you to describe it, to specify its state, a typical answer would be like ” There is a desk near the door and a dresser under the window. My bed is next to the dresser and my dog is running around in the middle of the room. “. You just list the stuff that exist around you and give their positions and velocities. Your mind just assumes that theses stuff objectively possess those properties and so it seems so intuitive to use those same properties to specify the state. This is our intuitive notion of the state of a classical system, and Classical Mechanics(CM) uses this same intuitive notion to describe the state of a classical system.

In CM, the state of a classical system is described by a point in a space called the phase space and its evolution is just the motion of this point through this space. It sometimes happens that the constituents of the classical system are kinematically connected, which means there is a connection between some of the properties of different constituents. I call it a kinematical connection as opposed to a dynamical connection which is an interaction between different constituents. The only effect of a kinematical connection, is to restrict the allowed portion of the phase space in which the classical system is allowed to be in. It doesn’t fundamentally change anything about the way we talk about different constituents of the classical system and only forces the system to move in a subspace of the phase space. But as you noticed, I said that for a classical system, a kinematical connection affects the motion of the system through phase space and so its not about the description of the system at a certain time i.e. its state. So It seems that for a classical system, any connection between its constituents only affects the time evolution of the system and not our description of it at a particular time.

But as I mentioned earlier, QM makes us realize that our universe is something more than the things that have formed our common sense and those extra stuff are so much different that even our way of describing the state of a system doesn’t apply to them. In QM, the state of a quantum system is actually a bunch of probability distributions for all of the properties that the quantum system possesses. This means that a kinematical connection between two constituents implies that you can’t have separate probability distributions for each particle for at least one of those properties. This means that the information that the quantum state gives about only one of the constituents gets some non-trivial effects from the information about the other constituents and this effect is truly kinematical and actually about the state of that constituent at a particular time.

Let’s consider an example for some clarification. Consider a particle at rest that disintegrates to two different particles. Because the initial state had zero momentum, the state after the disintegration should have zero momentum too and so we should have ## p_1+p_2=0 ##. In CM, this means that the system evolves in the ## p_1+p_2=0 ## subspace of the phase space. Consider a particular point in this subspace, e.g. ## (x_1=1,x_2=4,p_1=5,p_2=-5) ##. This state by itself doesn’t tell you anything about the connection between the two particles because two unrelated particles can be in this same state either.

But for the corresponding example in QM, the state of the system would be ##\displaystyle\int_0^{\infty} dp \phi(p) \left( e^{-ipx_1} e^{ipx_2}+e^{ipx_1}e^{-ipx_2}\right) ##. As you can see, the relation between the particles is obvious in the state itself and not in the way they’re going to evolve. So the connection between the particles is actually encoded in their state and actually affects the information that the state can give about each of the particles.

**Entanglement**

As I explained, in QM it is possible to have systems that are connected in a way that in general there is no way to assign a state to each individual one of them because the state of the system that is composed of those systems, encode something more than the information about each individual system: the connection between those systems. But anyone familiar with QM knows that there is still some notion of state that can be assigned to those individual systems(mixed states), so am I wrong in my last sentence? In a way, no!

Let’s go back to our CM example. When we say the two particles are in the state ## (x_1=1,x_2=4,p_1=5,p_2=-5) ##, we are giving the maximum amount of information that we can have about the system. This seems to be a reasonable notion of state because it doesn’t make sense to voluntarily give up information unless its needed, which in the case of two particles its obviously not. As I said this state contains no information more that the state of each of the particles and so we easily can give each of the particles its own state, with the definition just given. So in CM, its always possible to assign a state to a system, at least in principle.

But let’s see how that definition of state works out in QM. The quantum state is something more than just the state of each of the constituents of the system, so its obvious that if we assign any kind of “state” to each of the constituents, in general we’re going to lose some information. So it seems that by the definition given, a system in an entangled state is a system that is composed of individual subsystems that can not possess a state. But as I said, we still can assign some kind of a “state” to those subsystems but its obvious that, in general, we’re going to lose some information.

**Entanglement Entropy**

So we learned that a number of individual systems can be part of a larger system and in general this affects the amount of information we can have about each of them. But entanglement is not an on-off concept and there can be different degrees of it and so its useful to have a quantity that indicates the amount of entanglement. The quantity that is used is called the von Neumann entropy. Let’s see some mathematics:

###### If you’re familiar with state vectors, inner products, etc. you can skip this part between the lines.

As I explained, a quantum state is a bunch of probability distributions for the properties of the system. But what kind of a monster can contain that much information? To find that monster, we should at first take a look at the notion of a probability distribution.

When we talk about probabilities, the situation that comes to mind is that we have a variable that randomly takes values from a set, either continuous or discrete. A probability distribution associates a probability to each of the values in the set.

So the monster that we are looking for, should have the ability to indicate different values of each of the properties and give each of them a probability. But in case of quantum mechanics the familiar probability theory with real numbers doesn’t work, because different possibilities for a property of a quantum system have the ability to interfere with each other and our notion of state should be able to indicate that interference too. So Instead of using probability distributions directly, we use probability amplitudes whose modulus squared gives a probability distribution. We also allow these probability amplitudes to have complex values.

But different properties of a quantum system, take values from different kinds of sets and so a quantum state should have separate states for different properties that are somehow connected to each other to give a quantum state for the system. Such a mathematical notion was developed after QM was born and its called a Complex Hilbert Space and a member of such a space is called a state vector. A Hilbert Space is so much like the good old Euclidean space, actually a Euclidean space is a Real Hilbert Space with a Euclidean inner product.

So, a Complex Hilbert Space also needs a basis and that basis is given by different possibilities for the value of the property for which we want to make a Hilbert Space. In the following, I’m going to take the spin component of the state vector of a spin-1/2 particle as an example.

We know from QM, that the spin of a spin-1/2 particle can be either up or down along any arbitrary direction. So we should just choose a direction, usually the z-direction, and consider up or down along that direction as the basis of the Hilbert Space(then we say that those basis vectors span the Hilbert Space) and a state for the spin of the said particle should give probability amplitudes to each of the possibilities. The state is represented by ## |\psi\rangle ## and the basis by ## \{|\uparrow\rangle,|\downarrow\rangle\} ##. It should be noted that the symbols inside ## | \rangle ## are just labels, so they don’t have to be any kind of number or alphabets in any particular language, just whatever you like.

Like the Euclidean space in which we can expand any vector in terms of the basis ## \hat x, \hat y, \hat z ##, here we can write ## |\psi\rangle=a|\uparrow\rangle+b|\downarrow\rangle ##. a and b are the probability amplitudes for the spin to be up or down, respectively. So ## P(\uparrow)=\frac{|a|^2}{|a|^2+|b|^2} ## and ## P(\downarrow)=\frac{|b|^2}{|a|^2+|b|^2} ##. I divided by ##{|a|^2+|b|^2}## because probabilities should add up to one.

But what if we want to know the probability that the spin is up or down along a direction different from the z-direction? This is where the notion of an inner product comes into play. An inner product (indicated by ## \langle \phi|\psi\rangle ##) is just a way of associating a complex number to two vectors and all we have to do to define it, is to axiomatize the result of the inner products of the basis vectors by each other. So we should just give arbitrary values to ## \langle \uparrow|\uparrow\rangle, \langle \uparrow|\downarrow\rangle, \langle \downarrow|\uparrow\rangle ## and ##\langle \downarrow|\downarrow\rangle##. Of course these are just arbitrary values from a mathematical perspective and should make sense from a physical point of view. If the spin is definitely up, then its definitely not down and vice versa, so it makes sense to assume ## \langle \uparrow|\downarrow\rangle=\langle \downarrow|\uparrow\rangle=0 ##. But if the spin is definitely up(down)…well…its definitely up(down). So we should have ## \langle \uparrow|\uparrow\rangle=\langle \downarrow|\downarrow\rangle=1 ##.

Now we should just encode the direction we want (## \hat n=(\theta,\phi) ##) in a state ## |n\rangle=\cos(\frac \theta 2) |\uparrow\rangle+e^{i\phi}\sin(\frac \theta 2)|\downarrow\rangle ## and evaluate:

## \langle n|\psi\rangle=\left( \cos(\frac \theta 2)\langle \uparrow|+e^{-i\phi}\sin(\frac \theta 2) \langle \downarrow| \right)\left(a|\uparrow\rangle+b|\downarrow\rangle\right)=\\ a\cos(\frac \theta 2)\langle \uparrow|\uparrow\rangle+b\cos(\frac \theta 2)\langle \uparrow|\downarrow\rangle+ae^{-i\phi}\sin(\frac \theta 2) \langle \downarrow|\uparrow\rangle+be^{-i\phi}\sin(\frac \theta 2) \langle \downarrow|\downarrow\rangle=\\ a\cos(\frac \theta 2)+be^{-i\phi}\sin(\frac \theta 2) ##

So for a spin in the state ## |\psi\rangle=a|\uparrow\rangle+b|\downarrow\rangle ##, the probability to be in the direction ## \hat n ##, is equal to ## a\cos(\frac \theta 2)+be^{-i\phi}\sin(\frac \theta 2) ##.

A density operator is another way to associate a state to a quantum system. Its defined by ## \rho=|\psi\rangle\langle\psi| ## and for the above state becomes:

##\rho=\left(a|\uparrow\rangle+b|\downarrow\rangle\right)\left( a^*\langle \uparrow|+b^*\langle \downarrow| \right)=|a|^2|\uparrow\rangle\langle \uparrow|+ab^*|\uparrow\rangle\langle\downarrow|+ba^*|\downarrow\rangle\langle\uparrow|+|b|^2|\downarrow\rangle\langle\downarrow| ##

Then the probability for the spin to be up is given by ## P(\uparrow)=Tr(\rho |\uparrow\rangle\langle \uparrow|) ##.The Tr symbol means the trace of an operator(say, ##\sigma##) which is calculated like ## Tr(\sigma)=\langle \uparrow|\sigma|\uparrow\rangle+\langle\downarrow|\sigma|\downarrow\rangle ##. So we have:

## P(\uparrow)=Tr(\rho |\uparrow\rangle\langle \uparrow|)=\langle \uparrow |\rho|\uparrow\rangle\underbrace{\langle\uparrow|\uparrow\rangle}_{1}+\langle \downarrow|\rho|\uparrow\rangle\underbrace{\langle \uparrow|\downarrow\rangle}_{0}=\\ \langle \uparrow |\left(|a|^2|\uparrow\rangle\langle \uparrow|+ab^*|\uparrow\rangle\langle\downarrow|+ba^*|\downarrow\rangle\langle\uparrow|+|b|^2|\downarrow\rangle\langle\downarrow|\right)|\uparrow\rangle=\\|a|^2 \underbrace{\langle \uparrow|\uparrow\rangle}_1\underbrace{\langle\uparrow|\uparrow\rangle}_1+ab^*\underbrace{\langle \uparrow|\uparrow\rangle}_1\underbrace{\langle\downarrow|\uparrow\rangle}_0+ba^*\underbrace{\langle\uparrow|\downarrow\rangle}_0\underbrace{\langle\uparrow|\uparrow\rangle}_1+|b|^2\underbrace{\langle\uparrow|\downarrow\rangle}_0\underbrace{\langle\downarrow|\uparrow\rangle}_0=|a|^2 ##

The quantum state of a system is given by a state vector, ## |\psi\rangle ##. We can also associate a density operator to the system, ## \rho=|\psi\rangle \langle \psi| ##. Now if the quantum system is consisted of two parts and the Hilbert spaces of these two parts are spanned by ## \{ |\phi^a_n\rangle \} ## and ## \{ |\phi^b_n\rangle \} ##, the “state” that we can associate to subsystem a, is computed by the formula ## \rho_a=\sum_n \langle \phi^b_n|\rho|\phi^b_n \rangle ##.( For a spin-1/2, this becomes ## \rho_a=\langle \uparrow^b |\rho|\uparrow^b\rangle+\langle \downarrow^b|\rho|\downarrow^b\rangle ##, where the superscript b means that these basis vectors should only be multiplied by the basis vectors of the b part of the state.)Then the von Neumann entropy is calculated by the formula ## S_a=-Tr(\rho_a \ln \rho_a) ##.

Lets consider a system consisted of two spins in the state ## \sin\theta |\uparrow\rangle| \downarrow\rangle+\cos\theta |\downarrow\rangle| \uparrow\rangle ##. The density operator associated to this system is:

## \rho= \left[ \sin\theta |\uparrow\rangle| \downarrow\rangle+\cos\theta |\downarrow\rangle| \uparrow\rangle \right] \left[\sin\theta \langle\uparrow|\langle\downarrow|+\cos\theta \langle\downarrow|\langle \uparrow|\right]= \\ \sin^2\theta |\uparrow\rangle| \downarrow\rangle \langle\uparrow|\langle\downarrow|+ \sin\theta \cos\theta \left [|\uparrow\rangle| \downarrow\rangle \langle\downarrow|\langle \uparrow|+|\downarrow\rangle| \uparrow\rangle \langle\uparrow|\langle\downarrow| \right]+\cos^2\theta |\downarrow\rangle| \uparrow\rangle \langle\downarrow|\langle \uparrow| ##

The reduced density operator associated to each of the spins can easily be calculated to be ## \rho_i =\sin^2\theta |\uparrow\rangle\langle\uparrow|+\cos^2\theta |\downarrow\rangle\langle\downarrow| ## and so the von Neumann entropy is equal to ## -\sin^2\theta \ln \sin^2\theta-\cos^2\theta \ln\cos^2\theta ##.

As you can see the state given above is parameterized by ## \theta ##, so let’s see how is the situation for different values of the parameter. For ## \theta=0 ## and ## \theta=\frac \pi 2 ##, the state is, respectively, ## |\downarrow \rangle|\uparrow\rangle## and ## |\uparrow\rangle |\downarrow \rangle##. These are factored states, like classical states. For these states, a complete knowledge about the whole system, gives us a complete knowledge about its parts. And you can see that for these states, the von Neumann entropy is equal to zero. But for ## \theta=\frac \pi 4 ##, the state is ## \frac 1 {\sqrt 2} \left(|\uparrow\rangle |\downarrow \rangle+|\downarrow \rangle|\uparrow\rangle\right) ##. This state is clearly entangled and each of the two possibilities is equally likely. The von Neumann entropy for this state is equal to ## \ln 2 ##.

**Why is it an Entropy?**

Good question. As I explained, entanglement means that there is some information about the subsystems in the state of the whole system. So if someone is only allowed to examine a subsystem, they are going to miss some information and because that information is going to affect the subsystem at hand, it means that there is actually some information about the subsystem under study that is lost to the observer. The more the entanglement, the more the information the observer misses about the subsystem. So it seems that any measure of entanglement, also measures the amount of information about a subsystem that the observer is going to lose if they’re only allowed to examine that subsystem. This really feels like some kind of entropy!

Now that we have some understanding of the notion of Entanglement Entropy in QM, I’m going to delve into the same notion for a Quantum Field Theory in the next part. Coming soon!

Very nice. I think this is a really important topic, when it comes to understanding quantum mechanics. I'm beginning to think that entanglement entropy is not just analogous to the entropy that almost always increases in classical thermodynamics, but that they might in some sense be the same thing. That is, I wonder whether the second law of thermodynamics can be understood in terms of entanglement?

Isn't the author kind of suggesting a hidden variable theory? He says that if an observer only sees a part of the system he is losing information that is contained in the state of the whole system. (I hope I did not mess that up) Here are his own words

Let's look at a particular example, namely EPR. It's known ahead of time that Alice's particle and Bob's particle are correlated: If Alice measures spin-up along some axis [itex]vec{a}[/itex], then Bob will definitely measure spin-down along that axis.

But if you try to give separate states for Alice's particle and Bob's particle, you would have to say:

But those two statements involves throwing the information that Alice's and Bob's particles are correlated. Whether you consider a correlation to be a "hidden variable" or not is a matter of terminology, but it's definitely not a local hidden variable.

I think you could:

The second laws of quantum thermodynamicshttps://arxiv.org/abs/1305.5278

Instead of just the von Neumann entropy, their result utilizes an infinite family of the Renyi entropies, which apparently are also entanglement measures. (Classically, the Renyi entropies are generalizations of the Shannon entropy and can be derived by postulating some reasonable axioms that the Shannon entropy satisfies.) There may be other approaches, but I know about this one because it is very much quantum-information-theoretic. I don't know it in details but if people are interested, we could dive into it (maybe in another thread).

Well, its an appealing line of thinking and there are some similarities. But its not that straightforward. The general belief is that entanglement entropy contains thermal entropy for a thermal state. And its important to mention that entanglement entropy is not extensive! In the next part, I'll calculate entanglement entropy for a thermal state of a field theory and you'll see that although its not extensive, its high temperature limit is!

Also people have explored laws about entanglement entropy that resemble the laws of thermodynamics. But because the methods used in QFT are too hard, people usually use holographic arguments, e.g. this paper.

My goal is to have a series that starts with Entanglement Entropy, then goes to Holography and then to Holographic methods to calculate the Entanglement Entropy of Quantum Field Theories. But I'm just a master's student working on this so its only going to be an introduction.

That's a very nice Insight article. The only thing I'd make more clear is the definition of the reduced operator. It's a partial trace. Using your notation, the Hilbert space of the total system is spanned by the Kronecker-product basis

$$|Phi(n_a,n_b) rangle=|phi_{n_a}^a rangle otimes |phi_{n_b}^b rangle.$$

Then the reduced statistical operator for subsystem ##a## is given by tracing out subsystem ##b##, i.e.,

$$hat{rho}_a=mathrm{Tr}_{b} hat{rho} := sum_{n_a,n_a'n_b} |Phi(n_a,n_b)rangle langle Phi(n_a,n_b)|hat{rho}|Phi(n_a',n_b) rangle langle Phi(n_a',n_b)|.$$

I've gotten some feedback requesting a greater explanation of the math and terminology of the entropy section.

At the macroscopic level, one can identify entropy with a kind of degeneracy in the system. If there are multiple microstates consistent with the total energy and other conserved quantities, then S = k ln(number of degenerate microstates). What you seem to be saying about QM statistical mechanics is that when the one particle decays into 2, the process generates a 3rd state, the entangled one. Since the component states and the entangled one are degenerate (each is reachable from the others, with some probability), the entropy increases from the single particle state to the 2 produced particle states and still further when the entangled state is taken into account. Is this a reasonable interpretation of what you said?

I added some extra explanation.

No, its not! I wasn't talking about classical or quantum statistical mechanics. Its just quantum mechanics.

In statistical mechanics, entropy is a measure of the information lost to us as the result of ignoring the exact state of the system. Entanglement Entropy is a measure of the amount of information contained in the state of the whole, that is not contained in the "state" of each of the constituents. So its a measure of the information lost as a result of not being able to examine the whole system. It doesn't have anything to do with the size of the system, but statistical entropy is about the size of the system since we ignore the exact state of the system exactly because its a lot of information. The information is there, we just ignore it! But in case of entanglement entropy, we're actually measuring how much information the whole system can give us about a subsystem, that the subsystem itself is fundamentally unable to.

It's the information-theoretical approach, which made me understand what's entropy in the first place. A very good book on the information-theoretical approach to both classical and quantum statistical physics is

A. Katz, Principles of Statistical Mechanics, W. H. Freeman and Company, San Francisco and London, 1967.

Nice article, Shayan.

I've pondered the use of entropy in QM for many years and still don't feel I've really got the hang of it too well. I've always found Shannon's formulation and von Neumann's quantum generalization of it to be rather elegant and fundamental. For me, though, it isn't the entanglement entropy, per se, that's important but rather the ##mutual## information – of course for pure states of bipartite systems the entanglement entropy and mutual information are proportional to one another.

If we have 2 quantum systems ##A## and ##B## with total entropy ##S## and reduced entropies ##S_A## and ##S_B## then they are related by the Araki-Lieb inequality $$ left| S_A – S_B right| leq S leq S_A + S_B $$The RHS of this inequality is of course that of classical systems – the entropy of the whole must be less than or equal to the sum of the entropies of its constituents. The LHS is where the quantum magic comes in. For pure states of the combined ##AB## system the von Neumann entropy is zero so that in any (combined) pure state of 2 systems the quantum entropies of the 2 component pieces are equal, that is, ##S_A = S_B##.

The mutual information is a measure of the 'information content' of the correlation. In other words, if we only did measurements on the 2 systems alone it is the amount of information we would miss by not considering joint properties. With the mutual information defined as ##I = S_A + S_B – S## then using the AL inequality it's easy to show that $$ I leq 2 text {inf} left{ S_A , S_B right} $$The maximum is obtained when the smaller component system is 'maximally' mixed. The classical version of the AL inequality would be $$ rm {sup} left{ S_A , S_B right} leq S leq S_A + S_B $$ so that the total entropy, classically, can't be less than the entropy of either of its constituents.

For EM fields, the 2-mode squeezed state can be considered to be a purification of the single mode thermal state – and the mutual information formalism tells us that the two-mode squeezed state is the most strongly correlated state of 2 modes of the EM field, subject to a mean energy constraint.

If we generalize this measure of correlation to multipartite quantum systems (eg, ## I = S_A + S_B + S_C – S##) then some nice general properties can be derived for the evolution of correlations under unitary evolutions using only very elementary methods. The nice thing about this generalization to multipartite systems is that the mutual information is the only sensible measure that satisfies some reasonable properties – for example, if we had 2 uncorrelated (unentangled) systems ##A## and ##B##, each comprised of component parts, then any measure of correlation of the combined ##AB## system should just give the sum of the amount of correlation ##within## each of ##A## and ##B##.

I still think there's more insight to be gained from the use of entropy/information in QM – but it will take a more talented person than me to figure it out o0)

I just wanted to expand on what @Simon Phoenix said about mutual information. I'm not exactly sure how to understand the meaning of Von Neumann entropy in the weird cases where it's not a positive number (classically, entropy is always positive, and represents roughly the number of bits of information to completely describe the complete situation). But it is a very stark way to show the difference between classical probability and quantum probability.

The idea of mutual information is shown in figure 1, below. If you have a composite system (and I've shown it as composed of a system each for experimenters Alice and Bob), then the information [itex]S[/itex] for the composite system can be broken down into three parts:

If Alice ignores Bob, then her subsystem is characterized by [itex]S(A) = S(A|B) + S(A:B)[/itex]. Similarly, Bob's subsystem ignoring Alice is characterized by [itex]S(B) = S(B|A) + S(A:B)[/itex]. The various pieces of information fit together into a Venn diagram, where [itex]S(A:B)[/itex] is the overlap.

View attachment 195048

Coin flipsThe simplest example is a coin flip: Alice and Bob each flip a coin, and get "heads" or "tails". Alice's result says nothing about Bob's, and vice-versa, so the mutual information is 0. Each coin separately has an entropy of 1 bit. So the total entropy is 2 bits. This situation is shown in Figure 2.

A pair of shoesAnother classical example is a pair of shoes. Imagine that you have a pair of shoes and randomly pick one to send to Alice and send the other to Bob. If Alice ignores Bob, then this seems just like a coin flip: she gets one of two possibilities, that are equally likely. So [itex]S(A) = 1[/itex]. Similarly, [itex]S(B) = 1[/itex]. But in this case, Alice's result (left or right) tells her exactly what Bob's result will be (the opposite). There is only mutual information, and no information for Alice independent of Bob independent of Alice. So [itex]S = S[A:B] = 1[/itex].

EPRThis is the essentially quantum situation. Alice and Bob each measure the spin of one of a pair of correlated particles along some axis. If Alice ignores Bob, then again her measurement seems like a coin flip: she gets one of two possible results, with equal probability. Similarly for Bob. So again [itex]S(A) = S(B) = 1[/itex]. But in this case, the total information is 0! There is only one possibility for a pair of entangled particles, so the composite information [itex]S[/itex] is zero. (Which is what the Von Neumann entropy gives for the composite two-particle state). But these two facts, plus the definitions of mutual information and conditional information lead us to absolutely weird conclusions:

Neither of these make any sense, classically. It says that of the information associated with Alice's measurement (1 bit), 2 bits of it is associated with the mutual information shared by Alice and Bob, and -1 bit is private to Alice. How can information be negative? How can the information for a subsystem be more than the information for the composite system? The mathematics seems to work, but it's hard to understand it, intuitively.

Yes and the issue with the negativity of the conditional entropy is one of the reasons (amongst many) that I struggle with the interpretation of quantum entropies. Classically we have $$ I(A;B) = S(A) + S(B) – S = S(A) – S(A|B)$$ and one can

definea quantum conditional entropy via $$ S(A|B) = S – S(B) $$but this no longer has the intuitive classical meaning along the lines of "the uncertainty in ##A## given knowledge of ##B##" precisely because this 'conditional' entropy can be negative. The von Neumann entropy ## – mathbf {Tr} left( rho rm ln rho right)## is, of course, always greater than or equal to zero.The very appealing classical interpretation of information as a difference of uncertainties is not quite so straightforward in the quantum generalization.

Great Insight! Looking forward to Part 2!

I have difficulties to understand the details of the topic "between the lines". Please can you give a simple example – maybe using a graphical representation – illustrating the bra-ket mathematics in this section.

rhkail

It works better if you ask specific questions. In which line it stopped making sense to you?

I cannot figure out how the spins are encoded in a preferential direction (vector n). An example or a graphical representation in the complex plane could be helpful.

Moreover, why is the density operator rho not described by a matrix?

rhkail

"The arrow of time is an arrow of increasing correlations (with the surroundings)", Seth Lloyd said. So the von Neumann entropy can be considered as being consistent with the entanglement arrow of time, as Sandu Popescu et al showed 2009.

rhkail

As any operator, choosing a basis allows you to describe ##hat{rho}## by an (in general infinite-dimensional) matrix,

$$rho_{jk}=langle j|hat{rho}|k rangle.$$