# Entanglement Entropy – Part 1: Quantum Mechanics

~~Quantum Mechanics(QM) is one of the greatest intellectual achievements in human history. Not only because it describes the world at the microscopic level and in turn provides us with the technological advancement that we enjoy today, but also because it shows us how little we know about the world we’ve been living in for so long. It shows us the strength of the human mind and its power to use logical arguments and mathematical reasoning to the point that it can surpass not only what we have imagined, but also what we are able to imagine. It shows us the limitations of our common sense that is the result of being compiled by observing only the phenomena that are common around us. In short, it has provided us with an enhanced view of ourselves and the world we live in.~~

In this insight, I’m going to explain a little bit about entanglement in QM, which is one of the interesting features of this amazing theory. But before delving into entanglement, its helpful to have some discussion about the notion of a quantum state.

**Quantum State**

Look around yourself. Surely there are some stuff around you that make up your environment. If I ask you to describe it, to specify its state, a typical answer would be like ” There is a desk near the door and a dresser under the window. My bed is next to the dresser and my dog is running around in the middle of the room. “. You just list the stuff that exist around you and give their positions and velocities. Your mind just assumes that theses stuff objectively possess those properties and so it seems so intuitive to use those same properties to specify the state. This is our intuitive notion of the state of a classical system, and Classical Mechanics(CM) uses this same intuitive notion to describe the state of a classical system.

In CM, the state of a classical system is described by a point in a space called the phase space and its evolution is just the motion of this point through this space. It sometimes happens that the constituents of the classical system are kinematically connected, which means there is a connection between some of the properties of different constituents. I call it a kinematical connection as opposed to a dynamical connection which is an interaction between different constituents. The only effect of a kinematical connection, is to restrict the allowed portion of the phase space in which the classical system is allowed to be in. It doesn’t fundamentally change anything about the way we talk about different constituents of the classical system and only forces the system to move in a subspace of the phase space. But as you noticed, I said that for a classical system, a kinematical connection affects the motion of the system through phase space and so its not about the description of the system at a certain time i.e. its state. So It seems that for a classical system, any connection between its constituents only affects the time evolution of the system and not our description of it at a particular time.

But as I mentioned earlier, QM makes us realize that our universe is something more than the things that have formed our common sense and those extra stuff are so much different that even our way of describing the state of a system doesn’t apply to them. In QM, the state of a quantum system is actually a bunch of probability distributions for all of the properties that the quantum system possesses. This means that a kinematical connection between two constituents implies that you can’t have separate probability distributions for each particle for at least one of those properties. This means that the information that the quantum state gives about only one of the constituents gets some non-trivial effects from the information about the other constituents and this effect is truly kinematical and actually about the state of that constituent at a particular time.

Let’s consider an example for some clarification. Consider a particle at rest that disintegrates to two different particles. Because the initial state had zero momentum, the state after the disintegration should have zero momentum too and so we should have ## p_1+p_2=0 ##. In CM, this means that the system evolves in the ## p_1+p_2=0 ## subspace of the phase space. Consider a particular point in this subspace, e.g. ## (x_1=1,x_2=4,p_1=5,p_2=-5) ##. This state by itself doesn’t tell you anything about the connection between the two particles because two unrelated particles can be in this same state either.

But for the corresponding example in QM, the state of the system would be ##\displaystyle\int_0^{\infty} dp \phi(p) \left( e^{-ipx_1} e^{ipx_2}+e^{ipx_1}e^{-ipx_2}\right) ##. As you can see, the relation between the particles is obvious in the state itself and not in the way they’re going to evolve. So the connection between the particles is actually encoded in their state and actually affects the information that the state can give about each of the particles.

**Entanglement**

As I explained, in QM it is possible to have systems that are connected in a way that in general there is no way to assign a state to each individual one of them because the state of the system that is composed of those systems, encode something more than the information about each individual system: the connection between those systems. But anyone familiar with QM knows that there is still some notion of state that can be assigned to those individual systems(mixed states), so am I wrong in my last sentence? In a way, no!

Let’s go back to our CM example. When we say the two particles are in the state ## (x_1=1,x_2=4,p_1=5,p_2=-5) ##, we are giving the maximum amount of information that we can have about the system. This seems to be a reasonable notion of state because it doesn’t make sense to voluntarily give up information unless its needed, which in the case of two particles its obviously not. As I said this state contains no information more that the state of each of the particles and so we easily can give each of the particles its own state, with the definition just given. So in CM, its always possible to assign a state to a system, at least in principle.

But let’s see how that definition of state works out in QM. The quantum state is something more than just the state of each of the constituents of the system, so its obvious that if we assign any kind of “state” to each of the constituents, in general we’re going to lose some information. So it seems that by the definition given, a system in an entangled state is a system that is composed of individual subsystems that can not possess a state. But as I said, we still can assign some kind of a “state” to those subsystems but its obvious that, in general, we’re going to lose some information.

**Entanglement Entropy**

So we learned that a number of individual systems can be part of a larger system and in general this affects the amount of information we can have about each of them. But entanglement is not an on-off concept and there can be different degrees of it and so its useful to have a quantity that indicates the amount of entanglement. The quantity that is used is called the von Neumann entropy. Let’s see some mathematics:

###### If you’re familiar with state vectors, inner products, etc. you can skip this part between the lines.

As I explained, a quantum state is a bunch of probability distributions for the properties of the system. But what kind of a monster can contain that much information? To find that monster, we should at first take a look at the notion of a probability distribution.

When we talk about probabilities, the situation that comes to mind is that we have a variable that randomly takes values from a set, either continuous or discrete. A probability distribution associates a probability to each of the values in the set.

So the monster that we are looking for, should have the ability to indicate different values of each of the properties and give each of them a probability. But in case of quantum mechanics the familiar probability theory with real numbers doesn’t work, because different possibilities for a property of a quantum system have the ability to interfere with each other and our notion of state should be able to indicate that interference too. So Instead of using probability distributions directly, we use probability amplitudes whose modulus squared gives a probability distribution. We also allow these probability amplitudes to have complex values.

But different properties of a quantum system, take values from different kinds of sets and so a quantum state should have separate states for different properties that are somehow connected to each other to give a quantum state for the system. Such a mathematical notion was developed after QM was born and its called a Complex Hilbert Space and a member of such a space is called a state vector. A Hilbert Space is so much like the good old Euclidean space, actually a Euclidean space is a Real Hilbert Space with a Euclidean inner product.

So, a Complex Hilbert Space also needs a basis and that basis is given by different possibilities for the value of the property for which we want to make a Hilbert Space. In the following, I’m going to take the spin component of the state vector of a spin-1/2 particle as an example.

We know from QM, that the spin of a spin-1/2 particle can be either up or down along any arbitrary direction. So we should just choose a direction, usually the z-direction, and consider up or down along that direction as the basis of the Hilbert Space(then we say that those basis vectors span the Hilbert Space) and a state for the spin of the said particle should give probability amplitudes to each of the possibilities. The state is represented by ## |\psi\rangle ## and the basis by ## \{|\uparrow\rangle,|\downarrow\rangle\} ##. It should be noted that the symbols inside ## | \rangle ## are just labels, so they don’t have to be any kind of number or alphabets in any particular language, just whatever you like.

Like the Euclidean space in which we can expand any vector in terms of the basis ## \hat x, \hat y, \hat z ##, here we can write ## |\psi\rangle=a|\uparrow\rangle+b|\downarrow\rangle ##. a and b are the probability amplitudes for the spin to be up or down, respectively. So ## P(\uparrow)=\frac{|a|^2}{|a|^2+|b|^2} ## and ## P(\downarrow)=\frac{|b|^2}{|a|^2+|b|^2} ##. I divided by ##{|a|^2+|b|^2}## because probabilities should add up to one.

But what if we want to know the probability that the spin is up or down along a direction different from the z-direction? This is where the notion of an inner product comes into play. An inner product (indicated by ## \langle \phi|\psi\rangle ##) is just a way of associating a complex number to two vectors and all we have to do to define it, is to axiomatize the result of the inner products of the basis vectors by each other. So we should just give arbitrary values to ## \langle \uparrow|\uparrow\rangle, \langle \uparrow|\downarrow\rangle, \langle \downarrow|\uparrow\rangle ## and ##\langle \downarrow|\downarrow\rangle##. Of course these are just arbitrary values from a mathematical perspective and should make sense from a physical point of view. If the spin is definitely up, then its definitely not down and vice versa, so it makes sense to assume ## \langle \uparrow|\downarrow\rangle=\langle \downarrow|\uparrow\rangle=0 ##. But if the spin is definitely up(down)…well…its definitely up(down). So we should have ## \langle \uparrow|\uparrow\rangle=\langle \downarrow|\downarrow\rangle=1 ##.

Now we should just encode the direction we want (## \hat n=(\theta,\phi) ##) in a state ## |n\rangle=\cos(\frac \theta 2) |\uparrow\rangle+e^{i\phi}\sin(\frac \theta 2)|\downarrow\rangle ## and evaluate:

## \langle n|\psi\rangle=\left( \cos(\frac \theta 2)\langle \uparrow|+e^{-i\phi}\sin(\frac \theta 2) \langle \downarrow| \right)\left(a|\uparrow\rangle+b|\downarrow\rangle\right)=\\ a\cos(\frac \theta 2)\langle \uparrow|\uparrow\rangle+b\cos(\frac \theta 2)\langle \uparrow|\downarrow\rangle+ae^{-i\phi}\sin(\frac \theta 2) \langle \downarrow|\uparrow\rangle+be^{-i\phi}\sin(\frac \theta 2) \langle \downarrow|\downarrow\rangle=\\ a\cos(\frac \theta 2)+be^{-i\phi}\sin(\frac \theta 2) ##

So for a spin in the state ## |\psi\rangle=a|\uparrow\rangle+b|\downarrow\rangle ##, the probability to be in the direction ## \hat n ##, is equal to ## a\cos(\frac \theta 2)+be^{-i\phi}\sin(\frac \theta 2) ##.

A density operator is another way to associate a state to a quantum system. Its defined by ## \rho=|\psi\rangle\langle\psi| ## and for the above state becomes:

##\rho=\left(a|\uparrow\rangle+b|\downarrow\rangle\right)\left( a^*\langle \uparrow|+b^*\langle \downarrow| \right)=|a|^2|\uparrow\rangle\langle \uparrow|+ab^*|\uparrow\rangle\langle\downarrow|+ba^*|\downarrow\rangle\langle\uparrow|+|b|^2|\downarrow\rangle\langle\downarrow| ##

Then the probability for the spin to be up is given by ## P(\uparrow)=Tr(\rho |\uparrow\rangle\langle \uparrow|) ##.The Tr symbol means the trace of an operator(say, ##\sigma##) which is calculated like ## Tr(\sigma)=\langle \uparrow|\sigma|\uparrow\rangle+\langle\downarrow|\sigma|\downarrow\rangle ##. So we have:

## P(\uparrow)=Tr(\rho |\uparrow\rangle\langle \uparrow|)=\langle \uparrow |\rho|\uparrow\rangle\underbrace{\langle\uparrow|\uparrow\rangle}_{1}+\langle \downarrow|\rho|\uparrow\rangle\underbrace{\langle \uparrow|\downarrow\rangle}_{0}=\\ \langle \uparrow |\left(|a|^2|\uparrow\rangle\langle \uparrow|+ab^*|\uparrow\rangle\langle\downarrow|+ba^*|\downarrow\rangle\langle\uparrow|+|b|^2|\downarrow\rangle\langle\downarrow|\right)|\uparrow\rangle=\\|a|^2 \underbrace{\langle \uparrow|\uparrow\rangle}_1\underbrace{\langle\uparrow|\uparrow\rangle}_1+ab^*\underbrace{\langle \uparrow|\uparrow\rangle}_1\underbrace{\langle\downarrow|\uparrow\rangle}_0+ba^*\underbrace{\langle\uparrow|\downarrow\rangle}_0\underbrace{\langle\uparrow|\uparrow\rangle}_1+|b|^2\underbrace{\langle\uparrow|\downarrow\rangle}_0\underbrace{\langle\downarrow|\uparrow\rangle}_0=|a|^2 ##

The quantum state of a system is given by a state vector, ## |\psi\rangle ##. We can also associate a density operator to the system, ## \rho=|\psi\rangle \langle \psi| ##. Now if the quantum system is consisted of two parts and the Hilbert spaces of these two parts are spanned by ## \{ |\phi^a_n\rangle \} ## and ## \{ |\phi^b_n\rangle \} ##, the “state” that we can associate to subsystem a, is computed by the formula ## \rho_a=\sum_n \langle \phi^b_n|\rho|\phi^b_n \rangle ##.( For a spin-1/2, this becomes ## \rho_a=\langle \uparrow^b |\rho|\uparrow^b\rangle+\langle \downarrow^b|\rho|\downarrow^b\rangle ##, where the superscript b means that these basis vectors should only be multiplied by the basis vectors of the b part of the state.)Then the von Neumann entropy is calculated by the formula ## S_a=-Tr(\rho_a \ln \rho_a) ##.

Lets consider a system consisted of two spins in the state ## \sin\theta |\uparrow\rangle| \downarrow\rangle+\cos\theta |\downarrow\rangle| \uparrow\rangle ##. The density operator associated to this system is:

## \rho= \left[ \sin\theta |\uparrow\rangle| \downarrow\rangle+\cos\theta |\downarrow\rangle| \uparrow\rangle \right] \left[\sin\theta \langle\uparrow|\langle\downarrow|+\cos\theta \langle\downarrow|\langle \uparrow|\right]= \\ \sin^2\theta |\uparrow\rangle| \downarrow\rangle \langle\uparrow|\langle\downarrow|+ \sin\theta \cos\theta \left [|\uparrow\rangle| \downarrow\rangle \langle\downarrow|\langle \uparrow|+|\downarrow\rangle| \uparrow\rangle \langle\uparrow|\langle\downarrow| \right]+\cos^2\theta |\downarrow\rangle| \uparrow\rangle \langle\downarrow|\langle \uparrow| ##

The reduced density operator associated to each of the spins can easily be calculated to be ## \rho_i =\sin^2\theta |\uparrow\rangle\langle\uparrow|+\cos^2\theta |\downarrow\rangle\langle\downarrow| ## and so the von Neumann entropy is equal to ## -\sin^2\theta \ln \sin^2\theta-\cos^2\theta \ln\cos^2\theta ##.

As you can see the state given above is parameterized by ## \theta ##, so let’s see how is the situation for different values of the parameter. For ## \theta=0 ## and ## \theta=\frac \pi 2 ##, the state is, respectively, ## |\downarrow \rangle|\uparrow\rangle## and ## |\uparrow\rangle |\downarrow \rangle##. These are factored states, like classical states. For these states, a complete knowledge about the whole system, gives us a complete knowledge about its parts. And you can see that for these states, the von Neumann entropy is equal to zero. But for ## \theta=\frac \pi 4 ##, the state is ## \frac 1 {\sqrt 2} \left(|\uparrow\rangle |\downarrow \rangle+|\downarrow \rangle|\uparrow\rangle\right) ##. This state is clearly entangled and each of the two possibilities is equally likely. The von Neumann entropy for this state is equal to ## \ln 2 ##.

**Why is it an Entropy?**

Good question. As I explained, entanglement means that there is some information about the subsystems in the state of the whole system. So if someone is only allowed to examine a subsystem, they are going to miss some information and because that information is going to affect the subsystem at hand, it means that there is actually some information about the subsystem under study that is lost to the observer. The more the entanglement, the more the information the observer misses about the subsystem. So it seems that any measure of entanglement, also measures the amount of information about a subsystem that the observer is going to lose if they’re only allowed to examine that subsystem. This really feels like some kind of entropy!

Now that we have some understanding of the notion of Entanglement Entropy in QM, I’m going to delve into the same notion for a Quantum Field Theory in the next part. Coming soon!

Very nice. I think this is a really important topic, when it comes to understanding quantum mechanics. I'm beginning to think that entanglement entropy is not just analogous to the entropy that almost always increases in classical thermodynamics, but that they might in some sense be the same thing. That is, I wonder whether the second law of thermodynamics can be understood in terms of entanglement?

Isn't the author kind of suggesting a hidden variable theory? He says that if an observer only sees a part of the system he is losing information that is contained in the state of the whole system. (I hope I did not mess that up) Here are his own words

Let's look at a particular example, namely EPR. It's known ahead of time that Alice's particle and Bob's particle are correlated: If Alice measures spin-up along some axis [itex]vec{a}[/itex], then Bob will definitely measure spin-down along that axis.

But if you try to give separate states for Alice's particle and Bob's particle, you would have to say:

But those two statements involves throwing the information that Alice's and Bob's particles are correlated. Whether you consider a correlation to be a "hidden variable" or not is a matter of terminology, but it's definitely not a local hidden variable.

I think you could:

The second laws of quantum thermodynamicshttps://arxiv.org/abs/1305.5278

Instead of just the von Neumann entropy, their result utilizes an infinite family of the Renyi entropies, which apparently are also entanglement measures. (Classically, the Renyi entropies are generalizations of the Shannon entropy and can be derived by postulating some reasonable axioms that the Shannon entropy satisfies.) There may be other approaches, but I know about this one because it is very much quantum-information-theoretic. I don't know it in details but if people are interested, we could dive into it (maybe in another thread).

Well, its an appealing line of thinking and there are some similarities. But its not that straightforward. The general belief is that entanglement entropy contains thermal entropy for a thermal state. And its important to mention that entanglement entropy is not extensive! In the next part, I'll calculate entanglement entropy for a thermal state of a field theory and you'll see that although its not extensive, its high temperature limit is!

Also people have explored laws about entanglement entropy that resemble the laws of thermodynamics. But because the methods used in QFT are too hard, people usually use holographic arguments, e.g. this paper.

My goal is to have a series that starts with Entanglement Entropy, then goes to Holography and then to Holographic methods to calculate the Entanglement Entropy of Quantum Field Theories. But I'm just a master's student working on this so its only going to be an introduction.

That's a very nice Insight article. The only thing I'd make more clear is the definition of the reduced operator. It's a partial trace. Using your notation, the Hilbert space of the total system is spanned by the Kronecker-product basis

$$|Phi(n_a,n_b) rangle=|phi_{n_a}^a rangle otimes |phi_{n_b}^b rangle.$$

Then the reduced statistical operator for subsystem ##a## is given by tracing out subsystem ##b##, i.e.,

$$hat{rho}_a=mathrm{Tr}_{b} hat{rho} := sum_{n_a,n_a'n_b} |Phi(n_a,n_b)rangle langle Phi(n_a,n_b)|hat{rho}|Phi(n_a',n_b) rangle langle Phi(n_a',n_b)|.$$

I've gotten some feedback requesting a greater explanation of the math and terminology of the entropy section.

At the macroscopic level, one can identify entropy with a kind of degeneracy in the system. If there are multiple microstates consistent with the total energy and other conserved quantities, then S = k ln(number of degenerate microstates). What you seem to be saying about QM statistical mechanics is that when the one particle decays into 2, the process generates a 3rd state, the entangled one. Since the component states and the entangled one are degenerate (each is reachable from the others, with some probability), the entropy increases from the single particle state to the 2 produced particle states and still further when the entangled state is taken into account. Is this a reasonable interpretation of what you said?

I added some extra explanation.

No, its not! I wasn't talking about classical or quantum statistical mechanics. Its just quantum mechanics.

In statistical mechanics, entropy is a measure of the information lost to us as the result of ignoring the exact state of the system. Entanglement Entropy is a measure of the amount of information contained in the state of the whole, that is not contained in the "state" of each of the constituents. So its a measure of the information lost as a result of not being able to examine the whole system. It doesn't have anything to do with the size of the system, but statistical entropy is about the size of the system since we ignore the exact state of the system exactly because its a lot of information. The information is there, we just ignore it! But in case of entanglement entropy, we're actually measuring how much information the whole system can give us about a subsystem, that the subsystem itself is fundamentally unable to.