# Memory, Entropy and the Arrow of Time

1. Sep 29, 2014

Sean Carroll has stated several times that the reason we can remember the past and not the future is because entropy is increasing, i.e. because there is an arrow of time. Is this statement justifiable?

Remember that life and its processes, including memory, require negentropy. In other words, memory as a process involves a net decrease of entropy in the brain. While it is true that this decrease in entropy must be offset somewhere else (e.g. in the sun), I still find it quite misleading to claim that memory is possible only due to a decrease in entropy. The important point is that memories are formed via a decrease in entropy within the brain rather than an increase.

What do you think? Did Sean Carroll not think this one through, or am I missing something?

2. Sep 29, 2014

### Staff: Mentor

No, it doesn't. The entropy increase involved in the chemical reactions that allow the brain to store memories more than compensates for the entropy decrease of the information storage. Similar arguments apply to, say, a computer storing information in its RAM, or writing information to a storage device.

3. Sep 29, 2014

### Chronos

I think there is frequent confusion about entropy. The short version is entropy is the ability to do work, and work is any process that reduces the ability to do more work. Complexity requires work and decreases the ability to do more work. The burning of hydrogen produces water. Water does not burn, so you have decreased the ability of the hydrogen and oxygen present in water to do more work. On the other hand, water molecules are more complex than either hydrogen or oxygen molecules. Does the relative complexity of a water molecule imply a decrease in entropy? Of course not. Similarly, does memory decrease entropy in the brain? The brain requires energy to create a memory and it would be easy to argue a memory is merely the waste product of the memory process. The complexity of memory no more reflects a decrease in entropy than a water molecule.

4. Sep 30, 2014

PeterDonis - of course I wasn't claiming that entropy doesn't decrease globally. I stated that the sun provides the excess entropy increase to allow living processes to occur. If you are correct that memory storage involves a net increase in entropy from chemical reactions in the brain, it doesn't make any difference to what I was saying. The point is that the memory engram as an isolated system involves a decrease in entropy, which is offset by an increase elsewhere. This almost implies that we can remember the past but not the future despite the arrow of time, rather than because of it.

Chronos - surely a water molecule does have a lower entropy compared to individual atoms? It's just that it was created in a process which involved a net increase in entropy?

5. Sep 30, 2014

### Staff: Mentor

Yes, it does. The chemical reactions aren't taking place in the Sun; they're taking place in your brain, at the same place where the information is being stored. Entropy increase is local, not global.

No, this is not correct; storing the memory involves a pair of locally coupled processes (memory storage and chemical reaction to provide energy) which are a net increase in entropy, locally.

6. Sep 30, 2014

Right, but the actual memory engram which allows you to remember something is a lower entropy state the state without the engram. While it is true that there is a net increase in entropy involved in generating that engram, it is specifically the lower entropy of the engram which allows the memory to be recalled.

If you disagree with this, could you offer an explanation as to why an increase in entropy would allow someone to remember the past and not the future?

7. Sep 30, 2014

### Staff: Mentor

Is it? In order to call that isolated system a "memory" of anything, it must be correlated with other systems; more precisely, it must be correlated with the past states of other systems. For example, if I remember what I had for lunch yesterday, it's because some part of my current brain state is correlated with some part of yesterday's state (the part that specifies what I had for lunch then). The process of establishing that correlation increases entropy, because it's irreversible: in order to correlate some part of my brain state with what I had for lunch, whatever was previously stored in that part of my brain state has to be erased, so that information about it is lost.

(Note that what was previously "stored" might not have been a memory--it might have just been random data in a "memory cell" in my brain that hadn't yet been allocated, like free space on a hard drive. But that random data was still a *particular* piece of random data, different from all the other possible pieces of random data that could have occupied that memory cell, and the information about which particular piece of data was there is lost when the memory of what I had for lunch gets written into that cell.)

8. Sep 30, 2014

I don't want to derail into neuroscience in the cosmology forum, but while I think that you're roughly correct that brains have a limited storage capacity and that new memories somehow are erasing existing existing information, I think this is a side issue. This is because we are considering the case when the brain is at its limits of memory engram storage, meaning that it has had to go through many memory storage events in which the entropy of the engrams decreased until it was not possible to fit any more in. This is not the case while the brain is still developing and, hypothetically, if we had bigger skulls and no energy constraints, we could go on forming new memories forever. It it seems that in general the entropy of the engram should decrease when a memory is stored, and in the limit of full capacity it would tend to stay constant.

Again, to avoid derailing, can you offer me some intuition as to why someone might think that the second law of thermodynamics explains why we can remember the past and not the future, as Sean Carroll says?

9. Sep 30, 2014

### Staff: Mentor

That wasn't my point. Did you read the paragraph in parentheses at the end of my last post? Even if the memory is being stored in a storage unit that has not previously been used to store any memory (as in the cases you mention), the process of storing the memory still destroys information about the previous state of the storage unit. The fact that that previous state had no "meaning" as far as storing a memory is irrelevant; the storage unit still had a previous state, and information about what state that was is destroyed in the process of storing the memory.

Because the process of storing memories has to increase entropy, for the reasons I've given. That means our memories have an "arrow of time" built into them. We use the term "past" to denote the "tail" of the arrow, so to speak, and the term "future" to denote the "head" of the arrow, so we say we remember the past and not the future.

Last edited: Sep 30, 2014
10. Sep 30, 2014

"Even if the memory is being stored in a storage unit that has not previously been used to store any memory (as in the cases you mention), the process of storing the memory still destroys information about the previous state of the storage unit. The fact that that previous state had no "meaning" as far as storing a memory is irrelevant; the storage unit still had a previous state, and information about what state that was is destroyed in the process of storing the memory."

I think it's pretty clear that there is more order in the pattern of synaptic connections which encode a memory than the random pattern before the memory. Encoding a memory can be roughly thought of as embedding an energy landscape with an attractor in it into the pattern of connections in a Hopfield network http://en.wikipedia.org/wiki/Hopfield_network#Energy. There's less entropy in the network after an attractor has been embedded than before, when the connection weights were all random.

"Because the process of storing memories has to increase entropy, for the reasons I've given. That means our memories have an "arrow of time" built into them. We use the term "past" to denote the "tail" of the arrow, so to speak, and the term "future" to denote the "head" of the arrow, so we say we remember the past and not the future."

Again, the network with a memory is a lower entropy configuration than the network without the memory, so that the opposite of what you're saying should be true.

11. Sep 30, 2014

### Staff: Mentor

The page you linked to doesn't say that. It says there is less energy in the network after an attractor has been embedded than before. Less energy does not mean less entropy. If it did, a stone rolling downhill and stopping at the bottom due to friction would violate the second law.

12. Oct 1, 2014

### Chronos

You still have the whole entropy thing bassakwards, madness. I tried to explain it in terms of energy, but, that is obviously beyond your grasp.

13. Oct 1, 2014

### Torbjorn_L

Q: Is Carroll is justified in claiming that "the reason we can remember the past and not the future is because entropy is increasing, i.e. because there is an arrow of time"?

A: I haven't checked if that is what Carroll says, but the first part is correct. The second part is unconstrained, because there can be several arrows of time. But there is a global cosmological arrow of time set up by universe expansion, that allows entropy increase. Note that we can't take it further than observing that increase is allowed, the entropy of the universe isn't well defined.

Biological processes aligns with that:

"The change in entropy as a function of time of the bounded system is thus due to two contributions: entropy carried by the flow of material and/or energy across the system's boundary (an incremental amount of which is conventionally labeled deS), and the changes in entropy of the material/energy within the bounded system due to the irreversible processes taking place within it (labeled diS). That is, as is drawn in Fig. 1, the total incremental change in the entropy of the system is:

dS=deS+diS

where deS and dS can be of either sign but by the 2nd law we must have diS ≥ 0 (explanation: deS has no effect on the entropy of the universe since it is just due to moving energy/material from one place to another; therefore it is only the irreversible processes taking place within the system that effect the entropy of the universe; that is: dSuniverse = diS. But by the 2nd law, dSuniverse must be non-negative) [1] and [6]."

[ http://www.sciencedirect.com/science/article/pii/S0005272812010420 ; "Turnstiles and bifurcators: The disequilibrium converting engines that put metabolism on the road", Elbert Branscomb & Michael J. Russell, Biochimica et Biophysica Acta (BBA) - Bioenergetics, 2013]

More specifically we have both exergonic (increasing entropy) and endergonic (decreasing entropy) processes, the condition is that the sum is endergonic:

"Although multiplying the quantities ΔeS and ΔiS by temperature—thus clothing them in the units of energy—recovers the classical Gibb's free energy equation and mollifies both history and convention, it arguably obscures the physics. In particular, the above discussion makes it clear that in the Gibbs relationship there are not three different types of physical quantities: free energy, enthalpy, and entropy; but just one: entropy, and the relationship is, in physical content, ‘really’ just the simple entropy budget equation of NET given above. Note however that the NET [Non-Equilibrium Thermodynamics] entropy budget relationship is more general in two fundamental respects; in applying to open systems (where both energy and matter can flow between the system and its environment), and in applying to ongoing processes taking place at finite velocities, not just to difference between the (equilibrium) end states of those processes (or to processes that are obliged to proceed “quasi statically”)."

I put in bold the description of free energy as it applies to NET, as I see there is some confusion in the thread.

Metabolism is of course both catabolic (free energy converting) and anabolic (building biochemicals), which allows the processes we associate with memory: nerve impulses, synaptic and receptor action including hormones, growth and paring of synapses and nerve cells.

I have no idea what an "engram " is supposed to be, it doesn't sound like a biological description. But I note that memory as a brain function likely is very dependent on plasticity including paring down the system in size. Any idea of a static "recording" of memory is wrong, biological systems are dynamical. If clusters of nerve cells works anything like when they cluster with their sister cells muscle cells (common cell lineage descendant) in the nerve cord and skeletal muscles systems, they use pattern generation to identify and play out actions. (And I believe that is what neuroscientists research.)

The young brain grows to a maximum size and the number of synapses goes down a factor 10 during adult life. I mention this for those that may erroneously believe that memory must mean that the system grows more complex (no, see above) or that complexity is a simple function of entropy (no, increasing entropy in sufficiently constrained systems confer order).

Last edited: Oct 1, 2014
14. Oct 1, 2014

"The page you linked to doesn't say that. It says there is less energy in the network after an attractor has been embedded than before. Less energy does not mean less entropy. If it did, a stone rolling downhill and stopping at the bottom due to friction would violate the second law."

I wasn't referring to the energy. The Hopfield network before the memory is embedded has random connections, and after the memory is embedded it has a specific pattern of connections which generate attractor dynamics within the network. This is what constitutes a decrease in entropy.

If you still don't believe me, read this paper titled "Self-organization and entropy decreasing in neural networks" http://ptp.oxfordjournals.org/content/92/5/927.full.pdf and this paper, titled "Pattern recognition minimizes entropy production in a neural network of electrical oscillators" http://www.sciencedirect.com/science/article/pii/S0375960113007305.

"You still have the whole entropy thing bassakwards, madness. I tried to explain it in terms of energy, but, that is obviously beyond your grasp."

Lol. Glad to see you're taking the moral high ground here Chronos ;).

"I have no idea what an "engram " is supposed to be, it doesn't sound like a biological description."

Then why not do a google search? http://en.wikipedia.org/wiki/Engram_(neuropsychology) http://www.sciencemag.org/content/341/6144/387.

"Any idea of a static "recording" of memory is wrong, biological systems are dynamical."

Come on, this is simply not true. It was proven long ago by Eric Kandel that memories are stored in the structural connections between cells rather than the ongoing reverberatory dynamics.

"The young brain grows to a maximum size and the number of synapses goes down a factor 10 during adult life."

This is actually a perfect example of entropy decreasing and complexity increasing. In the young brain, all cells are basically connected to all other cells. With learning, cells which do not participate in synchronous patterns of activity have their connections removed until a highly specific set of connections remains.

Last edited: Oct 1, 2014
15. Oct 1, 2014

### Staff: Mentor

But the page you linked to was; it didn't say anything about entropy decrease.

It looks like entropy here is being defined in the information theoretic sense, i.e., the entropy of a state with probability $P$ is $- P \ln P$, and you just sum over all the possible states to get the total entropy. There is a lot of contention about whether, and under what conditions, this definition of entropy correlates with the usual thermodynamic definition. To the extent the two definitions don't correlate, we may be talking past each other, since you appear to be talking about information theoretic entropy and I am talking about thermodynamic entropy.

One of the key reasons why the two senses of entropy may not correlate is that, if you consider a particular information system in isolation, its information theoretic entropy can indeed decrease, because, for example, if you take a memory cell whose state you don't know, and put it in a state you do know (for example by storing a zero bit there), the information theoretic entropy of the memory cell after the operation is zero, where before the operation it was some positive number (for a single bit it would be $\ln 2$, because you have two states each with probability 1/2). However, thermodynamically speaking, writing a value into a memory cell increases entropy, because you destroy the information about the cell's previous state--for example, for a single bit, there are two possible "before" states, but only one possible "after" state (because you forced the bit to a known state), so the time evolution is irreversible and thermodynamic entropy increases.

The paper is behind a paywall so I can only read the abstract; it talks about minimizing "entropy production", which doesn't sound to me like decreasing entropy, just minimizing the increase in entropy. But I can't tell for sure since I can't read the actual paper.

16. Oct 1, 2014

"It looks like entropy here is being defined in the information theoretic sense, i.e., the entropy of a state with probability P is −PlnP, and you just sum over all the possible states to get the total entropy. There is a lot of contention about whether, and under what conditions, this definition of entropy correlates with the usual thermodynamic definition. To the extent the two definitions don't correlate, we may be talking past each other, since you appear to be talking about information theoretic entropy and I am talking about thermodynamic entropy."

Given that a Hopfield network is equivalent to an Ising model, and that the energy and entropy is exactly the same in these two models, the definition of entropy in that paper is obviously the same as the usual Gibbs entropy for an Ising model. Moreover, Shannon entropy and GIbbs entropy are completely equivalent when working with the probability distribution of states in a system, as we are here (http://en.wikipedia.org/wiki/Entrop...d_information_theory#Theoretical_relationship).

"One of the key reasons why the two senses of entropy may not correlate is that, if you consider a particular information system in isolation, its information theoretic entropy can indeed decrease, because, for example, if you take a memory cell whose state you don't know, and put it in a state you do know (for example by storing a zero bit there), the information theoretic entropy of the memory cell after the operation is zero, where before the operation it was some positive number (for a single bit it would be ln2, because you have two states each with probability 1/2). However, thermodynamically speaking, writing a value into a memory cell increases entropy, because you destroy the information about the cell's previous state--for example, for a single bit, there are two possible "before" states, but only one possible "after" state (because you forced the bit to a known state), so the time evolution is irreversible and thermodynamic entropy increases."

The information theoretic and thermodynamic versions of entropy are literally identical in this case, other than the Bolzmann constant and the base of the logarithm. There is absolutely no difference. I think you are getting confused because you are not considering that the entropy in the paper is determined by the probability that the network is in a particular activation pattern at a given time, and has nothing to do with overwriting or losing information. Networks without stored memories will randomly visit all possible patterns, having a more or less uniform probability distribution over the activation patterns and therefore high entropy, whereas networks with stored memories will converge to attractor states, so that there is a high probability of only a few patterns and therefore a low entropy.

Last edited: Oct 1, 2014
17. Oct 1, 2014

### Staff: Mentor

Hm. Maybe I've misstated the way in which we are talking past each other. Let me try again with a simple one-bit model.

Suppose we have a one-bit memory storage cell and we don't know what state it's in. The entropy is $\ln 2$. Now we store a 0 bit in the cell. The entropy of the cell is now 0.

But we left out something in the above: how did the 0 bit get stored? In the absence of any external interaction, the cell's state will never change (we're idealizing it as perfectly stationary, whereas of course real memory cells are not, but if it's going to work as a memory cell we want it to keep the same value once we store one), and its entropy will never change either. So we had to interact with the cell somehow in order to force it into the 0 bit state.

That means that, if we have the ability to change the cell's state, we can't consider it as an isolated system; we have to take the interaction that changes the state into account in any correct analysis of how entropy changes with the change in state. For example, suppose we have a simple "bit swap" interaction: we take a cell that stores a known 0 bit and move it close to the memory cell we want to store a 0 bit to. The interaction between them swaps the two bits: the known 0 bit goes to the memory cell, and the unknown bit in the memory cell goes to our cell. In this interaction, the total entropy change is zero: we've just swapped $\ln 2$ entropy from one cell to the other. Note that this particular interaction is reversible (we can just swap the bits back again), which is why it has zero entropy change.

But this really just pushes the problem back a step: how did we get a known 0 bit into the other cell in the first place? Sooner or later, as you trace the chain of bits back, you are going to come to a point where an irreversible interaction happened: a known 0 bit got into some memory cell at the expense of more than $\ln 2$ entropy increase in whatever system interacted with that cell to store the 0 bit in it--in other words, the total interaction at that point increased entropy instead of keeping it constant.

Or we can look at it another way. The "bit swap" interaction has a drawback, if we're trying to view it as "storing a memory": it doesn't correlate the 0 bit we stored with anything that it wasn't correlated with already. It just swaps the bit from one memory cell to another. But if that 0 bit is supposed to be a "memory" of something, then at some point some bit has to get set to 0 as part of an interaction that correlates it with something else that it wasn't correlated with before.

For example, suppose we want to store a 0 bit because some pixel in a video frame is 0 (this is a very crude video frame with only one bit per pixel). The interaction can't change the pixel itself, because it's being used for other things (like being seen). So we have to take a memory cell with an unknown bit value (and therefore a bit value that is uncorrelated with the pixel) and turn it into a memory cell with a 0 bit stored (and therefore perfect correlation with the pixel). What sort of interaction could do that? Well, we could have a store of known bits, some 0, some 1, and then we could measure the pixel bit and pick which known bit to swap into the memory cell based on whether the pixel bit was 0 or 1. The bit swap part is fine--no entropy increase there, as we saw above. But what about the measuring and picking part? It doesn't seem to me that there's any way to do that without at some point having an entropy increase.

Your networks are just more complicated versions of the same thing. You say the network with memories stored has a non-random pattern of connections that creates attractors in its state space: but how did the connections get changed to that nonrandom pattern? By some interaction with something external to the network. You have to count that interaction when you're figuring the entropy change.

In short, when you include the interactions of any memory storage device, the interactions that are necessary in order for it to actually store memories, you find that there always has to be some entropy increase involved in storing the memories. You can only ignore this by artificially looking only at the storage device itself as an isolated system, even though it's impossible for an isolated storage device to change state, and therefore it's impossible for an isolated storage device to store memories.

18. Oct 2, 2014

"In short, when you include the interactions of any memory storage device, the interactions that are necessary in order for it to actually store memories, you find that there always has to be some entropy increase involved in storing the memories. You can only ignore this by artificially looking only at the storage device itself as an isolated system, even though it's impossible for an isolated storage device to change state, and therefore it's impossible for an isolated storage device to store memories."

Great, so now we've agreed on what I said to be true in my second post of the thread.

Who cares if it's impossible for an isolated device to store memories? As I said at the beginning of the thread, the entropy decrease within the memory device is offset by an increase elsewhere, but the crucial fact is that it is only possible to store memories by decreasing entropy, against the global arrow of time. It is therefore nonsense say that it is this arrow of time that allows us to remember the past and not the future - it's the fact that our brains are able to decrease the entropy of certain circuits which allows us to remember the past.

19. Oct 2, 2014

### Staff: Mentor

No, not "elsewhere". Both state changes--memory device and "elsewhere"--are intrinsically part of the same interaction. There's no way to separate them.

No. It's only possible to store memories through an interaction that increases entropy. If the entire universe were already in a state of maximum entropy, there would be no way to store memories. That's why the global arrow of time has to be there, and why the "past" direction of time--the direction in which memories "point"--has to be the "tail" direction of the global arrow of time.

20. Oct 2, 2014

### Staff: Mentor

Let me try to restate this in a way that might be a little less contentious.

In order to store memories, there has to be a store of negentropy in the universe. So the OP is correct when it states:

The process of storing memories can be viewed as transferring negentropy from somewhere else into the memory storage cell. However, this transfer is never 100% efficient; the interaction that transfers the negentropy always expends some in the process, so some of the negentropy that was taken from the universal store does not make it into the memory cell; it gets wasted.

The above has two implications: (1) memories can't be stored if there is no negentropy left in the universe; (2) there is less negentropy in the universe after a given memory is stored than before. These facts are what link the thermodynamic arrow of time with the direction of memories.