I Can Entropy Increase During a Probability Model Simulation?

Stephen Tashi · Dec 26, 2020

Thinking of the common language notion of "entropy" as "uncertainty", how can running a simulation based on a probability model implement entropy increasing? After all, the simulation picks definite outcomes to happen, so (intuitively) there is less uncertainty about the future as definite outcomes occur.

For a slightly more technical way of putting it, a probability model for phenomena provides a way to simulate outcomes of processes using various probability distributions. Probability distributions can be assigned an entropy by the Shannon definition of entropy. In the case where there are dependent random variables X,Y in the model, a definite outcome for random variable X might decrease or increase the entropy of the resulting conditional probability distribution for the future event Y compared to its prior distribution before X was realized. Does it require care in designing probability models to force Shannon entropy to increase as a simulation progresses in time? Is there a measure of entropy for a simulation other than Shannon entropy, which will "naturally" increase as the simulation is run?

Jarvis323 · Dec 26, 2020

Stephen Tashi said:

Summary:: Is there a measure of entropy for probability models that obeys the second law of thermodynamics?

Thinking of the common language notion of "entropy" as "uncertainty", how can running a simulation based on a probability model implement entropy increasing? After all, the simulation picks definite outcomes to happen, so (intuitively) there is less uncertainty about the future as definite outcomes occur.

For a slightly more technical way of putting it, a probability model for phenomena provides a way to simulate outcomes of processes using various probability distributions. Probability distributions can be assigned an entropy by the Shannon definition of entropy. In the case where there are dependent random variables X,Y in the model, a definite outcome for random variable X might decrease or increase the entropy of the resulting conditional probability distribution for the future event Y compared to its prior distribution before X was realized. Does it require care in designing probability models to force Shannon entropy to increase as a simulation progresses in time? Is there a measure of entropy for a simulation other than Shannon entropy, which will "naturally" increase as the simulation is run?

Usually probabilistic process models require stationary, which means that the intrinsic underlying state machine doesn't evolve. And asymptotically, the entropy rate, and other measures from information theory will be fixed for the process.

https://en.wikipedia.org/wiki/Entropy_rate

If it's not stationary, so that the state machine itself is evolving, entropy can increase as the probabilities governing state transitions become more uniform, so that the next state is less predictable based on the current one.

Entropy rate alone isn't that interesting, for example, a fair coin process has the maximum entropy rate possible. Forecasting complexity, which measures the average number of states the system goes through over time, and temporal correlation range, which measures how far back you have to look to maximally predict the future, are interesting to look at in conjunction with the entropy rate.

Maybe if you're after a process where entropy strictly increases, the process model would have a series of transitory states (states which eventually are never visited again). As you progress through those transitory states, the outgoing state transitions on average per state, will have more uniform probability distributions. Maybe the process is not stationary, and is basically just all transitory states, or maybe eventually you reach the core states of a stationary process, which will then have an const asymptotic entropy rate.

But I don't think that the entropy in a process model, and the entropy in thermodynamics are really the same type of thing exactly.

atyy · Dec 26, 2020

The Shannon entropy can be argued to be consistent with the common language notion of "entropy" as "uncertainty". The informal argument is found in physics sources (eg. II.G of Kardar's lecture notes), and the more formal version can be searched under "asymptotic equipartition" and "typicality".

In classical physics, Newton's laws are time reversible, and "fine-grained" entropy does not increase. However, we believe that these should also give rise to the second law of thermodynamics. Simulations show that for many initial conditions and Newton's laws, "coarse-grained" entropy does increase over physically relevant time scales.

Mathematically, the question is how to derive the second law (where "coarse grained" entropy increases) from Newton's laws (where "fine-grained" entropy does not increase).

Two important ideas found in physics textbooks are
(1) Boltzmann's H-theorem which relies on the assumption of "molecular chaos" (and which he did not realize he had used) https://en.wikipedia.org/wiki/H-theorem
(2) Poincare recurrence which suggests that if entropy increases, it must eventually decrease, but only over physically irrelevant time scales https://en.wikipedia.org/wiki/Poincaré_recurrence_theorem

I'm not sure what the formal mathematical status is of this research area. One area for which there has been progress is in Landau damping, where the heuristic argument in many physics textbooks has apparently been formally proved now [I say "apparent" only because the maths is beyond me; but the sources saying so are good]. Some references that may be useful are:

https://arxiv.org/abs/0807.1268
Entropy, Probability and Dynamics
E. G. D. Cohen

https://cedricvillani.org/sites/dev/files/old_images/2012/08/P18.Bourbaphy_English.pdf
(Ir)reversibility and entropy
Cedric Villani

https://www.Newton.ac.uk/files/seminar/20101115170018004-152633.pdf
Entropy and H-Theorem
Cedric Villani

You can find work about other irreversibility theorems in:

https://arxiv.org/abs/cond-mat/0503393
Evolution of Entanglement Entropy in One-Dimensional Systems
Pasquale Calabrese, John Cardy

https://ncatlab.org/nlab/show/c-theorem
The Zamolodchikov c-theorem

pbuk · Dec 27, 2020

Jarvis323 said:

But I don't think that the entropy in a process model, and the entropy in thermodynamics are really the same type of thing exactly.

Nor do I.

Stephen Tashi said:

Does it require care in designing probability models to force Shannon entropy to increase as a simulation progresses in time?

That would be difficult: at the end of the simulation the system is in exactly 1 state with probability 1.

Edit: but that's not how the 2nd law of thermodynamics works anyway: the value of the wind direction at noon tomorrow has a high entropy, but when the clock strikes twelve it can have exactly one possible state.

Stephen Tashi · Dec 27, 2020

Jarvis323 said:

Maybe if you're after a process where entropy strictly increases, the process model would have a series of transitory states (states which eventually are never visited again).

That would work. I was thinking (in terms of concepts in common language) that if the increase in entropy (of some sort) characterizes natural processes then good simulations of natural processes should exhibit a similar increase in (some sort of) entropy.

The simulation of a natural processes by a state machine requires that we define "states". So natural processes aren't "naturally" , Markov processes or stationary processes etc. The only become such after states are defined. For example, the states of a simulation of 8 tosses of a coin could be defined as "the result of the previous toss", or a state could be defined as the complete history of previous tosses - like "H,H,T after 3 tosses" - or a state could be defined as a summary of results like "2 Heads and 1 Tail" after 3 tosses.

atyy said:

Mathematically, the question is how to derive the second law (where "coarse grained" entropy increases) from Newton's laws (where "fine-grained" entropy does not increase).

The link to the Cohen paper confirms my impression that some important arguments given in typical presentations of classical thermodynamics are not mathematical proofs - although I see various threads on the forum about statistical physics where some contributors apparently think they are! I hadn't heard about the approach of dividing phase space into stable and unstable manifolds before.

pbuk said:

That would be difficult: at the end of the simulation the system is in exactly 1 state with probability 1.

As far as Entropy of processes goes, that line of thinking is what troubles me!

Perhaps Quantum Mechanics provides a hint about how to avoid it. For example, suppose the "state" of process is given in terms of two variables X,Y in a way such that on even steps of the process X has a definite value and Y only has a probability distribution - and on odd steps the situation is reversed. Then the "uncertainty" about what happens next always remains.

atyy · Dec 28, 2020

Stephen Tashi said:

The link to the Cohen paper confirms my impression that some important arguments given in typical presentations of classical thermodynamics are not mathematical proofs - although I see various threads on the forum about statistical physics where some contributors apparently think they are! I hadn't heard about the approach of dividing phase space into stable and unstable manifolds before.

Some? You've got to be kidding! While doing a derivation, my stat mech lecturers (Mehran Kardar, Nihat Berker) would usually say something like, "Actually, I don't know if anything I'm doing is really correct - but it matches experiment!" :oldbiggrin:

There is also Landau and Lifshitz's famous uncontrolled, but really neat, derivation of the part of Stirling's approximation that is important for physics.

Many years after those stat mech courses, I repeated my lecturers' comments to a mathematician of the rigorous type (Rafael de la Llave) working on renormalization (which appears in stat mech as well as quantum field theory) - and he said that they were absolutely right. When I asked the mathematical status of renormalization, he said that in many cases it was still not rigorous, as it turned out various quantities in the calculations seemed not to go to zero fast enough.

However, I believe many results referred to by Cohen are meant to be rigorous - especially since Ruelle, Sinai etc are mathematical physicists (and Sinai has won mathematics prizes).

You can see from Villani's slides that it's also taken many years to go from Landau arguments to Mouhot and Villani's rigorous results. In his interesting last slide, he says that the goal of mathematical physics is not to prove rigorously what physicists know [personally, I think that would be a perfectly good goal], but to get new insights about physics, and also to define new mathematical problems.

atyy · Dec 29, 2020

Stephen Tashi said:

Perhaps Quantum Mechanics provides a hint about how to avoid it. For example, suppose the "state" of process is given in terms of two variables X,Y in a way such that on even steps of the process X has a definite value and Y only has a probability distribution - and on odd steps the situation is reversed. Then the "uncertainty" about what happens next always remains.

The thinking in physics is that entropy increase has little to do with quantum mechanics. Instead it is due to limitations on what we are able to observe. If we had knowledge of all microscopic details, entropy would not increase. However, because we lack the ability to know all microscopic details, entropy increases for us.

Although technically not identical (since entropy does not increase), the Landau damping case provides an analogy for how irreversibility can occur despite reversibility.

In their introduction, Mouhot and Villani explain why Landau damping is surprising: "In contrast to models incorporating collisions [86], the Vlasov–Poisson equation is time-reversible. However, in 1946 Landau [48] stunned the physical community by predicting an irreversible behavior on the basis of this equation. "

The intuition is provided by Villani (slide #79): "Information still present, but not observable (goes away in very fast velocity oscillations)". What he means is that if we can only measure things up to a certain frequency, then although information is not truly lost, for the observer with finite observation resolution, information is irreversibly lost, since the information is carried to higher and higher frequencies.

Fra · Dec 29, 2020

As this is in the math section forgive my ignorance if I misunderstand the OP motives for the question, but here is my understanding on this from the perspective of foundations of physical law.

The second law, and the principle of least action are all kind of special cases of the same underlying idea - to maximize the a priori probability, wether a probability of a state or a transition probability. The only thing making "entropy" interesting is that its a LOG measure of probability; meaning that multiplicative constructions become additive. Other than this, I see now magic in "entropy" vs just "a priori probability".

atyy said:

The thinking in physics is that entropy increase has little to do with quantum mechanics. Instead it is due to limitations on what we are able to observe.
...
The intuition is provided by Villani (slide #79): "Information still present, but not observable (goes away in very fast velocity oscillations)". What he means is that if we can only measure things up to a certain frequency, then although information is not truly lost, for the observer with finite observation resolution, information is irreversibly lost, since the information is carried to higher and higher frequencies.

Reasoning along these lines are IMO fruitful and can be generalized alot, and can probably play a part in a reconstruction of QM required for progress.

I would light to highlight this:

If you can make the observer side (wether you call it "background" or matter) reactive then it should be crystal clear how the mechanism of information beeing lost to an "observer with finite resolution" MUST stabilize the chaos, and probably EXPLAIN emergence of stable rules, in despited of the "in principle" lack of detailed knowledge. In this like on KEY, to TAME, the madness you get when removing the classical background.

Ultimately, all P-measures and consequently entropies are fundamentally attached to an observer. And its precisely in this, that the explanatory power in a reconstruction lies.

So what I would suggest is, that forget about the old shannon entropy, at least in these contexts.

A IMO simpler derivation is also to consider the multinomial distribution, you can ponder about the "probability" to draw a finite sequence of dices throws (and approximate a discretized frequency from this, abn consider this to be a "probability" p) for a future sequence, based on a prior, then one gets a log P(p|p_prior) that pops out not shannone entropy, but the relative kullback-leibler entropy.
ie log P = a * S_KL + b. But if you assume a equipotential prior, beeing independent of the outcomes, then one can add that part to the constant terms, and instead the shannon entropy pops out. the first terms a, seems to be the number of samples defining the discretized vesion of the draw p.

Interestingly if one makes a differential of the Kullback-leibler entropy, one can get the fischer information riemann metric.

So principle of least action, can be understood as the principle of minimum information gain.

But this the interesting thins is if one can couple thsi to reconstruction of QM.

I noted one Einstein quite from the Cohen paper atyy posted.

“UsuallyWequals the number of complexions. In order to computeW[however] one needs a complete(molecular-mechanical) theory of the system.Therefore it is dubious that the Boltzmann principle has any meaning withouta completemolecular-mechanical theory or some other theory which describesthe elementary [dynamical] processes [of the system]. In equilibrium, the ex-pressionS=klogW+c, seems [therefore] devoid of [any] content from aphenomenological point of view, without giving in addition such an elemen-tary theory.”

Indeed, this translations in the light of what i tried to outline here, as the need for the for a description of the microstructure of the observer side. This is part of the task of rescontructing QM. This is also why its instructive to take an explicit example of the multinomial distribution, where one can define states as a sequence of dice draws - binned up as a relative frequency distribution. then the derivation also becomes explicit. Its also possible the "simlpest of simplest casese" (the reason why i looked at it) to consider a multinomal ditribution, as ther can can consider toy models for markov processes with any memory you want. I think in the reconstruction of QM wee need; an observer in principle may have "infinite memory" but for the saome reasons above, an observer with limited capacity will necessarily loose some information and be required to recode. So in a extremely speculative outook, any given observer possible corresponds to somethink like a markov chain where its memory depends both on its choice of internal structure (ie coding) ANd its physical mass, assuming it constrains what's possible. Then one would need to model "interacting markov chains" whose memory and internal state machine are in constant evolution.

So a computer simulation of this, probably would suggest, having two algoritms "interact", and look for what they negotiated upon. So quite different from an initial value problems with boundary constraints subject to diff equations.

If the objective was certain proofs of mathematical theorems, then ignore this.

/Fredrik

Stephen Tashi · Dec 30, 2020

atyy said:

The thinking in physics is that entropy increase has little to do with quantum mechanics. Instead it is due to limitations on what we are able to observe.

Fra said:

If you can make the observer side (wether you call it "background" or matter) reactive then it should be crystal clear how the mechanism of information beeing lost to an "observer with finite resolution" MUST stabilize the chaos, and probably EXPLAIN emergence of stable rules, in despited of the "in principle" lack of detailed knowledge. In this like on KEY, to TAME, the madness you get when removing the classical background.

In explaining whether an event has a detectable effect I've heard lecturer's in phyics say this depends on whether the event produces a "physical record". So I wonder if there are inherent limitations on the accuracy and persistent of such physical records - and how would their accuracy be measured? It seems that to quantify the accuracy of information, we must set some goal for how the information will be used.

Returning to the tidier topic of stochastic models: A famous example of Entropy increasing is the breaking of a coffee cup as a "natural" phenomena and the reassembling to pieces of a broken cup into a whole cup as something "unnatural". If we want to use such an idea (i.e. irreversibility) to define an increasing entropy for stochastic models, we'll have to ignore any history that a computer simulation of such a model keeps of its trajectory. That's not a problem from viewpoint of making mathematical definitions. If the stochastic process is implemented by a state machine then we can consider the (intutive) notion of how accurately knowing (only) the current state can be used to estimate the past trajectory. Perhaps the notion of "running a state machine backwards" can be rigorously defined.

Stephen Tashi · Dec 30, 2020

Fra said:

If the objective was certain proofs of mathematical theorems, then ignore this.

The immediate goal of my original post is not to find mathematical theorems. Instead, I'm looking for mathematical definitions!

The line of thinking is:

1) There exist useful stochastic models for natural phenomena.

2) Natural phenomena generally exhibit an increase in entropy (by some definitions of entropy in physics)

3) Can we find a mathematical definition for the entropy of stochastic models that defines something that also increases?

Your post #8 raises the concern that existing stochastic models for certain QM phenomena are not adequate and suggests they can be improved by explicitly modeling the information kept by an observer of the stochastic process. I wasn't focusing on the inadequacy of existing stochastic models for QM. In fact, I was thinking about stochastic models in general, including models involved in econometrics, psychology etc. I like the idea of explicitly modeling information kept by an observer of a stochastic process. Perhaps that is a path to defining an increasing entropy. ( As suggested in post #9, a primitive implementation of that idea is to ignore any history that an implementation of a stochastic model keeps of its trajectory.)

Jarvis323 · Dec 30, 2020

Stephen Tashi said:

The immediate goal of my original post is not to find mathematical theorems. Instead, I'm looking for mathematical definitions!

The line of thinking is:

1) There exist useful stochastic models for natural phenomena.

2) Natural phenomena generally exhibit an increase in entropy (by some definitions of entropy in physics)

3) Can we find a mathematical definition for the entropy of stochastic models that defines something that also increases?

I don't think that entropy from information theory is that entropy exactly. At least Shannon entropy itself is essentially just a measure of the spread of a pdf. It's information theory as a framework that makes it interesting. But if you want to apply information theory, you need to transform the problem into one that fits the information theory framework. That basically means turning the object of study into a process that generates strings of symbols.

I like the idea of explicitly modeling information kept by an observer of a stochastic process. Perhaps that is a path to defining an increasing entropy. ( As suggested in post #9, a primitive implementation of that idea is to ignore any history that an implementation of a stochastic model keeps of its trajectory.)

In my experience, we used symbolic dynamics to study chaotic systems as hidden Markov models through the lens of information theory, in part as a way to study how information 'flows' in the system in the sense of how correlation exists across time and decays.

The simplest possible HMM that is optimally predictive is derived, and that is how you get something you can say is 'intrinsic'.
The issue is that it's only somewhat simple models that we can apply this stuff to in practice. For example, the logistic map is studied extensively. But you're not going to see much in terms of advanced 3D systems, because it's just too complex to get a handle on easily. In fact we don't even have a good extension of information theory for dimensions greater than 1. The one dimension usually being time. Towards tackling spatial problems, people have been developing models based on things like cellular automata, as a starting point.
These links might be a good start.

https://fapp.ucsd.edu/Ergodic_Mixing_Chaotic.pdf

http://www.scholarpedia.org/article/Kolmogorov-Sinai_entropy

https://en.m.wikipedia.org/wiki/Symbolic_dynamics

https://webspace.clarkson.edu/~ebollt/Classes/MA563Sp18DynamicalSystems/Ch9Bollt.pdf

There is a huge area of research on these issues, but more questions than answers at this point in my opinion.

Maybe a way to combine both space and time, rather than just time, in something like an information theoretical framework, would be the breakthrough we need. At that point, hidden Markov models would not be the type of model to use I guess.

atyy · Dec 30, 2020

Stephen Tashi said:

In explaining whether an event has a detectable effect I've heard lecturer's in phyics say this depends on whether the event produces a "physical record".

I am not sure what that is referring to, but it seems like language from quantum mechanics, where we don't imagine things happening everywhere at all times, but restrict ourselves to discussing the outcomes of measurements. Pure quantum mechanics is mechanics - so in a formal sense, there is no second law of themodynamics in pure quantum mechanics. We can add statistical physics to quantum mechanics, but it is not required. However, pure quantum mechanics does informally require an irreversibility - the measurement outcomes are "real" (whereas the quantum state may not be "real"), and the measurement outcomes cannot be "reversed". This places limits on whether an observer who describes themself as classical can be described (and manipulated) within the quantum state of another observer. But we can avoid all these quantum complications and restrict ourselves to classical mechanics, since we believe that classical mechanics can give rise to effective irreversibility.

Stephen Tashi said:

So I wonder if there are inherent limitations on the accuracy and persistent of such physical records - and how would their accuracy be measured? It seems that to quantify the accuracy of information, we must set some goal for how the information will be used.

In pure quantum mechanics and pure classical mechanics, there is no second law of thermodynamics - and there is no fundamental limit on the accuracy of single measurements. The second law of thermodynamics is not a fundamental law (it is not as fundamental as mechanics).

Stephen Tashi said:

It seems that to quantify the accuracy of information, we must set some goal for how the information will be used.

Within classical statistical mechanics, one example of a measurement that gives incomplete information is the measurement of a macroscopic variable (eg. pressure) that is consistent with many microscopic states (eg. the position and momentum of each individual ideal gas particle).

You may also like to look up Landauer's principle.

https://www.preposterousuniverse.com/blog/2013/11/28/thanksgiving-8/
Thanksgiving [for Landauer's Principle]
Sean Carroll

https://cds.cern.ch/record/491908/files/0103108.pdf
The physics of forgetting: Landauer’s erasure principle and information theory
M. B. Plenio and V. Vitelli

Fra · Dec 31, 2020

Stephen Tashi said:

2) Natural phenomena generally exhibit an increase in entropy (by some definitions of entropy in physics)

3) Can we find a mathematical definition for the entropy of stochastic models that defines something that also increases?

If you are thinking specifically of markov chains, yielding a sequence of probabilities as its run, then these probabilities there are more like "transition probabilities"? 2nd law is not even about transition probabilities, its about a priori probabilities. Ie. the probability given some equipartition hypothesis; not the probability given the current state.

To look for some useful measure of transition probabilities I am not sure. The only think I have been thinking about, is that if you consider the process of an observer (info processing agent) doing a "random walk", then from the perspective of a second observer, one would expect this to always follwo the path of least resistance, which means minimizing the information-gain measure. Which is a "relative entropy". This is the generalisation of second law, to transitions IMO. Its about a priori probability vs probability of change, as a differential. But I don't think that abstraction applies to markov chains genereally. My perspective is centered around the open issues, i rarely think about these things from a general mathematical model perspective.

/Fredrik

Fra · Dec 31, 2020

Stephen Tashi said:

Your post #8 raises the concern that existing stochastic models for certain QM phenomena are not adequate and suggests they can be improved by explicitly modeling the information kept by an observer of the stochastic process. I wasn't focusing on the inadequacy of existing stochastic models for QM. In fact, I was thinking about stochastic models in general, including models involved in econometrics, psychology etc. I like the idea of explicitly modeling information kept by an observer of a stochastic process. Perhaps that is a path to defining an increasing entropy. ( As suggested in post #9, a primitive implementation of that idea is to ignore any history that an implementation of a stochastic model keeps of its trajectory.)

Abstractions of the ideas of modelling observers this way can most certainly have a generic value in applications to other things as well. There is a natural game theoretic and evolutionary angle in here, that should fit nicely to model interacting sub-systems, like market interactions as well as social interactions. So there is probably plenty of food for mathematicians here!

/Fredrik

I Can Entropy Increase During a Probability Model Simulation?

Similar threads

Hot Threads

B A Little Probability Puzzle

I Need help solving this Existence Algorithm for truth

A Does this computation satisfy LTL formulas?

A Prove that points which are indistinguishable from 0 exist (using logic)

A Mathematical Connection between Cosmic Expansion and Exponential Growth

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective