# Derivation of Statistical Mechanics

• A
• Nullstein

#### Nullstein

Moderator's note: Spin-off from previous thread due to topic change.

OK, so why do you not agree that Valentini's version fixes this?
Because it doesn't work. Bohmian time evolution doesn't involve the coarse graining steps that are used in his calculation. A delta distribution remains a delta distribution at all times and does not decay into ##|\Psi|^2##.

Because it doesn't work. Bohmian time evolution doesn't involve the coarse graining steps that are used in his calculation. A delta distribution remains a delta distribution at all times and does not decay into ##|\Psi|^2##.
If your argument is correct, then an analogous argument should apply to classical statistical mechanics: The Hamiltonian evolution doesn't involve the coarse graining steps that are used in the Boltzmann H-theorem. A delta distribution in phase space remains a delta distribution at all times and does not decay into a thermal equilibrium. Would you then conclude that thermal equilibrium in classical statistical mechanics also requires fine tuning?

If your argument is correct, then an analogous argument should apply to classical statistical mechanics: The Hamiltonian evolution doesn't involve the coarse graining steps that are used in the Boltzmann H-theorem. A delta distribution in phase space remains a delta distribution at all times and does not decay into a thermal equilibrium. Would you then conclude that thermal equilibrium in classical statistical mechanics also requires fine tuning?
No, because this is not Boltzmann's H-theorem. Valentini appeals to Gibbs' H-theorem, which is well-understood not to work without a physical mechanism for the coarse-graining.

No, because this is not Boltzmann's H-theorem. Valentini appeals to Gibbs' H-theorem, which is well-understood not to work without a physical mechanism for the coarse-graining.
By Gibbs H-theorem, I guess you mean H-theorem based on Gibbs entropy, rather than Boltzmann entropy. But I didn't know that Gibbs H-theorem in classical statistical mechanics does not work for the reasons you indicated. Can you give a reference?

By Gibbs H-theorem, I guess you mean H-theorem based on Gibbs entropy, rather than Boltzmann entropy.
No, they are completely different derivations. Boltzmann's H-theorem is based on the Stosszahlansatz. Gibbs' H-theorem is based on coarse graining in phase space.
But I never heard before that Gibbs H-theorem in classical statistical mechanics does not work for the reasons you indicated. Can you give a reference?
The theorem works in classical statistical physics, if you can supply a reason for why this coarse graining happens, i.e. there must be a physical process. The Liouville equation alone is reversible and can't therefore can't increase entropy. Similarly, the Bohmian equations of motion are reversible and can't increase entropy.

Here's what Wikipedia says about it:
As to why this blurring should occur in reality, there are a variety of suggested mechanisms. For example, one suggested mechanism is that the phase space is coarse-grained for some reason (analogous to the pixelization in the simulation of phase space shown in the figure). For any required finite degree of fineness the ensemble becomes "sensibly uniform" after a finite time. Or, if the system experiences a tiny uncontrolled interaction with its environment, the sharp coherence of the ensemble will be lost. Edwin Thompson Jaynes argued that the blurring is subjective in nature, simply corresponding to a loss of knowledge about the state of the system. In any case, however it occurs, the Gibbs entropy increase is irreversible provided the blurring cannot be reversed.

vanhees71
No, they are completely different derivations. Boltzmann's H-theorem is based on the Stosszahlansatz. Gibbs' H-theorem is based on coarse graining in phase space.
I see, thanks!

The theorem works in classical statistical physics, if you can supply a reason for why this coarse graining happens, i.e. there must be a physical process. The Liouville equation alone is reversible and can't therefore can't increase entropy. Similarly, the Bohmian equations of motion are reversible and can't increase entropy.
It seems that we disagree on what is coarse graining, so let me explain how I see it, which indeed agrees with all textbooks I am aware of, as well as with the view of Jaynes mentioned in your wikipedia quote. Coarse graining is not something that happens, it's not a process. It's just a fact that, in practice, even in classical physics we cannot measure position and momentum with perfect precision. The coarse graining is the reason why do we use statistical physics in the first place. Hence there is always some practical uncertainty in the phase space. The picture shows how an initial small uncertainty evolves to an uncertainty that, upon coarse graining, looks like a larger uncertainty.

Last edited:
It seems that we disagree on what is coarse graining, so let me explain how I see it, which indeed agrees with all textbooks I am aware of.
What you describe here is not coarse graining. It's just ergodicity. Gibbs' H-theorem requires a specific way of coarse graining given by an irreversible physical process. If the system is fully governed by Liouville's equation or Bohmian equations of motion, then Gibbs' H-theorem is just inapplicable. Ergodic theory works just fine for deriving statistical mechanics, but this is not what Valentini is using. Valentini tries to argue like in Gibbs' H-theorem, so he must explain how the coarse graining is realized as a physical process.
Coarse graining is not something that happens, it's not a process. It's just a fact that, in practice, even in classical physics we cannot measure position and momentum with perfect precision.
But Gibbs' H-theorem requires coarse graining to happen as a physical process. Otherwise, the theorem does not help explaining why systems are driven to equilibrium.
The coarse graining is the reason why do we use statistical physics in the first place. Hence there is always some practical uncertainty in the phase space, which in sufficiently complex systems typically evolves as shown on this picture.
What you are presenting is not coarse graining. This is the argument based on ergodic theory and it works fine. But this is not what is required in Gibbs' H-theorem, which specifically requires an actual coarse-graining, a deviation from reversible equations of motion (like Liouville's equation or Bohmian EOMs).

What you are presenting is not coarse graining. This is the argument based on ergodic theory and it works fine.
No, that's coarse graining, not ergodicity. Ergodicity involves an average over a long period of time, while the fourth picture shows filling the whole phase-space volume at one time. And it fills the whole space only in the coarse grained sense.

vanhees71
No, that's coarse graining, not ergodicity. Ergodicity involves an average over a long period of time, while the fourth picture shows filling the whole phase-space volume at one time. And it fills the whole space only in the coarse grained sense.
There is no coarse graining in your picture. It just shows the trajectory eventually filling the whole phase space. Ergodicity says that ensemble averages are equal to time averages, so if the trajectory fills the whole phase space, then using ensemble averages is justified by ergodicity. That's a standard argument, but irrelevant to Gibbs' H-theorem. Gibbs' H-theorem requires coarse graining as a physical process, not just as a limit on our measurement precision or anything like you mentioned.

Except that it doesn't.
https://www.jstor.org/stable/1215826?seq=1#page_scan_tab_contents
https://arxiv.org/abs/1103.4003
https://arxiv.org/pdf/cond-mat/0105242v1.pdf
If a system is ergodic, then the ergodic theorem justifies the ensemble averages. Not all systems are ergodic and if they are, it is often hard to prove, but it's a valid route to statistical equilibrium. Anyway, if you don't like the ergodic argument, why do you bring it up then? I'm only telling you that you argue based on ergodicity here.

It just shows the trajectory eventually filling the whole phase space.
No it doesn't. The final pink region has the same area as the initial one, it covers only a fraction of the whole gray disc. It's only if you look at the picture with blurring glasses (which is coarse graining) that it looks as if the whole disc is covered.

Last edited:
Anyway, if you don't like the ergodic argument, why do you bring it up then? I'm only telling you that you argue based on ergodicity here.
You mentioned ergodicity first. We obviously disagree on foundations of classical statistical mechanics, even on meaning of the words such as "ergodicity" and "coarse graining", so I think it's important to clear these things up.

No it doesn't. The final pink region has the same area as the initial one, it covers only a fraction of the whole gray disc. It's only if you look at the picture with blurring glasses (which is coarse graining) that it looks as if the whole disc is covered.
Okay, I was too quick and thought the picture shows the trajectory, my bad. But still, there is no coarse graining in the picture, it's just regular Liouville time evolution, so it is not relevant to Gibbs' H-theorem.
You mentioned ergodicity first. We obviously disagree on foundations of classical statistical mechanics, even on meaning of the words such as "ergodicity" and "coarse graining", so I think it's important to clear these things up.
I mentioned ergodicity, because I thought you were arguing about trajectories spreading across phase space. I'm not sure we disagree on foundations of classical statistical mechanics then, but this doesn't invalidate the fact that Gibbs' H-theorem requires a true physical coarse graining.

Basically, there are two ways to arrive at statistical mechanics:

1. Boltzmann's view: The system is always in one and only one pure state, i.e. there is no distribution in the ##N##-particle phase space (or one might say there is always a delta distribution), every particle always has a definite position. However, after exact time evolution by Hamilton's equation of motion, the particles will have evolved chaotically. You can then perform statistics on these ##N## particles and you will find, by Boltzmann's H-theorem (if the Stosszahlansatz is valid), that the one-particle distribution approaches the Boltzmann distribution. You can also study subensembles with ##M<<N## particles. Those will also approach the canonical ensemble. But the full system will remain forever in a pure state.

2. Gibbs' view: We study distributions in the phase space of the full system. But then we have to argue in a different way how the equilibrium distribution arises, because Liouville time evolution just won't produce the equilibrium distribution from a sharp delta state, which must undeniably be taken as initial state. In this case, arguments like ergodicity are needed and they have limited applicability. You could also modify Liouville's equation to account for interactions with the environment (this is a possible way to make Gibbs' H-theorem work).

What does not work is the following: Work in the full space and claim that the equilibrium distribution arises from the unmodified Liouville time evolution. It's just a provable fact that a delta distribution will evolve into another delta distribution.

Gibbs' H-theorem requires intermediate coarse graining as a physical process, i.e. it must be in the equations of motion. It can't be a subjective thing and it is unrelated to limitations of our measurement precision or anything like it.

The picture shows how an initial small uncertainty evolves to an uncertainty that, upon coarse graining, looks like a larger uncertainty.
I think this description is a little off. What your series of pictures show is a series of "snapshots" at single instants of time of one "cell" of a coarse graining of the phase space (i.e., all of the phase space points in the cell have the same values for macroscopic variables like temperature at that instant of time). At ##t = 0## the cell looks nice and neat and easy to distinguish from the rest of the phase space even with measurements of only finite precision (the exact location of the boundary of the cell will be uncertain, but the boundary is simple and that uncertainty doesn't have too much practical effect). As time evolution proceeds, however, ergodicity (I think that's the right term) causes the shape of the cell in phase space to become more and more convoluted and makes it harder and harder to distinguish, by measurements with only finite precision, what part of the phase space is in the cell and what part is not.

There is no coarse graining in your picture.
Yes, there is. The blue region in his picture is not a single trajectory. It's a set of phase space points that correspond to one "cell" of a coarse graining of the phase space at a single instant of time. See above.

Gibbs' H-theorem requires intermediate coarse graining as a physical process, i.e. it must be in the equations of motion.
I don't see how this can be true since the classical equations of motion are fully deterministic. A trajectory in phase space is a 1-dimensional curve (what you describe as "delta functions evolving to delta functions"), it does not start out as a 1-dimensional curve but then somehow turn into a 2-dimensional area.

Basically, there are two ways to arrive at statistical mechanics
Neither of these looks right to me.

It is true that "the system is always and only in one pure state". And if we could measure with exact precision which state it was in, at any instant, according to classical physics, we would know its state for all time, since the dynamics are fully deterministic.

However, we can't measure the system's state with exact precision. In fact, we can't measure its microscopic state (the individual positions and velocities of all the particles) at all. We can only measure macroscopic variables like temperature, pressure, and volume. So in order to make predictions about what the system will do, we have to coarse grain the phase space into "cells", where each cell represents a set of phase space points that all have the same values for the macroscopic variables we are measuring. Then, roughly speaking, we build theoretical models of the system, for the purpose of making predictions, using these coarse grained cells instead of individual phase space points: we basically assume that, at an instant of time where the macroscopic variables have particular values, the system's exact microscopic state is equally likely to be any of the phase space points inside the cell that corresponds to those values for the macroscopic variables. That gives us a distribution and enables us to do statistics.

Yes, there is. The blue region in his picture is not a single trajectory. It's a set of phase space points that correspond to one "cell" of a coarse graining of the phase space at a single instant of time. See above.
But that is not the kind of coarse graining that is required by Gibbs' H-theorem. Gibbs requires dynamic coarse graining by the laws of motion.

I don't see how this can be true since the classical equations of motion are fully deterministic. A trajectory in phase space is a 1-dimensional curve (what you describe as "delta functions evolving to delta functions"), it does not start out as a 1-dimensional curve but then somehow turn into a 2-dimensional area.
Yes, this is why Gibbs' H-theorem doesn't work as a derivation of statistical mechanics. The Liouville equation evolves deltas into deltas. However, if you were allowed to modify the Liouville equation to include coarse graining as a physical phenomenon, then the argument would work. One can understand such a modification as an effective theory that comes from a bigger system, which includes also the environment with the environment integrated out, i.e. there is some sort of heat bath.
However, we can't measure the system's state with exact precision. In fact, we can't measure its microscopic state (the individual positions and velocities of all the particles) at all. We can only measure macroscopic variables like temperature, pressure, and volume. So in order to make predictions about what the system will do, we have to coarse grain the phase space into "cells", where each cell represents a set of phase space points that all have the same values for the macroscopic variables we are measuring.
But this only happens on paper, not in the real world. There is no physical coarse graining step involved in this scenario.
Then, roughly speaking, we build theoretical models of the system, for the purpose of making predictions, using these coarse grained cells instead of individual phase space points: we basically assume that, at an instant of time where the macroscopic variables have particular values, the system's exact microscopic state is equally likely to be any of the phase space points inside the cell that corresponds to those values for the macroscopic variables. That gives us a distribution and enables us to do statistics.
Nature doesn't care about whether we artificially introduced coarse graining in our description of the system. The actual system will always remain in a delta state. I agree that if we evolve the system, then coarse grain, then evolve, then coarse grain (and so on), the state we have written down on our piece of paper evolves into an equilibrium state. That's Gibbs' H-theorem. But the state in the physical world does not evolve into equilibrium this way, because the coarse graining steps don't happen in the real world. This can only mean that the state on our paper is wrong, becase the state in the real world can not be wrong. If the coarse graining is not actually a physical process or at least an effective physical process in a subsystem, this way of deriving the equilibrium doesn't work.

Gibbs requires dynamic coarse graining by the laws of motion.
this is why Gibbs' H-theorem doesn't work as a derivation of statistical mechanics
I don't think these claims are true. From posts others have made in this thread, I don't think I'm the only one with that opinion.

What reference are you using for your understanding of Gibbs' derivation of statistical mechanics? (And for that matter, Boltzmann's?)

I don't think these claims are true. From posts others have made in this thread, I don't think I'm the only one with that opinion.

What reference are you using for your understanding of Gibbs' derivation of statistical mechanics? (And for that matter, Boltzmann's?)
Just look at Wikipedia:
The critical point of the theorem is thus: If the fine structure in the stirred-up ensemble is very slightly blurred, for any reason, then the Gibbs entropy increases, and the ensemble becomes an equilibrium ensemble. As to why this blurring should occur in reality, there are a variety of suggested mechanisms. For example, one suggested mechanism is that the phase space is coarse-grained for some reason (analogous to the pixelization in the simulation of phase space shown in the figure). For any required finite degree of fineness the ensemble becomes "sensibly uniform" after a finite time. Or, if the system experiences a tiny uncontrolled interaction with its environment, the sharp coherence of the ensemble will be lost. Edwin Thompson Jaynes argued that the blurring is subjective in nature, simply corresponding to a loss of knowledge about the state of the system. In any case, however it occurs, the Gibbs entropy increase is irreversible provided the blurring cannot be reversed.
For Boltzmann's view, I recommend the books by Ruelle and Khintchin.

Just look at Wikipedia:
Wikipedia is not a valid reference. You need to reference a textbook or peer-reviewed paper. (You do that for Boltzmann so that part is fine, although I don't have those books so I can't personally check the references.)

Gibbs' H-theorem requires a true physical coarse graining
It seems that this is the only part on which you and me disagree, or perhaps don't understand each other. I think a "subjective" coarse graining in the sense of Jaynes is enough.

vanhees71
Nature doesn't care about whether we artificially introduced coarse graining in our description of the system.
Sure, but the way we see nature depends on it a lot.

The actual system will always remain in a delta state.
Sure, but in practice we can't measure a delta state. If we could, we would not need statistical mechanics in the first place.

I agree that if we evolve the system, then coarse grain, then evolve, then coarse grain (and so on), the state we have written down on our piece of paper evolves into an equilibrium state. That's Gibbs' H-theorem. But the state in the physical world does not evolve into equilibrium this way, because the coarse graining steps don't happen in the real world. This can only mean that the state on our paper is wrong, becase the state in the real world can not be wrong.
No, no, no, that's not how course graining works. Let me explain how it works on a computer. Suppose that you have a very strong computer, but connected to a screen with a very low resolution. The computer solves the dynamical equations for N particles with a very good precision, the numerical results it gives are essentially "delta functions". The computer shows the results as N dots moving on the screen. However, since the screen resolution is low, the dots on the screen do not give an accurate representation of numerical results. The screen performs a coarse graining. A scientist who only watches the screen and does not have access to accurate numbers produced by the computer will see a coarse grained evolution, not the true evolution in the computer.

Klystron and vanhees71
Sure, but the way we see nature depends on it a lot.
If predictions about nature depend on how we see it, then we have made a mistake.
No, no, no, that's not how course graining works. Let me explain how it works on a computer. Suppose that you have a very strong computer, but connected to a screen with a very low resolution. The computer solves the dynamical equations for N particles with a very good precision, the numerical results it gives are essentially "delta functions". The computer shows the results as N dots moving on the screen. However, since the screen resolution is low, the dots on the screen do not give an accurate representation of numerical results. The screen performs a coarse graining. A scientist who only watches the screen and does not have access to accurate numbers produced by the computer will see a coarse grained evolution, not the true evolution in the computer.
The point is: Sure you can compute a coarse grained state, but it will evolve differently than the fine grained state and it may also produce different predictions than the fine grained state. Gibbs' H-theorem does not prove that the fine grained state converges to an equilibrium. It just shows that a certain coarse grained state does. Sure, maybe some expectation values will be the same in both the fine grained and the coarse grained distributions, but in order to prove that, we have to resort to a proper derivation anyway.

What does that mean for Valentini's derivation of the equilibrium in Bohmian mechanics? It means that the fine grained state does not approach quantum equilibrium. In Bohmian mechanics, this is even more disastrous than in statistical mechanics, because while in statistical mechanics, at least the dynamics of the particles does not depend on the fine grained distribution (it doesn't appear in the Hamiltonian), in Bohmian mechanics, the dynamics of the hidden positions do depend crucially on the fine grained distribution and not the coarse grained one. It is crucial for the reproduction of QM predictions that the fine grained distribution is given by ##|\Psi|^2##. It appears in the quantum potential and also the continuity equation would be violated if the fine grained distribution wasn't given by ##|\Psi|^2##.

If predictions about nature depend on how we see it, then we have made a mistake.
Of course they depend on it, but it's not a mistake. We are doing our best, but we are not perfect.

Of course they depend on it, but it's not a mistake. We are doing our best, but we are not perfect.
Well, if we see it differently than it actually is, then the predictions will be wrong at some point. If we have two competing theories, we must make sure that they either make the same predictions (up to the measurement precision) or we have to reject one of them.

In the case of Bohmian mechanics, it's really crucial that the fine grained distribution is given by ##|\Psi|^2##, because otherwise the continuity equation would be violated. We can't just replace it by a coarse grained distribution, because that would introduce a contradiction into the theory.

Well, if we see it differently than it actually is, then the predictions will be wrong at some point.
And they are wrong at some point. For example, the second law of thermodynamics is wrong at some point, especially for systems with a small number of particles.

In the case of Bohmian mechanics, it's really crucial that the fine grained distribution is given by ##|\Psi|^2##, because otherwise the continuity equation would be violated. We can't just replace it by a coarse grained distribution, because that would introduce a contradiction into the theory.
I don't see any internal contradiction in the theory in which Bohmian particles are not distributed according to ##|\Psi|^2##. Perhaps one could talk about external contradiction, namely a contradiction between theory and observations, but Valentini and his coworkers have demonstrated in many numerical simulations that deviations from ##|\Psi|^2## are negligibly small.

Last edited:
gentzen
in Bohmian mechanics, the dynamics of the hidden positions do depend crucially on the fine grained distribution and not the coarse grained one.
That's a nonsense. The Bohmian trajectories do not depend on the distribution at all, either fine or coarse grained. A Bohmian trajectory depends on ##\Psi##, but a priori ##\Psi## in Bohmian mechanics has nothing to do with the distribution. They are related only in the quantum equilibrium, but a priori the system does not need to be in equilibrium.

O don't see any internal contradiction in the theory in which Bohmian particles are not distributed according to ##|\Psi|^2##. Perhaps one could talk about external contradiction, namely a contradiction between theory and observations, but Valentini and his coworkers have demonstrated in many numerical simulations that deviations from ##|\Psi|^2## are negligibly small.
The internal contradiction in BM is the following: The fine grained probability density ##\rho(x)## satisfies the continuity equation if and only if ##\rho(x) = |\Psi(x)|^2##. Otherwise, probabilities won't add up exactly to ##1## and can't be interpreted as probabilities anymore. So if Valentini shows that the coarse grained distribution ##\left<\rho\right>## quickly evolves to ##\left<|\Psi|^2\right>## even if the fine grained distribution didn't start out in equilibrium, then he has just shown that the coarse grained distribution will quickly deviate from the fine grained distribution, generating the contradiction.

As far as I can tell, the numerics he did only verifies his statement about the coarse grained distribution, which is irrelevant, because we already knew it was true. The fine grained distribution will still deviate from the equilibrium distribution and will cease to be a probability distribution, because it doesn't satisfy the continuity equation. This just shows that this coarse graining procedure is unphysical and the Gibbs' H-theorem can't be applied.
That's a nonsense. The Bohmian trajectories do not depend on the distribution at all, either fine or coarse grained. A Bohmian trajectory depends on ##\Psi##, but a priori ##\Psi## in Bohmian mechanics has nothing to do with the distribution. They are related only in the quantum equilibrium, but a priori the system does not need to be in equilibrium.
But it is essential for BM to be an explanation of QM that the fine grained distribution closely resembles ##|\Psi|^2##. However, if it isn't exactly equal to it, then it will quickly cease to be a probability distribution due to the violation of the continuity equation.

The fine grained probability density ##\rho(x)## satisfies the continuity equation if and only if ##\rho(x) = |\Psi(x)|^2##.
That's a nonsense. As long as particles are not created or destructed (which are not in BM), the continuity equation is satisfied for any ##\rho(x) ##. More precisely, for any initial ##\rho(x,0)## there is ##\rho(x,t)## such that ##\rho(x,t)## satisfies a continuity equation.

The fine grained probability density ##\rho(x)## satisfies the continuity equation if and only if ##\rho(x) +|\Psi(x)|^2##.
That's a nonsense. As long as particles are not created or destructed (which are not in BM), the continuity equation is satisfied for any ##\rho(x) ##.
But the current in such a continuity equation is not given as a function of the wavefunction and its derivatives.
More precisely, for any initial ##\rho(x,0)## there is ##\rho(x,t)## such that ##\rho(x,t)## satisfies a continuity equation.
That was a later addition in an edit, which intentionally wrote "a continuity equation" instead of "the continuity equation". Indeed, I agree that you are right in your defense of coarse graining for BM. But since Nullstein is nicely focused in his points, I think I am right too in making the concrete difference between "a continuity equation" and "the continuity equation" more explicit.

Demystifier
But the current in such a continuity equation is not given as a function of the wavefunction and its derivatives.

That was a later addition in an edit, which intentionally wrote "a continuity equation" instead of "the continuity equation". Indeed, I agree that you are right in your defense of coarse graining for BM. But since Nullstein is nicely focused in his points, I think I am right too in making the concrete difference between "a continuity equation" and "the continuity equation" more explicit.
Well, if Nullstein meant this by the continuity equation, then he is even more wrong. He said that "Otherwise, probabilities won't add up exactly to 1 and can't be interpreted as probabilities anymore.". But probabilities add up to 1 for any ##\rho(x,t)## that satisfies (i) the initial condition ##\int d^3x \, \rho(x,0)=1## and (ii) continuity equation for all times with this ##\rho(x,t)##.

Last edited:
That's a nonsense. As long as particles are not created or destructed (which are not in BM), the continuity equation is satisfied for any ##\rho(x) ##. More precisely, for any initial ##\rho(x,0)## there is ##\rho(x,t)## such that ##\rho(x,t)## satisfies a continuity equation.
You shouldn't just call something nonsense just because you don't understand it. In QM, the continuity equation is given by ##\partial_t|\Psi|^2 +\nabla\vec j = 0##, where ##\vec j## is the probability current derived from the Schrödiger equation. In BM, one defines the velocity of the hidden variables to be ##\vec v = \frac{\vec j}{|\Psi|^2}## and you can insert it into the conuity equation to arrive at ##\partial_t|\Psi|^2 +\nabla(|\Psi|^2\vec v)=0##. That's fine, because ##|\Psi|^2\vec v = \vec j## and the equation still holds, because it's just a reformulation. But if you now replace ##|\Psi|^2## by some different function ##\rho## in that equation, the equation will no longer be satisfied.

It can never be a mistake to just work with the fine grained distribution. If there is a coarse grained distribution that makes the same predictions, this is fine, but if it makes different predictions than the fine grained version, then using the coarse grained version is just a mistake. In this case, the above discussion proves that the fine grained distribution will not satisfy the continuity equation if it is not equal to ##|\Psi|^2##, which leads to probabilitites that don't add up to ##1##, so the fine grained distribution will cease to be a probability distribution.

Well, if Nullstein meant this by the continuity equation, then he is even more wrong. He said that "Otherwise, probabilities won't add up exactly to 1 and can't be interpreted as probabilities anymore.". But probabilities add up to 1 for any ##\rho(x,t)## that satisfies (i) the initial condition ##\int d^3x \, \rho(x,0)=1## and (ii) continuity equation for all times with this ##\rho(x,t)##.
You can't just use any random continuity equation. There is exactly one fine grained distribution, which satisfies one specific continuity equation, which follows from the Schrödinger equation. You are free to use a coarse grained version of the distribution if its predictions agree with the fine grained version (up to some precision). But in this case, it is even provable that the fine grained distribution won't even remain a probability distribution after some time. Using the fine grained distribution can never be wrong. Using the coarse grained version can only be convenience. If the fine grained version tells you that probabilities don't add up to ##1##, then you have a serious problem.

You can't just use any random continuity equation. ... But in this case, it is even provable that the fine grained distribution won't even remain a probability distribution after some time.
This seems wrong to me. For the fine grained distribution to remain a probability distribution, it is enough that is remains non-negative, and follows any random continuity equation.

Demystifier