# Definition of entropy for indistinguishable and distinguishable particles

• A
• Philip Koeck
Becker's book.In summary, according to R. Becker, entropy in statistical physics is only defined up to an additive function of N. This division by N! ensures that the probability for each accessible microstate is comparable between distinguishable and indistinguishable particles.

#### Philip Koeck

I have a rather general question about the definition of entropy used in most textbooks:
S = k ln Ω, where Ω is the number of available microstates.

Boltzmann wrote W rather than Ω, and I believe this stood for probability (Wahrscheinlichkeit).
Obviously this is not a number between 0 and 1, so it's more like something proportional to probability.

Probability would be the number of accessible microstates in a given macrostate divided by the total number of microstates (including those that are not accessible).
Now for distinguishable particles both these numbers are bigger than for indistinguishable particles, by a factor N!, where N is the number of particles, in the case of low occupancy.

Would it make sense therefore to use the following definition of entropy for distinguishable particles to make sure that this "probability" W is calculated correctly?
S = k ln (Ω / N!) for distinguishable particles at low occupancy.

This would make S extensive even for distinguishable particles.

If you do so you miss the mixing entropy for distinguishable particles. All these obstacles of counting go away when using quantum field theory, because there the appropriate commutator (bosons) or anticommutator (fermions) relations of the field operators take care of indistinguishability of particles and provide the right counting.

vanhees71 said:
If you do so you miss the mixing entropy for distinguishable particles. All these obstacles of counting go away when using quantum field theory, because there the appropriate commutator (bosons) or anticommutator (fermions) relations of the field operators take care of indistinguishability of particles and provide the right counting.
The problem I see is that QFT doesn't deal with distinguishable particles at all. Is that correct?
On the other hand there is extensive literature on statistical physics for systems of distinguishable particles, such as colloids, see for example Robert Swendsen.
I believe in this literature it's claimed that S is only defined up to an additive function of N, so the division of Ω by N! is okay according to that.

Missing the exchange entropy is sort of the point of this division by N! because the factor N! applies to both the accessible and the total number of microstates. The division by N! makes sure that the probability for each accessible microstate is comparable between distinguishable and indistinguishable particles.

I do, however, find it somewhat messy to have 2 different definitions of S.

Of course QFT describes distinguishable particles. Already QED has electrons, positrons, and photons. Particles are distinguishable by intrinsic quantum numbers, as are spin, mass, charge(s).

vanhees71 said:
Of course QFT describes distinguishable particles. Already QED has electrons, positrons, and photons. Particles are distinguishable by intrinsic quantum numbers, as are spin, mass, charge(s).
I meant identical distinguishable particles. For example one could imagine an aerosol where all colloidal particles are identical. Obviously that's not realistic, but one could, in principle, produce an aerosol with a mass distribution that is so narrow that the particles might as well be identical.
If one is only interested in the distribution of kinetic energies in this aerosol, for example, all detailed structural differences are irrelevant.
How would one define S for such a system? Would the division by N! make sense?

I believe such an aerosol is one of the systems Swendsen studies using statistical physics.

Identical particles are indistinguishable in quantum theory (i.e., either fermions or bosons).

vanhees71 said:
Identical particles are indistinguishable in quantum theory (i.e., either fermions or bosons).
I think people like Swendsen simply don't use QT to describe systems of almost identical, macroscopic objects such as colloids. So they have to deal with identical distinguishable particles. They argue that S for such systems should still be extensive. One way of making sure it is would be to divide by N!
I'm just wondering what people think about having different definitions of S for macroscopic, identical, distinguishable particles and microscopic, indistinguishable particles

Sure, you cannot describe matter consistently with classical physics. That's why you have to borrow a bit from quantum many-body theory. Nevertheless, if the particles are distinguishable you have mixing entropy. There was a long debate about this in the forums some time ago. The most clear discussion I'm aware of is in

R. Becker, Theory of Heat, Springer (1967)

That's anyway a very good textbook on both phenomenological thermodynamics and statistical physics.

protonsarecool, hutchphd and Lord Jestocost
vanhees71 said:
Sure, you cannot describe matter consistently with classical physics. That's why you have to borrow a bit from quantum many-body theory. Nevertheless, if the particles are distinguishable you have mixing entropy. There was a long debate about this in the forums some time ago. The most clear discussion I'm aware of is in

R. Becker, Theory of Heat, Springer (1967)

That's anyway a very good textbook on both phenomenological thermodynamics and statistical physics.
So would you say that S = k ln Ω (where Ω is the number of accessible microstates) is the only sensible definition of entropy in statistical physics?
Would you exclude anything like S = k ln Ω + f(N) ?

I'll have to look at that book. I've heard it mentioned a lot.

I'd always use the von Neumann-Shannon-Jaynes definition,
$$S=-k \mathrm{Tr} (\hat{\rho} \ln \hat{\rho}).$$

Philip Koeck and hutchphd
vanhees71 said:
I'd always use the von Neumann-Shannon-Jaynes definition,
$$S=-k \mathrm{Tr} (\hat{\rho} \ln \hat{\rho}).$$
In Blundell's book they show that the information-definition is equivalent to k ln Ω, but I can't make sense of what they write. Do you know a good derivation? (Is it equivalent?)

But back to S = k ln Ω:

I realize now what you mean be mixing entropy. When 2 volumes containing different gases at the same P and T mix, S increases. Is that what you meant?

This is easily shown using the Sackur Tetrode equation (STE) for two ideal gases of indistinguishable particles, for example He and Ne atoms.

Now I've shown in my own derivation that by defining S = k ln (Ω / N!) one also gets the STE for an ideal gas of distinguishable particles, whatever that might be. (Think of trillions of identical steel balls in a very large container in zero gravity if you want to.)
See: https://www.researchgate.net/publication/330675047_An_introduction_to_the_statistical_physics_of_distinguishable_and_indistinguishable_particles_in_an_ideal_gas

That would mean that the STE also predicts an entropy increase upon mixing of two volumes of gas consisting of distinguishable particles if entropy is defined as S = k ln (Ω / N!).
The only difference is that the two gases don't need to be different.

So I don't think one misses mixing entropy with this definition of S.

vanhees71
Philip Koeck said:
In Blundell's book they show that the information-definition is equivalent to k ln Ω, but I can't make sense of what they write. Do you know a good derivation? (Is it equivalent?)
In the microcanonical ensemble you fix the (relevant) additive conserved quantities. For a simple ideal gas that's the energy and the particle number. Given the values of these quantities in equilibrium each micro state is equally probable, i.e., ##P=1/\Omega=\text{const}## and then
$$S=-k \sum P \ln P=-k \Omega/\Omega \ln(1/\Omega)=k \ln \Omega.$$
Philip Koeck said:
But back to S = k ln Ω:

I realize now what you mean be mixing entropy. When 2 volumes containing different gases at the same P and T mix, S increases. Is that what you meant?
That's the standard example. If you have a gas, say helium, which you put in a box with volume ##V## divided into volumes ##V_1## and ##V_2## by some semipermeable membrane, putting in one part only ##^3\text{He}## and in the other one ##^4\text{He}## atoms (the entire volume at the same temperature and pressure) and then wait until the atoms have diffused for a long enough time through the semipermeable membrane (or you simply adiabatically through out the wall and wait until the gases diffuse) the entropy increases by this mixing entropy,
$$S_{\text{mix}}=k (N \ln N-N_1 \ln N_1-N_2 \ln N_2)=k [N_1 \ln (N/N_1)+N_2 \ln(N/N_2)]=K [N_1 \ln(V/V_1) + N_2 \ln(V/V_2)].$$
If you ignore that the gas is in fact a mixture of atoms with nuclei of different isotopes, as is usually done in chemistry, this mixing entropy doesn't play a role in the entropy differences in the usual chemical processes and thus can be ignored.

Only if you do the gedanken experiment with two indstinguishable gases, i.e., if initially in both parts is only one isotope of helium, then the mixing entropy is a paradox, which is resolved by introducing the famous factor ##N!## for indistinguishable particles, getting the correct Sackur-Tetrode formula for an additive entropy for the ideal gas.
Philip Koeck said:
This is easily shown using the Sackur Tetrode equation (STE) for two ideal gases of indistinguishable particles, for example He and Ne atoms.

Now I've shown in my own derivation that by defining S = k ln (Ω / N!) one also gets the STE for an ideal gas of distinguishable particles, whatever that might be. (Think of trillions of identical steel balls in a very large container in zero gravity if you want to.)
See: https://www.researchgate.net/publication/330675047_An_introduction_to_the_statistical_physics_of_distinguishable_and_indistinguishable_particles_in_an_ideal_gas

That would mean that the STE also predicts an entropy increase upon mixing of two volumes of gas consisting of distinguishable particles if entropy is defined as S = k ln (Ω / N!).
The only difference is that the two gases don't need to be different.

So I don't think one misses mixing entropy with this definition of S.
I don't understand the part of distinguishable particles. If you have distinguishable particles you must count the different components of the gas separately, and doing so leads automatically to the correct mixing entropy. As I said, the most simple way is to first do the calculation with QFT having a mixture of different gases (each bosons or fermions) and then doing the classical limit for low occupancies, leading to the correct Boltzmann distributions

I like as an example to consider random DNA molecules. If the molecules are long enough, no two molecules will have the same sequence. Macroscopically, the DNA will behave as a pure uniform substance. You can't devise a macroscopic process which would allow you to generate work from mixing two such DNA samples. Nevertheless, with more refined techniques, such a process is possible, as you can read out a DNA sequence when it diffuses through a pore. This is exactly the situation of a Maxwellian demon, i.e. you are acquiring large ammounts of information whose deletion will generate entropy, so that in the end, nothing is gained in terms of entropy change, whether you simply mix the DNA samples or whether you try to do so in a reversible manner, where you have to delete information at the end.

vanhees71
Sure, if you cannot distinguish the distinguishable substances, then it behaves as one substance, but in this case the mixing entropy is simply irrelevant and it doesn't make a difference in your description in this "coarse grained way".

This becomes conceptually very clear when remembering the information-theoretical meaning of entropy as a measure for the missing information.

vanhees71 said:
In the microcanonical ensemble you fix the (relevant) additive conserved quantities. For a simple ideal gas that's the energy and the particle number. Given the values of these quantities in equilibrium each micro state is equally probable, i.e., ##P=1/\Omega=\text{const}## and then
$$S=-k \sum P \ln P=-k \Omega/\Omega \ln(1/\Omega)=k \ln \Omega.$$
Blundell's book does something very strange when it introduces the Shannon formula. To me it sounds like they introduce another level of micro states, so they get macro, micro and even more micro.
I can share the passages if someone wants to look. If someone has the book look at the beginning of section 14.8 and compare that to what they say after equation (14.36).

Anyway it got me thinking and my questions are getting more and more basic as a consequence, I'm afraid.

If I consider a very strange isolated system, which I call A and which has a total of 2 possible macro states called 1 and 2 with Ω micro states in each of these macro states, then the entropy of the system, when it's in macro state 1 would be given by S = k ln Ω.

Now let's say I have another system, B, that has 4 such macro states, also with Ω micro states in each macro state, and Ω is the same number as in A.
If this system is in state 1 it also has the entropy S = k ln Ω.

How can the entropy be the same for these two systems?
To me it seems that I've gained more information by knowing that system B is in macro state 1 than by knowing that system A is in macro state 1.

vanhees71
vanhees71 said:
Sure, if you cannot distinguish the distinguishable substances, then it behaves as one substance, but in this case the mixing entropy is simply irrelevant and it doesn't make a difference in your description in this "coarse grained way".

This becomes conceptually very clear when remembering the information-theoretical meaning of entropy as a measure for the missing information.
Yes, this is the point I wanted to make. Even if particles are distinguishable at a microscopic level (as in the DNA example or in case of isotopic substitution pattern), the information contained in these particle "labels" corresponds to a huge information entropy, which has to be taken into account properly.

vanhees71
Philip Koeck said:
Blundell's book does something very strange when it introduces the Shannon formula. To me it sounds like they introduce another level of micro states, so they get macro, micro and even more micro.
I can share the passages if someone wants to look. If someone has the book look at the beginning of section 14.8 and compare that to what they say after equation (14.36).

Anyway it got me thinking and my questions are getting more and more basic as a consequence, I'm afraid.

If I consider a very strange isolated system, which I call A and which has a total of 2 possible macro states called 1 and 2 with Ω micro states in each of these macro states, then the entropy of the system, when it's in macro state 1 would be given by S = k ln Ω.

Now let's say I have another system, B, that has 4 such macro states, also with Ω micro states in each macro state, and Ω is the same number as in A.
If this system is in state 1 it also has the entropy S = k ln Ω.

How can the entropy be the same for these two systems?
To me it seems that I've gained more information by knowing that system B is in macro state 1 than by knowing that system A is in macro state 1.
But it's right that you get the same entropy, because if you say the system is in a given macro state (in equilibrium with precisely given values for all (relevant) additive conserved quantities of the system) and the number of available microstates is the same, you get the same entropy.

I remember one author of a textbook (unfortunately I don't remember which one) said that entropy in the information-theoretical sense is a measure for the "surprise" if you look at a definite outcome, given the objective knowledge you have about the system.

If you make the macrostate less precise, i.e., if you make the knowledge coarser, more microstates are compatible with the macrostate you have knowledge of and thus the entropy gets larger, i.e., the surprise for getting a certain outcome, when observing some definite microstate is larger than before.

Philip Koeck
vanhees71 said:
I don't understand the part of distinguishable particles. If you have distinguishable particles you must count the different components of the gas separately, and doing so leads automatically to the correct mixing entropy. As I said, the most simple way is to first do the calculation with QFT having a mixture of different gases (each bosons or fermions) and then doing the classical limit for low occupancies, leading to the correct Boltzmann distributions
I wouldn't speak of a gas with different components. In my treatment of distinguishable particles in the second half of the text (https://www.researchgate.net/publication/330675047_An_introduction_to_the_statistical_physics_of_distinguishable_and_indistinguishable_particles_in_an_ideal_gas) every particle can be distinguished from every other particle. I take that into account when I write the expression for W.
I can get all the usual results for an ideal gas of distinguishable particles if I assume that S = k ln (W/N!), whereas for an ideal gas of indistinguishable particles I get the usual results for S = k ln W.

I'm mainly wondering whether it's reasonable to use two different definitions of S.

I'm a bit puzzled by the fact that your entropy is smaller for distinguishable particles than for indistinguishable ones. Shouldn't it be precisely the other way around, because if you have distinguishable particles the exchange of any two particles occupying different microstates are to be counted as different states, why they are to be counted as only one state for indistinguishable particles.

protonsarecool
vanhees71 said:
I'm a bit puzzled by the fact that your entropy is smaller for distinguishable particles than for indistinguishable ones. Shouldn't it be precisely the other way around, because if you have distinguishable particles the exchange of any two particles occupying different microstates are to be counted as different states, why they are to be counted as only one state for indistinguishable particles.
I would say it's the same.
Don't forget that W is bigger for distinguishable than for indistinguishable particles, given the same structure of energy levels and single particle states within these levels and the same number of particles in the system.
The division by N! only makes sense if the number of available single particle states is much larger than the number of particles, so that one can be certain that there is hardly ever more than one particle in a state.
So, at low occupancy, W for distinguishable particles is larger by a factor N! than W for indistinguishable particles.
You actually see that directly if you look at the expressions for W of indistinguishable particles at low occupancy on pages 6 and 7 and compare to the expressions for W of distinguishable particles at low occupancy on pages 15 and 23.
Remark: W for distinguishable, boson-like particles doesn't change when going to low occupancy, all the other Ws do.

vanhees71
Philip Koeck said:
I meant identical distinguishable particles. For example one could imagine an aerosol where all colloidal particles are identical. Obviously that's not realistic, but one could, in principle, produce an aerosol with a mass distribution that is so narrow that the particles might as well be identical.
If one is only interested in the distribution of kinetic energies in this aerosol, for example, all detailed structural differences are irrelevant.
How would one define S for such a system? Would the division by N! make sense?

I believe such an aerosol is one of the systems Swendsen studies using statistical physics.
You can describe identical distinguishable particles in QFT using paraboson or parafermion creation and anihilation operators. This also allows to describe Boltzmann statistics as a limit in QFT.

protonsarecool and Philip Koeck
What are "identical distinguishable particles"? That's an oxymoron for me. Either the particles are distinguishable, i.e., having different intrinsic quantum numbers (spins, various "charges" of the standard model) or they are indistinguishable, having all the same intrinsic quantum numbers.

In the usual case of 3 space dimensions indistinguishable particles are described as either fermions or bosons related to their spin being half-integer or integer valued, respectively. The resulting equilibrium phase-space distributions are thus Fermi-Dirac or Bose-Einstein distributions. In the limit of small phase-space densities both distributions are well approximated by the Maxwell-Boltzmann distribution.

Philip Koeck
vanhees71 said:
What are "identical distinguishable particles"? That's an oxymoron for me. Either the particles are distinguishable, i.e., having different intrinsic quantum numbers (spins, various "charges" of the standard model) or they are indistinguishable, having all the same intrinsic quantum numbers.

In the usual case of 3 space dimensions indistinguishable particles are described as either fermions or bosons related to their spin being half-integer or integer valued, respectively. The resulting equilibrium phase-space distributions are thus Fermi-Dirac or Bose-Einstein distributions. In the limit of small phase-space densities both distributions are well approximated by the Maxwell-Boltzmann distribution.
Fully agree with the second part of the above.

About systems of identical distinguishable particles:
One can certainly simulate such systems in the computer and it can be interesting to calculate S for such systems.
One could also think of very unlikely model systems such as a large container in zero gravity with billions of tiny steel balls flying around in it. Clearly they are not really identical, but they can be so similar that the differences are irrelevant, for example if I'm only interested in the distribution of kinetic energies and an expression for S related to this distribution.
They are also distinguishable despite being "identical" simply because it's possible to keep track of them with an imaging system without disturbing them noticeably.
A more realistic system could be an aerosol or even colloidal particles in solution.
Again, the particles could be so similar that the small differences are irrelevant for the analysis.
That sort of system is what Swendsen, for example, studies. He also wrote a textbook on statistical physics

Last edited:
vanhees71