# Why is the partition for Fermions a sum of Boltzman factors?

• A

## Main Question or Discussion Point

The partition function should essentially be the sum of probabilities of being in various states, I believe. Why is it then the sum of Boltzmann factors even for fermions and bosons? I've never seen a good motivation for this in literature.

## Answers and Replies

Related Quantum Physics News on Phys.org
I'll reformulate my question: To derive for example the FD distribution textbooks usually start with the partition function for the Boltzmann distribution with constant particle number, which is simply a sum of Boltzmann factors. Then the chemical potential is plugged in to account for non-constant particle number. From this the FD-distribution is derived as the expectation value for the number of particles on each energy level.
How is this procedure motivated? Why would one even think of trying to derive the FD distribution from the Boltzmann distribution?

Demystifier
In general, let $|k\rangle$ be the complete set of Hamiltonian eigenstates, $H|k\rangle=E_k|k\rangle$. The partition function is then
$$Z=\sum_k e^{-\beta E_k}$$
Now let us specify this for free particles. A single free particle with momentum ${\bf p}$ has energy $E({\bf p})$. The basis of Hamiltonian eigenstates are all states of the form
$$|k\rangle=|n_1,{\bf p}_1;n_2,{\bf p}_2;\ldots\rangle$$
which denotes a state where $n_1$ particles have momentum ${\bf p}_1$, $n_2$ particles have momentum ${\bf p}_2$, etc. The energy of this state is
$$E_k=n_1E({\bf p}_1)+n_2E({\bf p}_2)+\ldots$$
so
$$e^{-\beta E_k}=e^{-\beta n_1E({\bf p}_1)} e^{-\beta n_2E({\bf p}_2)} \cdots$$
Hence the partition function is
$$Z=\sum_{n_1,n_2,\ldots} e^{-\beta n_1E({\bf p}_1)} e^{-\beta n_2E({\bf p}_2)} \cdots = \sum_{n_1}e^{-\beta n_1E({\bf p}_1)} \sum_{n_2}e^{-\beta n_2E({\bf p}_2)} \cdots= \prod_{{\bf p}} z({\bf p})$$
where
$$z({\bf p})=\sum_{n({\bf p})} e^{-\beta n({\bf p}) E({\bf p})}=\sum_{n} e^{-\beta nE({\bf p})}$$
Hence
$${\rm ln}Z=\sum_{{\bf p}}{\rm ln}\,z({\bf p})=V\int \frac{d^3p}{(2\pi)^3} {\rm ln}\,z({\bf p})$$
It remains to compute $z({\bf p})$, which depends on whether particles are fermions or bosons. For fermions $n=0,1$ so
$$z_{\rm fer}({\bf p})=1+e^{-\beta E({\bf p})}$$
For bosons $n=0,1,\ldots,\infty$, so
$$z_{\rm bos}({\bf p})=\sum_{n=0}^{\infty}(e^{-\beta E({\bf p})})^n=\frac{1}{1-e^{-\beta E({\bf p})}}$$
As you can see, in the first equation one has a Boltzmann-like expression which looks similar to that in classical physics. The Bose-Einstein or Fermi-Dirac factor emerges at the end of calculation from a sum over $n$. I hope this answers your original question. The chemical potential can be added in a similar way, by changing the first equation as $E_k\rightarrow E_k-\mu N_k$.

Last edited:
stevendaryl
Staff Emeritus
Here's a heuristic argument for why the factor $e^{-\beta E}$ comes into play:

Suppose that you have a system with discrete energy levels $E_i$. It is kept at a constant temperature by keeping it in thermal contact with a "heat bath". A heat bath is a second system that is much larger than the system of interest, and has a much higher energy. (Imagine keeping the first system in a large reservoir of warm water, for example)

Because the first system is in thermal contact with the heat bath, it does not have a constant amount of energy, because it can exchange energy with the heat bath. So the energy of the first system becomes probabilistic. There is a certain probability $P(E_i)$ that the first system will be found to have energy $E_i$. The question is, how to compute $P(E_i)$.

A way to reason is this: Let $W(E)$ be the number of states of the heat bath that have energy $E$. When the first system is placed into the heat bath, the total energy $E$ is divided between the small system and the heat bath, with $\mathcal{E}$ going to the small system, and $E-\mathcal{E}$ going to the heat bath. Then $W(E - \mathcal{E})$ is the number of ways to split up energy $E$ into two parts $\mathcal{E}$ and $E - \mathcal{E}$. We assume that the probability of a particular split is proportional to the number of ways to make that split. That means mathematically that:

$P(\mathcal{E}) = K W(E - \mathcal{E})$

We compute the constant $K$ by making the probabilities add up to 1: $\sum_i P(E_i) = 1$. So how do we compute $W(E-\mathcal{E})$?

We use Boltzmann's definition of entropy of the heat bath: $S = k log(W)$ or in the exponential form, $W = e^{S/k}$.

Then $W(E-\mathcal{E}) = e^{S(E -\mathcal{E})/k}$

Now, assuming that $\mathcal{E} \ll E$, we can expand $S$ in a taylor series and only keep the first couple of terms:

$S(E - \mathcal{E}) \approx S(E) - \frac{\partial S}{\partial E} \mathcal{E}$

A definition of temperature is that $\frac{1}{T} \equiv \frac{\partial S}{\partial E}$. So we have:

$S(E - \mathcal{E}) \approx S(E) - \mathcal{E}/T$

Plugging this back into the formula for $P(\mathcal{E})$, we get:

$P(\mathcal{E}) = K e^{(S(E) - \mathcal{E}/T)/k} = K e^{S/k - \beta \mathcal{E}}$ where $\beta = 1/(kT)$.

The factor $e^{S}$ is a constant, which can be absorbed into $K$. So we have:

$P(\mathcal{E}) = K e^{-\beta \mathcal{E}}$

The constant $K$ is determined via: $\sum_j e^{-\beta E_j} = \frac{1}{K}$. That's the choice that makes the probabilities add up to 1. If we define that sum to be $Z(\beta)$, then we have:

$P(E_i) = Z^{-1} e^{-\beta E_i}$

Last edited:
Demystifier
In quantum physics of bosons, both Boltzmann and Bose-Einstein distributions are correct, because those two distributions talk about two different things. The Boltzmann factor $e^{-\beta E_k}$ is (proportional to) the probability of the quantum state $|k\rangle$. The Bose-Einstein factor $1/(e^{\beta E({\bf p})}-1)$ is (proportional to) the average number of particles having the momentum ${\bf p}$. Note that $E_k$ and $E({\bf p})$ are different quantities. Their relation is given in my previous post above.

Last edited:
Here's a heuristic argument for why the factor $e^{-\beta E}$ comes into play:

Suppose that you have a system with discrete energy levels $E_i$. It is kept at a constant temperature by keeping it in thermal contact with a "heat bath". A heat bath is a second system that is much larger than the system of interest, and has a much higher energy. (Imagine keeping the first system in a large reservoir of warm water, for example)
...................
If we define that sum to be $Z(\beta)$, then we have:

$P(E_i) = Z^{-1} e^{-\beta E_i}$
So the exponential term simply emerges from the assumption that a small system exchanges energy with a much larger heat bath, no matter what kind of statistics the small system obeys. Is that right?

stevendaryl
Staff Emeritus
So the exponential term simply emerges from the assumption that a small system exchanges energy with a much larger heat bath, no matter what kind of statistics the small system obeys. Is that right?
Yes, the statistics Fermi versus Bose only comes into play in computing how many different states there are.

If you have one-particle states with energies $E_1, E_2, ...$, and you have $N$ particles (assumed non-interacting), then the possible many-particle states include

• (for Bosons): For each $j$, $N_j = 0, 1, 2, ...$ (where $N_j$ is the number of particles with energy $E_j$)
• (for Fermions): For each $j$, $N_j = 0, 1$
That's what @Demystifier was explaining.

stevendaryl
Staff Emeritus
There is a similar "heat-bath" type understanding of the chemical potential.

Instead of the system only exchanging energy with the environment, you can consider a situation in which it exchanges both energy and particles. In that case, the state of the system is characterized by a vector of numbers $\vec{N}$, where $N_i$ is the number of particles in state $i$.

If the total number of particles (system + environment) is $N$, and the total energy is $E$, then the energy and number of particles available to the environment is: $E_{env} = E -\sum_i E_i N_i$ and $N_{env} = N - \sum_i N_i$. If you let $W = e^{S/k}$ be the number of possible states of the environment consistent with that energy and number of particles, then we can again do a Taylor series to get: $W \approx e^{S(E,N)/k - \frac{\partial S}{\partial E} \sum_i E_i - \frac{\partial S}{\partial N} \sum_i N_i}$. By definition, $\frac{\partial S}{\partial E} = \frac{1}{T}$ and $\frac{\partial S}{\partial N} = -\frac{\mu}{T}$ where $\mu$ is the chemical potential. So we have:

$W = e^{S/k - \sum_i \beta E_i N_i + \sum_i \beta \mu N_i}$

(where $\beta = \frac{1}{kT}$)

We can write this as: $W = e^{S/k} \Pi_j (e^{-\beta (E_i - \mu)})^{N_i}$

The probability of having state $\vec{N}$ is then given by:

$P(\vec{N}) = (Z(\beta, \mu))^{-1} \Pi_i (z_i)^{N_i}$

where $z_i = e^{-\beta (E_i -\mu) N_i}$ and where

$Z(\beta, \mu) = \sum_{N_1, N_2, ...} \Pi_i (z_i)^{N_i}$

which can be rearranged into

$Z(\beta, \mu) = \Pi_i (\sum_{N_i} (z_i)^{N_i})$

$\mu$ accounts for fluctuating numbers of particles in the same way the $\beta = \frac{1}{kT}$ accounts for fluctuating amounts of energy.

Sorry for taking up this thread again after such a long time.
I still have conceptual problems with the canonical distribution (Boltzmann factor).
I'm thinking of a large system, such as an ideal gas, that is isolated from the surroundings.
The number of particles is fixed and the total energy is also constant.
The energy levels could be continuous like in an ideal gas or one could imagine a discrete set of available energies.
If I now pick out one particle as a very small system and treat the rest of the gas (or whatever it is) as a heat bath just like in the posts by @stevendaryl, my impression is that I'm making two implicit assumptions:
1. I can distinguish the particles.
2. The particle I designated as the small system can populate any energy level, even one that is already occupied by a particle belonging to the heat bath.
The second assumption is only a problem for systems with discrete energies, as I see it.

Anyway, it seems to me that the canonical ensemble implicitly assumes distinguishable bosons.
To my way of thinking this is problematic if the canonical distribution is then used to construct a partition function that is used to derive state functions and distributions for indistinguishable particles, both bosonic and fermionic.
It looks to me like the initial assumptions are being ignored.

stevendaryl
Staff Emeritus
Anyway, it seems to me that the canonical ensemble implicitly assumes distinguishable bosons.
To my way of thinking this is problematic if the canonical distribution is then used to construct a partition function that is used to derive state functions and distributions for indistinguishable particles, both bosonic and fermionic.
It looks to me like the initial assumptions are being ignored.
Just for clarification, the discussion that I gave about "heat baths" is just a way to motivate the Boltzmann distribution. It's not intended to be rigorous.

Second, in the discussion, it's not the particles that are distinguished, but entire subsystems. If you have a closed container filled with water, and that container is sitting in a large pool of water, you can distinguish between the two quantities of water, even though you can't distinguish two water molecules.

If you have a system with many indistinguishable particles, then as a first approximation, people often ignore the interaction between the particles. In a certain sense, this is inconsistent, because noninteracting particles will not reach thermal equilibrium (since equilibrium is reached by exchanging energy, which means interacting).

So once again, you have two subsystems, the system of interest and the heat bath. The only thing that is important about the heat bath is that it has a huge number of states associated with each energy (which is necessary for it to have a well-defined temperature).

Just for clarification, the discussion that I gave about "heat baths" is just a way to motivate the Boltzmann distribution. It's not intended to be rigorous.

Second, in the discussion, it's not the particles that are distinguished, but entire subsystems. If you have a closed container filled with water, and that container is sitting in a large pool of water, you can distinguish between the two quantities of water, even though you can't distinguish two water molecules.

If you have a system with many indistinguishable particles, then as a first approximation, people often ignore the interaction between the particles. In a certain sense, this is inconsistent, because noninteracting particles will not reach thermal equilibrium (since equilibrium is reached by exchanging energy, which means interacting).

So once again, you have two subsystems, the system of interest and the heat bath. The only thing that is important about the heat bath is that it has a huge number of states associated with each energy (which is necessary for it to have a well-defined temperature).
I'm cautiously trying to come to grips with these concepts. Thanks for the help.
So the small system should never be viewed as a single atom or molecule.

stevendaryl
Staff Emeritus
I'm cautiously trying to come to grips with these concepts. Thanks for the help.
So the small system should never be viewed as a single atom or molecule.
Well, there are the two different types of ensemble, one of which allows the interchange of energy between the system and the "heat bath", and the other of which allows the interchange of both energy and particles. The second type doesn't make sense for a single particle, but the first does. But the point is that the total state of the entire system can be factored into the state of the "heat bath" and the state of the system of interest. If the single atom is somehow isolated from other similar atoms, but is able to emit and absorb energy, then I think you can treat it thermodynamically.

Demystifier
If I now pick out one particle as a very small system and treat the rest of the gas (or whatever it is) as a heat bath just like in the posts by @stevendaryl, my impression is that I'm making two implicit assumptions:
1. I can distinguish the particles.
2. The particle I designated as the small system can populate any energy level, even one that is already occupied by a particle belonging to the heat bath.
I would suggest to first consider something simpler. Imagine that you have two hydrogen atoms in the ground state, one atom on Earth and one atom on the Moon.
1. Can you distinguish those two atoms? What is the wave function for that system of two atoms?
2. Since both atoms are in the ground state, do they populate the same energy level?

The only thing that is important about the heat bath is that it has a huge number of states associated with each energy (which is necessary for it to have a well-defined temperature).
Are you saying that the heat bath has low occupancy?
If the small system is just a part of the whole large system, then I would expect it to also have low occupancy.
That would mean the Boltzmann distribution is only valid in the low occupancy case.
Then it's still strange that the partition function constructed from Boltzmann factors can be used to derive the FD and BE distribution.

vanhees71
Gold Member
2019 Award
Let's take the grandcanonical statistical ensemble. Information theory tells us to find the state of maximum entropy under the constraints, i.e., (a) normalization, (b) value of the mean total energy and (c) value of the mean total particle number, leading to
$$\hat{\rho}=\frac{1}{Z} \exp(-\beta \hat{H}+\beta \mu \hat{N}).$$
The partition sum is
$$Z=\mathrm{Tr} \hat{\rho}.$$
Now take a finite cubic volume of length $L$ with periodic boundary conditions (to have a properly defined single-particle momentum) and a free gas. The Hamiltonian reads
$$\hat{H}=-\frac{1}{2m} \hat{\psi}^{\dagger} \Delta \hat{\psi} = \sum_{\vec{p}} \frac{\vec{p}^2}{2m} \hat{a}^{\dagger} (\vec{p}) \hat{a}(\vec{p}),$$
where the $\hat{a}$ are fermionic annihilation operators for momentum eigenstates (for convenience I leave out spin, which can easily be added later). The occupation numbers of $\hat{N}(\vec{p})=\hat{a}^{\dagger}(\vec{p}) \hat{a}(\vec{p})$ are a complete set of observables, and the eigenvalues are $N(\vec{p}) \in \{0,1 \}$ for each $\vec{p}$. The partion Sum thus is
$$Z=\prod_{\vec{p}} \sum_{N(\vec{p})=0}^{1} \exp[-\beta \hat{N}(\vec{p}) (\vec{p}^2/(2m)-\mu)] = \prod_{\vec{p}} [1+\exp[-\beta (\vec{p}^2/(2m)-\mu)]].$$
Now you can make $\alpha=\mu \beta$ formaly dependent on $\vec{p}$, and the mean number of particles with momentum $\vec{p}|$ is then given by
$$\langle N \rangle=\frac{\partial}{\partial \alpha(\vec{p})} = \frac{\exp[-\beta(\vec{p}^2/(2m)-\mu)]}{1+\exp[-\beta (\vec{p}^2/(2m)-\mu)]}=\frac{1}{\exp[\beta(\vec{p}^2/(2m)-\mu)]+1},$$
which is the Fermi-Dirac distribution, as it should be.

For bosons everything goes through precisely the same. The only difference is that you have to sum for each $N(\vec{p})$ over all $\mathbb{N}_0$. Of course this results in the Bose-Einstein distribution function.

stevendaryl
Staff Emeritus
Are you saying that the heat bath has low occupancy?
If the small system is just a part of the whole large system, then I would expect it to also have low occupancy.
There is no need to assume anything about the small system along those lines.

That would mean the Boltzmann distribution is only valid in the low occupancy case.
The Boltzmann distribution describes the environment (heat bath), not the small system.

Then it's still strange that the partition function constructed from Boltzmann factors can be used to derive the FD and BE distribution.
The reasoning doesn't distinguish between bosons or fermions or distinguishable particles. Those distinctions go into computing $W(N,E)$, the number of states with a given energy and number of particles. For distinguishable particles, swapping two particles results in a different state,while for bosons and fermions, it doesn't.

The following argument is semiclassical, in that it uses discrete energy levels and indistinguishability, but does not use operators as in @vanhees71.

What you do need is for the number of states for two subsystems to be multiplicative. Assume that you have one large system, the heat bath, with $W_B(N,E)$ giving the number of states for a given number of particles and total energy.

You have another system, the system of interest, with its own function $W_1(N_1, \varepsilon)$. Then the assumption is that the number of states $W(N,E)$ for the combined system is related to these two in the following way:

$W(N,E) = \sum_\varepsilon \sum_{N_1} W_1(N_1,\varepsilon) W_B(N-N_1, E - \varepsilon)$

Now, you assume that $W_B$ is huge compared with $W_1$, so most of the particles and most of the energy will be with the heat bath, rather than the system of interest. Under this assumption, we can approximate: $W_B(N-N_1, E - \varepsilon) \approx W_B(N,E) e^{-\beta (\varepsilon - \mu N_1)}$, where $\beta = \frac{\partial S_B}{\partial E}/k$ and $\beta \mu = \frac{\partial S_B}{\partial N}/k$.

Then we have:

$W(N,E) = W_B(N,E) \sum_\varepsilon \sum_{N_1} e^{- \beta(\varepsilon - \mu N_1)} W_1(N_1,\varepsilon)$

At this point, no assumption whatsoever is being made about the nature of the small system, other than the fact that it is much smaller than the heat bath. To go further and see where the statistics come into play, let's assume that the small system contains a number of identical particles that will be approximated as non-interacting. Then the many-particle states can be obtained from the single-particle states. Assume that a single particle has a discrete set of states, with energies $e_1, e_2, ...$. Then the multiparticle state is specified by simply giving $n_1, n_2, ...$ where $n_j$ is the number of particles in state $j$. The indistinguishability comes into play by the fact that we don't consider two states to be different if you swap two particles, because that leaves the numbers $n_j$ unchanged. For Fermions, each $n_j$ can have value 0 or 1. For Bosons, $n_j$ can be any nonnegative integer.

Given these assumptions, we can compute $W_1(N_1, \varepsilon)$ as follows:

$W_1(N_1, \varepsilon) = \sum_{n_1, n_2, n_3, ...} \Delta(\varepsilon - n_1 e_1 - n_2 e_2 - ...) \Delta(N_1 - n_1 - n_2 - ...)$

where $\Delta(X)$ is a function that is equal to 1 if $X=0$ and is equal to 0 otherwise. (That's actually the Kronecker delta, but I don't want to get the notion mixed up with the Diract delta function). In other words, $W_1(N_1, \varepsilon)$ is just counting the number of states with the correct number of particles and total energy. In terms of this expression, we can now rewrite our general expression above:

$W(N,E) = W_B(N,E) \sum_\varepsilon \sum_{N_1} \sum_{n_1, n_2, ...} e^{- \beta(\varepsilon - \mu N_1)} \Delta(E - n_1 e_1 - n_2 e_2 - ...) \Delta(N_1 - n_1 - n_2 - ...)$

Because of the presence of the $\Delta$s, the quantity being summed is zero unless $\varepsilon = n_1 e_1 + n_2 e_2 + ...$ and $N_1 = n_1 + n_2 + ...$. So we can replace $\varepsilon$ and $N_1$ in the exponential by those sums without changing the value of the whole expression:

$W(N,E) = W_B(N,E) \sum_\varepsilon \sum_{N_1} \sum_{n_1, n_2, ...} e^{- \beta(n_1 e_1 + n_2 e_2 + ... - \mu n_1 - \mu n_2 - ...)} \Delta(E - n_1 e_1 - n_2 e_2 - ...) \Delta(N_1 - n_1 - n_2 - ...)$

Now, since the exponential no longer involves $\varepsilon$ or $N_1$, we can reorder the sums (assuming everything is absolutely convergent, anyway):

$W(N,E) = W_B(N,E) \sum_{n_1, n_2, ...} e^{- \beta(n_1 e_1 + n_2 e_2 - \mu n_1 - \mu n_2 - ...)} \sum_\varepsilon \sum_{N_1} \Delta(E - n_1 e_1 - n_2 e_2 - ...) \Delta(N_1 - n_1 - n_2 - ...)$

The innermost sum is just $\sum_\varepsilon \sum_{N_1} \Delta(E - n_1 e_1 - n_2 e_2 - ...) \Delta(N_1 - n_1 - n_2 - ...)$. The only nonzero term is the one where $N_1 = n_1 + ...$ and $\varepsilon = n_1 e_1 + n_2 e_2 + ...$. There is only one value making this true, so the sum just yields 1. So we get a huge simplification:

$W(N,E) = W_B(N,E) \sum_{n_1, n_2, ...} e^{- \beta(n_1 e_1 + n_2 e_2 + ... - \mu n_1 - \mu n_2 - ...)}$

We can write this in a more tidy way:

$\sum_{n_1, n_2, ...} e^{-\beta(n_1 e_1 + n_2 e_2 + ... - \mu n_1 - \mu n_2 ..)} = \Pi_j \ [ \sum_{n} e^{-\beta n (e_j - \mu)}]$

So we just get: $W(N,E) = W_B(N,E) \Pi_j Z_j(\beta,\mu)$

where $Z_j(\beta, \mu) = \sum_n e^{-\beta n (e_j - \mu)}$. As mentioned earlier, for Fermions, $n = 0$ or $n=1$. So $Z_j(\beta,\mu) = 1 + e^{-\beta(e_j - \mu)}$. For Bosons, $n= 0, 1,2,...$ so the sum gives $Z_j(\beta,\mu) = 1 + e^{-\beta(e_j - \mu)} + e^{-2\beta (e_j - \mu)} + ... = \frac{1}{1-e^{-\beta(e_j - \mu)}}$

Taking the natural log of both sides gives:

$S(N, E) = S_B(N,E) + \sum_j log(Z_j(\beta, \mu))$

This shows an amazing, or maybe not so amazing, result, which is that if the particles are non-interacting, then each single-particle state $j$ makes its own independent contribution to the total entropy, $log(Z_j(\beta, \mu))$.

So once again, you have two subsystems, the system of interest and the heat bath. The only thing that is important about the heat bath is that it has a huge number of states associated with each energy (which is necessary for it to have a well-defined temperature).
I really liked the derivation in your latest post. (I also like @vanhees71 posts but I need some more background in QM to follow them completely, I'm afraid.)
There's one thing that still doesn't quite click. I believe you say that the heat bath has low occupancy in the quote above. Somewhere else you also say that the Boltzmann distribution applies to the heat bath (it's proportional to the probability distribution for the heat bath). At some point, however, we end up with a probability distribution for the small system, which doesn't necessarily have low occupancy. Where does the derivation switch from talking about the heat bath to talking about the system?

vanhees71
Gold Member
2019 Award
The Boltzmann distribution applies to bosons as well of fermions, if the "typical states" are populated by less than 1 particle on average, i.e., if
$$\frac{1}{\exp[\beta(E-\mu)] \pm 1} \simeq \exp[-\beta(E-\mu)] \quad \text{i.e., if} \quad \exp[\beta(E-\mu)] \gg 1, \quad E-\mu \gg k_{\text{B}} T.$$
This is plausible, because if there's only less than one particle in per phase-space cell of size $(2 \pi \hbar)^3$, then the indistinguishability of quantum physics can be neglected, i.e., Pauli blocking and Bose enhancement are unimportant.

stevendaryl
Staff Emeritus
I really liked the derivation in your latest post. (I also like @vanhees71 posts but I need some more background in QM to follow them completely, I'm afraid.)
There's one thing that still doesn't quite click. I believe you say that the heat bath has low occupancy in the quote above. Somewhere else you also say that the Boltzmann distribution applies to the heat bath (it's proportional to the probability distribution for the heat bath). At some point, however, we end up with a probability distribution for the small system, which doesn't necessarily have low occupancy. Where does the derivation switch from talking about the heat bath to talking about the system?
So, again, the number of states for the composite system is:

$\sum_{\varepsilon, N_1} W_B(E-\varepsilon, N-N_1) W(\varepsilon, N_1)$

where $W_B$ refers to the heat bath, and $W$ is the small system. So out of the total energy $E$, some small amount $\varepsilon$ goes to the small system, and the rest stays with the heat bath. Out of the total number of particles $N$, some small number $N_1$ goes to the small system. To get all possibilities, you sum over all possible values of $\varepsilon$ and $N_1$. (Strictly speaking, you should just consider $N_1 < N$ and $\varepsilon < E$, but that doesn't make much difference.)

My point about the heat bath is that the factor $e^{-\beta(\varepsilon - \mu N_1)}$ comes from approximating $W_B$, not from any properties of the small system. Defining $S_B = k log W_B$, we have $S_B(E - \varepsilon, N - N_1) \approx S_B(E, N) - \varepsilon \frac{\partial S_B}{\partial E} - N_1 \frac{\partial S_B}{\partial N}$. By definition, $\frac{\partial S_B}{\partial E} = \frac{1}{T}$. That's the thermodynamic definition of temperature. By definition, $\frac{\partial S_B}{\partial N} = - \frac{\mu}{T}$. That's the definition of chemical potential $\mu$ (I'm not sure why is wasn't defined to be just $\frac{\partial S}{\partial N}$, but presumably it's more convenient to have the $T$ and minus sign in there.) $\beta = \frac{1}{kT}$ is more convenient to use than $T$, so we can write $S_B(E-\varepsilon, N- N-1) \approx S_B(E,N) - k \beta \varepsilon + k N_1 \beta \mu$

So given the temperature and chemical potential of the heat bath, we can approximate $W_B(E-\varepsilon, N-N_1)$ as:

$W_B(E-\varepsilon, N-N_1) = e^{S_B(E-\varepsilon, N-N-1)/k} \approx e^{S_B(E,N)/k - \beta (\varepsilon - \mu N_1)}$

This is a fact solely about the heat bath. The properties of the small system are irrelevant. So we have the approximation (where $W_C$ means composite number of states)

$W_C(E,N) \approx e^{S_B(E-\varepsilon, N-N-1)/k} \sum_{\varepsilon, N_1} e^{-\beta (\varepsilon - \mu N)} W(\varepsilon, N_1)$

So in this expression, $W(\varepsilon, N_1)$ is the number of states of the smaller system as a function of energy and number of particles. We haven't made any assumption at all (yet) about this function. We haven't said that it's bosons or fermions or low or high occupancy. We haven't made any assumptions at all. Except that we're assuming that typically $W \ll W_B$, that the heat bath has lots more states, compared with the small system.

So the factor $e^{-\beta (\varepsilon - \mu N)}$ does not come from any properties of the small system. It comes from approximating $W_B$. So $\beta$ and $\mu$ are properties of the large system.

So the factor $e^{-\beta (\varepsilon - \mu N)}$ does not come from any properties of the small system. It comes from approximating $W_B$. So $\beta$ and $\mu$ are properties of the large system.
Does that mean that temperature and chemical potential are only defined for a large system with low occupancy?
So for a system that obeys FD or BE statistics the chemical potential and temperature in the distribution function have nothing to do with the system. They only refer to the surrounding heat bath.

vanhees71
Gold Member
2019 Award
Of course, temperature and chemical potential have the same meaning in classical and quantum statistics. They come into the game as Lagrange multipliers to impose the constraints of given average total center-momentum energy and particle number (or more generally conserved charges) in the maximum-entropy principle.

stevendaryl
Staff Emeritus
Does that mean that temperature and chemical potential are only defined for a large system with low occupancy?
Yes, I would say so. You can't talk about the temperature of a single atom. But if that atom is interacting with a "bath" of photons, the bath can have a temperature, and their temperature affects the probability that the atom will be in this state versus that state.

So for a system that obeys FD or BE statistics the chemical potential and temperature in the distribution function have nothing to do with the system. They only refer to the surrounding heat bath.
Yes. Although if you have two (or more) large systems that are interacting, at equilibrium, their temperatures and chemical potentials will be equal.

stevendaryl
Staff Emeritus
Of course, temperature and chemical potential have the same meaning in classical and quantum statistics. They come into the game as Lagrange multipliers to impose the constraints of given average total center-momentum energy and particle number (or more generally conserved charges) in the maximum-entropy principle.
It's interesting. You get the same answer (the Boltzmann distribution) in two ways:
1. You maximize the entropy of the small system subject to the constraint that the average energy and average number of particles are fixed values $\langle \varepsilon \rangle$ and $\langle N_1 \rangle$. Then the temperature and chemical potential arise as Lagrange multipliers.
2. You imagine the small system to be sharing energy and particles with a much larger environment. Then the temperature and chemical potential arise as thermodynamic properties of the environment.
You get the same probability distribution in the two cases, but conceptually, they seem very different.

It's interesting. You get the same answer (the Boltzmann distribution) in two ways:
1. You maximize the entropy of the small system subject to the constraint that the average energy and average number of particles are fixed values $\langle \varepsilon \rangle$ and $\langle N_1 \rangle$. Then the temperature and chemical potential arise as Lagrange multipliers.
2. You imagine the small system to be sharing energy and particles with a much larger environment. Then the temperature and chemical potential arise as thermodynamic properties of the environment.
You get the same probability distribution in the two cases, but conceptually, they seem very different.
Here are derivations (appended as a pdf, I hope) I mentioned to @stevendaryl in a message. I don't know how to append to a message so I'm sending the file here in the thread. The approach is quite different from the canonical one and the chemical potential makes the particle number constant, not variable.

#### Attachments

• 646.4 KB Views: 50
vanhees71
Gold Member
2019 Award
It's interesting. You get the same answer (the Boltzmann distribution) in two ways:
1. You maximize the entropy of the small system subject to the constraint that the average energy and average number of particles are fixed values $\langle \varepsilon \rangle$ and $\langle N_1 \rangle$. Then the temperature and chemical potential arise as Lagrange multipliers.
2. You imagine the small system to be sharing energy and particles with a much larger environment. Then the temperature and chemical potential arise as thermodynamic properties of the environment.
You get the same probability distribution in the two cases, but conceptually, they seem very different.
In fact they are not different, because both lead to the maximum-entropy principle. The only difference between the classical distribution and the quantum distributions (either bosons or fermions in $\geq 3$ space dimensions or also possibly anyons in 2 space dimensions) comes about the counting of microstates and thus the particular expression for entropy. Of course, nobody has a fully classical answer since it has to be repaired by hand by (a) introducing and arbitrary unit for phase-space volumes (which in QT turns out to be $h^{3N}=(2 \pi \hbar)^{3N}$ for $N$ particles (modulo degeneracy factors from internal degrees of freedom like spin, flavor, color, etc) and (b) the indistinguishability of particles to repair the Gibbs paradox.

For a treatment, not using QFT (which is of course the most straight-forward way, given you have already learnt QFT in the first place), but simple stochastics arguments, see Landau&Lifshitz vol. V or my transport lectures:

https://th.physik.uni-frankfurt.de/~hees/publ/kolkata.pdf

A marvelous undergraduate-level text, which uses your arguments and also makes the connection to entropy (in the traditional non-information theoretical approach) is the Berkely Phsics Course, Statistical Mechanics volume, written by F. Reiff.

A more modern excellent text, using information theory (imho the only approach which leads to a real understanding of the meaning of entropy, as is shown by recent experimental realizations in nano-physics, where among other things, the old resolutions of the Maxwell Demon paradox by Szilard and Landauer are empirically verified on the quantum level) is

J. Rau, Statistical Physics and Thermodynamics, Oxford University Press (2017)

Last edited: