Boltzmann with degenerate levels

jostpuur · Feb 1, 2017

Suppose we have some model for some system, and that model has given us a sequence [itex]\mathcal{E}_1,\mathcal{E}_2,\mathcal{E}_3,\ldots[/itex], whose values are interpreted as the energy levels of the system. Denoting the energy levels slightly redundantly for future modification soon below, we state that the energy levels are

[itex] E_1=\mathcal{E}_1[/itex]
[itex] E_2=\mathcal{E}_2[/itex]
[itex] E_3=\mathcal{E}_3[/itex]
[itex]\vdots[/itex]

The probabilities defined by the Boltzmann distribution under a temperature [itex]T[/itex] will be

[itex] p(1) = \frac{1}{Z(T)} e^{-\frac{\mathcal{E}_1}{T}}[/itex]
[itex] p(2) = \frac{1}{Z(T)} e^{-\frac{\mathcal{E}_2}{T}}[/itex]
[itex] p(3) = \frac{1}{Z(T)} e^{-\frac{\mathcal{E}_3}{T}}[/itex]
[itex] \vdots[/itex]

where the partition function is
[itex] Z(T) = e^{-\frac{\mathcal{E}_1}{T}} + e^{-\frac{\mathcal{E}_2}{T}} + e^{-\frac{\mathcal{E}_3}{T}}+ \cdots[/itex]

Suppose we find out that the model was only an approximation of a more accurate model, and according to the new more accurate model the energy values are going to be [itex]\mathcal{E}_n[/itex] and [itex]\mathcal{E}_{2n}+\epsilon[/itex] with some small positive epsilon. Now the energy levels are

[itex] E_1 = \mathcal{E}_1[/itex]
[itex] E_2 = \mathcal{E}_2[/itex]
[itex] E_3 = \mathcal{E}_2+ \epsilon[/itex]
[itex] E_4 = \mathcal{E}_3[/itex]
[itex] E_5 = \mathcal{E}_4[/itex]
[itex] E_6 = \mathcal{E}_4 + \epsilon[/itex]
[itex] E_7 = \mathcal{E}_5[/itex]
[itex] \vdots[/itex]

Now the probabilities defined by

[itex] p(n) = \frac{1}{Z(T)}e^{-\frac{E_n}{T}}[/itex]

turn out to be

[itex] p(1) = \frac{1}{Z(T)}e^{-\frac{\mathcal{E}_1}{T}}[/itex]
[itex] p(2) = \frac{1}{Z(T)}e^{-\frac{\mathcal{E}_2}{T}}[/itex]
[itex] p(3) = \frac{1}{Z(T)}e^{-\frac{\mathcal{E}_2+\epsilon}{T}}[/itex]
[itex] p(4) = \frac{1}{Z(T)}e^{-\frac{\mathcal{E}_3}{T}}[/itex]
[itex] p(5) = \frac{1}{Z(T)}e^{-\frac{\mathcal{E}_4}{T}}[/itex]
[itex] p(6) = \frac{1}{Z(T)}e^{-\frac{\mathcal{E}_4+\epsilon}{T}}[/itex]
[itex] p(7) = \frac{1}{Z(T)}e^{-\frac{\mathcal{E}_5}{T}}[/itex]
[itex] \vdots[/itex]

where the partition function is

[itex] Z(T) = e^{-\frac{\mathcal{E}_1}{T}} + e^{-\frac{\mathcal{E}_2}{T}} + e^{-\frac{\mathcal{E}_2 + \epsilon}{T}} + e^{-\frac{\mathcal{E}_3}{T}} + e^{-\frac{\mathcal{E}_4}{T}} + e^{-\frac{\mathcal{E}_4 + \epsilon}{T}} + e^{-\frac{\mathcal{E}_5}{T}} + \cdots[/itex]

Suppose we decide that the epsilon is so small that it has not much significance, and we might as well simplify the formulas by taking the limit [itex]\epsilon\to 0[/itex]. This limit is going to give us a new probability distribution

[itex] p(1) = \frac{1}{Z(T)}e^{-\frac{\mathcal{E}_1}{T}}[/itex]
[itex] p(2) = \frac{2}{Z(T)}e^{-\frac{\mathcal{E}_2}{T}}[/itex]
[itex] p(3) = \frac{1}{Z(T)}e^{-\frac{\mathcal{E}_3}{T}}[/itex]
[itex] p(4) = \frac{2}{Z(T)}e^{-\frac{\mathcal{E}_4}{T}}[/itex]
[itex] p(5) = \frac{1}{Z(T)}e^{-\frac{\mathcal{E}_5}{T}}[/itex]
[itex] \vdots[/itex]

where the partition function is

[itex] Z(T) = e^{-\frac{\mathcal{E}_1}{T}} + 2e^{-\frac{\mathcal{E}_2}{T}} + e^{-\frac{\mathcal{E}_3}{T}} + 2e^{-\frac{\mathcal{E}_4}{T}} + e^{-\frac{\mathcal{E}_5}{T}} + \cdots[/itex]

Now we have two different probability distributions for the case [itex]\epsilon = 0[/itex]. Is one of them right, and the other one wrong? Which way around would be the right answer?

stevendaryl · Feb 2, 2017

You have to be careful here. The partition function is defined to be:

[itex]Z = \sum_i e^{\frac{-E_i}{kT}}[/itex]

where [itex]i[/itex] ranges over all states. It's not a sum over energy eigenvalues, it's a sum over states. Each state makes a contribution, not just states with distinguishable energy levels.

If you want to sum over energy values, as well, then you have to include a degeneracy factor: [itex]Z = \sum_i g_i e^{\frac{-E_i}{kT}}[/itex], where now the sum is over energy levels, and [itex]g_i[/itex] is the number of states with energy level [itex]E_i[/itex].

jostpuur · Feb 2, 2017

I understand the claim, but don't believe. Why would the probability distribution have such uniform background measure over all states?

For example, suppose you have lot of holes on some special table, and suppose little balls are being thrown at that table so that the balls eventually fall through the small holes. The events where the balls hit the holes are going to be random events. If the holes are not uniformly distributed, and if some of the holes are extremely close to each other, they are going to be competing for the same random events, hence reducing their individual chances of getting a hit.

Perhaps the probability distribution I wrote down above for the case [itex]\epsilon > 0[/itex] was wrong, because perhaps the probabilities should have been
[itex] p(1) = \frac{1}{Z(T)}e^{-\frac{\mathcal{E}_1}{T}}[/itex]
[itex] p(2) = \frac{1}{2Z(T)}e^{-\frac{\mathcal{E}_2}{T}}[/itex]
[itex] p(3) = \frac{1}{2Z(T)}e^{-\frac{\mathcal{E}_2 + \epsilon}{T}}[/itex]
[itex] p(4) = \frac{1}{Z(T)}e^{-\frac{\mathcal{E}_3}{T}}[/itex]
[itex] p(5) = \frac{1}{2Z(T)}e^{-\frac{\mathcal{E}_4}{T}}[/itex]
[itex] p(6) = \frac{1}{2Z(T)}e^{-\frac{\mathcal{E}_4 + \epsilon}{T}}[/itex]
[itex] p(7) = \frac{1}{Z(T)}e^{-\frac{\mathcal{E}_5}{T}}[/itex]
[itex] \vdots[/itex]
where the partition function would be
[itex] Z(T) = e^{-\frac{\mathcal{E}_1}{T}} + \frac{1}{2}e^{-\frac{\mathcal{E}_2}{T}} + \frac{1}{2}e^{-\frac{\mathcal{E}_2+\epsilon}{T}}<br /> + e^{-\frac{\mathcal{E}_3}{T}} + \frac{1}{2}e^{-\frac{\mathcal{E}_4}{T}} + \frac{1}{2}e^{-\frac{\mathcal{E}_4+\epsilon}{T}}<br /> + e^{-\frac{\mathcal{E}_5}{T}} + \cdots[/itex]

Perhaps there should have been factors [itex]\frac{1}{2}[/itex] like this because some of the states are so similar that they are competing for the same random events? Why not like this?

Isn't this what being careful looks like, by the way?

stevendaryl · Feb 2, 2017

Well, you can take it as the definition of a system being at equilibrium that if you fix all the macroscopic conserved quantities--total energy, total momentum, total angular momentum, total charge, total number of particles of each type, etc.--then every microscopic state consistent with those macroscopic properties is equally likely. I don't know if there is a justification for that assumption, other than the principle of indifference: if you don't have any other way of distinguishing states, then they have to be equally likely.

Your example is not relevant to this assumption, because there is no notion of "competition" for states. When you compute the partition function it's for the entire system, not for a single particle. So for your example, specifying the state means specifying the location and momentum of each ball. The constraint that two balls can't occupy the same location could either be imposed by just some states as not allowed, or it could be done in a "soft" way by putting in a short-range repulsive force between balls, so that the energy of the system as a whole shoots up if two balls get too close together.

I'll have to think about your example some more to see if I can model it using statistical mechanics.

rubi · Feb 2, 2017

The density matrix of the canonical ensemble is given by ##\rho=e^{-\beta\hat H}##, where ##\hat H## is the Hamiltonian operator. The partition function is given by ##Z=\mathrm{Tr}(\rho)##. Assume that ##\hat H## has discrete, but possibly degenerate spectrum. Then it can be diagonalized as ##\hat H=\sum_n\sum_{k=1}^{g(n)} E_n\left|\Psi_{nk}\right>\left<\Psi_{nk}\right|##, where ##\hat H\left|\Psi_{nk}\right>=E_n\left|\Psi_{nk}\right>##, ##k## labels the degeneracy of ##E_n## (i.e. ##1\leq k \leq g(n)=\mathrm{dim}(\mathrm{Eig}(\hat H,E_n))##) and the ##\left|\Psi_{nk}\right>## can be choosen to be orthogonal. We then find ##e^{-\beta\hat H}=\sum_n \sum_{k=1}^{g(n)} e^{-\beta E_n} \left|\Psi_{nk}\right>\left<\Psi_{nk}\right|## (by the spectral theorem) and ##Z=\sum_n\sum_{k=1}^{g(n)} e^{-\beta E_n}##. The term ##e^{-\beta E_n}## appears ##g(n)## times in this sum, i.e. ##Z=\sum_n g(n) e^{-\beta E_n}##, so each term ##e^{-\beta E_n}## must be weighted by its degeneracy.

DrClaude · Feb 2, 2017

jostpuur said:

I understand the claim, but don't believe. Why would the probability distribution have such uniform background measure over all states?

It is a fundamental assumption of equilibrium statistical physics: all accessible microstates are equally probable.

jostpuur · Feb 2, 2017

DrClaude said:

It is a fundamental assumption of equilibrium statistical physics: all accessible microstates are equally probable.

In classical models (not quantum) that "fundamental assumption" is contradictory and leads to paradoxes, because the microstates often form some kind of continuum, and the only way to apply the Boltzmann's distribution is to first discretize the model somehow. However, there is always multiple ways of discretizing the model, and the different discretizations can lead to different probability distributions for the original continuous model. For this reason I think it is obvious that in general the probability distribution has to be allowed to be proportional to some function [itex]f(n)e^{-\frac{E_n}{T}}[/itex], where [itex]f(n)[/itex] is some "background measure". To me it seems that people have not understood the need for this background measure, because in many examples it is something very uniform and often only a constant. Anyway, if you insist that the probability distribution will be proportional to precisely [itex]e^{-\frac{E_n}{T}}[/itex] and nothing else, it will lead to paradoxes.

mfb · Feb 2, 2017

In classical mechanics the definition of states can look a bit arbitrary, and you won't get the correct results e. g. for blackbody radiation. That should not be surprising - blackbody radiation was one of the key observations that lead to the discovery that we do not live in a classical world. With knowledge about quantum mechanics - the more fundamental theory - we also got a deeper motivation for the states in classical physics.

jostpuur · Feb 2, 2017

Can the (often axiomatic) assumption, that all accessible (under some energy constraint) microstates are equally probable, be derived out from a Schrödinger's equation (as some accurate approximation)?

Wouldn't the derivation need some model of the form [itex]H=H_0+\epsilon I[/itex], where the eigenstates (eigenvectors) of [itex]H_0[/itex] would be considered as the microstates (of statistical physics), and the term [itex]I[/itex] would be something that somehow mixes the wave function in a statistical way over macroscopic time intervals?

mfb · Feb 3, 2017

Objects don't have to be in thermal equilibrium. You cannot force them to be - the Schrödinger equation doesn't tell you if something is in thermal equilibrium.

jostpuur · Feb 3, 2017

If you assume that the true time evolution comes from Schrödinger's equation and also assume something else, something reasonable that would be related to statistics, the assumptions together might imply Boltzmann's distribution as an accurate approximation. A proper derivation of Boltzmann's distribution should look like something on those lines, since it is the Schrödinger's equation that ultimately produces the true time evolution. I already knew that Schrödinger's equation alone is not going to imply Boltzmann's distribution.

When the Boltzmann's distribution is derived without any use of Schrödinger's equation, some things in the derivation have to be working by accident and good luck.

Related to this topic, I would like to remind you that the assumption that all microstates (accessible under some energy constraint) are equally probable, when the microstates have been identified with the energy eigenstates, contains implicit assumption which severely contradicts quantum mechanics, because according to quantum mechanics the state of a system does not need to be any energy eigenstate. The states can be linear combinations of energy eigenstates. For this reason it is obvious that the use of microstates is supposed to be some kind of approximative model that merely has similar statistical behavior as some more accurate quantum model with proper wave functions. A proper derivation of Boltzmann's distribution should take into account the nature of this approximation.

stevendaryl · Feb 3, 2017

jostpuur said:

In classical models (not quantum) that "fundamental assumption" is contradictory and leads to paradoxes, because the microstates often form some kind of continuum, and the only way to apply the Boltzmann's distribution is to first discretize the model somehow. However, there is always multiple ways of discretizing the model, and the different discretizations can lead to different probability distributions for the original continuous model.

Yes, you're right. The Boltzmann prescription, that at equilibrium, all states with the same energy are equally likely, is strictly speaking only applicable to a system with a finite number of states. To do classical statistical mechanics, people divide phase space into little cells, and use the volume of the cells as the measure of likelihood. The volume in phase space is a particular way of giving a particularly simple "background measure". You could certainly use other measures, and I don't know whether that has been explored, or not.

I'm assuming that when you talk about paradoxes, you're just saying that you can get different results depending on how you divide the continuum into "states" and take the limit? I don't think that there is anything paradoxical about the usual approach of using phase space volume. At least not for nonrelativistic physics. Famously, Planck's attempt to apply statistical mechanics to electromagnetic radiation led to infinities when he tried to take the continuum limit, but he got reasonable results by using a discrete number of states (leading to QM). Maybe there is some sense in which Boltzmann's rule is nonsensical in the continuum limit, which is a hint that the world isn't classical.

Anyway, getting back to the original post, I think it has been answered. Whether you use one "background measure" or another, it will not be the case that [itex]Z[/itex] is computed by summing over energy levels; you have to include a measure [itex]g(E)[/itex] giving the degeneracy (in the case of discrete states) or the measure (in the case of a continuum of states).

stevendaryl · Feb 3, 2017

jostpuur said:

Related to this topic, I would like to remind you that the assumption that all microstates (accessible under some energy constraint) are equally probable, when the microstates have been identified with the energy eigenstates, contains implicit assumption which severely contradicts quantum mechanics, because according to quantum mechanics the state of a system does not need to be any energy eigenstate. The states can be linear combinations of energy eigenstates. For this reason it is obvious that the use of microstates is supposed to be some kind of approximative model that merely has similar statistical behavior as some more accurate quantum model with proper wave functions. A proper derivation of Boltzmann's distribution should take into account the nature of this approximation.

Yeah, you can't treat every linear combination of eigenstates as different states for statistical purposes, because they are overlapping. The Boltzmann rule requires the notion of "state" to be exclusive: you can't be in two different states simultaneously. It works to just pick a complete orthonormal set of states and do statistics on those, but there might be a more sophisticated treatment that doesn't require first coming up with a basis. It's getting beyond my knowledge at this point.

jostpuur · Feb 3, 2017

stevendaryl said:

Anyway, getting back to the original post, I think it has been answered.

I see that there is a standard answer available to my question.

My original question contained the slight ambiguity that I didn't specify whether it was supposed to be in classical or quantum setting. It might affect the answers. For example the quantum mechanical energy states (eigenvectors) will always be orthogonal, no matter how close the energy levels (eigenvalues) are, so the states are perhaps never going to be so similar that they would "compete for the same random events".

stevendaryl · Feb 3, 2017

jostpuur said:

I see that there is a standard answer available to my question.

My original question contained the slight ambiguity that I didn't specify whether it was supposed to be in classical or quantum setting. It might affect the answers. For example the quantum mechanical energy states (eigenvectors) will always be orthogonal, no matter how close the energy levels (eigenvalues) are, so the states are perhaps never going to be so similar that they would "compete for the same random events".

I don't think that "competing for the same events" really makes sense as a concept. Or maybe I just don't understand what you mean.

jostpuur · Feb 3, 2017

What I meant becomes evident with the following modification to the derivation of the Maxwell's speed distribution in classical setting.

The possible states of gas particles can be parametrized by the momentum vector [itex]\vec{p}[/itex] whose allowed values can be anything from [itex]\mathbb{R}^3[/itex]. The continuum brings serious problems, and the allowed values must be discretized. The most obvious choice is to choose some small [itex]\Delta p>0[/itex], and then decide that the allowed values are from [itex]\Delta p\;\mathbb{Z}^3[/itex]. If you then decide that probabilities must be proportional to [itex]e^{-\frac{E_{\vec{p}}}{T}}[/itex], and at the end take the limit [itex]\Delta p\to 0[/itex], you get the right Maxwell's speed distribution.

Suppose that for some reason you are given a different discretization: Some discrete set [itex]\Lambda\subset\mathbb{R}^3[/itex], where for example the points are denser close to the origin, and sparser far away from the origin. If you then decide that the probabilities again must be proportional to [itex]e^{-\frac{E_{\vec{p}}}{T}}[/itex] for all [itex]\vec{p}\in\Lambda[/itex], you are going to get a wrong result -- a distorted version of the Maxwell's speed distribution.

This does not necessarily mean that the discretization [itex]\Lambda[/itex] would be wrong. All you have to do is to find nice weights [itex]f(\vec{p})[/itex] which are suitably smaller close to the origin, and larger far away from the origin, and then you have to postulate that the probabilities are going to be proportional to [itex]f(\vec{p})e^{-\frac{E_{\vec{p}}}{T}}[/itex]. If the weights are right, you get the correct Maxwell's speed distribution again.

In this case I would say that the points of [itex]\Lambda[/itex], which were denser close to the origin, were "competing for the same random events".

Due to this example, I'm not convinced that the question of finding the "right discretization" would necessarily be the correct question. Equivalently we might think that almost any discretization will be fine, while the real task is going to be finding a way to find the right weights.

mfb · Feb 3, 2017

jostpuur said:

Due to this example, I'm not convinced that the question of finding the "right discretization" would necessarily be the correct question. Equivalently we might think that almost any discretization will be fine, while the real task is going to be finding a way to find the right weights.

In classical mechanics, that leads to the same answers, but it ruins the feature that classical mechanics is a special case of quantum mechanics. In quantum mechanics it does not work at all.
I don't see any advantage of a system that is more complicated, less motivated by physics and has a smaller range of applications.

jostpuur · Feb 3, 2017

The quantum mechanical derivation of Maxwell's speed distribution is symbolic garbage that produces the right result by accident and good luck. Or at least it looks like it.

Suppose you have a cube of macroscopic size 10m[itex]\times[/itex]10m[itex]\times[/itex]10m, and lot of gas particles in it. The gas particles are going to obey the Maxwell's speed distribution, and you can derive it by assuming that the gas particles would be occupying the quantum energy eigenstates, which will be spatially spread over the entire macroscopic cube, and which can be written in terms of trigonometric functions. Seriously speaking the gas particles are not going to be spatially spread like that though, because the macroscopic gas wouldn't feel like macroscopic gas like that. The gas particles are probably on such states which can be described by spatially localized wave packets with some sensical momentum expectation values [itex]\langle \vec{p}\rangle[/itex], which will behave as their classical momentums.

Is there any serious reason to believe that quantum mechanics would have anything to do with the Maxwell's speed distribution?

PeterDonis · Feb 3, 2017

jostpuur said:

Suppose we find out that the model was only an approximation of a more accurate model, and according to the new more accurate model the energy values are going to be ##\mathcal{E}_n## and ##\mathcal{E}_{2n}+\epsilon## with some small positive epsilon. Now the energy levels are

Is your intent to add new energy levels, or to shift some of the energy levels without adding any? The actual modified levels you posted indicate the former, but your verbal description seems to indicate the latter. Also, your derivation of the different distributions for ##\epsilon = 0## requires the former, but I don't see why different distributions would be a problem unless you are assuming the latter.

jostpuur · Feb 3, 2017

PeterDonis said:

Is your intent to add new energy levels, or to shift some of the energy levels without adding any?

The original question contained some ambiguities because it was only brainstorming.

mfb · Feb 3, 2017

jostpuur said:

The quantum mechanical derivation of Maxwell's speed distribution is symbolic garbage that produces the right result by accident and good luck. Or at least it looks like it.

Maybe to you, I can't judge that. It is perfectly fine and the right thing to do.

Nothing in the derivation assumes that the particles are perfectly in energy eigenstates. The classical case is the limit of the quantum case for h->0. For observables that do not depend on h, this limit is trivial to calculate.

PeterDonis · Feb 3, 2017

jostpuur said:

The original question contained some ambiguities because it was only brainstorming.

But your question can't be answered unless the ambiguity is resolved. Can you resolve it?

jostpuur · Feb 3, 2017

I think that my original question has been answered in the sense that the relevant ambiguities have gotten pinpointed.

PeterDonis said:

Is your intent to add new energy levels, or to shift some of the energy levels without adding any?

Originally I did not have an answer ready for this. I can see that the standard answer to my original question will depend on which way the clarification will be made.

PeterDonis · Feb 3, 2017

jostpuur said:

The quantum mechanical derivation of Maxwell's speed distribution

Please give a specific reference.

jostpuur said:

The gas particles are going to obey the Maxwell's speed distribution, and you can derive it by assuming that the gas particles would be occupying the quantum energy eigenstates

No, you can't, because when you take quantum statistics into account, the correct distribution is not Maxwell-Boltzmann, it's either Bose-Einstein or Fermi-Dirac.

jostpuur · Feb 6, 2017

I'm only responding to the request.

Introductory Statistical Mechanics (second edition) by Bowley and Sanchez has Chapter 7 with title "Maxwell distribution of molecular speeds". Section 7.1 has title "The probability that a particle is in a quantum state", and it starts on page 144 with this type of content:

For a particle in a box, the eigenfunction describing standing waves is
[tex] \phi_i(x,y,z) = A\sin\Big(\frac{n_1\pi x}{L_x}\Big)\sin\Big(\frac{n_2\pi y}{L_y}\Big)\sin\Big(\frac{n_3\pi z}{L_z}\Big)[/tex]

The corresponding energy eigenvalue is
[tex] \epsilon_i = \frac{\hbar^2\pi^2}{2m}\Big(\frac{n_1^2}{L_x^2}+\frac{n_2^2}{L_y^2}+\frac{n_3^2}{L_z^2}\Big)[/tex]

According to Boltzmann, the probability of finding a given particle in a particular single-particle state of energy [itex]\epsilon_i[/itex] is
[tex] p_i = \frac{e^{-\epsilon_i/k_{\textrm{B}}T}}{Z}[/tex]

Then the book goes on about issues with densities of states in three dimensions, and eventually on page 152 they get to this

Let there be [itex]n(u)du[/itex] particles with speeds between [itex]u[/itex] and [itex]u+du[/itex]. For a gas in three dimensions we get
[tex] n(u)du = \Big(\frac{N\lambda_D^3m^3}{2\pi^2\hbar^3}\Big)u^2e^{-mu^2/2k_{\textrm{B}}T}du[/tex]

This is called the Maxwell distribution of molecular speeds.

Nowhere in between did they speak about artificial discretization of classical velocities or momentums, so the reader is left under impressions, that this result comes naturally from the Schrödinger's equation.

The formula appears to contain [itex]\hbar[/itex], but actually the factor [itex]\lambda_D[/itex] was defined in such way that the Planck's constants cancel.

Later in Chapter 10 they speak about Fermi and Bose particles.

DrClaude · Feb 6, 2017

Let me quote Callen, Thermodynamics and an Introduction to Thermostatics, 2nd ed., sec. 16-9:

Callen said:

[...] the partition function becomes
$$
z = \frac{1}{h^3} \int e^{-\beta \mathcal{H}} dx \, dy \, dz \, dp_x \, dp_y \, dp_z \quad \quad (16.68)
$$
Except for the appearance of the classically inexplicable prefactor (##1/h^3##), this representation of the partition sum (per mode) is fully classical. It was in this form that statistical mechanics was devised by Josiah Willard Gibbs in a series of papers in the Journal of the Connecticut Academy between 1875 and 1878. Gibbs' postulate of equation 16.68 (with the introduction of the quantity ##h##, for which there was no a priori classical justification) must stand as one of the most inspired insights in the history of physics. To Gibbs, the numerical value of ##h## was simply to be determined by comparison with empirical thermophysical data.

jambaugh · Feb 7, 2017

@jostpuur,
Pardon me for chiming in so late but I would also point out that you get the same derivation from a fundamental assumption that at equilibrium the entropy is maximized. This works for both classical and quantum settings, solving the constrained optimization problem via Lagrange multipliers gives you the partition function from the probability normalization. The equi-partition principle is built into the definition of entropy, because entropy is calculate as a sum over states (or trace over dimension in the density operator formulation for the quantum case). the - p_k log p_k sum defining entropy is alway larger when the p's are uniformly equal across equivalent states... (variations in probabilities between states manifests when you optimize subject to constraints, e.g. that <E> is some specific value.)

Work through the Lagrange multiplier optimization problem yourself and you'll begin to see that it really can't be any other way than this. I was quite inspired to see the definition of temperature, emerge as the (reciprocal) Lagrange multiplier for the fixed expected energy constraint and the chemical potential similarly emerge from the expected particle number constraint. And you can further invoke other constraints by prescribing a fixed expected value for any system observable (mean magnetization, charge polarization, angular momentum, ...)

jostpuur · Feb 13, 2017

Are you speaking about the calculation where we wish to maximize the function

[tex] f:[0,\infty[^N\to\mathbb{R},\quad f(p_1,p_2,\ldots, p_N) = -\sum_{n=1}^N p_n\log(p_n)[/tex]

under the constraint

[tex] 0 = \phi(p_1,p_2,\ldots, p_N) = p_1 + p_2 + \cdots + p_N - 1[/tex]

The equation [itex]\nabla f = \lambda \nabla\phi[/itex] then implies that all [itex]p_n[/itex] have to be equal.

The problems in the classical statistical physics start from the fact that there is no right way to discretize the continuous parameters, and this entropy calculation assumes that the discrete set [itex]\{1,2,\ldots, N\}[/itex] has already been fixed and given some interpretation right from the start, so this calculation is not giving much aid those issues.

If you assume that a wave function of a very large dimensional quantum system obeys the Schrödinger's equation, and also assume something else concerning statistics, can you prove that the quantity

[tex] S(t) = -\sum_{n=1}^N |\psi_n(t)|^2 \log(|\psi_n(t)|^2)[/tex]

has a habit of growing upwards?

jambaugh · Feb 14, 2017

What you say about all the probabilities being equal is true only if you do not impose further constraints. You can constrain the range of probabilities so that the expected value of the energy is a(n arbitrary) fixed value <E> = e. This gives you the classic (or quantum) distribution and the partition function emerges as the probability normalizing Lagrange multiplier. You can work with classical probability densities over phase space and the entropy defined as proportional to [itex]-\kappa \int_S \rho \ln(\rho) dxdp[/itex] or the quantum case either the discrete trace or integral trace. There's no difficulty with needing discrete states in the classical continuum case, you just need to pick an arbitrary scale for the entropy.

And as I mentioned, you can further constrain the system to have an arbitrary fixed expected particle number <n> , or arbitrary fixed expected *insert observable here*. These constraints further affect the state probabilities. Shall I work out the details here?

jostpuur · Feb 14, 2017

How do you justify that the formula [itex]\int p\log(p)dx[/itex] is a good formula for entropy? All the mathematical sources I have seen only give it as an axiomatic definition. In the context of Boltzmann's distribution the formula [itex]\log(W)[/itex] is justified via the need for the formula [itex]\log(W_1W_2)=\log(W_1)+\log(W_2)[/itex] to hold, and only logarithm has this property.

Boltzmann with degenerate levels

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Two equivalent statements of time reversal symmetric Hamiltonian

High School Interesting paper on QM in Scientific American

Undergrad ##r-##independent angular momentum in quantum mechanics

Graduate Consistency of Relativistic QM

Graduate Some derivation in QFT in Curved SpaceTime by Birrell and Davies

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect