# Derivation of the partition function

• I
Starting from the definition of energy levels ##e_n## and occupations ##a_n## and
the conditions ##\sum_n a_n = N## (2.2) and ##\sum_n a_n e_n = E## (2.3) where ##N## and ##E## are fixed I'm trying to find the distribution which extremizes the Shannon entropy.

Using the frequency ##f_n=a_n/N## and ##S=\sum f_n\log(f_n)## I need to solve the variation ( ##\lambda## and ##\mu## are Lagrange multipliers) ##\delta\left(S+\lambda(\sum_n f_n-1) + \mu(\sum_n e_n f_n -E/N\right)=0##.
\begin{align*} \delta(S) &= \sum \delta f_n\log(f_n) + \sum f_n \delta(\log(f_n) )\\ &\sum f_n \delta(\log(f_n) ) = \sum f_n (\delta f_n)/f_n=\sum \delta f_n \end{align*}
and the total variation is
\begin{align*} \delta &= (1+\lambda)\sum \delta f_n + \sum \delta f_n\log(f_n) + \mu\sum e_n\delta f_n \end{align*}
Setting ##\lambda=-1## we are left with ##\sum \delta f_n\log(f_n)=- \mu\sum e_n\delta f_n ## from which follows ##a_n=Ne^{-\mu e_n}##.

This is the first time I've attempted a constrained variation so I'm not entirely confident that this is right. I hope it is because it is more satisfactory than methods** that use Stirlings formula. Also ##\lambda## can be trivially eliminated.

The interpetation can now be made that the state derived has the greatest entropy/disorder of all possible states that satisfy the constraints. There is a physical connection which is absent if a probability is extremised, because probability has no physical analog, unlike entropy.

** This method extremises the log of ##P=N!/\Pi a_n! ##

Last edited:

stevendaryl
Staff Emeritus
Starting from the definition of energy levels ##e_n## and occupations ##a_n## and
the conditions ##\sum_n a_n = N## (2.2) and ##\sum_n a_n e_n = E## (2.3) where ##N## and ##E## are fixed I'm trying to find the distribution which extremizes the Shannon entropy.

Using the frequency ##f_n=a_n/N## and ##S=\sum f_n\log(f_n)## I need to solve the variation ( ##\lambda## and ##\mu## are Lagrange multipliers) ##\delta\left(S+\lambda(\sum_n f_n-N) + \mu(\sum_n e_n f_n -E\right)=0##.
\begin{align*} \delta(S) &= \sum \delta f_n\log(f_n) + \sum f_n \delta(\log(f_n) )\\ &\sum f_n \delta(\log(f_n) ) = \sum f_n (\delta f_n)/f_n=\sum \delta f_n \end{align*}
and the total variation is
\begin{align*} \delta &= (1+\lambda)\sum \delta f_n + \sum \delta f_n\log(f_n) + \mu\sum e_n\delta f_n \end{align*}
Setting ##\lambda=-1## we are left with ##\sum \delta f_n\log(f_n)=- \mu\sum e_n\delta f_n ## from which follows ##a_n=Ne^{-\mu e_n}##.

This is the first time I've attempted a constrained variation so I'm not entirely confident that this is right. I hope it is because it is more satisfactory than methods** that use Stirlings formula. Also ##\lambda## can be trivially eliminated.

The interpetation can now be made that the state derived has the greatest entropy/disorder of all possible states that satisfy the constraints. There is a physical connection which is absent if a probability is extremised, because probability has no physical analog, unlike entropy.

** This method extremises the log of ##P=N!/\Pi a_n! ##

Looks good to me, except for a couple of points. Since you define ##f_n## as ##a_n/N##, then
• ##\sum_n f_n = 1## (not ##N##)
• ##\sum_n e_n f_n = E/N##
Also, why can you set ##\lambda = 1##? What you actually need is ##\sum_n e^{-\lambda - \mu e_n} = 1##. So ##e^\lambda = \sum_n e^{-\mu e_n}##

• Mentz114
Looks good to me, except for a couple of points. Since you define ##f_n## as ##a_n/N##, then
• ##\sum_n f_n = 1## (not ##N##)
• ##\sum_n e_n f_n = E/N##
Thanks for pointing that out. I have corrected the constraints. I switched from ##a_n/N## to ##f_n## for brevity and failed to update those expressions.
Also, why can you set ##\lambda = 1##? What you actually need is ##\sum_n e^{-\lambda - \mu e_n} = 1##. So ##e^\lambda = \sum_n e^{-\mu e_n}##

Looks good to me, except for a couple of points. Since you define ##f_n## as ##a_n/N##, then
• ##\sum_n f_n = 1## (not ##N##)
• ##\sum_n e_n f_n = E/N##
Also, why can you set ##\lambda = 1##? What you actually need is ##\sum_n e^{-\lambda - \mu e_n} = 1##. So ##e^\lambda = \sum_n e^{-\mu e_n}##
OK, the total variation ##\ (1+\lambda)\sum \delta f_n + \sum \delta f_n\log(f_n) + \mu\sum e_n\delta f_n=0## gives

##f_n = Ne^{-1-\lambda-\mu e_n}##. Which is not what I concluded. Does this mean that the state which extremises the Shannon entropy is not that given by the partition function. Clearly I need help or to read some more. Any suggestions ?

stevendaryl
Staff Emeritus
OK, the total variation ##\ (1+\lambda)\sum \delta f_n + \sum \delta f_n\log(f_n) + \mu\sum e_n\delta f_n=0## gives

##f_n = Ne^{-1-\lambda-\mu e_n}##. Which is not what I concluded. Does this mean that the state which extremises the Shannon entropy is not that given by the partition function. Clearly I need help or to read some more. Any suggestions ?

I think it's correct. Your parameter ##\mu## should be equal to the Boltzmann factor ##\frac{1}{kT}##. Your parameter ##(-1-\lambda)## will be equal to ##\frac{\mu_{cp}}{kT}## where ##\mu_{cp}## is the chemical potential. So your ##f## would then be ##e^{-\frac{e_n - \mu}{kT}}##

I think it's correct. Your parameter ##\mu## should be equal to the Boltzmann factor ##\frac{1}{kT}##. Your parameter ##(-1-\lambda)## will be equal to ##\frac{\mu_{cp}}{kT}## where ##\mu_{cp}## is the chemical potential. So your ##f## would then be ##e^{-\frac{e_n - \mu}{kT}}##
Yes, that can be deduced from mixing and additivity once the variation is solved.

The variation I wrote seems to be correct because (the original) ##\sum e^{-\lambda - \mu e_n}=N## and (for the frequencies) ##\sum e^{-1 -\lambda - \mu e_n}=1## are the same in that we can solve for ##e^{-\lambda}## or ##e^{-1-\lambda}## and on substituting into the second constraint the equation is satisfied.

Now understand how ##\lambda## was eliminated in the derivation I'm reading ** this is straightforward.

Thanks for pointing me in right direction.

From ##f_n = e^{-1-\lambda-\mu e_n}## using the first constraint ##\sum_n f_n=\sum_n e^{-1-\lambda-\mu e_n}=1## we find the value of ##e^{-1-\lambda}## to be ##1/\sum e^{-\mu e_n}##. Substituting into the second constraint gives ##\sum e_n e^{-\mu e_n}/\sum e^{-\mu e_n}## which is the expected value of the energy ##E/N##. Both constraints are thus satisfied.

**
Erwin Schrodinger, Statistical Thermodynamics, Dover (1952)

Last edited: