Derivation of the partition function

Mentz114 · Jul 7, 2018

Starting from the definition of energy levels ##e_n## and occupations ##a_n## and
the conditions ##\sum_n a_n = N## (2.2) and ##\sum_n a_n e_n = E## (2.3) where ##N## and ##E## are fixed I'm trying to find the distribution which extremizes the Shannon entropy.

Using the frequency ##f_n=a_n/N## and ##S=\sum f_n\log(f_n)## I need to solve the variation ( ##\lambda## and ##\mu## are Lagrange multipliers) ##\delta\left(S+\lambda(\sum_n f_n-1) + \mu(\sum_n e_n f_n -E/N\right)=0##.
[tex]
\begin{align*}
\delta(S) &= \sum \delta f_n\log(f_n) + \sum f_n \delta(\log(f_n) )\\
&\sum f_n \delta(\log(f_n) ) = \sum f_n (\delta f_n)/f_n=\sum \delta f_n
\end{align*}
[/tex]
and the total variation is
[tex]
\begin{align*}
\delta &= (1+\lambda)\sum \delta f_n + \sum \delta f_n\log(f_n) + \mu\sum e_n\delta f_n
\end{align*}
[/tex]
Setting ##\lambda=-1## we are left with ##\sum \delta f_n\log(f_n)=- \mu\sum e_n\delta f_n ## from which follows ##a_n=Ne^{-\mu e_n}##.

This is the first time I've attempted a constrained variation so I'm not entirely confident that this is right. I hope it is because it is more satisfactory than methods^** that use Stirlings formula. Also ##\lambda## can be trivially eliminated.

The interpetation can now be made that the state derived has the greatest entropy/disorder of all possible states that satisfy the constraints. There is a physical connection which is absent if a probability is extremised, because probability has no physical analog, unlike entropy.

** This method extremises the log of ##P=N!/\Pi a_n! ##

stevendaryl · Jul 8, 2018

Mentz114 said:

Starting from the definition of energy levels ##e_n## and occupations ##a_n## and
the conditions ##\sum_n a_n = N## (2.2) and ##\sum_n a_n e_n = E## (2.3) where ##N## and ##E## are fixed I'm trying to find the distribution which extremizes the Shannon entropy.

Using the frequency ##f_n=a_n/N## and ##S=\sum f_n\log(f_n)## I need to solve the variation ( ##\lambda## and ##\mu## are Lagrange multipliers) ##\delta\left(S+\lambda(\sum_n f_n-N) + \mu(\sum_n e_n f_n -E\right)=0##.
[tex]
\begin{align*}
\delta(S) &= \sum \delta f_n\log(f_n) + \sum f_n \delta(\log(f_n) )\\
&\sum f_n \delta(\log(f_n) ) = \sum f_n (\delta f_n)/f_n=\sum \delta f_n
\end{align*}
[/tex]
and the total variation is
[tex]
\begin{align*}
\delta &= (1+\lambda)\sum \delta f_n + \sum \delta f_n\log(f_n) + \mu\sum e_n\delta f_n
\end{align*}
[/tex]
Setting ##\lambda=-1## we are left with ##\sum \delta f_n\log(f_n)=- \mu\sum e_n\delta f_n ## from which follows ##a_n=Ne^{-\mu e_n}##.

This is the first time I've attempted a constrained variation so I'm not entirely confident that this is right. I hope it is because it is more satisfactory than methods^** that use Stirlings formula. Also ##\lambda## can be trivially eliminated.

The interpetation can now be made that the state derived has the greatest entropy/disorder of all possible states that satisfy the constraints. There is a physical connection which is absent if a probability is extremised, because probability has no physical analog, unlike entropy.

** This method extremises the log of ##P=N!/\Pi a_n! ##

Looks good to me, except for a couple of points. Since you define ##f_n## as ##a_n/N##, then

##\sum_n f_n = 1## (not ##N##)
##\sum_n e_n f_n = E/N##

Also, why can you set ##\lambda = 1##? What you actually need is ##\sum_n e^{-\lambda - \mu e_n} = 1##. So ##e^\lambda = \sum_n e^{-\mu e_n}##

Mentz114 · Jul 8, 2018

stevendaryl said:

Looks good to me, except for a couple of points. Since you define ##f_n## as ##a_n/N##, then

##\sum_n f_n = 1## (not ##N##)

##\sum_n e_n f_n = E/N##

Thanks for pointing that out. I have corrected the constraints. I switched from ##a_n/N## to ##f_n## for brevity and failed to update those expressions.

Also, why can you set ##\lambda = 1##? What you actually need is ##\sum_n e^{-\lambda - \mu e_n} = 1##. So ##e^\lambda = \sum_n e^{-\mu e_n}##

I'm thinking about this. I thought I could solve for ##\lambda## and ##\mu## separately.

Mentz114 · Jul 8, 2018

stevendaryl said:

Looks good to me, except for a couple of points. Since you define ##f_n## as ##a_n/N##, then

##\sum_n f_n = 1## (not ##N##)

##\sum_n e_n f_n = E/N##

Also, why can you set ##\lambda = 1##? What you actually need is ##\sum_n e^{-\lambda - \mu e_n} = 1##. So ##e^\lambda = \sum_n e^{-\mu e_n}##

OK, the total variation ##\ (1+\lambda)\sum \delta f_n + \sum \delta f_n\log(f_n) + \mu\sum e_n\delta f_n=0## gives

##f_n = Ne^{-1-\lambda-\mu e_n}##. Which is not what I concluded. Does this mean that the state which extremises the Shannon entropy is not that given by the partition function. Clearly I need help or to read some more. Any suggestions ?

stevendaryl · Jul 8, 2018

Mentz114 said:

OK, the total variation ##\ (1+\lambda)\sum \delta f_n + \sum \delta f_n\log(f_n) + \mu\sum e_n\delta f_n=0## gives

##f_n = Ne^{-1-\lambda-\mu e_n}##. Which is not what I concluded. Does this mean that the state which extremises the Shannon entropy is not that given by the partition function. Clearly I need help or to read some more. Any suggestions ?

I think it's correct. Your parameter ##\mu## should be equal to the Boltzmann factor ##\frac{1}{kT}##. Your parameter ##(-1-\lambda)## will be equal to ##\frac{\mu_{cp}}{kT}## where ##\mu_{cp}## is the chemical potential. So your ##f## would then be ##e^{-\frac{e_n - \mu}{kT}}##

Mentz114 · Jul 8, 2018

stevendaryl said:

I think it's correct. Your parameter ##\mu## should be equal to the Boltzmann factor ##\frac{1}{kT}##. Your parameter ##(-1-\lambda)## will be equal to ##\frac{\mu_{cp}}{kT}## where ##\mu_{cp}## is the chemical potential. So your ##f## would then be ##e^{-\frac{e_n - \mu}{kT}}##

Yes, that can be deduced from mixing and additivity once the variation is solved.

The variation I wrote seems to be correct because (the original) ##\sum e^{-\lambda - \mu e_n}=N## and (for the frequencies) ##\sum e^{-1 -\lambda - \mu e_n}=1## are the same in that we can solve for ##e^{-\lambda}## or ##e^{-1-\lambda}## and on substituting into the second constraint the equation is satisfied.

Now understand how ##\lambda## was eliminated in the derivation I'm reading ^** this is straightforward.

Thanks for pointing me in right direction.

[edit]
From ##f_n = e^{-1-\lambda-\mu e_n}## using the first constraint ##\sum_n f_n=\sum_n e^{-1-\lambda-\mu e_n}=1## we find the value of ##e^{-1-\lambda}## to be ##1/\sum e^{-\mu e_n}##. Substituting into the second constraint gives ##\sum e_n e^{-\mu e_n}/\sum e^{-\mu e_n}## which is the expected value of the energy ##E/N##. Both constraints are thus satisfied.**
Erwin Schrodinger, Statistical Thermodynamics, Dover (1952)

Derivation of the partition function

FAQ: Derivation of the partition function

What is the partition function and why is it important in statistical mechanics?

How is the partition function derived?

What is the relationship between the partition function and the thermodynamic properties of a system?

Can the partition function be used for both classical and quantum systems?

How does the size of a system affect the partition function?

Similar threads

Hot Threads

Recent Insights