Derivation of the partition function

Click For Summary

Discussion Overview

The discussion revolves around the derivation of the partition function using the principles of statistical mechanics, specifically focusing on the extremization of Shannon entropy under certain constraints related to energy levels and occupations. Participants explore the mathematical formulation and implications of the derivation, including the use of Lagrange multipliers and the interpretation of the results.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant outlines the derivation process starting from the definitions of energy levels and occupations, aiming to find a distribution that maximizes Shannon entropy.
  • Another participant points out corrections regarding the definitions of frequencies and constraints, emphasizing that the sum of frequencies should equal 1, not N.
  • There is a discussion on the setting of the Lagrange multiplier λ, with some participants questioning the validity of setting it to -1 or 1, and suggesting that it should relate to the Boltzmann factor.
  • One participant expresses uncertainty about whether the derived state is consistent with the partition function, indicating a need for further clarification or reading.
  • Another participant suggests that the parameter μ should correspond to the Boltzmann factor and discusses the implications for the derived frequencies.
  • There is a mention of the relationship between the derived expressions and the expected values of energy, with a participant confirming that both constraints are satisfied in their derivation.

Areas of Agreement / Disagreement

Participants express both agreement and disagreement on various aspects of the derivation. While some corrections are acknowledged, there remains uncertainty about the correct interpretation of parameters and the consistency of the derived state with the partition function. The discussion does not reach a consensus on these points.

Contextual Notes

Participants note limitations in their understanding of the derivation process, particularly regarding the treatment of Lagrange multipliers and the implications of the constraints. There is also an acknowledgment of the need for further reading to clarify these concepts.

Mentz114
Messages
5,429
Reaction score
292
Starting from the definition of energy levels ##e_n## and occupations ##a_n## and
the conditions ##\sum_n a_n = N## (2.2) and ##\sum_n a_n e_n = E## (2.3) where ##N## and ##E## are fixed I'm trying to find the distribution which extremizes the Shannon entropy.

Using the frequency ##f_n=a_n/N## and ##S=\sum f_n\log(f_n)## I need to solve the variation ( ##\lambda## and ##\mu## are Lagrange multipliers) ##\delta\left(S+\lambda(\sum_n f_n-1) + \mu(\sum_n e_n f_n -E/N\right)=0##.
<br /> \begin{align*}<br /> \delta(S) &amp;= \sum \delta f_n\log(f_n) + \sum f_n \delta(\log(f_n) )\\<br /> &amp;\sum f_n \delta(\log(f_n) ) = \sum f_n (\delta f_n)/f_n=\sum \delta f_n<br /> \end{align*}<br />
and the total variation is
<br /> \begin{align*}<br /> \delta &amp;= (1+\lambda)\sum \delta f_n + \sum \delta f_n\log(f_n) + \mu\sum e_n\delta f_n<br /> \end{align*}<br />
Setting ##\lambda=-1## we are left with ##\sum \delta f_n\log(f_n)=- \mu\sum e_n\delta f_n ## from which follows ##a_n=Ne^{-\mu e_n}##.

This is the first time I've attempted a constrained variation so I'm not entirely confident that this is right. I hope it is because it is more satisfactory than methods** that use Stirlings formula. Also ##\lambda## can be trivially eliminated.

The interpetation can now be made that the state derived has the greatest entropy/disorder of all possible states that satisfy the constraints. There is a physical connection which is absent if a probability is extremised, because probability has no physical analog, unlike entropy.

** This method extremises the log of ##P=N!/\Pi a_n! ##
 
Last edited:
Physics news on Phys.org
Mentz114 said:
Starting from the definition of energy levels ##e_n## and occupations ##a_n## and
the conditions ##\sum_n a_n = N## (2.2) and ##\sum_n a_n e_n = E## (2.3) where ##N## and ##E## are fixed I'm trying to find the distribution which extremizes the Shannon entropy.

Using the frequency ##f_n=a_n/N## and ##S=\sum f_n\log(f_n)## I need to solve the variation ( ##\lambda## and ##\mu## are Lagrange multipliers) ##\delta\left(S+\lambda(\sum_n f_n-N) + \mu(\sum_n e_n f_n -E\right)=0##.
<br /> \begin{align*}<br /> \delta(S) &amp;= \sum \delta f_n\log(f_n) + \sum f_n \delta(\log(f_n) )\\<br /> &amp;\sum f_n \delta(\log(f_n) ) = \sum f_n (\delta f_n)/f_n=\sum \delta f_n<br /> \end{align*}<br />
and the total variation is
<br /> \begin{align*}<br /> \delta &amp;= (1+\lambda)\sum \delta f_n + \sum \delta f_n\log(f_n) + \mu\sum e_n\delta f_n<br /> \end{align*}<br />
Setting ##\lambda=-1## we are left with ##\sum \delta f_n\log(f_n)=- \mu\sum e_n\delta f_n ## from which follows ##a_n=Ne^{-\mu e_n}##.

This is the first time I've attempted a constrained variation so I'm not entirely confident that this is right. I hope it is because it is more satisfactory than methods** that use Stirlings formula. Also ##\lambda## can be trivially eliminated.

The interpetation can now be made that the state derived has the greatest entropy/disorder of all possible states that satisfy the constraints. There is a physical connection which is absent if a probability is extremised, because probability has no physical analog, unlike entropy.

** This method extremises the log of ##P=N!/\Pi a_n! ##

Looks good to me, except for a couple of points. Since you define ##f_n## as ##a_n/N##, then
  • ##\sum_n f_n = 1## (not ##N##)
  • ##\sum_n e_n f_n = E/N##
Also, why can you set ##\lambda = 1##? What you actually need is ##\sum_n e^{-\lambda - \mu e_n} = 1##. So ##e^\lambda = \sum_n e^{-\mu e_n}##
 
  • Like
Likes   Reactions: Mentz114
stevendaryl said:
Looks good to me, except for a couple of points. Since you define ##f_n## as ##a_n/N##, then
  • ##\sum_n f_n = 1## (not ##N##)
  • ##\sum_n e_n f_n = E/N##
Thanks for pointing that out. I have corrected the constraints. I switched from ##a_n/N## to ##f_n## for brevity and failed to update those expressions.
Also, why can you set ##\lambda = 1##? What you actually need is ##\sum_n e^{-\lambda - \mu e_n} = 1##. So ##e^\lambda = \sum_n e^{-\mu e_n}##
I'm thinking about this. I thought I could solve for ##\lambda## and ##\mu## separately.
 
stevendaryl said:
Looks good to me, except for a couple of points. Since you define ##f_n## as ##a_n/N##, then
  • ##\sum_n f_n = 1## (not ##N##)
  • ##\sum_n e_n f_n = E/N##
Also, why can you set ##\lambda = 1##? What you actually need is ##\sum_n e^{-\lambda - \mu e_n} = 1##. So ##e^\lambda = \sum_n e^{-\mu e_n}##
OK, the total variation ##\ (1+\lambda)\sum \delta f_n + \sum \delta f_n\log(f_n) + \mu\sum e_n\delta f_n=0## gives

##f_n = Ne^{-1-\lambda-\mu e_n}##. Which is not what I concluded. Does this mean that the state which extremises the Shannon entropy is not that given by the partition function. Clearly I need help or to read some more. Any suggestions ?
 
Mentz114 said:
OK, the total variation ##\ (1+\lambda)\sum \delta f_n + \sum \delta f_n\log(f_n) + \mu\sum e_n\delta f_n=0## gives

##f_n = Ne^{-1-\lambda-\mu e_n}##. Which is not what I concluded. Does this mean that the state which extremises the Shannon entropy is not that given by the partition function. Clearly I need help or to read some more. Any suggestions ?

I think it's correct. Your parameter ##\mu## should be equal to the Boltzmann factor ##\frac{1}{kT}##. Your parameter ##(-1-\lambda)## will be equal to ##\frac{\mu_{cp}}{kT}## where ##\mu_{cp}## is the chemical potential. So your ##f## would then be ##e^{-\frac{e_n - \mu}{kT}}##
 
stevendaryl said:
I think it's correct. Your parameter ##\mu## should be equal to the Boltzmann factor ##\frac{1}{kT}##. Your parameter ##(-1-\lambda)## will be equal to ##\frac{\mu_{cp}}{kT}## where ##\mu_{cp}## is the chemical potential. So your ##f## would then be ##e^{-\frac{e_n - \mu}{kT}}##
Yes, that can be deduced from mixing and additivity once the variation is solved.

The variation I wrote seems to be correct because (the original) ##\sum e^{-\lambda - \mu e_n}=N## and (for the frequencies) ##\sum e^{-1 -\lambda - \mu e_n}=1## are the same in that we can solve for ##e^{-\lambda}## or ##e^{-1-\lambda}## and on substituting into the second constraint the equation is satisfied.

Now understand how ##\lambda## was eliminated in the derivation I'm reading ** this is straightforward.

Thanks for pointing me in right direction.

[edit]
From ##f_n = e^{-1-\lambda-\mu e_n}## using the first constraint ##\sum_n f_n=\sum_n e^{-1-\lambda-\mu e_n}=1## we find the value of ##e^{-1-\lambda}## to be ##1/\sum e^{-\mu e_n}##. Substituting into the second constraint gives ##\sum e_n e^{-\mu e_n}/\sum e^{-\mu e_n}## which is the expected value of the energy ##E/N##. Both constraints are thus satisfied.**
Erwin Schrödinger, Statistical Thermodynamics, Dover (1952)
 
Last edited:

Similar threads

  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 0 ·
Replies
0
Views
1K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 33 ·
2
Replies
33
Views
7K
  • · Replies 6 ·
Replies
6
Views
3K