Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Mathematical detail regarding Boltzman's H thm

  1. Oct 3, 2006 #1


    User Avatar
    Science Advisor
    Homework Helper
    Gold Member

    In the "proof" of the theorem, my course notes defines [itex]P_r(t)[/itex] as the probability to find the system is state r at time t, and it defined H as the mean value of [itex]\ln P_r[/itex] over all acesible states:

    [tex]H \equiv \sum_r P_r\ln P_r[/tex]

    Is is right to call the above sum the "mean value of ln P_r" ?! Cause given a quantity u, the mean value of f(u) is defined as

    [tex]\sum_i P(u_i)f(u_i)[/tex]

    So the mean value of [itex]\ln P_r[/itex] should be

    [tex]\sum_r P(P_r)\ln P_r[/tex]

    But P(P_r) does not make sense.

    I confessed my confusion to the professor in more vague terms (at the time, I only tought the equation looked suspicious), but he said there was nothing wrong with it. I say, H could be called at best "some kind" of mean value of ln(Pr).
    Last edited: Oct 3, 2006
  2. jcsd
  3. Oct 4, 2006 #2


    User Avatar
    Science Advisor
    Gold Member

    The formula you are asking about
    H=SUM(P_r * ln(P_r))
    is Shannon's definition of information entropy; it is related back to statistical mechanics by multiplying by the Boltzmann factor -k to give Gibbs' entropy
    S=-k*SUM(P_r * ln(P_r)).
    H is correctly called the mean of ln(P_r). Don't get confused by your notation--you must sum over states, not variables. P(u_i) is the probability of finding u in the ith state, so your first and second equations are actually the same.

    You may prefer a less confusing way of writing the mean or expectation
    E[f(x)] = SUM(P_r * f(x_r)),
    as used in Jaynes, Phys. Rev. 106:620-630 (1957) (who discusses the connection between information H and stat mech S)
    or Chandler, Intro to Modern Stat Mech, ch. 3 (1987).

    Having defended the correctness of the definition, I have to add that it isn't a very useful way of thinking of H. Take a very simple case, that of an ideal gas, as an example. One sums r over W equally probable microstates (p=1/W) so the entropy of a macrostate of the gas system reduces to
    H = -lnW;
    multiplying by the constant -k gives exactly Boltzmann's entropy H=k*lnW. But how is thinking of H as the mean value of log of probability helpful or insightful? Instead, entropy reflects the number of possible ways W that a macrostate can be realized, and the second law ensures that the macrostate adopted in equilibrium is that which can be realized in the most number of ways. To tie it to information theory, this is the maximum entropy state.

    Hope this helps.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook