I realize the Insight is intended as an
overview of Statistical mechanics as oppose to an
introduction to it. However, I'll comment on the logic structure since I've never found an introduction where it is presented clearly -and perhaps someday you'll write a textbook.
NFuller said:
My point was that since the subsystems are identical, if you relabel the points, one could not tell which points have been relabeled or which points have been moved. This allows for the a priori probability distribution ##p_{i}=\omega_{i}/\Omega_E##.
I don't see that the subsystems are "identical".
My interpretation: Apparently the underlying probability space we are talking has outcomes of "pick a subsystem". If we assume each subsystem has an equal probability of being selected, then it follows that the probability of the event "we select a subsystem in state ##i##" is ##\omega_i/\Omega##.
The conceptual difficulty in applying this to a particular physical system (e.g gases) is that we are dealing with a finite set of points in phase space and nothing is said about how these points are selected. If the goal is to model what happens in experiments, the we should imitate how an experiment randomly encounters a subsystem. For example, if we are modeling the event "A randomly selected person's last name begins with 'X"" by randomly selecting a person from a set of people then we need a set of people whose last names are representative of the entire population.
Are we assuming that any "bias" in the finite set of subsystems is supposed to be overcome by the fact we attain a representatives sample as the number of subsystems under consideration approaches infinity? For a finite population, we would achieve a representative sample that was the whole population. However, I don't see why taking more and more points in a (continuous) phase space necessarily guarantees a representative sample.
I defined equilibrium as a maximum entropy state.
Did you define equilibrium?
You wrote:
At equilibrium, the probability to be in a given state is expected to be constant in time.
Do we take that as the definition of equilibrium?
Justifying that the maximum entropy distribution gives the probabilities that would occur in a random measurement of a system at equilibrium seems to require an argument involving the limiting probabilities of a random process .
The system may fluctuate between different microstates but these microstates must be consistent with a fixed macroscopic variables, i.e. pressure, volume, energy, etc. There is nothing which would lead us to believe that one microstate is favored over another.
Suppose Bob partitions ##\Omega_E## into 10 states ##b_1,b_2,...b_{10}## and Alice partitions ##\Omega_E## into 10 states ##a_1,a_2,...a_{10}## in a different manner. Setting ##p_{b_1} = \omega_{b_1}/\Omega_E = 1/10 ## might give a different model for a random measurement than setting ## p_{a_1} = \omega_{a_1}/\Omega_E = 1/10##. For example, Alice might have chosen to make her ##a_1## a proper subset of Bob's ##b_1##.
So something about the condition "
these microstates must be consistent with a fixed macroscopic variables" needs to be invoked to prevent ambiguity. I don't see how this is done.
-----
Using Sterling’s approximation, this can be written as S=k[ΩElnΩE−ΩE−∑i=1lωilnωi+∑i=1lωi]
It's worth reminding readers that the error in Sterling's approximation for ##N!## approches infinity as ##N## approaches infinity. So finding the limit of a function of factorials cannot, in general, be done by replacing each factorial by a Sterling's approximation. For the specific case of multinomial coefficients, the replacement works. ( If we encounter a situation where we are letting the number of factorials in the denominator of a binomial coefficient approach infinity, I'm not sure what happens. For example, if we need to take a limit as we partition ##\Omega_E## into more and more ##\omega_i##, it might take some fancy analysis.)