Understanding the Uniform Probability Distribution in Statistical Ensembles

A. Neumaier · Apr 22, 2016

stevendaryl said:

you would need a nonsubjective notion of probability to make such a judgement.

Of course. But physics is based on an objective notion of probability defined as expected relative frequency - with expectations checkable by experiment within the standard statistical limits.

stevendaryl said:

Physics alone can't tell you anything about probabilities unless you know the initial conditions exactly.

This is simply false.

We never know the initial conditions exactly and nevertheless make very useful predictions using the physical laws and reliably collected data.

We know the probability for decay of all familiar radioactive substances objectively to a fairly high accuracy. We predict probabilities for the daily weather and companies depending on whether pay a lot for accurate prognosis. We can calculate predictions for probabilities of quantum optics experiments to the point that we can reliably refute the Bell inequalities. And so on. All this is done using physics and slightly inaccurate knowledge to get objective (though a little approximate) probabilities.

Nowhere is the slightest use made of subjective probabilities.

Subjective judgments (and in particular subjective probabilities) have no place at all in physics. Their reasonable place is constrained for making value judgments about the relevance or success likelihood of what we do, priority judgments about what we should do, choices about which physical system to study in which detail, which part of a scientific study to make public, etc.. Every other use of subjectivity is - from the scientific point of view - a blunder.

bhobba · Apr 22, 2016

A. Neumaier said:

Subjective judgments (and in particular subjective probabilities) have no place at all in physics.

Hmmmmm. A Copenhagenist might argue that one.

I think Jaynes was a physicist.
http://bayes.wustl.edu/etj/prob/book.pdf

My view is its malleable - chosen purely for utility.

Thanks
Bill

stevendaryl · Apr 22, 2016

A. Neumaier said:

This is simply false

No, it's simply true.

We never know the initial conditions exactly and nevertheless make very useful predictions using the physical laws and reliably collected data.

Subjective probability is used all the time to make useful and accurate predictions.

We know the probability for decay of all familiar radioactive substances objectively to a fairly high accuracy.

Once again, what I said was that to make objective probabilistic predictions in physics, you have to know the initial states. We don't know exactly the initial states of atoms. We make a guess, and that guess is good enough for most purposes.

We predict probabilities for the daily weather and companies depending on whether pay a lot for accurate prognosis.

Subjective does not mean useless. Subjective probabilities can be used for useful and accurate predictions.

A. Neumaier · Apr 22, 2016

stevendaryl said:

Subjective probability is used all the time to make useful and accurate predictions.

Probabilities that lead to accurate predictions are objective, not subjective. For objectivity is what agrees with Nature.

With your use of the notion ''subjective'' everything physicists do, and all science is subjective, and the term (and its opposite ''objective'') lose their traditional meaning.

bhobba · Apr 22, 2016

A. Neumaier said:

Probabilities that lead to accurate predictions are objective, not subjective.

Baysisan inference - how does that fit? It can be done in a frequentest way but its not natural.

Thanks
Bill

stevendaryl · Apr 22, 2016

A. Neumaier said:

Subjective judgments (and in particular subjective probabilities) have no place at all in physics

That's completely false. We can't make any predictions at all without making assumptions that are subjective. You have to assume that your theory is correct, in the first place. You have to assume that your measurement devices worked correctly. You have to assume that you've accounted for all the relevant causal effects. You have to assume that records of past measurements were accurately recorded. There are countless assumptions that everyone must make in order to do the simplest sort of reasoning in physics. Most of those assumptions are completely subjective. You can certainly try to check your assumptions by repeating your measurements, and double-checking everything, but it's subjective whether you've repeated things enough times, whether you've double-checked enough times.

It is impossible to get along in the world without subjective judgments.

A. Neumaier · Apr 22, 2016

stevendaryl said:

to make objective probabilistic predictions in physics, you have to know the initial states.

The S-matrix gives objective probabilities for the outcomes given the input. The input is very accurately known in collision experiments - so accurate that they can check whether the scattering predictions come true or would represent violations of the standard model.

stevendaryl · Apr 22, 2016

A. Neumaier said:

Probabilities that lead to accurate predictions are objective, not subjective.

Whether a prediction is "accurate" or not is subjective. You predict that a coin toss has a 50% chance of resulting in heads. You toss 100 coins, and get 53 heads. Was that an accurate prediction, or not? It's not 50%. At some point, you're going to make a subjective decision that your statistics agree close enough with your predictions, and then you'll declare the predictions accurate.

stevendaryl · Apr 22, 2016

A. Neumaier said:

The S-matrix gives objective probabilities for the outcomes given the input.

The S-matrix makes asymptotic predictions: Some number of particles come in from infinity, where it's assumed that there are no interactions, collide and then the product particles go out to infinity. In the real world, we don't have particles coming in from infinity, and particles are always interacting. So to compare the S-matrix to actual experiments requires judgment. I claim that there is a subjective element to that judgment, inevitably.

stevendaryl · Apr 22, 2016

A. Neumaier said:

With your use of the notion ''subjective'' everything physicists do, and all science is subjective, and the term (and its opposite ''objective'') lose their traditional meaning.

It's a subjective judgment to call something objective. I know that's unsatisfying, but that's the way it is.

stevendaryl · Apr 22, 2016

stevendaryl said:

It's a subjective judgment to call something objective. I know that's unsatisfying, but that's the way it is.

I can see that this has gotten into a philosophical discussion about the meaning of probability and objectivity, and that's probably off-topic. So I will refrain from further replies on this topic.

A. Neumaier · Apr 22, 2016

bhobba said:

Bayesian inference - how does that fit? It can be done in a frequentest way but its not natural.

Bayesian inference if done in an objective manner, means to account for prior information in the likelihood function in a roundabout way. One adds extra prior terms that reflect (in a frequentist interpretation) what would have been obtained from data equivalent to the assumed knowledge. If the assumed knowledge (i.e., the prior) is true knowledge, the resulting Bayesian prediction is more accurate than without the prior; if the prior represents prejudice only, the resulting Bayesian prediction is heavily biased towards the prejudice unless a huge amount of data are present to cancel it.

For example, the Kalman filter for updating a Gaussian probability model is Bayesian in form as the current model is updated each time an additional data set comes in. However, if one considers the whole data stream as the data, it can be seen (when started with an improper prior at time zero) to be an optimal model according to the purely frequentist Gauss-Markov theorem for the estimation of linear models. The same holds for REML (restricted maximum likelihood), which is in spirit Bayesian but can be fully treated in a purely frequentist framework.

Thus it is only a matter of presentation and subjective preference whether to take a Bayesian or a frequentist view. Bayesian statistics is not intrinsically related to a subjective view of probability. It is a mathematical technique that is used in statistical practice in a shut-up-and-calculate way like quantum mechanics in physical practice.

In case you think I might not understand what I am talking about: As part of my work at the University of Vienna, I give regularly courses on statistical data analysis. I have written a big survey article about regularization (the abstract version of Bayesian inference in linear models) in SIAM Review 40 (1998), 636-666. I have worked on the Bayesian (REML) estimation of large animal breeding models; algorithms based upon my work are used all over the world to decide on animal breeding.

bhobba · Apr 22, 2016

A. Neumaier said:

In case you think I might not understand what I am talking about:

You obviously do. The initial probability, how is that arrived at in a frequentest view?

Take for example a coin. You start with it at 50-50 then flip the coin to update. In a frequentest view why would you start at 50-50?

Thanks
Bill

A. Neumaier · Apr 22, 2016

stevendaryl said:

It's a subjective judgment to call something objective.

As everything is subjective according to your usage of the word, it is meaningless to apply the adjective to anything, as it has no discriminative value. Your usage is far from how everyone else uses the word.

Is there anything that, according to you, fully deserves being called objective?
If not, why do you think the language contains such a term?
Why is science generally considered to collect objective knowledge?

stevendaryl said:

the meaning of probability and objectivity, and that's probably off-topic. So I will refrain from further replies on this topic.

The topic is ''what is an ensemble?'' and this is essentially synonymous with ''what is probability?'' It has a large physical (objective) aspect and a small philosophical (subjective) aspect. You are pulling the weight fully to the subjective side, but this is your subjective bias.

Mentz114 · Apr 22, 2016

stevendaryl said:

But you're making the assumption that equal volumes in phase space are equally likely. I guess you could say that that's the way you're defining "likelihood", but why phase space? For a single particle, you could characterize the particle's state (in one-dimension, for simplicity) by the pair p, x, where p is the momentum. Or you could characterize it by the pair v, x, where v is the velocity. If you include relativistic effects, v is not linearly proportional to p, so equal volumes in p,x space don't correspond to equal volumes in v,x. So why should one be the definition of "equally likely" rather than the other?

Because physics is about phase and configuration space. Most of what you've been saying is off topic. You're moving the goalposts around wildly so I don't know what you are trying to say.

Have a look at this
https://en.wikipedia.org/wiki/Phase_space_formulation
and this
https://web.stanford.edu/~peastman/statmech/phasespace.html
and
http://arxiv.org/abs/1003.0772
and
http://www.springer.com/us/book/9780792337942

A. Neumaier · Apr 22, 2016

bhobba said:

The initial probability, how is that arrived at in a frequentest view?

Take for example a coin. You start with it at 50-50 then flip the coin to update. In a frequentest view why would you start at 50-50?

You wouldn't unless you have good reasons to assume that the coin is almost fair.

In both the frequentist and the Bayesian case one starts with a prior count ##H_0## of heads and ##T_0## of tails. Then you flip a number of times and find ##H## heads and ##T## tails. You update the frequencies and get ##H'=H_0+H## and ##T'=T_0+T##. Then you estimate the probability for head as ##P_H:=H'/(H'+T')##.

If one initially knows nothing at all - in fact, unknown to everybody, someone prepared the coin so that both sides show head, and the experimenters see only the result, not the act of falling! -, the Bayesian starts with the unwarranted assumption [using an allegedly ''uninformative prior'', but still a prejudice] that ##H_0=T_0>0## (with a value that depends on how strongly the prior is believed to be true) while the frequentist puts correctly ##H_0=T_0=0##. It takes the Bayesian estimate a long time to realize that the coin was forged, while the frequentist gets the answer correct from the start. This shows the bad influence of a prejudice. (A real person would soon be suspicious about the coin, but a true Bayesian - following objective shut-up-and-calculate techniques rather than being subjective) will be unable to do that.

On the other hand, if the coin is known to be almost fair (because it looks like many other coins that have been tried before), both Bayesian and frequentist will assign ##H_0=T_0>0## - the frequentist by making a (somewhat subjective) estimate of how many equivalent coin flips the prior knowledge is worth, and checks during the computation whether the assumed estimate has a large effect on the result. (In technical terms, this is a regularization parameter. There are a number of ways this parameter can be objectively chosen under appropriate assumptions.) I have no idea how a true Bayesian would assigns the actual value of ##H_0=T_0>0## since probability theory gives no hints. In practice, there is no difference between the two; it is shut-up-and-calculate according to recipes taken from the literature.

If there are enough data and the prior is not weighted too much, the result is indifferent to the value of the prior.

A. Neumaier · Apr 22, 2016

bhobba said:

Jaynes was a physicist.

But he was mistaken about his subjective interpretation of physics. His interpretation only works because he knew already (from half a century of prior objective physics) which subjective assumptions he has to make to get it objectively correct. If he would assume in place of the subjective knowledge of ##\langle H\rangle## (which Nature happens to make use of) the subjective knowledge of ##\langle H^2\rangle## (which Nature abhors) he would have obtained in place of the canonical ensemble a ridiculously wrong ensemble. And even with the canonical ensemble, if he would know subjectively the wrong value of ##\langle H\rangle## (which is very well possible since in a subjective, stevendaryl-type of physics, no one specifies objectively what it means to have knowledge, then Jaynes would assign an equally wrong value for the temperature.

This proves that even in the context of the maximum entropy principle, only knowledge of the objectively correct information produces a reliable physical model and enables reliable physical predictions. Again, there is nothing subjective in the physics. Subjective deviations from the objective reality lead here (as always) to inaccurate or even grossly wrong predictions.

Mentz114 · Apr 22, 2016

bhobba said:

The difference between probability and likelihood is exactly what? Please be precise. I think you will find its very very slippery just like pinning down exactly what a point is rather slippery. That's why the axiomatic method was developed - it wasn't just so pure mathematicians could while away their time.

Thanks
Bill

bhobba,

I mean as in the likelihood function defined here.

https://en.wikipedia.org/wiki/Maximum_likelihood

A. Neumaier · Apr 22, 2016

Mentz114 said:

I mean as in the likelihood function defined here.

https://en.wikipedia.org/wiki/Maximum_likelihood

Then it is the logarithm of the probability density with respect to a prior measure. This is surely less fundamental than the notion of probability, which is independent of a prior measure.

stevendaryl · Apr 22, 2016

A. Neumaier said:

As everything is subjective according to your usage of the word, it is meaningless to apply the adjective to anything, as it has no discriminative value. Your usage is far from how everyone else uses the word.

Is there anything that, according to you, fully deserves being called objective?

No. I don't. I think that it's a short-cut in reasoning. To take into account all the ways that our judgments are influenced by unproved assumptions is intractable and inconvenient. So it's useful to be able to have cut-offs, where you treat sufficiently unlikely possibilities as if they were impossibilities. So the kind of reasoning that people typically do is a rule of thumb. It's subjective, but it's not consciously subjective.

Mentz114 · Apr 22, 2016

A. Neumaier said:

Then it is the logarithm of the probability density with respect to a prior measure. This is surely less fundamental than the notion of probability, which is independent of a prior measure.

I cannot (literally) argue against that. I was struck by a similarity to the path integral but that's probably spurious.

Ordinary folk, interestingly, have no idea of probability. One person I knew said after hearing there was a 40% chance of rain, asked '40% of what ?'
What people experience is 'confidence' and they can express it as likelihood ratios or 'odds'

stevendaryl · Apr 22, 2016

Mentz114 said:

Because physics is about phase and configuration space. Most of what you've been saying is off topic. You're moving the goalposts around wildly so I don't know what you are trying to say.

I'm sorry you feel that way. I'm just saying that volume in phase space is not the definition of likelihood. In certain circumstances, it's reasonable to assume that equal volumes in phase space imply equal likelihood, but that's an assumpion--it's not the definition of likelihood.

Have a look at this
https://en.wikipedia.org/wiki/Phase_space_formulation
and this
https://web.stanford.edu/~peastman/statmech/phasespace.html
and
http://arxiv.org/abs/1003.0772
and
http://www.springer.com/us/book/9780792337942

I know what phase space is.

rubi · Apr 22, 2016

The reason for using the phase space probability density is ergodicity. Ergodicity is supposed to single out the microcanonical ensemble and the other ensembles can be derived from it. Unfortunately, it's too hard to prove ergodicity for even the simplest physical systems. Nevertheless, it's a reasonable assumption in most situations. So at least for ergodic systems, the microcanonical ensemble is a hard, objective prediction of the theory.

stevendaryl · Apr 22, 2016

rubi said:

The reason for using the phase space probability density is ergodicity. Ergodicity is supposed to single out the microcanonical ensemble and the other ensembles can be derived from it. Unfortunately, it's too hard to prove ergodicity for even the simplest physical systems. Nevertheless, it's a reasonable assumption in most situations. So at least for ergodic systems, the microcanonical ensemble is a hard, objective prediction of the theory.

Related to the ergodicity assumption is the assumption that ensemble average of a quantity is equal to the time average.

rubi · Apr 22, 2016

stevendaryl said:

Related to the ergodicity assumption is the assumption that ensemble average of a quantity is equal to the time average.

Right, this is the more physical way of stating the ergodic hypothesis. In modern mathematical language, one usually defines ergodicity as a requirement on the probability measure. The equality of time averages and ensemble averages then follows from the so called ergodic theorems, for instance the Birkhoff ergodic theorem.

A. Neumaier · Apr 22, 2016

rubi said:

Ergodicity is [...] a reasonable assumption in most situations.

... though it is in fact known to be wrong in many physically relevant cases. It thus only has heuristic value.

rubi · Apr 22, 2016

A. Neumaier said:

... though it is in fact known to be wrong in many physically relevant cases. It thus only has heuristic value.

Well, I agree that this issue hasn't been addressed in a fully satisfactory way yet. But at least ergodic theory gives some confidence in the validity of the microcanonical ensemble.

(Here's a side question that interests me: Do you know whether such systems that are known not to be ergodic are usually well described by the microcanonical ensemble in experiments nevertheless?)

A. Neumaier · Apr 22, 2016

rubi said:

Do you know whether such systems that are known not to be ergodic are usually well described by the microcanonical ensemble in experiments nevertheless?

Probably yes (if they are large and simple enough), since in the thermodynamic limit the ensemble is equivalent to the grand canonical ensemble. Working with the latter is much simpler, closer to the formulas used in the applications, needs much weaker assumptions, and works identically in the classical and in the quantum case.

N88 · Apr 22, 2016

Demystifier said:

Then let me use an example. Suppose that you flip a coin, but only ONCE. How would you justify that the probability of getting heads is ##p=1/2##? Would you use an ensemble for that?

Edited 'ones' to 'ONCE'.

Interesting question! In the context of your opening reply to the OP, see interesting answer: http://arnold-neumaier.at/physfaq/topics/singleEvents

bhobba · Apr 22, 2016

Mentz114 said:

https://en.wikipedia.org/wiki/Maximum_likelihood

Got it.

However probability concepts such as maximum likelihood estimator are used throughout that link. I still suspect the whole thing is circular.

Thanks
Bill

Understanding the Uniform Probability Distribution in Statistical Ensembles

Similar threads

I Phase and group velocity for the wave function

A Causality in QFT

I Deriving Bogoliubov transformations correctly

A Magnetic field produced by moving charge in operator form

B Feynman QED Questions

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers