Demystifier said:
But what is probability then about? About anything that satisfies the axioms of probability? My view is that, if a set of axioms does not really capture the concept that people originally had in mind before proposing the axioms, then it is the axioms, not the concept, that needs to be changed.
It's fair to say that the concept of probability that people originally had in mind involves a situation where there are several "possible" outcomes of some physical phenomena, but only one of the "possible" outcomes "actually" occurs. The concept of probability associated with such a situation involves a "tendency" for certain outcomes to actually happen that can be measured by a number, but the lack of any absolute guarantee that this number will correspond to the observed frequencies of the outcomes that actually do happen. This is still how many people applying probability theory think of probability.
However, such thoughts involve the complicated metaphysical concepts of "possible" as distinct from "actual". There is not yet any ( well known) system of mathematics that formalizes these metaphysical concepts and
also provides anything useful for applications that the Kolmogorov approach doesn't already supply.
The Kolomogorov approach ( measure theory) provides a reliable basis for proving theorems about probabilities. The price of this approach is that probability theory is essentially circular. We have theorems that say if certain probabilities are such-and-such then the
probabilities of other things are so-and-so. Any interpretation of probability theory as a guarantee of what will
actually happen is outside this theory. It falls under whatever field of science deals with the problem to which the theory is applied.
It seems to me that in physics there is a long tradition of attempts to formulate theories of probability on the basis of
actual frequencies of outcomes. For example, if we consider tossing a fair coin as a physical event, then such a theory would tell us to consider the "ensemble" of tossed coins. The ensemble must be an
actual thing. It may involve all fair coins that have been tossed in past and all that will be tossed in the future, and coins tossed on other planets etc. In this actual ensemble of fair coins there is an actual frequency that have (or will) land heads. So this frequency is a specific number if the ensemble is finite. (If the ensemble isn't finite, we have more conceptual work to do.)
These ensemble theories do not explain taking independent samples from the ensemble unless we add further structure to theory. (For example, why won't the sub-ensemble corresponding to one experimenter's tosses all come out heads?) So we need the ensemble to be distributed in space and time (e.g. among various labs and among various times-of-day) in some way that mimics the appearance of independent trials.