# Bayesian Priors and relation with ignorance

• A
Gold Member
Hi everyone. I am reading through these very interesting (in terms of topics) notes:
https://arxiv.org/abs/1807.05996
And so far I am at Section 5. The author gives me the impression they don't seem to fear to call what is Bayesian and what is Frequentist, making the distinction in applications quite clear. Section 5 however gave me quite a few things I had never thought about, because I felt they were very intuitively correct.
For example I always thought that a uniform prior $p(\mu)$ can reflect complete ignorance (I think I've read it also as the highest entropy choice or so), as for any value $[\mu,\mu+d\mu$ has the same probability as any other. Although, now I start doubting of how sure one can be by making such a claim, as this is parametrization-dependent. I.e. if we say we are ignorant on $\mu^2$ by choosing a uniform prior in $\mu^2$, this is no longer true for $\mu$ (Sec 5.3, 2nd paragraph's last sentence). I think this is the case because of the Jacobian that shows up once we make a transformation of variables. I find that quite unintuitive: "I have no idea what your temperature is, it can be anything between 33-46C. But that is because I chose to take your temperature as my parameter. I have prior preferences of what your temperature squared might be instead!".
Does anyone have a good explanation for that?
I think this is also somewhat related with Sec5.5, although that is specific on the "non-subjective" priors (such as Jeffrey's prior), which is also weird (I have to admit I didn't look through the references). "Weird" because we constructed the non-subjective priors to obtain the least-information (maximize our ignorance) out of a specific measurement.

FactChecker

Orodruin
Staff Emeritus
Homework Helper
Gold Member
2021 Award
Indeed, a prior that is uniform in one set of parameters need not be uniform in some other set of parameters. In many cases you have to make a choice of what you consider uninformative based on some physical expectation. In some cases, there really are uniform priors that can be objectively taken to be more uninformative than others, such as that given by the Haar measure on a compact group.

Either way, a flat prior is still a prior, you can just try to make it as uninformative as possible. Likewise, the Fisher information matrix is essentially a metric on parameter space and if you are using the Jeffreys prior you are essentially using the corresponding volume element to define the prior.

Stephen Tashi
"I have no idea what your temperature is, it can be anything between 33-46C. But that is because I chose to take your temperature as my parameter. I have prior preferences of what your temperature squared might be instead!".
Does anyone have a good explanation for that?
Rigorous appoaches to establishing a relationship between specific probability distributions and "rational" beliefs set down assumptions about rationa behavior. Some app;oaches model ratiional behavior in gambling. For example, one may imagine a game where player guesses the temperature of something and loses an amount that is proportional to the absolute difference between his guess and the actual temperature. This is a different game that one in which a player loses an amount that is proportional to the absoluite difference between the square of his guess and the square of the absolute temperature. So it isn't surprising that associating belief or ignorance about temperature (or its square) with some probability distribution can depend on a specific context

If you take the outlook that "ignorance" has a definition that is independent of any ideas about the consequences of one's actions then there is indeed the problem you mention. If we take "complete ignorance" to be merely an emotional state with no specific consequences then such ignorance does not imply a unique probability distribution to model it.

Dale
Mentor
2021 Award
My personal opinion is that uninformative priors are a little silly anyway. We rarely encounter a scenario where we truly have no information, and our prior needs to reflect the information available. I think that people use an uninformative prior because it is easier than actually forming a good informed prior.

My preference would be for the statistical community to spend more effort making recommendations on forming a good informed prior rather than forming an uninformative prior. With an informed prior, the choice of representation is less of an issue since, for example, it is understood that prior information on temperature will have a different distribution when transformed to temperature squared.

Last edited:
ChrisVer, StoneTemplePython and FactChecker
stevendaryl
Staff Emeritus