Bayesian Priors and relation with ignorance

In summary, the conversation discusses the concept of uninformative priors in Bayesian statistics and their implications. The speakers bring up the issue of choosing a prior that reflects complete ignorance and how it can depend on the context and specific consequences of one's actions. They also mention the idea of forming informed priors based on available information rather than relying on uninformative priors. The conversation concludes with a discussion on the limitations of using a flat distribution for variables with infinite range and the concept of unbiased distribution in probability.
  • #1
ChrisVer
Gold Member
3,378
464
Hi everyone. I am reading through these very interesting (in terms of topics) notes:
https://arxiv.org/abs/1807.05996
And so far I am at Section 5. The author gives me the impression they don't seem to fear to call what is Bayesian and what is Frequentist, making the distinction in applications quite clear. Section 5 however gave me quite a few things I had never thought about, because I felt they were very intuitively correct.
For example I always thought that a uniform prior [itex]p(\mu)[/itex] can reflect complete ignorance (I think I've read it also as the highest entropy choice or so), as for any value [itex][\mu,\mu+d\mu[/itex] has the same probability as any other. Although, now I start doubting of how sure one can be by making such a claim, as this is parametrization-dependent. I.e. if we say we are ignorant on [itex]\mu^2[/itex] by choosing a uniform prior in [itex]\mu^2[/itex], this is no longer true for [itex]\mu[/itex] (Sec 5.3, 2nd paragraph's last sentence). I think this is the case because of the Jacobian that shows up once we make a transformation of variables. I find that quite unintuitive: "I have no idea what your temperature is, it can be anything between 33-46C. But that is because I chose to take your temperature as my parameter. I have prior preferences of what your temperature squared might be instead!".
Does anyone have a good explanation for that?
I think this is also somewhat related with Sec5.5, although that is specific on the "non-subjective" priors (such as Jeffrey's prior), which is also weird (I have to admit I didn't look through the references). "Weird" because we constructed the non-subjective priors to obtain the least-information (maximize our ignorance) out of a specific measurement.
 
  • Like
Likes FactChecker
Physics news on Phys.org
  • #2
Indeed, a prior that is uniform in one set of parameters need not be uniform in some other set of parameters. In many cases you have to make a choice of what you consider uninformative based on some physical expectation. In some cases, there really are uniform priors that can be objectively taken to be more uninformative than others, such as that given by the Haar measure on a compact group.

Either way, a flat prior is still a prior, you can just try to make it as uninformative as possible. Likewise, the Fisher information matrix is essentially a metric on parameter space and if you are using the Jeffreys prior you are essentially using the corresponding volume element to define the prior.
 
  • #3
ChrisVer said:
"I have no idea what your temperature is, it can be anything between 33-46C. But that is because I chose to take your temperature as my parameter. I have prior preferences of what your temperature squared might be instead!".
Does anyone have a good explanation for that?
Rigorous appoaches to establishing a relationship between specific probability distributions and "rational" beliefs set down assumptions about rationa behavior. Some app;oaches model ratiional behavior in gambling. For example, one may imagine a game where player guesses the temperature of something and loses an amount that is proportional to the absolute difference between his guess and the actual temperature. This is a different game that one in which a player loses an amount that is proportional to the absoluite difference between the square of his guess and the square of the absolute temperature. So it isn't surprising that associating belief or ignorance about temperature (or its square) with some probability distribution can depend on a specific context

If you take the outlook that "ignorance" has a definition that is independent of any ideas about the consequences of one's actions then there is indeed the problem you mention. If we take "complete ignorance" to be merely an emotional state with no specific consequences then such ignorance does not imply a unique probability distribution to model it.
 
  • #4
My personal opinion is that uninformative priors are a little silly anyway. We rarely encounter a scenario where we truly have no information, and our prior needs to reflect the information available. I think that people use an uninformative prior because it is easier than actually forming a good informed prior.

My preference would be for the statistical community to spend more effort making recommendations on forming a good informed prior rather than forming an uninformative prior. With an informed prior, the choice of representation is less of an issue since, for example, it is understood that prior information on temperature will have a different distribution when transformed to temperature squared.
 
Last edited:
  • Like
Likes ChrisVer, StoneTemplePython and FactChecker
  • #5
When it comes to a variable with infinite range, it's clear that there is no way to be unbiased in a probability distribution. You can't allow every natural number to be equally likely, for example. But any continuous variable with a finite range can be mapped to a variable with an infinite range. (Or the other way around---any continuous variable with an infinite range can be mapped to a finite range). So I think that the idea that a flat distribution on the range ##0 \leq x \leq 1## is unbiased is a little illusory.

It seems that you don't have the problem with a discrete, finite set of variables: You can just assign each the probability ##1/N## where ##N## is the number of possibilities. But maybe it's illusory to call even that unbiased, for the following reason:

When it comes to applications of probability, it's often (most of the time?) the case that you have a fixed finite number of possibilities only because you're lumping distinct possibilities into the same bin. A different way of lumping would have resulted in a different number of possibilities, and so the idea of "unbiased" distribution (all possibilities equally likely) would give rise to different probabilities.

For a sort of silly example: If randomly pick a person out of a population, I could choose two bins: Bald versus not-bald. The unbiased weighting would say that there is a 50% chance of picking someone who is bald. But suppose instead I chose the following bins: Bald vs. Red-haired vs. Black-haired vs. Blond vs. Brown-haired vs. Gray. With this collection of bins, it seems that the unbiased weighting gives only a 16.7% chance of picking a bald person.

So I think that the impossibility of getting an objectively unbiased distribution on a continuous variable is not anomalous---in practice, there's no way of getting an unbiased distribution on a finite set of possibilities, either.
 
  • Like
Likes Borg

1. What are Bayesian priors?

Bayesian priors are a fundamental concept in Bayesian statistics. They represent our beliefs or assumptions about a particular parameter or variable before we collect any data. They are used to update our knowledge and make predictions based on new evidence.

2. How are Bayesian priors related to ignorance?

Bayesian priors are closely related to ignorance because they represent our lack of knowledge about a particular parameter or variable. They are a way to incorporate our uncertainty into the statistical analysis and make more informed decisions.

3. Are Bayesian priors always subjective?

No, Bayesian priors can be either subjective or objective. Subjective priors are based on the individual's beliefs and knowledge, while objective priors are based on previous data or information from experts in the field.

4. How do Bayesian priors affect the outcome of a Bayesian analysis?

Bayesian priors play a crucial role in Bayesian analysis as they influence the posterior distribution, which is the updated belief about the parameter or variable after considering both the prior knowledge and the new data. The choice of priors can significantly impact the results and conclusions of a Bayesian analysis.

5. Can Bayesian priors be updated?

Yes, Bayesian priors can be updated with new data using Bayes' theorem. This allows us to continuously refine our beliefs and make more accurate predictions as we gather more information.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Introductory Physics Homework Help
Replies
1
Views
360
  • Engineering and Comp Sci Homework Help
Replies
1
Views
645
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
1K
  • Cosmology
Replies
4
Views
1K
Replies
4
Views
1K
Replies
3
Views
2K
  • Advanced Physics Homework Help
Replies
1
Views
2K
  • Special and General Relativity
Replies
21
Views
615
Back
Top