Understand the Importance of Reference Priors for Signal Search

Click For Summary

Discussion Overview

The discussion centers around the concept of reference priors in Bayesian analysis, particularly in the context of signal searches. Participants explore the role and importance of reference priors, the implications of different prior distributions on posterior probabilities, and the nuances of Bayesian inference.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant seeks an intuitive understanding of reference priors, questioning their importance in updating beliefs based on observations.
  • Another participant critiques the formulation of Bayes' theorem presented, emphasizing the distinction between probability densities and probabilities.
  • There is a suggestion to explore simple finite cases to illustrate how different priors can affect the posterior, particularly when data is limited.
  • One participant notes that the choice of prior can significantly impact results when observations are scarce, while abundant data may mitigate the influence of the prior.
  • There is a discussion about the conventions in notation, with emphasis on the distinction between cumulative probabilities and probability densities.

Areas of Agreement / Disagreement

Participants express differing views on the formulation of Bayes' theorem and the implications of prior choices. The discussion remains unresolved regarding the best understanding and application of reference priors.

Contextual Notes

Participants highlight potential confusion in terminology and notation, particularly regarding the use of probability and density functions. There are also unresolved questions about the impact of prior distributions on posterior outcomes.

ChrisVer
Science Advisor
Messages
3,372
Reaction score
465
Hi, a very basic question: what is a good intuitive way to understand the importance of a reference prior? In the context of a signal search.
Bellow, I also try to give the way I understand the approach in a Bayesian analysis (roughly):
1. You have your likelihood model L = p(x_{obs} | \lambda , b+\mu s), with the expected events (b,s background,signal) and several priors \pi (for your parameter of interest \mu and other nuisance parameters \lambda such as the uncertainties).
2. The posterior pdf is what you need in order to study the parameter of interest \mu. By Bayes' Theorem, this is:
p ( \mu | x_{obs} ) = \frac{ L \pi(\lambda) \pi(\mu,\lambda)}{p(x_{obs})}
The denominator several times can be taken out as it's only making sure that the normalization is correct for a pdf. Also I consider \pi(\mu,\lambda) = \pi(\mu) \pi(\lambda), which tells that the two parameters are independent.
3. You run several experiments, and from the outcome of each experiment you "update" your knowledge on the parameter of interest. Aka in the end you build up the posterior probability (once you integrate out the extra dimensions from the NPs):
p( \mu | x_{obs} ) \propto \int d \lambda ~ p(\mu | x_{obs} , \lambda ) \pi(\lambda)

So far I think I understand everything, maybe with some misconceptions which could potentially be pointed out.. When though one starts speaking about reference priors I am somewhat lost. Based on a few searches, I think the main target of the reference prior is to minimize the differences between the posterior and the prior. However I don't quiet understand how is that important as:
"I can give in any prior which I like (with reasonable limitations), and it's up to the observation/experiment to tell me how it evolves with the extra information. By reasonable I mean that for example it can't be 0 at ranges where the posterior is non-zero (as Bayes' theorem would result in 0 for the posterior)."
How could different distributions of a prior end up in updating it to different distributions for the posterior?
 
Physics news on Phys.org
ChrisVer said:
Hi, a very basic question: what is a good intuitive way to understand the importance of a reference prior? In the context of a signal search.
Bellow, I also try to give the way I understand the approach in a Bayesian analysis (roughly):
1. You have your likelihood model L = p(x_{obs} | \lambda , b+\mu s), with the expected events (b,s background,signal) and several priors \pi (for your parameter of interest \mu and other nuisance parameters \lambda such as the uncertainties).
2. The posterior pdf is what you need in order to study the parameter of interest \mu. By Bayes' Theorem, this is:
p ( \mu | x_{obs} ) = \frac{ L \pi(\lambda) \pi(\mu,\lambda)}{p(x_{obs})}
The denominator several times can be taken out as it's only making sure that the normalization is correct for a pdf. Also I consider \pi(\mu,\lambda) = \pi(\mu) \pi(\lambda), which tells that the two parameters are independent.

Your number 2 is wrong by any standard I'm aware of. If you are using a probability density function, then it should be something like ##f_{\mu\vert X}(\cdot \vert x_{obs}) ##. Crucially this is a probability density, not a probability. (Look to the CDF for the probability.)
ChrisVer said:
...
How could different distributions of a prior end up in updating it to different distributions for the posterior?

There are a lot of different issues here. In some sense, this is the whole point of Bayesian Inference.

Have you tried working out some very simple finite cases? I would almost always start with finite, then consider countably infinite, and after all that maybe consider the continuous / uncountable case.

E.g. suppose you have a coin that is either 50:50 heads: tails or 70:30 heads tails or 90:10 heads tails. Now suppose you run 5 trials and the results are ___.
Now suppose instead you run 50000 trials and the results are ___. Try working this through with a uniform prior, vs something heavily skewed toward 50:50 case. You should be able to clearly see a big impact in the choice of prior for the former.

Loosely speaking: the prior has a big impact if you don't have much observations / data. If you have a lot of data / observations, then you can 'overwhelm your prior' if such data and the choice of prior has minimal impact on the posterior -- except if you zero things out as being impossible in the prior, then there is no opportunity to overwhelm. (There are a lot of subtleties in the continuous case, though. A lot of people equate zero probability with impossibility -- and that is in general wrong.)
 
Last edited:
  • Like
Likes   Reactions: WWGD
StoneTemplePython said:
Your number 2 is wrong by any standard I'm aware of.
How so? I mean it's Bayes' theorem and I called them pdfs not probabilities (?)
 
ChrisVer said:
How so? I mean it's Bayes' theorem and I called them pdfs not probabilities (?)

This may be a superficial issue. ##P## seems to alway refer to cumulative probability (read: from CDF) and ##p## for probability at a point (read from PMF or in a transition matrix, and such).

Put differently: by any standard I'm aware of, ##p## (and pr) is reserved for probabilities not densities. This may just be a convention, but I also see a lot people confuse densities with probabilities early on, so I'm fairly convinced that the convention is useful.
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
Replies
2
Views
1K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 15 ·
Replies
15
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K