How Bayesian Inference Works in the Context of Science

Dale · Dec 17, 2020

Confessions of a moderate Bayesian part 3
Read part 1: How to Get Started with Bayesian Statistics
Read part 2: Frequentist Probability vs Bayesian Probability
Bayesian statistics by and for non-statisticians
https://www.cafepress.com/physicsforums.13280237
Background
One of the things that I like about Bayesian statistics is that it rather closely matches the way that I think about science, although of course in a more formal manner. Also, many of the more important concepts of scientific philosophy arise naturally and are enforced naturally when using Bayesian methods. I have a decidedly anti-philosophy philosophy of science, so I like being able to reduce scientific philosophy to features of a...

Dale · Dec 19, 2020

I think this may be my last one for a while. I have a little idea about one more that would focus on using Bayesian and frequentist concepts and methods together. But currently the only example I can think of is for justifying the usual mis-interpretation of a confidence interval using both Bayesian and frequentist probabilities together.

@Greg Bernhardt it looks like you are still listed as the author here in the forums even though it is fixed on the Insights page

Greg Bernhardt · Dec 19, 2020

Dale said:

@Greg Bernhardt it looks like you are still listed as the author here in the forums even though it is fixed on the Insights page

Yeah there was a glitch during publishing and now Cloudflare is breaking my access to the database to fix. I'll work on it later today.

Dale · Dec 19, 2020

Thanks @Greg Bernhardt it is good now!

atyy · Dec 20, 2020

I think another problem with non-informative priors is that it is not well-defined for continuous variables, since a smooth change of variables may take a "non-informative" prior into an "informative" prior.

Dale · Dec 20, 2020

atyy said:

I think another problem with non-informative priors is that it is not well-defined for continuous variables, since a smooth change of variables may take a "non-informative" prior into an "informative" prior.

Yes, that was a problem, but there has been some well recognized work on that by Harold Jeffreys. He came up with a non-informative prior which is invariant under a coordinate transformation. But even with Jeffreys’ prior you can still have cases where it cannot be normalized.

I think that his approach is well received at this point, but my opinion is still that non-informative priors should be generally avoided entirely.

Stephen Tashi · Dec 20, 2020

atyy said:

I think another problem with non-informative priors is that it is not well-defined for continuous variables, since a smooth change of variables may take a "non-informative" prior into an "informative" prior.

Under a smooth change of variables in frequentist statistics, important things may not go smoothly. For example, if ##f(x_1,x_2,...x_n)## is a formula for an unbiased estimate of the variance of a random variable, then ##\sqrt{f(x_1,x_2,...x_n)}## need not be an unbiased estimator for the standard deviation of the same random variable. In picking techniques for estimation, it matters whether you want a "good" estimate of a parameter versus a "good" estimate of some function of that parameter. So it isn't surprising that a measure of information applied to a prior distribution for a parameter may give a different value when applied to a prior distribution for some function of that parameter.

Dale · Dec 20, 2020

Stephen Tashi said:

Under a smooth change of variables in frequentist statistics, important things may not go smoothly.

One of the nice features of Bayesian methods that I did not address is related to this.

Stephen Tashi said:

if f(x1,x2,...xn) is a formula for an unbiased estimate of the variance of a random variable, then f(x1,x2,...xn) need not be an unbiased estimator for the standard deviation of the same random variable.

If you have random samples of the posterior distribution of the variance then the square root of those samples are random samples of the posterior distribution of the standard deviation. Similarly with any function of the posterior samples.

atyy · Dec 20, 2020

This paper uses both Frequentist and Bayesian error analysis - IIRC they use 2 different priors: https://arxiv.org/abs/astro-ph/9812133

I like the huge amount of philosophy in these posts despite @Dale's aversion to interpretive issues

Dale · Dec 21, 2020

atyy said:

This paper uses both Frequentist and Bayesian error analysis - IIRC they use 2 different priors: https://arxiv.org/abs/astro-ph/9812133

That seems like a good use of the technique. Most of my papers with a Bayesian analysis also had a frequentist analysis. But recently I have had a couple that just went pure Bayesian.

atyy said:

I like the huge amount of philosophy in these posts despite @Dale's aversion to interpretive issues

I must admit that I felt a little embarrassed to write so much philosophy. But in the end I went ahead and did it anyway.

I know that there is an interpretation of QM based on Bayesian probability, but I honestly don’t have an informed opinion on it. So I won’t have a post about that for the foreseeable future.

Swamp Thing · Dec 22, 2020

This video might be interesting to people who read your Bayesian insights.
It would be great if you could take a look at it and share your thoughts:

"The medical test paradox: Can redesigning Bayes rule help?"

Dale · Dec 22, 2020

Swamp Thing said:

It would be great if you could take a look at it and share your thoughts:

Yes, I thought it was a very good video. Well done and largely accurate.

I mentioned this form of Bayes’ theorem in the section on the strength of evidence. I didn’t go into detail, but I also actually prefer the odds form of Bayes’ theorem for doing simple “by hand” computations. It can be made even simpler by using log(odds), but the intuitive feel for logarithmic scales is challenging.

Another place that this form can be valuable is in assessing legal evidence. But some courts have ruled against it for rather uninformed reasons. Being a legal expert doesn’t make you a statistical expert.

haruspex · Jan 10, 2021

Dale said:

Continue reading...

I tried to post a comment on the article, but it said "You must be logged in to post a comment". I was already logged in, but I clicked on the link anyway, and it just took me back to the article.
So, here's my comment:

"we can calculate P(X∈[0.49,0.51]|E)=0.969. Meaning that even though the coin is probably not exactly fair it is also probably close enough to fair to be considered equivalent for all practical purposes."

I'm not following the argument here. We could calculate P(X∈[0,1]|E)=1 and argue any observed frequency is fair. Don't we need to penalise the expansion of the range to include 0.5? I.e. increase p?

Dale · Jan 10, 2021

haruspex said:

We could calculate P(X∈[0,1]|E)=1 and argue any observed frequency is fair.

This is not a pure calculation. Earlier I had defined ##X\in[0.49,0.51]## as being practically equivalent to a fair coin. This is a concept known as a ROPE (region of practical equivalence). So in this case a coin that is unfair by less than 0.01 is practically fair. That is a judgement call.

A ROPE is not something that is derived mathematically. It is something based on practical knowledge, or in the medical context on clinical judgement. I doubt that anyone would seriously consider a coin with ##X=0.99## to be practically fair. So you could calculate ##P(X\in [0,1]|E)## but I don’t think many people would agree that it is a ROPE for ##X=0.5##.

A ROPE allows one to provide evidence in support of a null hypothesis. With standard significance testing you can only reject a null hypothesis, never accept it. But if your posterior distribution is almost entirely inside a ROPE around the null hypothesis then you can accept the null hypothesis.

A similar concept is used in the testing of equivalence or non-inferiority with frequentist methods.

How Bayesian Inference Works in the Context of Science

Similar threads

Hot Threads

Recent Insights