• #1
Dale
Mentor
Insights Author
2020 Award
30,696
7,288
Confessions of a moderate Bayesian part 3
Read part 1: How to Get Started with Bayesian Statistics
Read part 2: Frequentist Probability vs Bayesian Probability
Bayesian statistics by and for non-statisticians
https://www.cafepress.com/physicsforums.13280237
Background
One of the things that I like about Bayesian statistics is that it rather closely matches the way that I think about science, although of course in a more formal manner. Also, many of the more important concepts of scientific philosophy arise naturally and are enforced naturally when using Bayesian methods. I have a decidedly anti-philosophy philosophy of science, so I like being able to reduce scientific philosophy to features of a...
Continue reading...
 
  • Like
Likes Choppy and Greg Bernhardt

Answers and Replies

  • #1
Dale
Mentor
Insights Author
2020 Award
30,696
7,288
I think this may be my last one for a while. I have a little idea about one more that would focus on using Bayesian and frequentist concepts and methods together. But currently the only example I can think of is for justifying the usual mis-interpretation of a confidence interval using both Bayesian and frequentist probabilities together.

@Greg Bernhardt it looks like you are still listed as the author here in the forums even though it is fixed on the Insights page
 
  • Like
Likes Greg Bernhardt

Answers and Replies

  • #3
18,354
8,154
@Greg Bernhardt it looks like you are still listed as the author here in the forums even though it is fixed on the Insights page
Yeah there was a glitch during publishing and now Cloudflare is breaking my access to the database to fix. I'll work on it later today.
 
  • Like
Likes Dale
  • #5
atyy
Science Advisor
14,214
2,471
I think another problem with non-informative priors is that it is not well-defined for continuous variables, since a smooth change of variables may take a "non-informative" prior into an "informative" prior.
 
  • #6
Dale
Mentor
Insights Author
2020 Award
30,696
7,288
I think another problem with non-informative priors is that it is not well-defined for continuous variables, since a smooth change of variables may take a "non-informative" prior into an "informative" prior.
Yes, that was a problem, but there has been some well recognized work on that by Harold Jeffreys. He came up with a non-informative prior which is invariant under a coordinate transformation. But even with Jeffreys’ prior you can still have cases where it cannot be normalized.

I think that his approach is well received at this point, but my opinion is still that non-informative priors should be generally avoided entirely.
 
  • Like
Likes atyy
  • #7
Stephen Tashi
Science Advisor
7,545
1,456
I think another problem with non-informative priors is that it is not well-defined for continuous variables, since a smooth change of variables may take a "non-informative" prior into an "informative" prior.
Under a smooth change of variables in frequentist statistics, important things may not go smoothly. For example, if ##f(x_1,x_2,...x_n)## is a formula for an unbiased estimate of the variance of a random variable, then ##\sqrt{f(x_1,x_2,....x_n)}## need not be an unbiased estimator for the standard deviation of the same random variable. In picking techniques for estimation, it matters whether you want a "good" estimate of a parameter versus a "good" estimate of some function of that parameter. So it isn't surprising that a measure of information applied to a prior distribution for a parameter may give a different value when applied to a prior distribution for some function of that parameter.
 
Last edited:
  • Like
Likes atyy and Dale
  • #8
Dale
Mentor
Insights Author
2020 Award
30,696
7,288
Under a smooth change of variables in frequentist statistics, important things may not go smoothly.
One of the nice features of Bayesian methods that I did not address is related to this.
if f(x1,x2,...xn) is a formula for an unbiased estimate of the variance of a random variable, then f(x1,x2,....xn) need not be an unbiased estimator for the standard deviation of the same random variable.
If you have random samples of the posterior distribution of the variance then the square root of those samples are random samples of the posterior distribution of the standard deviation. Similarly with any function of the posterior samples.
 
  • #9
atyy
Science Advisor
14,214
2,471
This paper uses both Frequentist and Bayesian error analysis - IIRC they use 2 different priors: https://arxiv.org/abs/astro-ph/9812133

I like the huge amount of philosophy in these posts despite @Dale's aversion to interpretive issues :wink:
 
  • Like
Likes Dale
  • #10
Dale
Mentor
Insights Author
2020 Award
30,696
7,288
This paper uses both Frequentist and Bayesian error analysis - IIRC they use 2 different priors: https://arxiv.org/abs/astro-ph/9812133
That seems like a good use of the technique. Most of my papers with a Bayesian analysis also had a frequentist analysis. But recently I have had a couple that just went pure Bayesian.

I like the huge amount of philosophy in these posts despite @Dale's aversion to interpretive issues :wink:
I must admit that I felt a little embarrassed to write so much philosophy. But in the end I went ahead and did it anyway.

I know that there is an interpretation of QM based on Bayesian probability, but I honestly don’t have an informed opinion on it. So I won’t have a post about that for the foreseeable future.
 
  • Like
Likes atyy
  • #11
This video might be interesting to people who read your Bayesian insights.
It would be great if you could take a look at it and share your thoughts:

"The medical test paradox: Can redesigning Bayes rule help?"

 
  • Like
Likes Dale
  • #12
Dale
Mentor
Insights Author
2020 Award
30,696
7,288
It would be great if you could take a look at it and share your thoughts:
Yes, I thought it was a very good video. Well done and largely accurate.

I mentioned this form of Bayes’ theorem in the section on the strength of evidence. I didn’t go into detail, but I also actually prefer the odds form of Bayes’ theorem for doing simple “by hand” computations. It can be made even simpler by using log(odds), but the intuitive feel for logarithmic scales is challenging.

Another place that this form can be valuable is in assessing legal evidence. But some courts have ruled against it for rather uninformed reasons. Being a legal expert doesn’t make you a statistical expert.
 
  • #13
haruspex
Science Advisor
Homework Helper
Insights Author
Gold Member
2020 Award
35,262
6,317
I tried to post a comment on the article, but it said "You must be logged in to post a comment". I was already logged in, but I clicked on the link anyway, and it just took me back to the article.
So, here's my comment:

"we can calculate P(X∈[0.49,0.51]|E)=0.969. Meaning that even though the coin is probably not exactly fair it is also probably close enough to fair to be considered equivalent for all practical purposes."

I'm not following the argument here. We could calculate P(X∈[0,1]|E)=1 and argue any observed frequency is fair. Don't we need to penalise the expansion of the range to include 0.5? I.e. increase p?
 
  • #14
Dale
Mentor
Insights Author
2020 Award
30,696
7,288
We could calculate P(X∈[0,1]|E)=1 and argue any observed frequency is fair.
This is not a pure calculation. Earlier I had defined ##X\in[0.49,0.51]## as being practically equivalent to a fair coin. This is a concept known as a ROPE (region of practical equivalence). So in this case a coin that is unfair by less than 0.01 is practically fair. That is a judgement call.

A ROPE is not something that is derived mathematically. It is something based on practical knowledge, or in the medical context on clinical judgement. I doubt that anyone would seriously consider a coin with ##X=0.99## to be practically fair. So you could calculate ##P(X\in [0,1]|E)## but I don’t think many people would agree that it is a ROPE for ##X=0.5##.

A ROPE allows one to provide evidence in support of a null hypothesis. With standard significance testing you can only reject a null hypothesis, never accept it. But if your posterior distribution is almost entirely inside a ROPE around the null hypothesis then you can accept the null hypothesis.

A similar concept is used in the testing of equivalence or non-inferiority with frequentist methods.
 

Related Threads on How Bayesian Inference Works in the Context of Science

  • Last Post
Replies
2
Views
2K
Replies
6
Views
970
Replies
19
Views
4K
Replies
2
Views
1K
  • Last Post
Replies
15
Views
5K
Replies
1
Views
3K
Replies
3
Views
1K
Replies
11
Views
1K
  • Last Post
Replies
2
Views
2K
Top