In summary, Bayesian statistics by and for non-statisticians is a great way to think about science, and many important concepts of scientific philosophy arise naturally and are enforced when using Bayesian methods. Bayesian probabilities are well-defined for continuous variables, and non-informative priors can sometimes lead to incorrect results. However, there is well-recognized work on this by Harold Jeffreys, and using Bayesian and frequentist concepts and methods together is a powerful way to improve estimation techniques.
  • #1
35,204
13,431
Confessions of a moderate Bayesian part 3
Read part 1: How to Get Started with Bayesian Statistics
Read part 2: Frequentist Probability vs Bayesian Probability
Bayesian statistics by and for non-statisticians
https://www.cafepress.com/physicsforums.13280237
Background
One of the things that I like about Bayesian statistics is that it rather closely matches the way that I think about science, although of course in a more formal manner. Also, many of the more important concepts of scientific philosophy arise naturally and are enforced naturally when using Bayesian methods. I have a decidedly anti-philosophy philosophy of science, so I like being able to reduce scientific philosophy to features of a...

Continue reading...
 
  • Like
  • Skeptical
Likes WWGD, weirdoguy, madness and 2 others
  • #1
I think this may be my last one for a while. I have a little idea about one more that would focus on using Bayesian and frequentist concepts and methods together. But currently the only example I can think of is for justifying the usual mis-interpretation of a confidence interval using both Bayesian and frequentist probabilities together.

@Greg Bernhardt it looks like you are still listed as the author here in the forums even though it is fixed on the Insights page
 
  • Like
Likes Greg Bernhardt
  • #3
Dale said:
@Greg Bernhardt it looks like you are still listed as the author here in the forums even though it is fixed on the Insights page
Yeah there was a glitch during publishing and now Cloudflare is breaking my access to the database to fix. I'll work on it later today.
 
  • Like
Likes Dale
  • #5
I think another problem with non-informative priors is that it is not well-defined for continuous variables, since a smooth change of variables may take a "non-informative" prior into an "informative" prior.
 
  • #6
atyy said:
I think another problem with non-informative priors is that it is not well-defined for continuous variables, since a smooth change of variables may take a "non-informative" prior into an "informative" prior.
Yes, that was a problem, but there has been some well recognized work on that by Harold Jeffreys. He came up with a non-informative prior which is invariant under a coordinate transformation. But even with Jeffreys’ prior you can still have cases where it cannot be normalized.

I think that his approach is well received at this point, but my opinion is still that non-informative priors should be generally avoided entirely.
 
  • Like
Likes atyy
  • #7
atyy said:
I think another problem with non-informative priors is that it is not well-defined for continuous variables, since a smooth change of variables may take a "non-informative" prior into an "informative" prior.

Under a smooth change of variables in frequentist statistics, important things may not go smoothly. For example, if ##f(x_1,x_2,...x_n)## is a formula for an unbiased estimate of the variance of a random variable, then ##\sqrt{f(x_1,x_2,...x_n)}## need not be an unbiased estimator for the standard deviation of the same random variable. In picking techniques for estimation, it matters whether you want a "good" estimate of a parameter versus a "good" estimate of some function of that parameter. So it isn't surprising that a measure of information applied to a prior distribution for a parameter may give a different value when applied to a prior distribution for some function of that parameter.
 
Last edited:
  • Like
Likes atyy and Dale
  • #8
Stephen Tashi said:
Under a smooth change of variables in frequentist statistics, important things may not go smoothly.
One of the nice features of Bayesian methods that I did not address is related to this.
Stephen Tashi said:
if f(x1,x2,...xn) is a formula for an unbiased estimate of the variance of a random variable, then f(x1,x2,...xn) need not be an unbiased estimator for the standard deviation of the same random variable.
If you have random samples of the posterior distribution of the variance then the square root of those samples are random samples of the posterior distribution of the standard deviation. Similarly with any function of the posterior samples.
 
  • #9
This paper uses both Frequentist and Bayesian error analysis - IIRC they use 2 different priors: https://arxiv.org/abs/astro-ph/9812133

I like the huge amount of philosophy in these posts despite @Dale's aversion to interpretive issues :wink:
 
  • Like
Likes Dale
  • #10
atyy said:
This paper uses both Frequentist and Bayesian error analysis - IIRC they use 2 different priors: https://arxiv.org/abs/astro-ph/9812133
That seems like a good use of the technique. Most of my papers with a Bayesian analysis also had a frequentist analysis. But recently I have had a couple that just went pure Bayesian.

atyy said:
I like the huge amount of philosophy in these posts despite @Dale's aversion to interpretive issues :wink:
I must admit that I felt a little embarrassed to write so much philosophy. But in the end I went ahead and did it anyway.

I know that there is an interpretation of QM based on Bayesian probability, but I honestly don’t have an informed opinion on it. So I won’t have a post about that for the foreseeable future.
 
  • Like
Likes atyy
  • #11
This video might be interesting to people who read your Bayesian insights.
It would be great if you could take a look at it and share your thoughts:

"The medical test paradox: Can redesigning Bayes rule help?"

 
  • Like
Likes Dale
  • #12
Swamp Thing said:
It would be great if you could take a look at it and share your thoughts:
Yes, I thought it was a very good video. Well done and largely accurate.

I mentioned this form of Bayes’ theorem in the section on the strength of evidence. I didn’t go into detail, but I also actually prefer the odds form of Bayes’ theorem for doing simple “by hand” computations. It can be made even simpler by using log(odds), but the intuitive feel for logarithmic scales is challenging.

Another place that this form can be valuable is in assessing legal evidence. But some courts have ruled against it for rather uninformed reasons. Being a legal expert doesn’t make you a statistical expert.
 
  • #13
Dale said:
I tried to post a comment on the article, but it said "You must be logged in to post a comment". I was already logged in, but I clicked on the link anyway, and it just took me back to the article.
So, here's my comment:

"we can calculate P(X∈[0.49,0.51]|E)=0.969. Meaning that even though the coin is probably not exactly fair it is also probably close enough to fair to be considered equivalent for all practical purposes."

I'm not following the argument here. We could calculate P(X∈[0,1]|E)=1 and argue any observed frequency is fair. Don't we need to penalise the expansion of the range to include 0.5? I.e. increase p?
 
  • #14
haruspex said:
We could calculate P(X∈[0,1]|E)=1 and argue any observed frequency is fair.
This is not a pure calculation. Earlier I had defined ##X\in[0.49,0.51]## as being practically equivalent to a fair coin. This is a concept known as a ROPE (region of practical equivalence). So in this case a coin that is unfair by less than 0.01 is practically fair. That is a judgement call.

A ROPE is not something that is derived mathematically. It is something based on practical knowledge, or in the medical context on clinical judgement. I doubt that anyone would seriously consider a coin with ##X=0.99## to be practically fair. So you could calculate ##P(X\in [0,1]|E)## but I don’t think many people would agree that it is a ROPE for ##X=0.5##.

A ROPE allows one to provide evidence in support of a null hypothesis. With standard significance testing you can only reject a null hypothesis, never accept it. But if your posterior distribution is almost entirely inside a ROPE around the null hypothesis then you can accept the null hypothesis.

A similar concept is used in the testing of equivalence or non-inferiority with frequentist methods.
 

Similar threads

  • Sticky
  • Set Theory, Logic, Probability, Statistics
2
Replies
44
Views
7K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
2K
  • Sticky
  • Set Theory, Logic, Probability, Statistics
Replies
12
Views
4K
Replies
3
Views
2K
Replies
14
Views
822
Replies
4
Views
2K
  • Beyond the Standard Models
4
Replies
105
Views
10K
  • General Discussion
Replies
12
Views
1K
Replies
4
Views
2K
  • Quantum Interpretations and Foundations
Replies
1
Views
398
Back
Top