Bad astronomy = bad statistics?

In summary: Bayesian analysis that Brown and Batygin did in their paper actually seems to disprove the existence of Planet Nine...)Overall, the article is good, but there is a problem with one part that Mike Brown and Konstantin Batygin have published. They have shown that the alignments are not from any observation bias, and in fact the chance of the alignments being a coincidence is just 0.2%. This means that the alignment is more likely to be caused by Planet Nine.
  • #1
Vanadium 50
Staff Emeritus
Science Advisor
Education Advisor
2023 Award
33,223
19,700
Phil Plait, creator of Bad Astronomy, has an article on Planet 9. Overall, it's pretty good, but there was one part that got my hackles up:

https://twitter.com/plutokiller/status/1087495838379651072
Mike Brown and Konstantin Batygin (who have been the leading force behind the idea of Planet Nine being out there) have published a paper showing that the alignments are not from any observation bias, and in fact the chance of the alignments being a coincidence is just 0.2%... or, to make it more clear, the chance of the alignments being real is 99.8%.

It may be more clear, but at a cost of being more wrong.
 
  • Like
Likes StatGuy2000, jim mcnamara and russ_watters
Astronomy news on Phys.org
  • #2
Indeed, that is a typical misrepresentation of frequentist statements as Bayesian ones. The correct statement would be that if there is no Planet Nine, then the alignment would be smaller in 99.8% of the cases, which is not the same as saying that there is a 99.8% probability of the alignment being caused by something like Planet Nine. Very bad statistics indeed.
 
  • Like
Likes StatGuy2000, StoneTemplePython and jim mcnamara
  • #3
It should be pointed out that this statement is not in the preprint by Brown and Batygin.
 
  • #4
Even as a frequentist, the statement is untrue. A frequentist would say that under the null hypothesis, the probability of getting a result at least this discrepant is 1/500. That does not say anything whatever about a particular alternative hypothesis.
 
  • #5
Vanadium 50 said:
Even as a frequentist, the statement is untrue. A frequentist would say that under the null hypothesis, the probability of getting a result at least this discrepant is 1/500. That does not say anything whatever about a particular alternative hypothesis.
This was kind of my point. The 0.2% is a frequentist statement about how unlikely the data is given the null hypothesis. The 99.8% statement is a Bayesian statement about the likelihood of the null hypothesis being false (for which you need additional information about all alternative hypotheses and their likelihood). The only 99.8% statement that can be made based on the statements in the paper is that the null hypothesis would produce a less extreme alignment in 99.8% of the cases.
 
  • #6
Good points, the situation is similar to the question of whether someone who has just won a lottery can logically conclude it is vastly unlikely that they won by random chance. Of course everyone else can easily conclude that (just as any alien could conclude our solar system orbits are the way they are out of pure stochastic chance if they observe 500 others like ours but without that behavior), but what about the person who won? If they only ever bought 100 lottery tickets in their life, and they have a 1 in a million chance each time, the chance they would have won by random chance is only 0.01%. So using the questionable logic, they should conclude there is a 99.99% chance that they won for some kind of reason other than random chance.

Now, interestingly, it is possible that they could conclude they won for some kind of reason, but only if they think it is not extremely unlikely that seemingly random events actually happen for reasons. So that's the Bayesian element-- we cannot say they either are, or are not, justified in imagining some higher cause if they win a lottery-- it depends on how reasonable they think, a priori, it is to attribute cause to seemingly random events. (For example, maybe they bought the ticket because they had a feeling that they would win this time, so the issue becomes how willing are they to believe that feelings like that are meaningful.) In the case of planet 9, the entire argument is predicated on the idea that it would not be surprising if there was a planet 9, and it would not be surprising if such a planet could exist in a parameter space where it could have the desired effect. If we instead think that is quite unlikely, or if we search for a long time and don't see what we think we should see, then we reassess the likelihood of the planet-- without any change in the 0.2% figure from Plait's argument. The 99.8% figure seems more like an upper limit on the likelihood of the planet, and we only approach that figure if we already expected such a planet to be there.

On the other hand, there are a number of different observations that planet 9 seems to help us understand, so taking all the factors into consideration, the likelihood does seem to grow. But we can all agree that a careful Bayesian type analysis would be needed to come up with anything that looks like a probability that it is real, in the sense of the odds of a "good bet." A posteriori statistics are so much trickier than a priori ones! (And ironically, it means Phil Plait is guilty of bad astronomy there.)
 
Last edited:
  • #7
Ken G said:
The 99.8% figure seems more like an upper limit on the likelihood of the planet, and we only approach that figure if we already expected such a planet to be there.
It is not an upper limit. It actually has nothing to do with the alternative hypothesis. It just tells you that if you had equal priors on planet 9 vs no planet 9 and planet 9 predicts the result with 98% probability, then the posterior probabilities are going to be 98% for planet 9 and 2% for no planet 9. If planet 9 predicts the result with 100% probability, then its posterior probability is going to be larger than 98%.

Also, if your prior is 100% (or 0%) planet 9, no data in the world is going to change this and your posterior will be 100% (or 0%) planet 9 as well.
 
  • #8
Yeah, Bayesian results require some actual calculation. Indeed, I can understand starting with equal priors on planet 9 or no planet 9 (which is a big leap already, since simply not knowing if it's there or not just means our prior isn't saying much), but how would we know that planet 9's existence predicts the data when it depends on the attributes of planet 9? That would be a hard calculation in itself, because not only would you need a prior on planet 9's existence, you'd also need a prior probability density on its possible orbit and mass. It seems to me you'd want to take some expected distribution of possible planet 9s, and see what probability measure is consistent with the behavior of the other objects. Then you could regard that as a relative probability you could compare with the 0.2% probability measure you conclude with no planet 9. This would seem to make it far less easy to favor the presence of planet 9, because there is a lot of parameter space out there where the existence of planet 9 would still not explain the behavior we see. If we think that only 0.1% of all possible planet 9s would be consistent with the data, then it sounds twice as likely that the behavior is random chance than that it's from planet 9! Framed, that way, the 99.8% figure not only sounds wrong, it sounds drastically wrong.
 
  • #9
Ken G said:
but how would we know that planet 9's existence predicts the data when it depends on the attributes of planet 9? That would be a hard calculation in itself, because not only would you need a prior on planet 9's existence, you'd also need a prior probability density on its possible orbit and mass. It seems to me you'd want to take some expected distribution of possible planet 9s, and see what probability measure is consistent with the behavior of the other objects.
Yes, if you have a model with any sort of parameters, you need to put a prior distribution on that model's parameter space and find the likelihood of the data given that prior.

Ken G said:
Then you could regard that as a relative probability you could compare with the 0.2% probability measure you conclude with no planet 9.
You mean "apply Bayes' theorem". In the actual Bayesian analysis you will not actually use the 0.2 %, you will use the actual likelihood of precisely your data (not the probability of getting a more extreme result in the null hypothesis).

Ken G said:
This would seem to make it far less easy to favor the presence of planet 9, because there is a lot of parameter space out there where the existence of planet 9 would still not explain the behavior we see.
This is a kind of natural Occam's razor that is built into the Bayesian analysis. A model that explains some data better in a larger part of its parameter space or a model that has fewer (relevant) parameters than makes predictions line up with observation are going to be favoured. However, as long as you can find a part of parameter space that has a sufficiently high likelihood, you are still going to favour a model with more parameters over one that simply does not predict the data.

Ken G said:
If we think that only 0.1% of all possible planet 9s would be consistent with the data, then it sounds twice as likely that the behavior is random chance than that it's from planet 9!
It is wrong to think along those lines too and I think you are misrepresenting the data. First of all, the Bayesian analysis deals with the likelihood, not the probability of generating extreme data. Second, even if you for some reason chose to use the probability of generating extreme data based on the different hypotheses, I find it unlikely that the planet 9 hypothesis would provide something that was significantly worse than a 0.2% probability in most of the parameter space. The actual probability of getting such extreme data would have to be marginalised over the model parameters. It has very little to do with just saying "predict the data", which is a rather vague statement.

Ken G said:
Framed, that way, the 99.8% figure not only sounds wrong, it sounds drastically wrong.
The wrong part is in the interpretation. You really cannot make any statement at all regarding model probability in a Frequentist setting. However, as pointed out above, your analysis is not exactly kosher either.
 
  • #10
The basic idea in what I was saying is that if you imagine two million solar systems that seem similar enough to ours that you can imagine ours is drawn at random from that sample (however you choose make that distinction, which is already tricky), and one million of them have a planet 9 and one million don't (based on some assumptions about the appropriate planet 9 parameter space), then we could ask which of those sets of a million contain more examples of behaviors similar to what we see in our solar system. The article seemed to suggest that fraction would be 0.2%, or 2000 here, in the "no planet 9" sample (I don't know how they arrive at 0.2%, but it seems to be their conclusion). I was saying we must also ask, what would be the fraction in the "random planet 9" sample? Phil Plait seemed to think that if only 0.2% of the no-planet-9 sample exhibited the odd behaviors we see, then we should conclude we are 99.8% likely to have been drawn from the planet 9 sample. That seems to assume that the entire "random planet 9" sample exhibits the odd behavior, not just some small fraction that have the right planet 9 attributes. It seems more likely that the random planet 9 sample would also quite rarely exhibit the behavior we see, so the more informative comparison should be the relative sizes of those two samples. It might not favor planet 9 at all, depending on our assumptions about a generic planet 9 parameter space.
 
  • #11
Ken G said:
That seems to assume that the entire "random planet 9" sample exhibits the odd behavior, not just some small fraction that have the right planet 9 attributes.
Actually, it assumes that 99.8 % of that sample has that behaviour.

Ken G said:
It seems more likely that the random planet 9 sample would also quite rarely exhibit the behavior we see, so the more informative comparison should be the relative sizes of those two samples. It might not favor planet 9 at all, depending on our assumptions about a generic planet 9 parameter space.
The point, again, is that you are trying to do a Bayesian analysis based on a frequentist measure. What you should look at is the likelihood of your exact observation, not the probability of having a result as or more extreme under the null hypothesis.

Regardless, even if you do that, it is unlikely that a planet 9 scenario would give you a smaller fraction than in the no planet 9 scenario. You seem to be grasping for a Bayesian analysis and the proper thing to do then would be to look at the actual likelihoods rather than integrated ones.
 
  • #12
I think we're getting off track. Plait is wrong not because he's not a Baysean. Plait is wrong because he assumes there are only two logical possibilities: totally random motion, or Planet 9.
 
  • #13
Vanadium 50 said:
I think we're getting off track. Plait is wrong not because he's not a Baysean. Plait is wrong because he assumes there are only two logical possibilities: totally random motion, or Planet 9.
I disagree with this. Even if there were just two logical possibilities the statement would not be correct.
 
  • #14
And for my part, I don't see that either the bimodal aspect (yes or no on planet 9) or the integrated versus differential character of the probability distribution are key elements of what is wrong in the logic. Those both just seem like binning issues to me, like how coarse or fine is the bidding choice. I'm thinking that the core issue is the relative fairness of the comparison being made.

A better controlled analogy might make these issues clearer. Let's say we give a 10 question true/false test to a large group of people that contains detailed questions about our own personal life, things that people who don't know us at all have no idea how to answer. But we know that half the people in the group actually do know us quite well, our friends and family, and the other half don't know us from a hole in the wall. Then the first test we grade gets 9/10, and we want to test the question, what are the chances that this test came from the cohort that doesn't know us at all? It would seem by the logic being critiqued, one might say that the probability is 11/210, or about 1%, of getting at least 9 out of 10 by random chance, so that means there's a 99% chance the test came from people who know us. We all agree this is wrong, but it's not wrong because 10/10 is being binned with 9/10 as some kind of integral of "as good or better" than the test we graded, and it's not wrong because we are looking at the people as either knowing us well, or not knowing us at all. It's wrong because we have not included the fact that even people who know us might not know all our personal details, so might do better than 5/10 on average, but might not do as well as 9/10 on average. We would need to calculate how well we expect our friends and family to do on the exam on the average, and figure out what fraction would get 9/10 or better. If that fraction is, say, only 10%, then we should compare the 10% to the 1% and conclude it is ten times more likely the exam came from that cohort-- but not 99% likely, more like 91%.

So that latter calculation seems to be the key missing element in the evidence for planet 9. If there is a 0.2% fraction of solar systems with behavior as aberrant, or more aberrant, than what we see in ours, that doesn't mean it's unlikely due to random chance. It means we need to look at anything else we think could explain the behavior, and assess how often in a random sample would that explanation produce that behavior. That number might be something like 1%, who knows, but the point is it must be calculated to assess the relative likelihoods. If it's only 5 times higher fraction, then we can only say it is 5 times more likely there is a planet 9 than that there isn't (if we are ambivalent about its existence, giving it a 50/50 chance out of ignorance prior to looking at any data). And of course that's a very difficult number to determine, since we'd need to know some kind of probability distribution of possible planet 9s. I think that last part is what we are all saying in different ways, the evidence for planet 9 is not necessarily anywhere close to 99.8% even if there seems to be only a 0.2% chance of getting the aberrant behavior we see from purely random chance.
 
Last edited:
  • #15
Ken G said:
you'd also need a prior probability density on its possible orbit and mass.
Yes, that type of approach is pretty standard. Sometimes such parameters of parameters are called hyperparameters
 
  • #16
Pluto is the ninth planet. :-p
 
  • Like
Likes diogenesNY
  • #17
Ah yes, hyperparameters-- that makes sense.
 

1. What is the relationship between bad astronomy and bad statistics?

The term "bad astronomy" refers to the practice of using faulty or incorrect astronomical data or methods. This can lead to inaccurate conclusions and results. Similarly, "bad statistics" refers to the use of flawed statistical methods or data analysis, which can also result in incorrect conclusions. Thus, bad astronomy can often lead to bad statistics, as the data being used is not reliable or accurate.

2. How does bad astronomy affect scientific research?

Bad astronomy can have a significant impact on scientific research, as it can lead to incorrect conclusions and findings. This can waste time, resources, and funding, as well as potentially misleading other researchers who may build upon the faulty data or methods. It can also damage the credibility of the scientific community and the public's trust in science.

3. What are some examples of bad astronomy and bad statistics in practice?

Examples of bad astronomy can include using outdated or incorrect astronomical data, misinterpreting data, or using faulty methods for data collection or analysis. Bad statistics can include using small sample sizes, biased data, or incorrect statistical tests. An example of both bad astronomy and bad statistics would be using data from a faulty telescope to make conclusions about the properties of a celestial object.

4. How can bad astronomy and bad statistics be avoided?

To avoid bad astronomy and bad statistics, it is important for scientists to thoroughly review and validate their data and methods before drawing conclusions. This can include using multiple sources of data, ensuring the accuracy and reliability of instruments, and using appropriate statistical methods. Collaborating with other scientists and peer-reviewing research can also help identify and correct any potential errors.

5. What are the consequences of ignoring the principles of good astronomy and statistics?

Ignoring the principles of good astronomy and statistics can have serious consequences, both for scientific research and the public's understanding of science. It can lead to incorrect conclusions and findings, which can have a ripple effect on future research and the development of new theories. It can also damage the public's trust in science and the credibility of the scientific community. Therefore, it is crucial for scientists to adhere to the principles of good astronomy and statistics to ensure the accuracy and reliability of their research.

Similar threads

Replies
17
Views
2K
  • STEM Academic Advising
2
Replies
35
Views
4K
Replies
75
Views
8K
  • Set Theory, Logic, Probability, Statistics
Replies
21
Views
3K
  • General Discussion
Replies
15
Views
2K
  • Sci-Fi Writing and World Building
Replies
3
Views
2K
  • STEM Academic Advising
Replies
4
Views
2K
  • Classical Physics
Replies
9
Views
6K
Replies
15
Views
7K
Back
Top