I Bad astronomy = bad statistics?

  • Thread starter Vanadium 50
  • Start date
  • Featured

Vanadium 50

Staff Emeritus
Science Advisor
Education Advisor
22,453
4,778
Phil Plait, creator of Bad Astronomy, has an article on Planet 9. Overall, it's pretty good, but there was one part that got my hackles up:

https://twitter.com/plutokiller/status/1087495838379651072
Mike Brown and Konstantin Batygin (who have been the leading force behind the idea of Planet Nine being out there) have published a paper showing that the alignments are not from any observation bias, and in fact the chance of the alignments being a coincidence is just 0.2%... or, to make it more clear, the chance of the alignments being real is 99.8%.
It may be more clear, but at a cost of being more wrong.
 

Orodruin

Staff Emeritus
Science Advisor
Homework Helper
Insights Author
Gold Member
2018 Award
15,054
5,267
Indeed, that is a typical misrepresentation of frequentist statements as Bayesian ones. The correct statement would be that if there is no Planet Nine, then the alignment would be smaller in 99.8% of the cases, which is not the same as saying that there is a 99.8% probability of the alignment being caused by something like Planet Nine. Very bad statistics indeed.
 

Orodruin

Staff Emeritus
Science Advisor
Homework Helper
Insights Author
Gold Member
2018 Award
15,054
5,267
It should be pointed out that this statement is not in the preprint by Brown and Batygin.
 

Vanadium 50

Staff Emeritus
Science Advisor
Education Advisor
22,453
4,778
Even as a frequentist, the statement is untrue. A frequentist would say that under the null hypothesis, the probability of getting a result at least this discrepant is 1/500. That does not say anything whatever about a particular alternative hypothesis.
 

Orodruin

Staff Emeritus
Science Advisor
Homework Helper
Insights Author
Gold Member
2018 Award
15,054
5,267
Even as a frequentist, the statement is untrue. A frequentist would say that under the null hypothesis, the probability of getting a result at least this discrepant is 1/500. That does not say anything whatever about a particular alternative hypothesis.
This was kind of my point. The 0.2% is a frequentist statement about how unlikely the data is given the null hypothesis. The 99.8% statement is a Bayesian statement about the likelihood of the null hypothesis being false (for which you need additional information about all alternative hypotheses and their likelihood). The only 99.8% statement that can be made based on the statements in the paper is that the null hypothesis would produce a less extreme alignment in 99.8% of the cases.
 

Ken G

Gold Member
4,422
325
Good points, the situation is similar to the question of whether someone who has just won a lottery can logically conclude it is vastly unlikely that they won by random chance. Of course everyone else can easily conclude that (just as any alien could conclude our solar system orbits are the way they are out of pure stochastic chance if they observe 500 others like ours but without that behavior), but what about the person who won? If they only ever bought 100 lottery tickets in their life, and they have a 1 in a million chance each time, the chance they would have won by random chance is only 0.01%. So using the questionable logic, they should conclude there is a 99.99% chance that they won for some kind of reason other than random chance.

Now, interestingly, it is possible that they could conclude they won for some kind of reason, but only if they think it is not extremely unlikely that seemingly random events actually happen for reasons. So that's the Bayesian element-- we cannot say they either are, or are not, justified in imagining some higher cause if they win a lottery-- it depends on how reasonable they think, a priori, it is to attribute cause to seemingly random events. (For example, maybe they bought the ticket because they had a feeling that they would win this time, so the issue becomes how willing are they to believe that feelings like that are meaningful.) In the case of planet 9, the entire argument is predicated on the idea that it would not be surprising if there was a planet 9, and it would not be surprising if such a planet could exist in a parameter space where it could have the desired effect. If we instead think that is quite unlikely, or if we search for a long time and don't see what we think we should see, then we reassess the likelihood of the planet-- without any change in the 0.2% figure from Plait's argument. The 99.8% figure seems more like an upper limit on the likelihood of the planet, and we only approach that figure if we already expected such a planet to be there.

On the other hand, there are a number of different observations that planet 9 seems to help us understand, so taking all the factors into consideration, the likelihood does seem to grow. But we can all agree that a careful Bayesian type analysis would be needed to come up with anything that looks like a probability that it is real, in the sense of the odds of a "good bet." A posteriori statistics are so much trickier than a priori ones! (And ironically, it means Phil Plait is guilty of bad astronomy there.)
 
Last edited:

Orodruin

Staff Emeritus
Science Advisor
Homework Helper
Insights Author
Gold Member
2018 Award
15,054
5,267
The 99.8% figure seems more like an upper limit on the likelihood of the planet, and we only approach that figure if we already expected such a planet to be there.
It is not an upper limit. It actually has nothing to do with the alternative hypothesis. It just tells you that if you had equal priors on planet 9 vs no planet 9 and planet 9 predicts the result with 98% probability, then the posterior probabilities are going to be 98% for planet 9 and 2% for no planet 9. If planet 9 predicts the result with 100% probability, then its posterior probability is going to be larger than 98%.

Also, if your prior is 100% (or 0%) planet 9, no data in the world is going to change this and your posterior will be 100% (or 0%) planet 9 as well.
 

Ken G

Gold Member
4,422
325
Yeah, Bayesian results require some actual calculation. Indeed, I can understand starting with equal priors on planet 9 or no planet 9 (which is a big leap already, since simply not knowing if it's there or not just means our prior isn't saying much), but how would we know that planet 9's existence predicts the data when it depends on the attributes of planet 9? That would be a hard calculation in itself, because not only would you need a prior on planet 9's existence, you'd also need a prior probability density on its possible orbit and mass. It seems to me you'd want to take some expected distribution of possible planet 9s, and see what probability measure is consistent with the behavior of the other objects. Then you could regard that as a relative probability you could compare with the 0.2% probability measure you conclude with no planet 9. This would seem to make it far less easy to favor the presence of planet 9, because there is a lot of parameter space out there where the existence of planet 9 would still not explain the behavior we see. If we think that only 0.1% of all possible planet 9s would be consistent with the data, then it sounds twice as likely that the behavior is random chance than that it's from planet 9! Framed, that way, the 99.8% figure not only sounds wrong, it sounds drastically wrong.
 

Orodruin

Staff Emeritus
Science Advisor
Homework Helper
Insights Author
Gold Member
2018 Award
15,054
5,267
but how would we know that planet 9's existence predicts the data when it depends on the attributes of planet 9? That would be a hard calculation in itself, because not only would you need a prior on planet 9's existence, you'd also need a prior probability density on its possible orbit and mass. It seems to me you'd want to take some expected distribution of possible planet 9s, and see what probability measure is consistent with the behavior of the other objects.
Yes, if you have a model with any sort of parameters, you need to put a prior distribution on that model's parameter space and find the likelihood of the data given that prior.

Then you could regard that as a relative probability you could compare with the 0.2% probability measure you conclude with no planet 9.
You mean "apply Bayes' theorem". In the actual Bayesian analysis you will not actually use the 0.2 %, you will use the actual likelihood of precisely your data (not the probability of getting a more extreme result in the null hypothesis).

This would seem to make it far less easy to favor the presence of planet 9, because there is a lot of parameter space out there where the existence of planet 9 would still not explain the behavior we see.
This is a kind of natural Occam's razor that is built into the Bayesian analysis. A model that explains some data better in a larger part of its parameter space or a model that has fewer (relevant) parameters than makes predictions line up with observation are going to be favoured. However, as long as you can find a part of parameter space that has a sufficiently high likelihood, you are still going to favour a model with more parameters over one that simply does not predict the data.

If we think that only 0.1% of all possible planet 9s would be consistent with the data, then it sounds twice as likely that the behavior is random chance than that it's from planet 9!
It is wrong to think along those lines too and I think you are misrepresenting the data. First of all, the Bayesian analysis deals with the likelihood, not the probability of generating extreme data. Second, even if you for some reason chose to use the probability of generating extreme data based on the different hypotheses, I find it unlikely that the planet 9 hypothesis would provide something that was significantly worse than a 0.2% probability in most of the parameter space. The actual probability of getting such extreme data would have to be marginalised over the model parameters. It has very little to do with just saying "predict the data", which is a rather vague statement.

Framed, that way, the 99.8% figure not only sounds wrong, it sounds drastically wrong.
The wrong part is in the interpretation. You really cannot make any statement at all regarding model probability in a Frequentist setting. However, as pointed out above, your analysis is not exactly kosher either.
 

Ken G

Gold Member
4,422
325
The basic idea in what I was saying is that if you imagine two million solar systems that seem similar enough to ours that you can imagine ours is drawn at random from that sample (however you choose make that distinction, which is already tricky), and one million of them have a planet 9 and one million don't (based on some assumptions about the appropriate planet 9 parameter space), then we could ask which of those sets of a million contain more examples of behaviors similar to what we see in our solar system. The article seemed to suggest that fraction would be 0.2%, or 2000 here, in the "no planet 9" sample (I don't know how they arrive at 0.2%, but it seems to be their conclusion). I was saying we must also ask, what would be the fraction in the "random planet 9" sample? Phil Plait seemed to think that if only 0.2% of the no-planet-9 sample exhibited the odd behaviors we see, then we should conclude we are 99.8% likely to have been drawn from the planet 9 sample. That seems to assume that the entire "random planet 9" sample exhibits the odd behavior, not just some small fraction that have the right planet 9 attributes. It seems more likely that the random planet 9 sample would also quite rarely exhibit the behavior we see, so the more informative comparison should be the relative sizes of those two samples. It might not favor planet 9 at all, depending on our assumptions about a generic planet 9 parameter space.
 

Orodruin

Staff Emeritus
Science Advisor
Homework Helper
Insights Author
Gold Member
2018 Award
15,054
5,267
That seems to assume that the entire "random planet 9" sample exhibits the odd behavior, not just some small fraction that have the right planet 9 attributes.
Actually, it assumes that 99.8 % of that sample has that behaviour.

It seems more likely that the random planet 9 sample would also quite rarely exhibit the behavior we see, so the more informative comparison should be the relative sizes of those two samples. It might not favor planet 9 at all, depending on our assumptions about a generic planet 9 parameter space.
The point, again, is that you are trying to do a Bayesian analysis based on a frequentist measure. What you should look at is the likelihood of your exact observation, not the probability of having a result as or more extreme under the null hypothesis.

Regardless, even if you do that, it is unlikely that a planet 9 scenario would give you a smaller fraction than in the no planet 9 scenario. You seem to be grasping for a Bayesian analysis and the proper thing to do then would be to look at the actual likelihoods rather than integrated ones.
 

Vanadium 50

Staff Emeritus
Science Advisor
Education Advisor
22,453
4,778
I think we're getting off track. Plait is wrong not because he's not a Baysean. Plait is wrong because he assumes there are only two logical possibilities: totally random motion, or Planet 9.
 

Orodruin

Staff Emeritus
Science Advisor
Homework Helper
Insights Author
Gold Member
2018 Award
15,054
5,267
I think we're getting off track. Plait is wrong not because he's not a Baysean. Plait is wrong because he assumes there are only two logical possibilities: totally random motion, or Planet 9.
I disagree with this. Even if there were just two logical possibilities the statement would not be correct.
 

Ken G

Gold Member
4,422
325
And for my part, I don't see that either the bimodal aspect (yes or no on planet 9) or the integrated versus differential character of the probability distribution are key elements of what is wrong in the logic. Those both just seem like binning issues to me, like how coarse or fine is the bidding choice. I'm thinking that the core issue is the relative fairness of the comparison being made.

A better controlled analogy might make these issues clearer. Let's say we give a 10 question true/false test to a large group of people that contains detailed questions about our own personal life, things that people who don't know us at all have no idea how to answer. But we know that half the people in the group actually do know us quite well, our friends and family, and the other half don't know us from a hole in the wall. Then the first test we grade gets 9/10, and we want to test the question, what are the chances that this test came from the cohort that doesn't know us at all? It would seem by the logic being critiqued, one might say that the probability is 11/210, or about 1%, of getting at least 9 out of 10 by random chance, so that means there's a 99% chance the test came from people who know us. We all agree this is wrong, but it's not wrong because 10/10 is being binned with 9/10 as some kind of integral of "as good or better" than the test we graded, and it's not wrong because we are looking at the people as either knowing us well, or not knowing us at all. It's wrong because we have not included the fact that even people who know us might not know all our personal details, so might do better than 5/10 on average, but might not do as well as 9/10 on average. We would need to calculate how well we expect our friends and family to do on the exam on the average, and figure out what fraction would get 9/10 or better. If that fraction is, say, only 10%, then we should compare the 10% to the 1% and conclude it is ten times more likely the exam came from that cohort-- but not 99% likely, more like 91%.

So that latter calculation seems to be the key missing element in the evidence for planet 9. If there is a 0.2% fraction of solar systems with behavior as aberrant, or more aberrant, than what we see in ours, that doesn't mean it's unlikely due to random chance. It means we need to look at anything else we think could explain the behavior, and assess how often in a random sample would that explanation produce that behavior. That number might be something like 1%, who knows, but the point is it must be calculated to assess the relative likelihoods. If it's only 5 times higher fraction, then we can only say it is 5 times more likely there is a planet 9 than that there isn't (if we are ambivalent about its existence, giving it a 50/50 chance out of ignorance prior to looking at any data). And of course that's a very difficult number to determine, since we'd need to know some kind of probability distribution of possible planet 9s. I think that last part is what we are all saying in different ways, the evidence for planet 9 is not necessarily anywhere close to 99.8% even if there seems to be only a 0.2% chance of getting the aberrant behavior we see from purely random chance.
 
Last edited:
27,308
3,895
you'd also need a prior probability density on its possible orbit and mass.
Yes, that type of approach is pretty standard. Sometimes such parameters of parameters are called hyperparameters
 

Ken G

Gold Member
4,422
325
Ah yes, hyperparameters-- that makes sense.
 

Want to reply to this thread?

"Bad astronomy = bad statistics?" You must log in or register to reply here.

Physics Forums Values

We Value Quality
• Topics based on mainstream science
• Proper English grammar and spelling
We Value Civility
• Positive and compassionate attitudes
• Patience while debating
We Value Productivity
• Disciplined to remain on-topic
• Recognition of own weaknesses
• Solo and co-op problem solving

Latest threads

Top