ArXiv crackpot filter developed by accident

Click For Summary
A program designed to categorize arXiv submissions inadvertently aids in identifying crackpot theories, as these submissions often do not fit conventional categories. The discussion highlights the challenges faced by non-native English speakers in scientific communication, which can lead to misclassification as outsiders or crackpots. Participants emphasize that effective scientific writing should prioritize clarity and standard terminology to avoid confusion. The conversation also touches on the need for better fact-checking mechanisms to distinguish between genuine scientific discourse and unfounded claims. Overall, the thread underscores the complexities of language and classification in the scientific community.
  • #31
Sir Isaac Newton famously and graciously wrote: If I have seen further it is by standing on the shoulders of Giants.

I would add: If you stand on the shoulders of giants, it is hard to see what is right in front of you.

My serious point is this: we require our youngsters to invest years of hard work learning the grand edifice of their science. Once they have done so, it can be painful for them to accept something that does not fit. I think it is no accident that breakthroughs often come from the young or weirdos -- they are not yet entrapped in the warm and blinding company of their peers. We do not want to be bored by crackpots, and they surely would disrupt a forum designed to explain the scientific edifice, but we also do not want to miss a grand idea just because the presentation or presenter is rough around the edges.
 
  • Like
Likes BiGyElLoWhAt
Physics news on Phys.org
  • #32
Bill McKeeman said:
we also do not want to miss a grand idea just because the presentation or presenter is rough around the edges
I don't think this has ever happened.

One of the most humbling experiences of my life was going into my PhD advisor's office with a grand idea. Without saying a word he turned to his filing cabinet and pulled out a paper. There was my grand idea, published a decade previously. The thing about this experience is that it happened repeatedly over the course of years.

Grand ideas simply don't come out of nowhere. The claim that we are missing out on any grand ideas from crackpots is just fantasy.
 
  • Like
Likes mfb and russ_watters
  • #33
If it ever happens, you'll never see it. The idea disappears forever when that happens. I bet it has happened at least once, but more likely twice or more. How great of ideas were they? The world will never know.
 
  • #34
BiGyElLoWhAt said:
The idea disappears forever when that happens
I really doubt that. Big ideas tend to be I dependently developed by more than one person at the same time. They are, after all, standing on the same shoulders.
 
  • #35
Regarding Dale's comment: I agree that we cannot afford to clutter our intellectual life with repetitious versions of previously presented results. I too was once embarrassed to find my great idea for navigation in a book on the shelf of the Chief of Naval Navigation. It worked out well. I got to meet the real inventor Admiral Weems who encouraged me to try again. Tonight I watched The Imitation Game about Alan Turing and WW II cryptology which caused me to reflect on a truly weird individual who led a breakthrough into a new technology.
 
  • #36
Bill McKeeman said:
I think it is no accident that breakthroughs often come from the young or weirdos -- they are not yet entrapped in the warm and blinding company of their peers. We do not want to be bored by crackpots, and they surely would disrupt a forum designed to explain the scientific edifice, but we also do not want to miss a grand idea just because the presentation or presenter is rough around the edges.

Dale said:
I don't think this has ever happened.

It depends on how big a threshold you have for a "grand idea."

My most widely cited paper was in a field I had never published in before. My PhD is in experimental atomic Physics, the paper was describing a hypothesis my wife and I developed in blast-related traumatic brain injury. We've offended lots of neuroscientists since then due to our rough edges and lack of knowledge regarding the secret handshake and subtle use of the language. But our hypothesis that the brain can be injured by the blast wave impacting the thorax has since been experimentally verified and stood the test of time.

Unlike many crackpot papers, my wife and I spent a long time and carefully reviewed a large volume of relevant literature while developing the idea and writing the paper. We considered carefully what kinds of experiments would be needed to test the hypothesis and whether there was already a convincing experimental disproof in the literature. We knew that there was a strong bias among experts in the field of brain injury (and medical community in general) that only insults to the head/brain can cause traumatic brain injuries. See: https://arxiv.org/ftp/arxiv/papers/0812/0812.4757.pdf

We knew through private communications and a review of the literature that our paper would likely be subject to strong biases in the peer-review process. As a result, we choose to publish it in a journal that was not peer reviewed (Medical Hypotheses) as well as arXiv. The follow-up paper where we estimated injury thresholds for the thoracic mechanism of blast-induced traumatic brain injury was published through a back door: a special issue of a journal reporting findings presented at a conference (as well as arXiv). The existing biases against the new idea would probably have otherwise prevented either paper from being published through the more traditional peer review process of submitting the paper to established experts. See: https://arxiv.org/ftp/arxiv/papers/1102/1102.1508.pdf

Both papers have been widely cited, and the benefit of hindsight shows them to be basically right, although the threshold paper somewhat overestimated the blast pressure needed to produce a traumatic brain injury by impacting the thorax. The whole field of blast-TBI shows how wrong the experts can be. For many decades most of the brain injury experts scoffed at the idea of blast-induced TBI (shell shock) and mis-attributed all the symptoms to psychological effects and malingering.
 
  • #37
Today a crackpot gave a talk in my university.
Unfortunately I wasn't good enough at math and physics to criticize him properly(he was of the advanced kind, and the audience were master and doctoral students and two professors!oh...and a postdoc!) but I did put up a fight.
Its amazing how many people actually take them seriously.
 
  • #38
I suppose let me reiterate. I'm not suggesting that we are missing out on any idea from "crack pots", but that we are potentially missing out on ideas from papers mislabeled as crackpottery, and that fall through the cracks.
As said in the article, there isn't a clear definition of what is considered crackpottery, we just sort of "know" when it is or isn't, which is why it's so surprising that this labeling algorithm is pulling out the crackpot papers. This is mainly due to the correlation between odd phrases and crackpottery. The two don't necessarily have to go hand in hand, either. Looking through the comments after the article, the point arises: How many of Einstein's papers would be filtered out as crackpottery?
Of course, those would probably belong to the group of crackpot papers that were then realized were actually not crack pot papers.
 
  • #39
Dr. Courtney said:
My most widely cited paper...
And so your grand idea was not lost.
 
  • #40
BiGyElLoWhAt said:
that we are potentially missing out on ideas from papers mislabeled as crackpottery
Again, I don't think that this happens. Important ideas are usually developed by multiple people independently.

To me, this is like saying that the NFL is missing out on all of the touchdowns that could be scored by people who have played sports video games. Or that medicine is missing out on all of the lives that could have been saved by all of the people who have watched an episode of Gray's Anatomy. In principle it could happen, but in practice it is a non existent concern.

BiGyElLoWhAt said:
How many of Einstein's papers would be filtered out as crackpottery?
Probably 0. He was not a crackpot and he did not write like a crackpot.
 
  • #41
BiGyElLoWhAt said:
How many of Einstein's papers would be filtered out as crackpottery?
Of course, those would probably belong to the group of crackpot papers that were then realized were actually not crack pot papers.

Maybe a a few (some o the papers he wrote towards the end of his life WERE borderline crackpottery). However, that would be true of many papers written during that era. One of the signs of a crackpot is someone who writes in an anachronistic style; usually because they are not familiar with modern terminology or notation (typically because someone like Tesla is their big hero). Modern physics and its many conventions when it comes to style of writing etc, has only really been around since the 1930s and did not really take off until after WWII (due to the explosion in the number of researchers and the number of journals and the establishment of English as the lingua franca of physics)
If someone was to submit an article written using 1970s notation (using say exclusively CGS units*) to a journal and I was referee I would reject it outright and I would presume it was written by a crackpot.

*Yes I do know CGS units are still used in certain fields (magnetic materials etc), but they are all weird...
 
  • #42
Dale said:
Probably 0. He was not a crackpot and he did not write like a crackpot.

I think you're missing what I'm saying, which is just emphasizing one of the points made in the article. Define: Crackpottery. (rigorously define, please).
The article emphasized the ability of this algorithm to find crackpot articles, but the surprising thing is not that it does this, it's that it isn't designed to. It has effectively found a correlation between a phrase's lack of commonality and the quality of substance of the containing article. This in no way says that someone who uses odd phrases is a crackpot, they might just not have either A) the experience to know which phrases are common enough to use, or B) the ability to care about which phrases are common enough to use.

Is it possible that we have missed zero ideas from the ill definition of crackpottery? Sure. Is it possible (and in my opinion more likely) that we have missed at least one good idea from this ill definition. Also yes.
 
  • #43
f95toli said:
some o the papers he wrote towards the end of his life WERE borderline crackpottery
Which papers are you referring to? As far as I know, when he was dying, he was working on Kaluza-Klein. Surely that's not what you're referring to? If so then it follows that string theory is crackpottery?
 
  • #44
BiGyElLoWhAt said:
I think you're missing what I'm saying, which is just emphasizing one of the points made in the article. Define: Crackpottery. (rigorously define, please).
Crackpottery in physics is the thing measured by the following test:
http://math.ucr.edu/home/baez/crackpot.html

Similar measurement instruments could be developed in other fields.
 
  • Like
Likes BiGyElLoWhAt
  • #45
Dale said:
Crackpottery in physics is the thing measured by the following test:
http://math.ucr.edu/home/baez/crackpot.html
Similar measurement instruments could be developed in other fields.

That's interesting. I'm not sure how this actually works, though. It seems as though the more points, the more crackpot it is. Considering number 1 is a 5 point starting credit, everything is a little bit crackpotty.
2) Were I working on, say, quantum gravity, it is likely that Einstein and Feynman would pop up on occasion, rendering my otherwise perfectly valid paper highly crackpotty. I suppose (from reading further on) that he's referring to the misspellings?
3) I'm pretty sure there's a reward out there to prove asymptotic freedom (maybe it's quark confinement?). I would assume that this reward would be equally applicable to someone proving the nonexistence of the proof for asymptotic freedom (or quark confinement).
4) Quite honestly, most of that is just silly, and I could probably come up with another list that included none or few of these elements and it could be equally applicable.

It is kind of funny that someone put the time into coming up with that list, though.
I particularly like number 31) "30 points for claiming that your theories were developed by an extraterrestrial civilization (without good evidence)."
I think the theory would potentially be overlooked by the "good evidence" part :biggrin:
 
  • #46
However, this does seem to justify my reemphasis of the point that there is no rigorous definition of whether a paper is crackpottery or not.
 
  • #47
BiGyElLoWhAt said:
Considering number 1 is a 5 point starting credit, everything is a little bit crackpotty.
That's a -5 starting credit!

BiGyElLoWhAt said:
It is kind of funny that someone put the time into coming up with that list, though.

Not funny when you remember that he used to be(and maybe still is) being flooded with letters from crackpots!
 
  • Like
Likes BiGyElLoWhAt
  • #48
I suppose that is a (-)5 point credit. Apologies.
 
  • #49
Also you have to read the names carefully, Einstien, Feynmann and Hawkins !
They are wrong.
 
  • #50
I did see that, but only after his point about messaging him about how he misspelled Einstien. I just sort of skimmed through it, if that wasn't obvious.
 
  • #51
Dale said:
I don't think this has ever happened.

Galois couldn't get his paper accepted because it was written so badly. "The ink was almost white," the reviewer said.

Grassman couldn't get linear algebra accepted because of the rough presentation. He self-published.
 
  • #52
Hornbein said:
Galois couldn't get his paper accepted because it was written so badly. "The ink was almost white," the reviewer said.

Well then he should have put more time into writing his paper. Its like complaining that your paper isn't accepted because it has a spelling error every other line.

Grassman couldn't get linear algebra accepted because of the rough presentation. He self-published.

What do you mean with rough presentation, ambiguous? Or perhaps too fast (using the trivial-argument)?In both cases I'd say that it doesn't matter how good your research is if the reader has to go to extraordinary lengths to comprehend it.
Be it using a magnifying glass to read it or figure out new techniques with only a statement of the problem and the result.
 
  • #53
JorisL said:
What do you mean with rough presentation, ambiguous? Or perhaps too fast (using the trivial-argument)?

He was self-taught and used strange terminology, I think. In at least one case the referee was enthusiastic about the contents, but thought the presentation was too poor. It didn't help that his ideas were so original.
 
  • #54
Hornbein said:
He was self-taught and used strange terminology, I think. In at least one case the referee was enthusiastic about the contents, but thought the presentation was too poor.

Isn't that normal? How can you understand something if the terminology isn't standard, adding an extra layer of difficulty.
Ideally the referee would help him contact someone that's willing to help but this requires even a basic understanding of what's actually done.
 
  • #55
JorisL said:
Isn't that normal? How can you understand something if the terminology isn't standard, adding an extra layer of difficulty.
Ideally the referee would help him contact someone that's willing to help but this requires even a basic understanding of what's actually done.

The question under discussion is whether we might "miss a grand idea just because the presentation or presenter is rough around the edges." The question is not whether poor presentations are difficult to understand.
 
  • Like
Likes BiGyElLoWhAt
  • #56
BiGyElLoWhAt said:
However, this does seem to justify my reemphasis of the point that there is no rigorous definition of whether a paper is crackpottery or not.
The point is that it isn't necessary to have a separate definition, just a measurement that you define as crackpottery.

The arXiv filter itself can be considered a measurement and crackpottery can be defined as a particular score or range of scores on that measurement. You could even take some reference standard crackpots and calibrate other similar measurements. All without ever providing a non empirical definition.
 
  • Like
Likes BiGyElLoWhAt
  • #57
I suppose that's a valid point.
 
  • #58
Hornbein said:
Galois couldn't get his paper accepted because it was written so badly. "The ink was almost white," the reviewer said.

Grassman couldn't get linear algebra accepted because of the rough presentation. He self-published.
And we have not missed their ideas, we have them and use them.

I think that Galois' case is pretty much a worst case example, the dissemination was delayed by about 10 years due to the poor presentation.
 
  • #59
Dale said:
, the dissemination was delayed by about 10 years due to the poor presentation.

And the fact that Galois was unable to revise anything in those 10 years, being rather inconveniently dead.
 
  • #60
Vanadium 50 said:
And the fact that Galois was unable to revise anything in those 10 years, being rather inconveniently dead.
Yes, pretty much a worst case indeed.
 

Similar threads

  • · Replies 54 ·
2
Replies
54
Views
16K
Replies
8
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
Replies
1
Views
2K
  • · Replies 22 ·
Replies
22
Views
2K
Replies
2
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 9 ·
Replies
9
Views
3K
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
4K