ArXiv crackpot filter developed by accident

  • #1
34,060
9,935

Main Question or Discussion Point

A very interesting blog post (from @hossi).

A program that helps sorting arXiv submissions into categories frequently struggles with crackpot submissions - because they do not fit in anywhere. The program was never designed for it, but is helps finding them.
 

Answers and Replies

  • #2
Drakkith
Staff Emeritus
Science Advisor
20,746
4,457
Haha! Nice!
 
  • #3
1,486
3,051
A very interesting blog post (from @hossi).

A program that helps sorting arXiv submissions into categories frequently struggles with crackpot submissions - because they do not fit in anywhere. The program was never designed for it, but is helps finding them.
Fun! And interesting too...
 
  • #4
12,683
9,218
I almost automatically have to think about the documentation of Darwin's Life I saw today. Or this Japanese mathematician whose proof nobody else in the world can understand. (Sorry I've forgotten name and conjecture.) And Einstein has been lucky that a solar eclipse came around. And his "biggest stupidity" has now a value.
 
  • #5
259
786
A very interesting blog post (from @hossi).

A program that helps sorting arXiv submissions into categories frequently struggles with crackpot submissions - because they do not fit in anywhere. The program was never designed for it, but is helps finding them.
Nice indeed. Over time I have come to loath crackpottery because it does so much damage to those who are still learning. They come to believe things that haven't been proven, in occasions can be proved wrong, and sometimes are extremely biased toward a subject with more crackpoterry.

However, this is another reason to stick to my language when writing a scientific paper. Otherwise I think I would fall on that spit out category, even if I'm not, just because English is not my first language and if I were to write a paper in English, I would 100% certainly use words not used by native English speakers. Not because I'm not trained in science, but because I'm not natively trained in English.

Interesting to note is that once when I took an IQ test at a doctor, the IQ test was in my language and I scored above average. Yet, in the crappy online IQ tests, that are given in English, I score average and in ocassions even lower than average because they are in English and sometimes I don't even understand the instructions correctly (what they are asking me to do). So in my language I'm above average, in English, not over and sometimes even lower than average. Another reason for me to believe that online IQ tests are biased in favor of some groups at the expense of other groups.

So sorry before hand my PF fellows if sometimes I sound below average in my posts. :oops:

[PLAIN]http://backreaction.blogspot.de/2016/05/the-holy-grail-of-crackpot-filtering.html said:
It[/PLAIN] [Broken] doesn’t surprise me much – you can see this happening in comment sections all over the place: The “insiders” can immediately tell who is an “outsider.” Often it doesn’t take more than a sentence or two, an odd expression, a term used in the wrong context, a phrase that nobody in the field would ever use. It is only consequential that with smart software you can tell insiders from outsiders even more efficiently than humans.
It is therefore also consequential that a non native English speaker runs higher chances of being classified as an outsider by an English speaking community. :sorry:

Not because the person doesn't know science, but because the person doesn't communicate the same way, even if they apply the same scientific method. :confused:
 
Last edited by a moderator:
  • #6
34,060
9,935
Most scientists learn English as a foreign language. Scientific English is not so hard to learn I think. You don't want to use complicated grammar or unusual words* anyway.

*unusual as in "not used in everyday language AND not a scientific expression".
 
  • #7
12,683
9,218
Most scientists learn English as a foreign language. Scientific English is not so hard to learn I think. You don't want to use complicated grammar or unusual words* anyway.

*unusual as in "not used in everyday language AND not a scientific expression".
I've once been told by a scientist: (with a staccato accent) "Scientific English is broken English."
 
  • #8
MathematicalPhysicist
Gold Member
4,220
172
Most scientists learn English as a foreign language. Scientific English is not so hard to learn I think. You don't want to use complicated grammar or unusual words* anyway.

*unusual as in "not used in everyday language AND not a scientific expression".
I came across this week while reading Principles of Algebraic Geometry by Griffiths and Harris to the word "abut", I never encountered this word in my life.

So I guess that also scientific English uses quite a lot unusual words; it depends on the period of time the book was publsihed; if it's further to the past the language will differ from nowadays English.
 
  • #9
2,788
586
It is therefore also consequential that a non native English speaker runs higher chances of being classified as an outsider by an English speaking community. :sorry:

Not because the person doesn't know science, but because the person doesn't communicate the same way, even if they apply the same scientific method. :confused:
As someone knowing English as a second language, I've never encountered such a thing! I've been here since high school and my English wasn't as good as it is now surely. I even can say that I got better at English partly because of my involvement with physicsforums and during all these years, I don't remember being labeled a crackpot because of not being good enough at English. So I'm sure you don't need to worry about this.
When you regularly read scientific writings, either forum posts or papers, you can easily recognize a crackpot. I myself have experienced it several times. After even reading a sentence or two of the post, I said to myself this guy is surely a crackpot and it had nothing to do with the way s\he used English. In fact as far as I can remember, all of the crackpots I recognized were native speakers.
 
  • #10
437
103
Most scientists learn English as a foreign language. Scientific English is not so hard to learn I think. You don't want to use complicated grammar or unusual words* anyway.

*unusual as in "not used in everyday language AND not a scientific expression".
The fact that other people may not have a good command of a certain language should not preclude the use of less than common words. This is especially true if the word in question has a highly defined meaning. It should be self evident that one should give consideration to the intended recipient of a communication and select words accordingly.
I do not think it is appropriate that I should limit your vocabulary due to my lack of understanding of the language being used.

Cheers,

Billy
 
  • #11
34,060
9,935
That was not my point. Taken randomly from a random theoretical particle physics preprint on arXiv:
Using concrete examples, we demonstrate the most sensitive channels and relevant bounds, as well as the required integrated luminosity to rule out particular models explaining the diphoton excess. For concreteness, we assume throughout the paper that the resonance is a scalar singlet under the SM gauge group. Hence, its interactions with SM particles are captured at leading order by a set of dimension-5 operators suppressed by a new physics scale ##\Lambda## [11]. We further assume that the new resonance does not mix with the SM Higgs boson, as existing and projected limits from Higgs coupling measurements set strong indirect constraints [12].
You don't want to write literature - the physics is complicated enough, adding complicated grammar doesn't help anyone. And you also want to use words with a clear meaning to everyone, which usually means you use the words everyone else uses. I'm sure you can find different ways to say "leading order", for example, but why should you? There is certainly a synonym for "resonance", but everyone calls it resonance because everyone knows what everyone else means by that word.

Taken from a random crackpot webpage:
Energy and solid material are night and day. One is never like the other. Yet here, they say you get what is widely believed to be physical matter that does not behave the same in all frames of reference. They say we're all seeing solid material behaving like energy -- This mystery cannot be. It slaps us all across the face, demanding satisfaction, demanding a sensible explanation to prevent our imaginations from getting in the way of what is really happening. Here tiny bits of physical matter appear to imaginative theories to not act the same in all frames of reference. They say solid material creates a pattern of waves and no fact explains why, only theory; only guesses, through which our imagination's desperate appeals for attention and affirmation thrive in the face of no sustained argument to the contrary -- science fails where over imagination takes root.
See the difference?
That is not literature either, but the choice of words is completely different.
 
  • #12
437
103
Hi mfb,
I don't think we are in any disagreement. Most anyone not directly involved in physics would likely not understand the word "diphoton" used in the first text for example. I myself have only a very limited understanding of that word and I assume it means a resonance particle.

My comment was less about syntax and more about the use of words that clearly define something in standard language. For example, the word "bruise" is a word most anyone would understand and communicates well enough in some cases. It is not as descriptive as the term "subcutaneous hematoma" a word that many people would consider perhaps unusual.

I 100% agree with your example of the use of the word resonance, why would one call that term anything else. I don't see any synonym that I could use to replace that word. I also agree that the use of "literary" devices have little or no place in the language of science. While the phrase "the force that through the green fuse drives the flower" may produce a smile it would get in the way of a someone understanding biology....lol

I had little issue understanding both of the text you posted but I don't think either one was very well written. Both communicated. The first one was based on the laws of physics as we currently understand them. The second was pure speculation and opinion with no basis in fact written in a manner to elicit emotional response.

Real "crackpots" are easy to spot. They have a personal agenda inconsistent with generally excepted facts and are highly emotionally involved in their beliefs. On the other hand it is sometimes very difficult to spot people who are unformed and are stating something as fact. I hear so call subject matter experts on the TV news and in print and the internet, that have very little idea what they are talking about. This includes very well educated people who should know better. These folks are more problematic than the crackpots.

More than a crackpot algorithm we need a fact checking algorithm. Until that happens those people who actually know and understand things will be burdened with the responsibility of exposing and correcting the nonsense that gets written by crackpots and the unformed alike.

Cheers,

Billy
 
  • #13
TeethWhitener
Science Advisor
Gold Member
1,689
1,010
That was not my point. Taken randomly from a random theoretical particle physics preprint on arXiv:You don't want to write literature - the physics is complicated enough, adding complicated grammar doesn't help anyone. And you also want to use words with a clear meaning to everyone, which usually means you use the words everyone else uses. I'm sure you can find different ways to say "leading order", for example, but why should you? There is certainly a synonym for "resonance", but everyone calls it resonance because everyone knows what everyone else means by that word.

Taken from a random crackpot webpage:See the difference?
That is not literature either, but the choice of words is completely different.
This is one of the best examples I've ever seen.
 
  • #15
e.bar.goum
Science Advisor
Education Advisor
951
388
Most scientists learn English as a foreign language. Scientific English is not so hard to learn I think. You don't want to use complicated grammar or unusual words* anyway.

*unusual as in "not used in everyday language AND not a scientific expression".
Indeed. As a native English speaker, I've had to make a definite effort to simplify my choice of words and sentence structure for Scientific English.
 
  • #16
BiGyElLoWhAt
Gold Member
1,560
113
You know what? I actually really like the conclusion. That's been something that I've been struggling with for a bit. It seems like a lot of stuff is really quick to get tossed out if it doesn't have support by Kaku, or Hawking, or ...
 
  • #17
Drakkith
Staff Emeritus
Science Advisor
20,746
4,457
It seems like a lot of stuff is really quick to get tossed out if it doesn't have support by Kaku, or Hawking, or ...
What?
 
  • #18
sophiecentaur
Science Advisor
Gold Member
24,291
4,319
Can there really be any surprise about all this? If we used a machine like language to live our lives then we could only communicate simple machine like ideas. Language is greater than this.
Science is not easy, even less at the cutting edge and we must expect confusion whenever everyday language is used as a description.
Maths does a good job, here.
I detect notes of complaint about the fact that non-native English speakers find difficulty and demands for non-mathematical explanations.
We are stuck with the way language is so rich and the way it is used. No complaints.
 
  • #19
BiGyElLoWhAt
Gold Member
1,560
113
What?
I don't know, maybe it's just because I'm an undergrad and I'm basing this on personal experience (people within the department). Also, there seem to be some (in my opinion) interesting and neat theories that have fizzled out. Maybe there's a reason for that, as I'm not necessarily versed on any of these with any sort of depth.
What I was referencing was the last couple paragraphs:
Blog said:
Conventional science isn’t bad science. But we also need unconventional science, and we should be careful to not assign the label “crackpottery” too quickly. If science is what scientists do, scientists should pay some attention to the science of what they do.
Maybe it's, again, just not accessible to me as an undergrad, but I don't tend to see many "out there" theories. I even did a google search yesterday trying to find some, and the most "out there" thing that I came across was Unparticle theory.

He also referenced a stat that the rate of production of these "out there" theories has declined by about a percent, from 3 point something to 2 point something, and the given reason for this was conformism. Not sure if that was the researchers conclusion or the bloggers, though.

They found that having previously unlikely combinations in the quoted literature is positively correlated with the later impact of a paper. They also note that the fraction of papers with such ‘unconventional’ combinations has decreased from 3.54% in the 1980s to 2.67% in the 1990, “indicating a persistent and prominent tendency for high conventionality.”
 
  • #20
34,060
9,935
I guess 2.67% of 1990s' publication rate is still more than 3.54% of 1980s' publication rate. Science is getting more complex, so you need more and more papers to cover the same fields and also new fields. You can't have something like special relativity in every new publication.
but I don't tend to see many "out there" theories
They are "out there" for a good reason, and they are rare for the same good reason.
 
  • #21
BiGyElLoWhAt
Gold Member
1,560
113
That very well may be. I suppose I would just like to see more of it. Of course, with rigor and all of the necessary things, but crazy ideas like the invariance of the speed of light.
 
  • #22
Nugatory
Mentor
12,619
5,171
It seems like a lot of stuff is really quick to get tossed out if it doesn't have support by Kaku, or Hawking, or ...
For example?
 
  • #23
BiGyElLoWhAt
Gold Member
1,560
113
For example?
I know generalizing is always bad, but I was pretty much generalizing to anything that isn't generally accepted. As an example, let's just drop any type of preon theory. I found those via the internet, and they seem to make sense (to me). However, there are many things like this that seem to have been "swept under the rug", so to speak. Again, this is based on my personal experience in my physics department, and the general attitude of other students as well as the profs, as well as what gets talked about in class and out.

I suppose I should mention that I'm sure there were at least some problems with it, either not coinciding with what we have (I know I myself couldn't resolve Rishon theory with the generation of mesons conceptually), or other reasons. I just feel like, if nothing else, these theories have value in a timeline sort of way, a history lesson so to speak. Not to derail this further, but I read a very recent psych article where they showed one group of physics students a bunch of failure stories from Einstein, Galileo, and Curie. Those students' marks went up w.r.t. the control and the students who were read success stories. This is the article. One important detail, is that the primary attribute was motivation, but improved scores are improved scores, IMO.
 
Last edited:
  • #24
Dr. Courtney
Education Advisor
Insights Author
Gold Member
3,139
2,161
Well, that certainly explains why some of my ballistics papers are often set aside for a week or so for moderation. I think the computer may be spitting them out because there are so few submissions in ballistics among the arXiv. Once a human looks at it, it makes sense and they quickly discover that I am one of the most widely published physicists in ballistics in the past decade, but there are simply too few submissions in ballistics for the computer to make sense of.

Something different is going on in Physics Education (physics.ed-ph). We actually had a paper that had already been accepted and appeared in an educational journal (Eur J Phys) reclassified as popular physics (physics.pop-ph). Still, we got a lot of great press on that one, with a mention in the MIT Tech Review blog and Physics World. See:

https://arxiv.org/ftp/arxiv/papers/1305/1305.0966.pdf
 
  • #25
29,072
5,340
He also referenced a stat that the rate of production of these "out there" theories has declined by about a percent, from 3 point something to 2 point something, and the given reason for this was conformism.
You appear to be substantially misunderstanding what the author is talking about in the referenced statistic. He is not talking about crackpots, he is talking about scientific authors who use unusual combinations of references. In other words, authors who establish new connections between existing disciplines. Crackpots generally don't even know the literature of a single discipline, let alone multiple.
 

Related Threads on ArXiv crackpot filter developed by accident

  • Last Post
Replies
2
Views
2K
  • Last Post
Replies
10
Views
3K
  • Last Post
Replies
19
Views
4K
  • Last Post
Replies
1
Views
2K
  • Last Post
Replies
2
Views
2K
  • Last Post
Replies
4
Views
9K
  • Last Post
Replies
8
Views
3K
  • Last Post
3
Replies
51
Views
5K
  • Last Post
Replies
5
Views
3K
  • Last Post
Replies
2
Views
1K
Top