Visualisation of every paper on the arXiv

Click For Summary

Discussion Overview

The discussion revolves around a visualization of scientific papers from the arXiv, specifically focusing on a map that represents 865,906 papers. Participants explore the nature of the visualization, including the meaning of spatial coordinates and the labeling of papers.

Discussion Character

  • Exploratory, Technical explanation, Conceptual clarification

Main Points Raised

  • Some participants inquire about the axes of the map, seeking clarification on what they represent.
  • One participant suggests that the spatial coordinates may not have quantitative meaning, indicating that the dots represent concept bubbles rather than specific data points.
  • Another participant speculates that larger dots could indicate papers that are more heavily cited.
  • There is a suggestion that the spacing of the papers might relate to how they cite one another.
  • A later post provides detailed information about the labeling process on the map, explaining how labels are generated based on titles and abstracts, and how thematic regions are identified and labeled.
  • Future plans for improving the labeling process are mentioned, including a desire for a smoother transition between zoom levels.

Areas of Agreement / Disagreement

Participants express curiosity and propose various interpretations of the visualization, but there is no consensus on the meaning of the spatial arrangement or the significance of the labels.

Contextual Notes

Limitations include the lack of clarity on the quantitative meaning of spatial coordinates and the assumptions underlying the labeling process.

Messages
19,910
Reaction score
10,920
A map of 865,906 scientific papers from the arXiv

Paperscape1.jpg


Click here for the big map
http://paperscape.org/
 
Physics news on Phys.org
What are the axes?
 
Vanadium 50 said:
What are the axes?

The website might tell you more
 
I think the spatial coordinates don't have any quantitative meaning. If you zoom in, all the little dots are actually concept bubbles surrounded by finer concept bubbles.
 
OK... they're actually each a paper. I'm guessing bigger one are more heavily cited?
 
Maybe their spacing is related to how they cite one another?
 
"The labels on the map are generated mostly automatically. When zoomed out, arXiv categories are displayed, and the position of the category label is computed as the average of all papers in that category. As you zoom in, these category labels disappear, and are replaced by individual labels on top of each paper, so long as that paper is “big enough” on screen. The labels for each paper are determined by analysing the title and abstract, looking for common keywords.

We have now added a third layer to this labelling process: we identify by eye regions of the map that have a definite theme, and give these regions a generic, but not too generic, label. For example, we can identify cleary the “neutrino” area in the north, and the “inflation” area at the interface of hep-th and astro-ph.

These new labels make the transition from arXiv category to keyword labels a bit easier to follow, and also allows you to more easily understand where you are on the map.

In the future we plan to implement a more sophisticated way of labelling that transits smoothly between zoom level, much like in a map of the geographic world. If you have any suggestions for this, please leave us a comment."

-Development Blog
 

Similar threads

Replies
3
Views
3K
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
495
Replies
8
Views
7K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 32 ·
2
Replies
32
Views
8K
Replies
6
Views
1K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 37 ·
2
Replies
37
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K