- 15,690
- 10,496
Einsteins papers are now publicly available online:
http://www.theverge.com/2014/12/5/7340029/digital-einstein-archives-princeton
http://www.theverge.com/2014/12/5/7340029/digital-einstein-archives-princeton
The discussion centers around the accessibility of Einstein's papers online, exploring the implications of their availability in the context of "Big Data" and indexing challenges. Participants share links to the digital archives and reflect on the volume of data involved.
Participants express varying views on the implications of the volume of Einstein's papers and the challenges of indexing large datasets. There is no consensus on the classification of the data as "big data," and the discussion remains unresolved regarding the best approaches to managing such information.
Participants reference different scales of data and indexing challenges without resolving the definitions of "big data" or the effectiveness of browsing large archives.
Readers interested in digital archives, data management, and the historical context of scientific writings may find this discussion relevant.
15,000 pages is a daunting amount of data when it's dumped in from of you, but it's not "big data" in the modern sense. 15,000 pages at a few hundreds of words per page and maybe ten characters per word, you're talking tens of megabytes. A random American grocery store generates and warehouses that much data every hour or so.Doug Huffman said:My neighbors and I recently discussed "Big Data" and the difficulty indexing it. Precisely the volume of Einstein's papers was mentioned. I struggle with Martin Luther's 15,000 pages of writings, and sermons. Browsing is not particularly useful in big-data.