CERN releases 300TB of LHC data to public

Greg Bernhardt · Apr 24, 2016

Ok, who has a spare super computer?
http://cms.web.cern.ch/news/cms-releases-new-batch-research-data-lhc

What is there motivation for this and realistically what can come from it?

enorbet · Apr 24, 2016

Hello Greg
I have to ask if your question is rhetorical? I read that page linked and a few it linked and it looks like due diligence to me at it's most basic but also since 300TB is a sum, one is not required to download it all. The article even mentioned that some Universities are employing the data to task students with plotting and verifying so it seems a terrific way to get students involved with some of the most exciting data ever collected.

Additionally, among some ummm "less than stringently scientific groups" ; ) for some reason they feel compelled to misinterpret and even lie about CERN. While some institutions employ the "ignore absurdity and maybe it will go away" method, in all honesty that has not seemed to workout well at the very least since public opinion does indeed have a powerful affect on funding. Now itr is possible to reply to such "villagers with pitchforks" with a simple, "We have released 300 TB of our data and you are welcome to examine and question any part of it. Thank you for your interest." : D

mfb · Apr 24, 2016

Many studies can be done with small subsets of those 300 TB.

Various groups will probably look at the whole dataset - the number of analyses you can do is always limited by manpower, so there are things CMS did not study with that data. Theorists with pet models about physics beyond the standard model that would lead to very obscure signals can now check the data on their own.

From a broader perspective: the public funded the experiments. It already has access to the final published results (all CERN results are open access, and all particle physics gets uploaded to arXiv anyway), but why not also give access to the raw data?

ShayanJ · Apr 25, 2016

It seems they also have some other things for people to download. Some sort of simulated data and simulation tools maybe. I'm not quite sure what they are. Could anyone explain?

mfb · Apr 25, 2016

Simulations how the detector reacts to given (also simulated) events, yes. You need that for most analyses. You have to know how a possible signal will look like, how the background looks like, and how your detector will react to all those events.
Experiments never trust those simulations, and check their accuracy with independent analyses, but it does not work completely without simulations.

ChrisVer · Apr 25, 2016

It is quiet interesting how MC simulations are disfavored vs Data-driven background estimates in particle physicis... on the other hand other fields have adopted the MCs (like using MC to simulate the interaction of cosmic rays with the lunar surface https://arxiv.org/abs/1604.03349).

mfb · Apr 25, 2016

It is known that the MC descriptions (simulations) are not that good, and without data-driven estimates you rarely know how large the deviations to data are.

Haelfix · Apr 25, 2016

I've heard that statement made about the MC method before, but I never understood why that was? Is it b/c of the dimensionality of the space being considered or is it more to do with details of the actual detectors?

mfb · Apr 25, 2016

Both.

The description of the detector is not perfect: You never get the radiation length of every component exactly right, you never know the exact asymmetry of your detector in response to kaon/antikaons, you don't get the exact amount of charge sharing between adjacent channels after radiation damage right, and hundreds of similar details.

The simulation of the proton-proton collisions is not perfect. This is mainly due to nonperturbative QCD effects. You don't know the parton distribution functions exactly, the hadronization description is not perfect. In addition, you have to limit the calculations of some processes to fixed order, and so on. There are processes that can be modeled very well, while others don't have any purely theoretical predictions and rely on experimental data.

CERN releases 300TB of LHC data to public

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad Very high energy photons in space

High School Where does the figure for a proton's rest mass come from?

Graduate Kerma from neutron irradiation

Undergrad Best way to focus charged particles back to their source?

Graduate Trouble Making a Penning Ion Source

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight