CERN releases 300TB of LHC data to public

In summary: It is interesting that the MC simulations are disfavored vs data-driven background estimates in particle physics... on the other hand other fields have adopted the MCs (like using MC to simulate the interaction of cosmic rays with the lunar surface https://arxiv.org/abs/1604.03349).There are various reasons why the data-driven methods work better than the MC simulations. One reason is that the data-driven methods are able to correctly reproduce the distribution of the particles in the collisions. MC simulations often produce particles that look too smooth and featureless, while the data-driven methods are able to reproduce the more complex features of the particle distribution. Additionally, the data-driven methods are able to correctly
Physics news on Phys.org
  • #2
Hello Greg
I have to ask if your question is rhetorical? I read that page linked and a few it linked and it looks like due diligence to me at it's most basic but also since 300TB is a sum, one is not required to download it all. The article even mentioned that some Universities are employing the data to task students with plotting and verifying so it seems a terrific way to get students involved with some of the most exciting data ever collected.

Additionally, among some ummm "less than stringently scientific groups" ; ) for some reason they feel compelled to misinterpret and even lie about CERN. While some institutions employ the "ignore absurdity and maybe it will go away" method, in all honesty that has not seemed to workout well at the very least since public opinion does indeed have a powerful affect on funding. Now itr is possible to reply to such "villagers with pitchforks" with a simple, "We have released 300 TB of our data and you are welcome to examine and question any part of it. Thank you for your interest." : D
 
  • Like
Likes Greg Bernhardt
  • #3
Many studies can be done with small subsets of those 300 TB.

Various groups will probably look at the whole dataset - the number of analyses you can do is always limited by manpower, so there are things CMS did not study with that data. Theorists with pet models about physics beyond the standard model that would lead to very obscure signals can now check the data on their own.

From a broader perspective: the public funded the experiments. It already has access to the final published results (all CERN results are open access, and all particle physics gets uploaded to arXiv anyway), but why not also give access to the raw data?
 
  • #4
It seems they also have some other things for people to download. Some sort of simulated data and simulation tools maybe. I'm not quite sure what they are. Could anyone explain?
 
  • #5
Simulations how the detector reacts to given (also simulated) events, yes. You need that for most analyses. You have to know how a possible signal will look like, how the background looks like, and how your detector will react to all those events.
Experiments never trust those simulations, and check their accuracy with independent analyses, but it does not work completely without simulations.
 
  • Like
Likes ShayanJ
  • #6
It is quiet interesting how MC simulations are disfavored vs Data-driven background estimates in particle physicis... on the other hand other fields have adopted the MCs (like using MC to simulate the interaction of cosmic rays with the lunar surface https://arxiv.org/abs/1604.03349).
 
  • #7
It is known that the MC descriptions (simulations) are not that good, and without data-driven estimates you rarely know how large the deviations to data are.
 
  • #8
I've heard that statement made about the MC method before, but I never understood why that was? Is it b/c of the dimensionality of the space being considered or is it more to do with details of the actual detectors?
 
  • #9
Both.

The description of the detector is not perfect: You never get the radiation length of every component exactly right, you never know the exact asymmetry of your detector in response to kaon/antikaons, you don't get the exact amount of charge sharing between adjacent channels after radiation damage right, and hundreds of similar details.

The simulation of the proton-proton collisions is not perfect. This is mainly due to nonperturbative QCD effects. You don't know the parton distribution functions exactly, the hadronization description is not perfect. In addition, you have to limit the calculations of some processes to fixed order, and so on. There are processes that can be modeled very well, while others don't have any purely theoretical predictions and rely on experimental data.
 
  • Like
Likes Haelfix

1. What is CERN?

CERN (European Organization for Nuclear Research) is a research organization that operates the largest particle physics laboratory in the world, located in Geneva, Switzerland. It is known for its flagship project, the Large Hadron Collider (LHC), which is the world's largest and most powerful particle accelerator.

2. What is the LHC?

The Large Hadron Collider (LHC) is a circular particle accelerator that spans over 27 kilometers underground at CERN. It is used to accelerate particles to high speeds and collide them to study the fundamental structure of matter and the universe.

3. What data has been released by CERN?

CERN has released 300 terabytes (TB) of data from its experiments at the LHC to the public. This includes data from the Compact Muon Solenoid (CMS) and A Toroidal LHC Apparatus (ATLAS) detectors, which were used to discover the Higgs boson in 2012.

4. How can the public access the data?

The data released by CERN is available for free on the CERN Open Data Portal, where anyone can download and analyze it. The portal also provides tools and resources for users to understand and work with the data.

5. Why did CERN release this data?

CERN released this data to promote open science and make the research conducted at the LHC more transparent and accessible to the public. It also allows for collaboration and new discoveries to be made by scientists and researchers all over the world.

Suggested for: CERN releases 300TB of LHC data to public

Replies
12
Views
1K
Replies
8
Views
1K
Replies
8
Views
1K
Replies
11
Views
1K
Replies
1
Views
939
Replies
20
Views
2K
Replies
10
Views
1K
Replies
9
Views
2K
Back
Top