LHC Part 4: Searching for New Particles and Decays - Comments

Click For Summary
SUMMARY

The forum discussion centers on the analysis of particle collisions at the Large Hadron Collider (LHC), specifically regarding the search for new particles and decays. Participants discuss the challenges of handling vast datasets, with ATLAS and CMS experiments generating approximately 80 TB/s at a 40 MHz bunch crossing rate. The conversation highlights the importance of statistical methods such as the look-elsewhere effect and p-hacking, while also addressing the accessibility of datasets for analysis. Users express anticipation for upcoming results and the significance of ongoing studies at various energy levels, including 7 TeV and 8 TeV.

PREREQUISITES
  • Understanding of particle physics and the LHC's role in experimental research.
  • Familiarity with statistical methods in data analysis, including the look-elsewhere effect.
  • Knowledge of data processing techniques for large datasets, particularly in high-energy physics.
  • Experience with data analysis software, specifically C++ and Python for handling .root files.
NEXT STEPS
  • Research the look-elsewhere effect and its implications in particle physics experiments.
  • Explore the data processing techniques used in high-energy physics, focusing on the ATLAS and CMS experiments.
  • Learn about the structure and analysis of .root files in C++ and Python.
  • Investigate the public datasets released by ATLAS, CMS, and LHCb for practical analysis opportunities.
USEFUL FOR

Particle physicists, data analysts in high-energy physics, researchers interested in statistical methods, and anyone looking to analyze LHC datasets.

Messages
37,399
Reaction score
14,228
mfb submitted a new PF Insights post

LHC Part 4: Searching for New Particles and Decays

LHC4.png


Continue reading the Original PF Insights Post.
 
  • Like
Likes   Reactions: Ygggdrasil, QuantumQuest, Orodruin and 2 others
Physics news on Phys.org
Excellent article with very clear explanations.
By looking at more places, we made it more likely to see larger statistical fluctuations. This is called look-elsewhere-effect or trials factor.
This is also known as the Green Jelly Bean effect in the medical sciences or p-hacking in the social sciences.
A really weird statistical fluctuation, a new particle, or some weird experimental effect?
Any guesses at this point, or should we all wait for Friday to see what the additional analyses have turned up?
 
Thanks :)
Ygggdrasil said:
Any guesses at this point, or should we all wait for Friday to see what the additional analyses have turned up?
Wait for Friday. There are various rumors around, I won't comment on them.

I'll post results here as soon as they are public.

Found this nice description by CERN, slightly different focus but a large overlap in the topics. With more pictures!
 
Last edited:
  • Like
Likes   Reactions: vanhees71
Great overview.as a quantum gravity guy trying to learn more about phenomenology your articles are just perfect!
 
  • Like
Likes   Reactions: Preetica and mfb
What kind of data files and analytics software are you guys using to dig through 2.5+ quadrillion collision events?
 
stoomart said:
What kind of data files and analytics software are you guys using to dig through 2.5+ quadrillion collision events?
That's why triggers are used; to decrease the rate of collecting events to a handleable size [not on a local computer of course]. https://inspirehep.net/record/1196429/files/soft-2004-007.pdf
For local computers the sizes you're dealing with depend on the number of the recorded data and the analysis you are doing.
 
  • Like
Likes   Reactions: stoomart
The experiments start with a 40 MHz bunch crossing rate. At ~2 MB/event (ATLAS/CMS, lower for LHCb) that is 80 TB/s. You cannot even read out such a data rate. The experiments read out a small part and look for the most interesting collisions there (mainly looking for high-energetic processes). That reduces the event rate to ~100 kHz (ATLAS/CMS) or 1 MHz (LHCb). 200 GB/s are then fed into computer farms and analyzed in more detail. Again the data is reduced to the most interesting events, ~1 kHz for ATLAS/CMS and ~10 kHz for LHCb. Those are stored permanently. The information which possible physics process happened there (e. g. "the reconstruction found two high-energetic electrons") is also stored.

Individual analyses can then access those datasets. As an example, an analysis could look for events with two high-energetic electrons: Those might have a rate of 3 Hz during data-taking, which means you have something like 12 million events (~20 TB for ATLAS/CMS). That number varies a lot between analyses, some have just a few thousand, some have 100 millions. Those events are then processed by the computing grid, typically producing a smaller dataset (gigabytes) with just the information you care about. The GB-sized files are typically .root files and studied with C++ or python on single computers or a few computers at a time. Everything before that is code and data formats developed for the individual experiments.

ALICE has much lower event rates, so the earlier steps are easier there, the later steps look very similar.
 
  • Like
Likes   Reactions: Preetica and stoomart
Sounds like a similar data pipeline to what I use for mining network and host events for interesting security events, though on a much larger scale (I'm currently condensing 20-30 million filtered and stored events per day into ~30 "interesting" events). Thanks for the link @ChrisVer, I'm definitely going to read through it and maybe even get me some LHC data to play around with. Thanks guys!
 
  • #10
I think some datasets became available to public last year? I think if you search for it, you may find a way to access them without having to be a member of the collaboration, and they should be easy to deal with on a local machine.
 
  • #11
Both ATLAS and CMS released older datasets, LHCb has some tiny example datasets but will publish more soon, ALICE will follow as well.
ATLAS
CMS

The full released CMS dataset has 300 TB. Dealing with that on a local machine can be "tricky," but they also have smaller subsamples. I don't know the size of the ATLAS data, but I would expect it to be similar.
Both datasets come with additional simulation samples necessary to understand the detector better.
 
  • Like
Likes   Reactions: stoomart
  • #12
Unfortunately it looks like the TAG data (event metadata) that I'm interested in analyzing is only stored in a relational database, which is not available for download.
 
  • #13
Very nice and understandable article, thanks @mfb! I did not read it until today, but better late than never.
 
  • #14
Its good they released ATLAS datasets but that is only 8 Tev. We all know the fun stuff happens past 10 Tev. Meanwhile we anxiously await for ALICE. An event happens they just don't have capabilities to interpret dataset.
 
  • #15
Tommy 101 said:
but that is only 8 Tev.
I am not sure, but I guess that's the point. Data that have been thoroughly studied should be accessible to groups or people outside the collaboration. Data is always useful [educationally or even for researchers and some theorists].
Also don't underestimate the 8TeV, studies are still being done on those samples... afterall, ATLAS is not just a machine dedicated to search for new physics, but it studies the Standard Model too (cross sections, polarizations etc).

Tommy 101 said:
We all know the fun stuff happens past 10 Tev
Do we? it may even start past 20TeV.

Tommy 101 said:
Meanwhile we anxiously await for ALICE. An event happens they just don't have capabilities to interpret dataset.
I don't understand your latest sentence here, could you make it clearer?
 
  • #16
The recent W mass measurement was done with 7 TeV data, and there are various studies with 7 and 8 TeV ongoing. Precision measurements take time.

The most recent datasets are not made public yet because the collaborations that built the detectors and did all the work to gather the data also want to analyze it first. This is not particle-physics specific. A group doing some laser physics or whatever also doesn't release raw data before writing a publication about the results. Chances are good you'll never see the raw data for most experiments. In particle physics, you do.
 
  • #17
Someone made a video about the same topic, and with nice animations.
 
  • Like
Likes   Reactions: DennisN and stoomart

Similar threads

  • · Replies 23 ·
Replies
23
Views
3K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 33 ·
2
Replies
33
Views
7K
  • · Replies 11 ·
Replies
11
Views
3K
  • Sticky
  • · Replies 2 ·
Replies
2
Views
7K
  • · Replies 49 ·
2
Replies
49
Views
12K
  • · Replies 48 ·
2
Replies
48
Views
8K
  • · Replies 69 ·
3
Replies
69
Views
14K
  • Sticky
  • · Replies 7 ·
Replies
7
Views
8K