A massive 300 Terabytes of data from CERN, one of the most ambitious of human inventions which is being used to look into the creation of this universe among other things, has been released for the general public.
The data can be studied by students, physicists and everyone else — although, not everyone will understand it — to gain interesting and cutting edge insights into the nature of the universe..
The data from CERN comes in three parts. The first part includes Raw data that can be uses to derive the final results, there are also “derived” datasets that are easier to work with.
CERN provides some of the most advanced particle accelerators and other infrastructure needed for high-energy physics research – and as a result of the same, many experiments have been constructed and are being studies at CERN.
The Collision data in the primary datasets are in the AOD or Analysis Object Data format. On the other hand, the simulated data is in the AODSIM format. Both of these formats contain the information associated with high-level physics objects (such as muons,electrons, etc) tracks with associated hits, calorimetric clusters with associated hits, vertices; and finally, information about event selection (triggers).
Apart from just the data, CERN scientists are also providing a bunch of tools to faciliate the exploration of the said data.
Speaking on the topic, Kati Lassila-Perini, a physicist who works on the Compact Muon Solenoid, said
Once we’ve exhausted our exploration of the data, we see no reason not to make them available publicly. The benefits are numerous, from inspiring high school students to the training of the particle physicists of tomorrow. And personally, as CMS’s data preservation coordinator, this is a crucial part of ensuring the long-term availability of our research data.
The CMS detector is a 5 storey-high digital camera recording hundreds of images per second of debris from events occuring in the hadron collider. The data had actually been collected from 2011 and is about half of the total data collected by the CMS detector.
CERN has also prepared a Linux environment which can be booted up in a virtual machine and can be used to study the data. Interested folks can also find a bunch of scripts and apps on the virtual environment — or on Github if you want.
Researchers in charge of the data release have also created special “masterclasses,” data sets and tools which have been made while keeping high school kids in mind. That’s one way to get introduced to the higher sciences!
The release marks the increasing attempts by scientists and corporates to make technology and sciences more widely available. This is the second such release of data by the CMS, which had previously made 27 TB of data public.
To get more information on the topic, you can visit this link.