Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

Large Hadron Collider (LHC) Data Grid:

Background:
The Large Hadron Collider (LHC) is the largest particle accelerator built by the European
Organization for Nuclear Research (CERN) for a span of over 10 years starting from 1998. This
facility conducts fundamental experiments in high energy particle physics aiming to answer
some of the most fundamental questions about the origin and nature of the universe. The
unprecedented volume of data produced by each collision (~140 million sensors deliver data at a
rate of 40 million times per second) and the number of collisions easily produces around 15 peta
bytes of data annually. The computing resources necessary to process all the data are
unreasonably high; so a globally distributed computing infrastructure was created to share the
burden of computing. This is called the Worldwide LHC Computing Grid (WLCG). This layered
computing grid currently consists of 4 levels or tiers, called 0, 1, 2 and 3. The tier-0 is the CERN
data center and serves to record, reconstruct and long term archival of all the LHC data. This tier
accounts for less than 20% of grids total capacity. The tier-1 serves to reprocess the data,
archival and further distribution of data downstream. The 11 data centers of tier-1
(geographically located in more than 10 countries) provides round the clock support for the grid
and also store a share of simulated data that the tier-2s produce. Dedicated high speed optical
network connects these tier-1 centers. Tier-2s consists of universities and other research
institutions providing computing powers for storage and expert analysis of data. They are located
in 140 sites in more than 35 countries. Finally, individual scientists can access the grid through
local resources and called tier-3.
Motivation:
The WLCG grid architecture distributes the computing load among many data centers across the
globe. This model of distributed computing also serves to distribute financial burden among
many stake holders and thereby efficiently utilize the expertise and knowledge of a globally
diverse pool of researchers and scientists. The elimination any single point of failure, providing
data redundancy and availability of expert support makes this distributed model a perfect choice
for a data-intensive project such as this. Above all, it accelerates the process of discovery in the
high energy physics by providing access to the resources to a worldwide community of
researchers and physicists.
Challenges:
The WLCG is the worlds largest computing grid. The initial focus of the model was to provide
stable service to the community rather than a long term sustainable computing model. So, the
WLCG evolved as a result of years of service and data challenges specific to the scientific
community involved. One of the shortcomings of this infrastructure has been the high cost of
manpower. The challenge that faces the WLCG community now is to operate the infrastructure
more efficiently with less effort as the scope and size of it continues to grow. Some of the other
challenges are: Lack of effective communication between tier-2s, Absence of a WLCG central
operation team, multiple (and fragile) experiment software installation and runtime configuration
systems, poor documentation and unsupervised and inconsistent middleware validations.
Accomplishments:
1. The discovery of so called god particle which ended almost 50 years of search for the
Higgs boson that is an integral part of standard model and other theories of particle
physics, can directly be attributed to works and power of the LHC and its auxiliary
facilities. This discovery has certainly been impetus in winning of 2013 Noble prize in
Physics of Peter Higgs and Francois Englert (the original predictors of the particle).
2. The seamless access it provides to the data and results to the thousand of researchers and
scientists across the globe wouldnt have been otherwise possible.
3. New discoveries and insights in particle physics have definitely seen a huge boost thanks
to the huge processing capabilities and sheer scope of collaboration it provides to the
community.
Future Plans:
The tiered architecture is evolving towards a more mesh model in terms of hosting of central
services and data custody. To provide a coherent and sustainable infrastructure its adopted
following future tasks:
Establishing a core team for coordinating WLCG operations
Simplify the middleware stack and improvement of documentation and procedures
Improve middleware distribution and configuration mechanism
Strengthen the participation sites

You might also like