EB21102033 HashirSaudKhan SuperComp

Class: BcSc (UOK)
Section: B (Evening)
Name: Hashir Saud Khan
Seat No : EB21102033
Super Computers
A super computer is an very powerful computer.The performance of
supercomputers is measured in floating point operations per second instead of
million instructions per second (since 2017) .The performance of a supercomputer is
commonly measured in floating-point operations per second (FLOPS) instead of million
instructions per second (MIPS). Since 2017, there are supercomputers which can perform
over 1017 FLOPS (a hundred quadrillion FLOPS, 100 peta FLOPS or 100 PFLOPS).
Top 10 supercomputers :
1. Dell frontera
2. HPC5, Italy
3. Frontera, US
4. JUWELS Booster Module, Germany
5. Tianhe-2A, China
6. Lenovo SuperMUC-N
7. Cray/HPE Trinity
8. Sunway TaihuLight
9. Summit, U.S.
10. Sierra, U.S
Dell Frontera:
Introduction:
Frontera is one of the most powerful supercomputers in the world, and the
fastest supercomputer on a university campus. Hundreds of U.S. scientists and
engineers use Frontera each year to power discoveries across a wide range of
scientific domains, from the quantum to the cosmic scale. Frontera is funded by
the National Science
Foundation (NSF) and serves as the leadership-class system in NSF’s
cyberinfrastructure ecosystem. The system is currently the ninth most
powerfu
supercomputer in the world and the fastest non-accelerated (primarily CPU-
based) system in the world. Researchers are awarded time on Frontera based on
their need for very large-scale computing, and the ability to efficiently use a
supercomputer on the scale of Frontera. Frontera opens up new possibilities in
science and engineering by providing computational capability that makes it
possible for investigators to tackle much larger and more complex research
challenges across a wide spectrum of domains. Frontera also has multiple
storage systems, as well as interfaces to cloud and archive systems, and a set of
application nodes for hosting virtual servers.
Block Diagram:
Software:
It uses CentOS Linux 7 as its Operating System. Intel as its Compiler. It uses the
Math
Library of Intel MKL 18.0.5 and MPI of Intel MPI 18.0.5.
Functional Unit:
Frontera has two computing subsystems, a primary computing system
focused on double precision performance, and a second subsystem focused
on single precision streaming-memory computing. Frontera is a Dell C6420
system equipped by Intel. Using 448,448 of its Intel Platinum Xeon cores,
the system achieves 23.5 petaflops.
HPC5, Italy :
HPC5 is a supercomputer built by Dell and installed by Eni, capable of 51.721
petaflops, and is ranked 9th in the Top500 as of November 2021. In June 2020,
HPC5 ranked 6th in the Green500. HPC5 is an upgrade to the HPC4 system, which
was built by Hewlett Packard Enterprise and used by Eni. It is also called as HPC4+.
Design :
HPC5 spans over 1,820 Dell EMC PowerEdge C4140 servers, each with
two Intel Gold 6252 24-core processors and four Nvidia V100 GPU accelerators. In
total, the system comprises 7,280 NVIDIA V100 GPUs.
Block diagram:
Software :
CentOS Linux 7
functional units :
Processor :
The Intel Xeon Gold 6252 2.1GHz Twenty Four Core Processor enhances the
performance and the speed of your system. Additionally, the Virtualization
Technology enables migration of more environments. It supports enhanced
SpeedStep technology that allows tradeoffs to be made between performance
and power consumption.
Frontera, US:
Frontera is a the successor to Stampede 2, a planned academic supercomputer
with a peak performance of 35-40 petaflops. Frontera is expected to deliver the
3x speedup in real-application performance over Blue Waters at about one-third
the cost.
Block diagram:
Software :
CentOS linux 7
Functional units :
Processor :
Intel xeon platinum 8280 Intel Xeon Platinum processors offer the critical and
hybrid cloud workloads, real-time analytics, machine learning, and artificial
intelligence.14 nm Clock Speed 4 GHz.
JUWELS Booster Module:

INTRODUCTION:
A debutante on the list, the Atos-built BullSequana machine was recently installed
at the Forschungszentrum Jülich (FZJ) in Germany. It is part of a modular system
architecture. These modules are integrated by using the ParTec Modulo Cluster
Software Suite. The Booster Module uses AMD EPYC processors with NVIDIA A100
GPUs for acceleration similar to the fifth-ranked Selene system. Running by itself
the JUWELS Booster Module was able to achieve 44.1 HPL petaflops, which makes
it the most powerful system in Europe.
BLOCK DIAGRAM:
DETAILED DESCRIPTION OF FUNCTIONAL UNITS:
JUWELS Booster Module uses AMD EPYC processors with Nvidia A100 GPUs for
acceleration. University of Edinburgh contracted a deal to utilise JUWELS to
pursue research in the fields of particle physics, astronomy, cosmology and
nuclear physics.
In 2021, JUWELS Booster among eight other supercomputing systems participated
in the MLPerf HPC training benchmark, which is the benchmark developed by the
consortium of artificial intelligence developers from academia, research labs, and
industry aiming to unbiasedly evaluate the training and inference performance for
hardware, software, and services used for AI. JUWELS also ranked among the top
15 on the worldwide Green500 list of energy-efficient supercomputers.
The Simulation and Data Laboratory (SimLab) for Climate Science
at Forschungszentrum Jülich uses JUWELS to detect gravity waves in
the atmosphere by running computing programs to continuously download and
compute on the operational radiance measurements from the NASA's data
servers.
Tianhe-2A:
INTRODUCTION:
Also known as the Milky Way-2A, this system is developed by China’s National
University of Defense Technology (NUDT) and deployed at the National
Supercomputer Center in Guangzho. It is powered by Intel Xeon CPUs and NUDT’s
Matrix-2000 DSP accelerators and achieves 61.4 petaflops on HPL.
BLOCK DIAGRAM:
DETAILED DESCRIPTION OF FUNCTIONAL UNITS:
It was the world's fastest supercomputer according to the TOP500 lists for June
2013, November 2013, June 2014, November 2014, June 2015, and November
2015. The record was surpassed in June 2016 by the Sunway TaihuLight. In 2015,
plans of the Sun Yat-sen University in collaboration with Guangzhou district and
city administration to double its computing capacities were stopped by a U.S.
government rejection of Intel's application for an export license for the CPUs
and coprocessor boards.
In response to the U.S. sanction, China introduced the Sunway
TaihuLight supercomputer in 2016, which substantially outperforms the Tianhe-2
(and also affected the update of Tianhe-2 to Tianhe-2A replacing US tech), and
now ranks fourth in the TOP500 list while using completely domestic technology
including the Sunway manycore microprocessor.
According to NUDT, Tianhe-2 will be used for simulation, analysis, and
government security applications.
LIST OF SOFTWARE USED:

Tianhe-2 ran on Kylin Linux, a version of the operating system developed by
NUDT. It uses more than 80,000 SPARC64 VIIIfx processors, each with eight cores,
for a total of over 700,000 cores—almost twice as many as any other system
Lenovo SuperMUC-NG:
Introduction:
SuperMUC was a supercomputer of the Leibniz Supercomputing Centre (LRZ) of
the Bavarian Academy of Sciences. It was housed in the LRZ's data centre in
Garching near Munich. It was decommissioned in January 2020, having been
superseded by the more powerful SuperMUC-NG.SuperMUC (the suffix 'MUC'
alludes to the IATA code of Munich's airport) is operated by the Leibniz
Supercomputing Centre, a European centre for supercomputing. In order to
house its hardware, the
infrastructure space of the Leibniz Supercomputing Centre was more than doubled
in
2012. SuperMUC was the fastest European supercomputer when it entered
operation in the summer of 2012 and is currently ranked #20 in the Top500 list of
the world's fastest supercomputers. SuperMUC serves European researchers in
many fields, including medicine, astrophysics, quantum chromodynamics,
computational fluid dynamics, computational chemistry, life sciences, genome
analysis and earthquake simulations.
Block Diagram:
Functional Units:
SuperMUC is an IBM iDataPlex system containing 19,252 Intel Xeon Sandy Bridge-
EP and Westmere-EX multi-core processors (155,656 cores), for a peak
performance of about 3 PFLOPS (3 × 1015 FLOPS). It has 340 TB of main memory
and
15 PB of hard disk space. It uses a new form of cooling that IBM developed, called
Aquasar, that uses hot water to cool the processors. IBM claims that this design
saves
40 percent of the energy normally needed to cool a comparable system.
SuperMUC is connected to powerful visualization systems, which consist of a large

4K
stereoscopic powerwall as well as a five-sided CAVE artificial virtual reality
environment.
Cray/HPE Trinity:
Introduction:
Trinity (or ATS-1) is a United States supercomputer built by the National
Nuclear Security Administration (NNSA) for the Advanced Simulation and
Computing Program (ASC).The aim of the ASC program is to simulate, test,
and maintain the United States nuclear stockpile.
Block Diagram:
Functional Units:
Its architecture is CRAY XC40 with memory capacity of 2.07 PiB. Its peak
performance is 41.5 PF/s . Number of computer nodes 19,420. Parallel file
capacity is 78 PB (69
PiB). Burst Buffer capacity is 3.7 PB. Footprint 4606 sq ft with power requirement
of
8.6 MV.Trinity was built in 2 stages. The first stage incorporated the Intel Xeon
Haswell
processor while the second stage added a significant performance increase using
the
Intel Xeon Phi Knights Landing Processor. There are 301,952 Haswell and 678,912
Knights Landing processors in the combined system, yielding a total peak
performance of over 40 PF/s (petaflops).There are 5 primary storage tiers;
Memory, Burst Buffer, Parallel File System, Campaign Storage, and Archive.2 PiB
of DDR4 DRAM provide physical memory for the machine. Each processor also
has DRAM built on to the tile, providing additional memory capacity. The data in
this tier is highly transient and is typically in residence for only a few seconds,
being overwritten continuously. Cray supplies the three hundred XC40 Data
Warp blades that each contain 2 Burst Buffer nodes and 4 SSD drives. There is a
total of 3.78 PB of storage in this tier, capable of moving data at a rate of up to 2
TB/s. In this tier, data is typically resident for a few hours, with data being
overwritten in approximately that same time frame.Trinity uses a
Sonexion based Lustre file system with a total capacity of 78 PB. Throughput on
this tier is about 1.8 TB/s (1.6 TiB/s). It is used to stage data in preparation for HPC
operations. Data residence in this tier is typically several weeks.The MarFS File
System fits into the Campaign Storage tier and combines properties of POSIX and
Object storage models. The capacity of this tier is growing at a rate of about 30
PB/year, with a current capacity of over 100 PB. In testing, LANL scientists were
able to create 968 billion files in a
single directory at a rate of 835 million file creations per second. This storage
is designed to be more robust than typical object storage, while sacrificing
some of the
end user functionality that you would get from a POSIX system. Performance of
this tier is between 100-300 GB/s of throughput. Data residence in this tier is
longer term, typically lasting several months.
Sunway TaihuLight:
Introduction:
The Sunway TaihuLight is a Chinese supercomputer which, as of November 2021,
is ranked fourth in the TOP500 list, with a LINPACK benchmark rating of 93
petaflops.The name is translated as divine power, the light of Taihu Lake. This is
nearly three times as fast as the previous Tianhe-2, which ran at 34 petaflops. As
of June
2017, it is ranked as the 16th most energy-efficient supercomputer in the
Green500, with an efficiency of 6.051 GFlops/watt. It was designed by the
National Research Center of Parallel Computer Engineering & Technology
(NRCPC) and is located at the National Supercomputing Center in Wuxi in the
city of Wuxi, in Jiangsu province, China.
Block Diagram:
Software:
The system runs on its own operating system, Sunway RaiseOS 2.0.5,
which is based on Linux. The system has its own customized
implementation of OpenACC 2.0 to aid the parallelization of code.
Functional Units:
The Sunway TaihuLight uses a total of 40,960 Chinese-designed
SW26010 manycore
64-bit RISC processors based on the Sunway architecture. Each
processor chip contains 256 processing cores, and an additional four
auxiliary cores for system management (also RISC cores, just more
fully featured) for a total of 10,649,600 CPU cores across the entire
system.
The processing cores feature 64 KB of scratchpad memory for data

(and 16 KB for instructions) and communicate via a network on a chip,
instead of having a traditional cache hierarchy.
Summit , US
Summit is a supercomputer developed by IBM for Oak Ridge national
laboratory. It is capable of 200 petaFLOPS making it second fastest
supercomputer. It was the fastest supercomputer till 2020. It is the fifth
most energy efficient supercomputer as of 2019. summit was the first
supercomputer to reach exaFLOP. The Summit supercomputer provides
scientists and researchers the opportunity to solve complex tasks in the
fields of energy, artificial intelligence, human health and other research
areas. It has been used in Earthquake Simulation, Extreme Weather
simulation using AI, Material science, Genomics and in predicting the
lifetime of Neutrinos in physics.
Hardware :
Each one of its 4,608 nodes has over 600 GB of coherent memory (96
GB HBM2 plus 512 GB DDR4 SDRAM) which is addressable by all CPUs
and GPUs plus 800 GB of non-volatile RAM that can be used as a burst
buffer or as extended memory.[19] The POWER9 CPUs and Nvidia
Volta GPUs are connected using NVIDIA's high speed NVLink. This allows
for a heterogeneous computing model.To provide a high rate of data
throughput, the nodes will be connected in a non-blocking fat-tree
topology using a dual-rail Mellanox EDR InfiniBand interconnect for both
storage and inter-process communications traffic which delivers both
200Gbit/s bandwidth between nodes.
Block diagram :
Functional units :
Weighing over 340 tons, Summit takes up 5,600 sq. ft. of floor space
at Oak Ridge National Laboratory. Summit consists of 256 compute
racks, 40 storage racks, 18 switching director racks, and 4 infrastructure
racks. Servers are linked via Mellanox IB EDR interconnect in a three-
level non-blocking fat-tree topology.
Compute Rack:
Each of Summit's 256 Compute Racks consists of 18 Compute Nodes

along with a Mellanox IB EDR for a non-blocking fat-tree interconnect
topology (actually appears to be pruned 3-level fat-trees). With 18
nodes, each rack has 9 TiB of DDR4 memory and another 1.7 TiB of
HBM2 memory for a total of 10.7 TiB of memory. A rack has a 59 kW
max power and a total of 864 TF/s of peak compute power (ORNL
reports 775 TF/s).
Compute Node:
The basic compute node is the Power Systems AC922 (Accelerated
Computing), formerly codename Witherspoon. The AC9222 comes in a
19-inch 2U rack-mount case.
Each node has two 2200W power supplies, 4 PCIe Gen 4 slots, and a
BMC card. There are two 22-core POWER9 processors per node, each
with 8 DIMMs.
Socket:
Since IBM POWER9 processors have native on-die NVLink connectivity,
they are connected directly to the CPUs. The POWER9 processor has
six NVLink 2.0 Bricks which are divided into three groups of two Bricks.
Since NVLink 2.0 has bumped the signaling rate to 25 GT/s, two Bricks
allow for 100 GB/s of bandwidth between the CPU and GPU. In addition
to everything else, there are x48 PCIe Gen 4 lanes for I/O.
The Volta GPUs have 6 NVLink 2.0 Bricks which are divided into three
groups. One group is used for the CPU while the other two groups
interconnect every GPU to every other GPU. As with the GPU-CPU link,
the aggregated bandwidth between two GPUs is also 100 GB/s.
System software:
The software used by summit is RED HAT enterprise Linux.
Super-computing often pairs standard hardware at scale with additional,
highly-specialized components, which is why Summit is using Linux -
specifically, Red Hat Enterprise Linux. Red Hat Enterprise Linux forms a
common bridge at the operating system to effectively link all of
Summit’s resources together, making it easier for individual application
stacks to take advantage of the specific resources that they need.
SIERRA ,US :
Sierra or ATS-2 is a supercomputer. It was manufactured for
the Lawrence Livermore National Laboratory for use by the National
Nuclear Security Administration as the second Advanced Technology
System in 2018. It is primarily used for predictive applications
in stockpile stewardship, helping to assure the safety, reliability and
effectiveness of the United State's nuclear weapons.
DESIGN :
The architectural design of sierra is very similar to that of SUMMIT . The
Sierra system uses IBM POWER9 CPUs in concurrence
with Nvidia Tesla V100 GPUs. The nodes in Sierra are Witherspoon IBM
S922LC OpenPOWER servers with two GPUs per CPU and four GPUs per
node. These nodes are connected with EDR InfiniBand. In 2019 Sierra
was upgraded with IBM Power System A922 nodes.
BLOCK DIAGRAM:
OPERATING SYSTEM:
Likewise SUMMIT, Sierra also uses Linux operating system.
Functional units:
Power9:
POWER9 is a family of superscalar, multithreading, multi-
core microprocessors produced by IBM, based on the Power ISA. It was
announced in August 2016.[2] The POWER9-based processors are being
manufactured using a 14 nm FinFET process,[3] in 12- and 24-core
versions, for scale out and scale up applications,[3] and possibly other
variations, since the POWER9 architecture is open for licensing and
modification by the OpenPOWER Foundation members.
GPU:
With 640 Tensor Cores, V100 is the world’s first GPU to break the 100
teraFLOPS (TFLOPS) barrier of deep learning performance. The next
generation of NVIDIA NVLink™ connects multiple V100 GPUs at up to
300 GB/s to create the world’s most powerful computing servers. AI
models that would consume weeks of computing resources on previous
systems can now be trained in a few days. With this dramatic reduction
in training time, a whole new world of problems will now be solvable
with AI.

EB21102033 HashirSaudKhan SuperComp

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

EB21102033 HashirSaudKhan SuperComp

Uploaded by

Copyright:

Available Formats

Class: BcSc (UOK)

JUWELS Booster Module:

LIST OF SOFTWARE USED:

SuperMUC is connected to powerful visualization systems, which consist of a large

The processing cores feature 64 KB of scratchpad memory for data

Each of Summit's 256 Compute Racks consists of 18 Compute Nodes

You might also like