Professional Documents
Culture Documents
EB21102033 HashirSaudKhan SuperComp
EB21102033 HashirSaudKhan SuperComp
Section: B (Evening)
Name: Hashir Saud Khan
Seat No : EB21102033
Super Computers
A super computer is an very powerful computer.The performance of
supercomputers is measured in floating point operations per second instead of
million instructions per second (since 2017) .The performance of a supercomputer is
commonly measured in floating-point operations per second (FLOPS) instead of million
instructions per second (MIPS). Since 2017, there are supercomputers which can perform
over 1017 FLOPS (a hundred quadrillion FLOPS, 100 peta FLOPS or 100 PFLOPS).
Top 10 supercomputers :
1. Dell frontera
2. HPC5, Italy
3. Frontera, US
4. JUWELS Booster Module, Germany
5. Tianhe-2A, China
6. Lenovo SuperMUC-N
7. Cray/HPE Trinity
8. Sunway TaihuLight
9. Summit, U.S.
10. Sierra, U.S
Dell Frontera:
Introduction:
Frontera is one of the most powerful supercomputers in the world, and the
fastest supercomputer on a university campus. Hundreds of U.S. scientists and
engineers use Frontera each year to power discoveries across a wide range of
scientific domains, from the quantum to the cosmic scale. Frontera is funded by
the National Science
Foundation (NSF) and serves as the leadership-class system in NSF’s
cyberinfrastructure ecosystem. The system is currently the ninth most
powerfu
supercomputer in the world and the fastest non-accelerated (primarily CPU-
based) system in the world. Researchers are awarded time on Frontera based on
their need for very large-scale computing, and the ability to efficiently use a
supercomputer on the scale of Frontera. Frontera opens up new possibilities in
science and engineering by providing computational capability that makes it
possible for investigators to tackle much larger and more complex research
challenges across a wide spectrum of domains. Frontera also has multiple
storage systems, as well as interfaces to cloud and archive systems, and a set of
application nodes for hosting virtual servers.
Block Diagram:
Software:
It uses CentOS Linux 7 as its Operating System. Intel as its Compiler. It uses the
Math
Library of Intel MKL 18.0.5 and MPI of Intel MPI 18.0.5.
Functional Unit:
Frontera has two computing subsystems, a primary computing system
focused on double precision performance, and a second subsystem focused
on single precision streaming-memory computing. Frontera is a Dell C6420
system equipped by Intel. Using 448,448 of its Intel Platinum Xeon cores,
the system achieves 23.5 petaflops.
HPC5, Italy :
HPC5 is a supercomputer built by Dell and installed by Eni, capable of 51.721
petaflops, and is ranked 9th in the Top500 as of November 2021. In June 2020,
HPC5 ranked 6th in the Green500. HPC5 is an upgrade to the HPC4 system, which
was built by Hewlett Packard Enterprise and used by Eni. It is also called as HPC4+.
Design :
HPC5 spans over 1,820 Dell EMC PowerEdge C4140 servers, each with
two Intel Gold 6252 24-core processors and four Nvidia V100 GPU accelerators. In
total, the system comprises 7,280 NVIDIA V100 GPUs.
Block diagram:
Software :
CentOS Linux 7
functional units :
Processor :
The Intel Xeon Gold 6252 2.1GHz Twenty Four Core Processor enhances the
performance and the speed of your system. Additionally, the Virtualization
Technology enables migration of more environments. It supports enhanced
SpeedStep technology that allows tradeoffs to be made between performance
and power consumption.
Frontera, US:
Frontera is a the successor to Stampede 2, a planned academic supercomputer
with a peak performance of 35-40 petaflops. Frontera is expected to deliver the
3x speedup in real-application performance over Blue Waters at about one-third
the cost.
Block diagram:
Software :
CentOS linux 7
Functional units :
Processor :
Intel xeon platinum 8280 Intel Xeon Platinum processors offer the critical and
hybrid cloud workloads, real-time analytics, machine learning, and artificial
intelligence.14 nm Clock Speed 4 GHz.
Tianhe-2A:
INTRODUCTION:
Also known as the Milky Way-2A, this system is developed by China’s National
University of Defense Technology (NUDT) and deployed at the National
Supercomputer Center in Guangzho. It is powered by Intel Xeon CPUs and NUDT’s
Matrix-2000 DSP accelerators and achieves 61.4 petaflops on HPL.
BLOCK DIAGRAM:
DETAILED DESCRIPTION OF FUNCTIONAL UNITS:
It was the world's fastest supercomputer according to the TOP500 lists for June
2013, November 2013, June 2014, November 2014, June 2015, and November
2015. The record was surpassed in June 2016 by the Sunway TaihuLight. In 2015,
plans of the Sun Yat-sen University in collaboration with Guangzhou district and
city administration to double its computing capacities were stopped by a U.S.
government rejection of Intel's application for an export license for the CPUs
and coprocessor boards.
In response to the U.S. sanction, China introduced the Sunway
TaihuLight supercomputer in 2016, which substantially outperforms the Tianhe-2
(and also affected the update of Tianhe-2 to Tianhe-2A replacing US tech), and
now ranks fourth in the TOP500 list while using completely domestic technology
including the Sunway manycore microprocessor.
According to NUDT, Tianhe-2 will be used for simulation, analysis, and
government security applications.
Lenovo SuperMUC-NG:
Introduction:
SuperMUC was a supercomputer of the Leibniz Supercomputing Centre (LRZ) of
the Bavarian Academy of Sciences. It was housed in the LRZ's data centre in
Garching near Munich. It was decommissioned in January 2020, having been
superseded by the more powerful SuperMUC-NG.SuperMUC (the suffix 'MUC'
alludes to the IATA code of Munich's airport) is operated by the Leibniz
Supercomputing Centre, a European centre for supercomputing. In order to
house its hardware, the
infrastructure space of the Leibniz Supercomputing Centre was more than doubled
in
2012. SuperMUC was the fastest European supercomputer when it entered
operation in the summer of 2012 and is currently ranked #20 in the Top500 list of
the world's fastest supercomputers. SuperMUC serves European researchers in
many fields, including medicine, astrophysics, quantum chromodynamics,
computational fluid dynamics, computational chemistry, life sciences, genome
analysis and earthquake simulations.
Block Diagram:
Functional Units:
SuperMUC is an IBM iDataPlex system containing 19,252 Intel Xeon Sandy Bridge-
EP and Westmere-EX multi-core processors (155,656 cores), for a peak
performance of about 3 PFLOPS (3 × 1015 FLOPS). It has 340 TB of main memory
and
15 PB of hard disk space. It uses a new form of cooling that IBM developed, called
Aquasar, that uses hot water to cool the processors. IBM claims that this design
saves
40 percent of the energy normally needed to cool a comparable system.
Block Diagram:
Functional Units:
Its architecture is CRAY XC40 with memory capacity of 2.07 PiB. Its peak
performance is 41.5 PF/s . Number of computer nodes 19,420. Parallel file
capacity is 78 PB (69
PiB). Burst Buffer capacity is 3.7 PB. Footprint 4606 sq ft with power requirement
of
8.6 MV.Trinity was built in 2 stages. The first stage incorporated the Intel Xeon
Haswell
processor while the second stage added a significant performance increase using
the
Intel Xeon Phi Knights Landing Processor. There are 301,952 Haswell and 678,912
Knights Landing processors in the combined system, yielding a total peak
performance of over 40 PF/s (petaflops).There are 5 primary storage tiers;
Memory, Burst Buffer, Parallel File System, Campaign Storage, and Archive.2 PiB
of DDR4 DRAM provide physical memory for the machine. Each processor also
has DRAM built on to the tile, providing additional memory capacity. The data in
this tier is highly transient and is typically in residence for only a few seconds,
being overwritten continuously. Cray supplies the three hundred XC40 Data
Warp blades that each contain 2 Burst Buffer nodes and 4 SSD drives. There is a
total of 3.78 PB of storage in this tier, capable of moving data at a rate of up to 2
TB/s. In this tier, data is typically resident for a few hours, with data being
overwritten in approximately that same time frame.Trinity uses a
Sonexion based Lustre file system with a total capacity of 78 PB. Throughput on
this tier is about 1.8 TB/s (1.6 TiB/s). It is used to stage data in preparation for HPC
operations. Data residence in this tier is typically several weeks.The MarFS File
System fits into the Campaign Storage tier and combines properties of POSIX and
Object storage models. The capacity of this tier is growing at a rate of about 30
PB/year, with a current capacity of over 100 PB. In testing, LANL scientists were
able to create 968 billion files in a
single directory at a rate of 835 million file creations per second. This storage
is designed to be more robust than typical object storage, while sacrificing
some of the
end user functionality that you would get from a POSIX system. Performance of
this tier is between 100-300 GB/s of throughput. Data residence in this tier is
longer term, typically lasting several months.
Sunway TaihuLight:
Introduction:
The Sunway TaihuLight is a Chinese supercomputer which, as of November 2021,
is ranked fourth in the TOP500 list, with a LINPACK benchmark rating of 93
petaflops.The name is translated as divine power, the light of Taihu Lake. This is
nearly three times as fast as the previous Tianhe-2, which ran at 34 petaflops. As
of June
2017, it is ranked as the 16th most energy-efficient supercomputer in the
Green500, with an efficiency of 6.051 GFlops/watt. It was designed by the
National Research Center of Parallel Computer Engineering & Technology
(NRCPC) and is located at the National Supercomputing Center in Wuxi in the
city of Wuxi, in Jiangsu province, China.
Block Diagram:
Software:
The system runs on its own operating system, Sunway RaiseOS 2.0.5,
which is based on Linux. The system has its own customized
implementation of OpenACC 2.0 to aid the parallelization of code.
Functional Units:
The Sunway TaihuLight uses a total of 40,960 Chinese-designed
SW26010 manycore
64-bit RISC processors based on the Sunway architecture. Each
processor chip contains 256 processing cores, and an additional four
auxiliary cores for system management (also RISC cores, just more
fully featured) for a total of 10,649,600 CPU cores across the entire
system.
Summit , US
Summit is a supercomputer developed by IBM for Oak Ridge national
laboratory. It is capable of 200 petaFLOPS making it second fastest
supercomputer. It was the fastest supercomputer till 2020. It is the fifth
most energy efficient supercomputer as of 2019. summit was the first
supercomputer to reach exaFLOP. The Summit supercomputer provides
scientists and researchers the opportunity to solve complex tasks in the
fields of energy, artificial intelligence, human health and other research
areas. It has been used in Earthquake Simulation, Extreme Weather
simulation using AI, Material science, Genomics and in predicting the
lifetime of Neutrinos in physics.
Hardware :
Each one of its 4,608 nodes has over 600 GB of coherent memory (96
GB HBM2 plus 512 GB DDR4 SDRAM) which is addressable by all CPUs
and GPUs plus 800 GB of non-volatile RAM that can be used as a burst
buffer or as extended memory.[19] The POWER9 CPUs and Nvidia
Volta GPUs are connected using NVIDIA's high speed NVLink. This allows
for a heterogeneous computing model.To provide a high rate of data
throughput, the nodes will be connected in a non-blocking fat-tree
topology using a dual-rail Mellanox EDR InfiniBand interconnect for both
storage and inter-process communications traffic which delivers both
200Gbit/s bandwidth between nodes.
Block diagram :
Functional units :
Weighing over 340 tons, Summit takes up 5,600 sq. ft. of floor space
at Oak Ridge National Laboratory. Summit consists of 256 compute
racks, 40 storage racks, 18 switching director racks, and 4 infrastructure
racks. Servers are linked via Mellanox IB EDR interconnect in a three-
level non-blocking fat-tree topology.
Compute Rack:
BLOCK DIAGRAM:
OPERATING SYSTEM:
Likewise SUMMIT, Sierra also uses Linux operating system.
Functional units:
Power9:
POWER9 is a family of superscalar, multithreading, multi-
core microprocessors produced by IBM, based on the Power ISA. It was
announced in August 2016.[2] The POWER9-based processors are being
manufactured using a 14 nm FinFET process,[3] in 12- and 24-core
versions, for scale out and scale up applications,[3] and possibly other
variations, since the POWER9 architecture is open for licensing and
modification by the OpenPOWER Foundation members.
GPU:
With 640 Tensor Cores, V100 is the world’s first GPU to break the 100
teraFLOPS (TFLOPS) barrier of deep learning performance. The next
generation of NVIDIA NVLink™ connects multiple V100 GPUs at up to
300 GB/s to create the world’s most powerful computing servers. AI
models that would consume weeks of computing resources on previous
systems can now be trained in a few days. With this dramatic reduction
in training time, a whole new world of problems will now be solvable
with AI.