Professional Documents
Culture Documents
04.huawei Atlas Data Center AI Training
04.huawei Atlas Data Center AI Training
1 Trends
3 Scenario
2 Huawei Confidential
Development
Trends
Shifting from General to Heterogeneous Computing,
Advancing Moore's Law
Unstructured data A paradigm shift from general to
increases exponentially heterogeneous computing
Data • Moore's Law is weakening
Unstructured data growth • Innovations of heterogeneous computing architecture will break the
bottlenecks and advance Moore's Law
Structured data
GPU
FPGA
CPU
1980 1990 2000 2010 2020 2030
3 Huawei Confidential
Challenges
Perceptron
Doubles Doubles every
every 2 years 3.4 months
Unit: PetaFLOPS/day 2012
Data source: OpenAI
4 Huawei Confidential
Contents
1 Trends
3 Scenario
5 Huawei Confidential
Huawei Data Center Solution: Building the Industry's Most
Powerful AI Training Platform
Ascend 910
AI processor
6 Huawei Confidential
Ascend 910
High integration 3
32 Huawei Da Vinci AI Cores + 16 TaiShan Cores + 125T
100G RoCE v2 NIC, integrating AI computing, 2
90T
general computing, and I/O all in one processor 45T
1
Architecture: Da Vinci
256 TFLOPS FP16 High-speed connection Ascend 910
128-channel full HD video
decoding
Integrates the HCCS, PCIe 4.0, and 100G RoCE * Normalized to 16-bit
7
Ascend 910
Integrated high-
16 CPU Cores speed network port
9
Atlas 800
10
Atlas 800
Kunpeng 920
processor
32 DDR4 DIMMs
PCIe module
PSU
8 x 100GE
optical ports
11 Huawei Confidential
Atlas 800
Atlas 800 AI Server: Industry's Highest Computing Density
Model: 9000
• Computing power:
Competitor's
Atlas 800 AI server • Computing power: 1
2 PFLOPS PFLOPS
• Height: 4U • Height: 4U
• Power consumption: • Power consumption:
5.5 kW 4.4 kW • Superior computing power: 8 Ascend 910 AI
processors, up to 2 PFLOPS FP16 per server,
2 PFLOPS 2 PFLOPS outstripping competitor products by far
8U
Vendor 1
Atlas 800
Atlas 800
Vendor 1
2x computing density
12 Huawei Confidential
Atlas 800
13 Huawei Confidential
Atlas 800
14 Huawei Confidential
Atlas 900
15 Huawei Confidential
Atlas 900
79.8s
70.2s
15%
59.8s
Computing power
125 TFLOPS 125 TFLOPS 256 TFLOPS
per processor
16 Huawei Confidential
Atlas 900
D D D D D D D D
iLossless, a unique, intelligent lossless switching algorithm, learns and
trains network traffic in real time, achieving zero packet loss and E2E
AI server AI server μs-level latency.
HCCS PCIe 4.0 100G RoCE
17 Huawei Confidential
Atlas 900
18 Huawei Confidential
Atlas Accelerates AI Model Training for Various Applications
Video
analysis
Algorithms & models
Trained models
Parameter Computing Model Tailor Quantization AI services
Dataset tuning verification Gene
research
Weather
forecast
Oil
exploration
Atlas 300T AI accelerator card Atlas 800 AI server Atlas 900 AI cluster
Model: 9000 Model: 9000
19
Contents
1 Trends
3 Scenario
20 Huawei Confidential
AI Supercomputing: Builds the Infrastructure for Cloud Services
World's No. 1 in performance benchmark test: 59.8s
Atlas 900
AI supercomputing powered by Time 76.8s
Kunpeng and Ascend 70.2s
256–1024 PFLOPS FP16 59.8s
AI applications boost the development of the Greater Bay Area (GBA) Benchmarking with NVIDIA: TCO reduced by 9.3% for
the same computing power
Smart Smart Smart finance
transportation healthcare and more 1. Doubled computing power
Interconnects over 1024 Ascend 910 AI processors,
• AI supercomputing platform with international providing double computing performance on a single chip
National
strategy
•
Shenzhen
computing system and robot system in the GBA 3. Over 60% electricity saving and 80% smaller footprint
• Improves the basic position and innovation capability of AI research Hybrid liquid cooling system for 50 kW per rack, PUE < 1.1
on open source platforms and intelligent applications in the GBA Ultra-high-density prefabricated modular equipment room,
• Attracts national AI resources and talents low power consumption, fast deployment, and exascale
CloudBrain cluster rollout in six months
21 Huawei Confidential
Thank you. 把数字世界带入每个人、每个家庭、
每个组织,构建万物互联的智能世界。
Bring digital to every person, home and
organization for a fully connected,
intelligent world.