Professional Documents
Culture Documents
High Bandwidth Memory HBM
High Bandwidth Memory HBM
HBM
Introduction
HBM (High Bandwidth Memory) is a new type of CPU/GPU memory chip (ie "RAM"). In fact,
many DDR chips are stacked together and packaged with the GPU to achieve a large-
capacity, high-bit-width DDR combination array.
Disadvantages of HBM
Poor flexibility
This type of memory of HBM was first initiated by AMD in 2008. AMD's original intention for
HBM was to make changes in power consumption and the size of computer memory. In the
following years, AMD has been trying to solve the technical problems of die stacking, and
later found partners in the industry with storage media stacking experience, including SK
Hynix, and some manufacturers in the interposer and packaging fields.
HBM was first manufactured by SK Hynix in 2013. And this year 2021 HBM was adopted by
the JESD235 standard of JEDEC (Electronic Components Industry Association). The first GPU
to use HBM storage was AMD Fiji (Radeon R9 Fury X) in 2015. The following year 2021
Samsung began mass production of HBM2-NVIDIA Tesla P100, which was the first GPU to
use HBM2 storage.
From the shape of HBM, it is not difficult to find its first shortcoming: the lack of flexibility in
system collocation. For PCs in the early years, the expansion of memory capacity is a
relatively conventional capability. And the HBM is packaged with the main chip, there is no
possibility of capacity expansion, and the specifications are already fixed at the factory. And
it is different from the current notebook equipment where DDR memory is soldered to the
motherboard. HBM is integrated on the chip by the chip manufacturer-its flexibility will be
weaker, especially for OEM manufacturers.
For most chip manufacturers, pushing processors for the mass market (including the
infrastructure market), based on various considerations including cost, is unlikely to launch
chip SKU models with various memory capacities. The processors pushed by these
manufacturers have various configuration models (for example, there are various models of
Intel Core processors)-if you consider the difference in subdivided memory capacity, the
manufacturing cost may be difficult to support.
Capacity is too small
The second problem with HBM is that memory capacity is more limited than DDR. Although
a single HBM package can stack 8 layers of DRAM die, each layer is 8Gbit and 8 layers are
8GByte. Supercomputing chips like A64FX leave 4 HBM interfaces, that is, 4 HBM stack
packages and a single chip has a total capacity of 32GByte.
Such a capacity is still too small for DDR. It is very common for ordinary PCs in the consumer
market to pile up more than 32GByte of memory. Not only are there a large number of
expandable memory slots on PCs and server motherboards, but some DDR4/5 DIMMs are
also stacking DRAM die. Using relatively high-end DRAM die stacking, 2-rank RDIMMs
(registered DIMMs) can achieve 128GByte capacity-considering 96 DIMM slots in high-end
servers, that is at most 12TByte capacity.
HBM3
From the PC era to the mobile and AI era, the architecture of the chip has also moved from
being CPU-centric to data-centric. The test brought by AI includes not only chip computing
power, but also memory bandwidth. Even though the DDR and GDDR rates are relatively
high, many AI algorithms and neural networks have repeatedly encountered memory
bandwidth limitations. HBM, which focuses on large bandwidth, has become the preferred
DRAM for high-performance chips.
At the moment, JEDEC has not yet given the final draft of the HBM3 standard, but the IP
vendors participating in the standard formulation work have already made preparations. Not
long ago, Rambus was the first to announce a memory subsystem that supports HBM3.
Recently, Synopsys also announced the industry's first complete HBM3 IP and verification
solution.
As early as the beginning of 2021, SK Hynix gave a forward-looking outlook on the
performance of HBM3 memory products, saying that its bandwidth is greater than 665 GB/s
and I/O speed is greater than 5.2Gbps, but this is just a transitional performance. Also in
2021, the data released by IP vendors further raised the upper limit. For example, Rambus
announced that in the HBM3 memory subsystem, the I/O speed is as high as 8.4Gbps, and
the memory bandwidth can be as high as 1.075TB/s.
In June 2022 lats year, Taiwan Creative Electronics released an AI/HPC/network platform
based on TSMC’s CoWoS technology, equipped with an HBM3 controller and PHY IP, with an
I/O speed of up to 7.2Gbps. Creative Electronics is also applying for an interposer wiring
patent, which supports zigzag wiring at any angle, and can split the HBM3 IP into two SoCs
for use.
The complete HBM3 IP solution announced by Synopsys provides a controller, PHY, and
verification IP for a 2.5D multi-chip package system, saying that designers can use memory
with low power consumption and greater bandwidth in SoC. Synopsys’ DesignWare HBM3
controller and PHY IP are based on the chip-proven HBM2E IP, while the HBM3 PHY IP is
based on the 5nm process. The rate of each pin can reach 7200 Mbps, and the memory
bandwidth can be increased to 921GB/s.
At present, Micron, Samsung, SK Hynix, and other memory manufacturers are already
following this new DRAM standard. SoC designer Socionext has cooperated with Synopsys to
introduce HBM3 in its multi-chip design, in addition to the x86 architecture that must be
supported. , Arm’s Neoverse N2 platform has also planned to support HBM3, and SiFive’s
RISC-V SoC has also added HBM3 IP. But even if JEDEC is not "stuck" and released the official
HBM3 standard at the end of the year 2022.
References
www.utmel.com
www.jedec.org