Professional Documents
Culture Documents
Ceng 222-15
Ceng 222-15
Ceng 222-15
Architectures
Parallel Computer Architectures
Although computers keep getting faster, the demands placed on them
are increasing at least as fast. Although clock rates are continually
rising, circuit speed cannot be increased indefinitely. The speed of
light is already a major problem for designers of highend computers,
and the prospects of getting electrons and photons to move faster are
dim. Heat-dissipation issues are turning supercomputers into state-of-
the-art air conditioners. Finally, as transistor sizes continue to shrink,
at some point each transistor will have so few atoms in it that
quantum mechanical effects may become a major problem.
Therefore, in order to handle larger and larger problems, computer
architects are turning increasingly to parallel computers. While it may
not be possible to build a computer with one CPU and a cycle time of
0.001 nsec, it may well be possible to build one with 1000 CPUs each
with a cycle time of 1 nsec. Although the latter design uses slower
CPUs than the former one, its total computing capacity is
theoretically the same.
2 CENG 222 - Spring 2012-2013 Dr. Yuriy ALYEKSYEYENKOV
Parallel Computer Architectures
Parallelism can be introduced at various levels. At the
lowest level, it can be added to the CPU chip, by pipelining and
superscalar designs with multiple functional units. It can also
be added by having very long instruction words with implicit
parallelism. Special features can be added to a CPU to allow it
to handle multiple threads of control at once.
Finally, multiple CPUs can be put together on the same
chip. Together, these features can pick up perhaps a factor of
10 in performance over purely sequential designs. At the next
level, extra CPU boards with additional processing capacity can
be added to a system. Usually, these plug-in CPUs have
specialized functions, such as network packet processing,
multimedia processing, or cryptography. For specialized
applications, they can also gain a factor of perhaps 5 to 10.
On-Chip Multithreading
All modern, pipelined CPUs have an inherent problem: when a
memory reference misses the level 1 and level 2 caches, there is
a long wait until the requested word (and its associated cache
line) are loaded into the cache, so the pipeline stalls. One
approach to dealing with this situation, called on-chip
multithreading, allows the CPU to manage multiple threads of
control at the same time in an attempt to mask these stalls. In
short, if thread 1 is blocked, the CPU still has a chance of
running thread 2 in order to keep the hardware fully occupied.
Useful
information
Figure shows what the packet looks like when on the Ethernet; on the
telephone line, it is similar except with a ‘‘telephone line header’’
instead of an Ethernet header. Header management is important and
is one of the things network processors can do.
19 CENG 222 - Spring 2012-2013 Dr. Yuriy ALYEKSYEYENKOV
COPROCESSORS. Network Processors
One kind of hardware solution for fast packet processing is to use a
custom ASIC (Application-Specific Integrated Circuit). Such a chip
is like a hardwired program that does whatever set of processing
functions it was designed for. Many current routers uses ASICs.
ASICs have many problems, however. First, they take a long time to
design and a long time to manufacture. They are also rigid, so if new
functionality is needed, a new chip has to be designed and
manufactured.
A second solution is the FPGA (Field Programmable Gate Array)
which is a collection of gates that can be organized into the desired
circuit by rewiring them in the field. These chips have much a shorter
time to market than ASICs and can be rewired in the field by
removing them from the system and inserting them into a special
reprogramming device. On the other hand, they are complex, slow,
and expensive, making them unattractive except for niche
applications.
20 CENG 222 - Spring 2012-2013 Dr. Yuriy ALYEKSYEYENKOV
COPROCESSORS. Network Processors
PPEs,
variously called
Protocol/Programmable/
Packet Processing
Engines.