02 Parallel Plateforms Architecture-I

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 25

Parallel Computer

Architecture/Programming Models
Introduction to Parallel and Distributed Computing

Lecture 2 – Parallel Architecture


Learning Outcomes
 Define parallelism in hardware and software.
 List parallel computer architectures types

 Explain different parallel computer architectures.

 Define Flynn’s taxonomy.

2
Lecture 2 – Parallel Architecture
Outline
• Parallel architecture types
• Instruction-level parallelism
• Vector processing
• Flynn’s Taxonomy
• Shared memory
– Memory organization: UMA, NUMA
– Coherency: CC-UMA, CC-NUMA
• Interconnection networks
• Distributed memory
• Clusters
• Clusters of SMPs
• Heterogeneous clusters of SMPs
Introduction to Parallel Computing, University of Oregon, IPCC
3
Lecture 2 – Parallel Architecture
Parallelism and Parallel vs. Distributed

• Parallel computing generally


means:
– Vector processing of data
– Multiple CPUs in a single
computer
• Distributed computing
generally means:
– Multiple CPUs across many
computers

Introduction to Parallel Computing, University of Oregon, IPCC


4
Lecture 2 – Parallel Architecture
Different Workers
 Different threads in the same core
 Different cores in the same CPU

 Different CPUs in a multi-processor system

 Different machines in a distributed system

Lecture 2 – Parallel Architecture


How do you get parallelism in the hardware?
• Task Parallelism
– Distribute Instructions across processors
• Data parallelism
– Distribute Instructions across processors

• Instruction-Level Parallelism (ILP)


• Bit-level parallelism
• Processor parallelism
– Increase number of processors
• Memory system parallelism
– Increase number of memory units
– Increase bandwidth to memory
• Communication parallelism
– Increase amount of interconnection between elements
– Increase communication bandwidth
Introduction to Parallel Computing, University of Oregon, IPCC
6
Lecture 2 – Parallel Architecture
Parallel Architecture Types
• Uniprocessor • SPMD
– Scalar processor • Shared Memory
processor Multiprocessor (SMP)
– Shared memory address space
memory
– Bus-based …memory system
processor processor
– Vector processor
processor vector bus
memory
memory
– Interconnection network
processor … processor
– Data Parallel - Flynn’s network
processor …
Taxonomy (SISD, SIMD,
… memory
MIMD)
memory

Introduction to Parallel Computing, University of Oregon, IPCC


7
Lecture 2 – Parallel Architecture
Parallel Architecture Types (2)
• Distributed
Cluster of SMPs
Memory
Multiprocessor
– Shared memory addressing within SMP node
– Message passing between SMP nodes
between nodes
memory memory
… M M
processor processor …
P … P P … P
interconnection network network
interface
interconnection network
processor processor

P … P P … P
memory memory …
M M

– Can also be regarded as MPP if processor number is large


– Massively
Introduction to Parallel Computing, Parallel Processor
University of Oregon, IPCC (MPP)Lecture 2 – Parallel Architecture
8
Parallel Architecture Types (3)
 Multicore
• Multicore SMP+GPU Cluster
 Multicore processor
– Shared memory addressing within SMP node
C C C C cores can be
– Message
m m mpassing between
m
hardware SMP nodes
– GPU acceleratorsmultithreaded
attached
memory (hyperthread)

 GPU accelerator …
M M
processor
PCI P … P P … P
memory
interconnection network
 “Fused” processor accelerator
P … P P … P
processor
M … M
memory

Introduction to Parallel Computing, University of Oregon, IPCC


9
Lecture 2 – Parallel Architecture
Instruction-Level Parallelism

Simultaneous execution of multiple instructions of a


Program.
 Opportunities for splitting up instruction processing

 Pipelining within instruction

 Pipelining between instructions

 Overlapped execution

 Multiple functional units

 Out of order execution

 Multi-issue execution

 Superscalar processing

 Hardware multithreading (hyperthreading)

Introduction to Parallel Computing, University of Oregon, IPCC


10
Lecture 2 – Parallel Architecture
Parallelism in Single Processor Computers

• History of processor architecture innovation


ILP
Unpipelined Pipelined
Vector

multiple only scalar vector


E unit instructions instructions

horizontal issue-when- register memory


control ready to register to memory

CDC FPS CDC CRAY-1 CDC


6600 AP-120B 7600 Cyber-205

scoreboarding
VLIW IBM
360/91
reservation stations

Introduction to Parallel Computing, University of Oregon, IPCC


11
Lecture 2 – Parallel Architecture
Vector Processing
• Scalar processing
– Processor instructions operate on scalar values
– integer registers and floating-point registers
• Vectors
– Set of scalar data
– Vector registers
• integer, floating point (typically)
– Vector instructions operate on
vector registers (SIMD)
• Issues
– Vector unit pipelining
– Multiple vector units
– Vector chaining
Introduction to Parallel Computing, University of Oregon, IPCC
12
Lecture 2 – Parallel Architecture
Flynn’s Taxonomy and Data Parallelism
• One of the more widely used parallel computer
classifications, since 1966, is called Flynn’s Taxonomy
• It distinguishes multiprocessor computers according to
the dimensions of Instruction and Data
Instructions
Single (SI) Multiple (MI)
SISD MISD
SISD: Single instruction stream,

Multiple (MD) Single (SD)


Single-threaded Pipeline
Single data stream process architecture
SIMD: Single instruction
stream, Multiple data streams
Data

MISD: Multiple instruction SIMD MIMD


streams, Single data stream Vector Multi-threaded
MIMD: Multiple instruction Processing Programming
streams, Multiple data streams
Lecture 2 – Parallel Architecture
SISD Machines
• A serial (non-parallel) computer
• Single instruction: Only one instruction stream is acted
on by CPU during any one clock cycle
• Single data: Only one data stream is used as input
during any one clock cycle
• Deterministic execution
• Oldest and most prevalent form computer

• Examples: Most PCs, single CPU


Processor
workstations and mainframes
D D D D D D D

Lecture 2 – Parallel Architecture


Instructions
SIMD Machines (I)
• A type of parallel computers
• Single instruction: All processor units execute the
same instruction at any give clock cycle
• Multiple data: Each processing unit can operate on a
different data element
• It typically has an instruction dispatcher, a very high-
bandwidth internal network, and a very large array of
very small-capacity instruction units
• Best suitable for specialized problems e.g., image
processing
• Two varieties: Processor Arrays and Vector Pipelines
• Examples: Connection Machines, MasPar-1, MasPar-2;
„ IBM 9000, Cray C90, Fujitsu VP, etc
Introduction to Parallel Computing, University of Oregon, IPCC
15
Lecture 2 – Parallel Architecture
SIMD Machines (II)

Introduction to Parallel Computing, University of Oregon, IPCC


16
Lecture 2 – Parallel Architecture
SIMD Machines (III)

Introduction to Parallel Computing, University of Oregon, IPCC


17
Lecture 2 – Parallel Architecture
MISD Machines (I)

• A single data stream is fed into multiple


processing units
• Each processing unit operates on the data
independently via independent instruction
streams
• Very few actual machines: CMU’s C.mmp
computer (1971)
• Possible use: multiple frequency filters
operating on a single signal stream
Introduction to Parallel Computing, University of Oregon, IPCC
18
Lecture 2 – Parallel Architecture
MISD Machines (II)

Introduction to Parallel Computing, University of Oregon, IPCC


19
Lecture 2 – Parallel Architecture
MIMD Machines (I)
• Multiple instruction: Every processor may
execute a different instruction stream
• Multiple data: Every processor may work with a
different data stream
• Execution can be synchronous or asynchronous,
deterministic or nondeterministic
• Examples: most current supercomputers, grids,
networked parallel computers, multiprocessor
SMP computer

Introduction to Parallel Computing, University of Oregon, IPCC


20
Lecture 2 – Parallel Architecture
MIMD Machines (II)

Introduction to Parallel Computing, University of Oregon, IPCC


21
Lecture 2 – Parallel Architecture
Parallel vs. Distributed
Processor

D D D D D D D

Network connection
for data transfer
Instructions
Shared
Memory Processor

D D D D D D D

Instructions

Parallel: Multiple CPUs within a Distributed: Multiple machines with own


shared memory machine memory connected over a network
Lecture 2 – Parallel Architecture
Data Parallel Architectures
• Array of simple processors with memory
• Processors arranged in a regular topology
• Control processor issues instructions
• Specialized synchronization and communication
• Specialized reduction operations
• Array processing

Introduction to Parallel Computing, University of Oregon, IPCC


23
Lecture 2 – Parallel Architecture
Next Class
 Parallel Computer Architecture/Programming
Models (Continued..)
❍ Memory
❍ Interconnection Networks

Lecture 2 – Parallel Architecture


Home Task and Assignment

• Linux Installation
• Cisco Linux certification
– Linux I
– Linux II
• Linux commands and BASH Scripts
(Assignment #1)

Introduction to Parallel Computing, University of Oregon, IPCC


25
Lecture 2 – Parallel Architecture

You might also like