Professional Documents
Culture Documents
04 - Lecture #4
04 - Lecture #4
04 - Lecture #4
Computing
LECTURE #4
1
o Parallel Computing Platforms.
• Von Numman Architecture.
• Flynn Techxonomy
Agenda
o Logical Organization
• Control
• Communication
o Physical Organization
2
von Neumann Architecture
❖Named after the Hungarian mathematician John von Neumann who first
authored the general requirements for an electronic computer in his 1945
papers.
❖Since then, virtually all computers have followed this basic design, which
differed from earlier computers programmed through "hard wiring".
3
von Neumann Architecture
4
Parallel Computing Platform
Logical Organization
5
Models:
Flynn's Classical Taxonomy
❖ There are different ways to classify parallel computers. One of the more
widely used classifications, in use since 1966, is called Flynn's Taxonomy.
❖Each of these dimensions can have only one of two possible states: Single or
Multiple.
6
Flynn's Classical Taxonomy
The matrix below defines the 4 possible classifications according to Flynn:
7
Flynn's Taxonomy
❖ Single data: only one data stream is being used as input during any one
clock cycle
❖ This is the oldest and even today, the most common type of computer
10
Flynn's Taxonomy
❖Processor Arrays: ILLIAC IV, DAP Connection Machine CM-2, MasPar MP-1.
❖Vector Pipelines: IBM 9000, Cray X-MP, Y-MP & C90, Fujitsu VP, NEC SX-2,
Hitachi S820, ETA10
❖Examples:
For (I = 0; i<1000; i++)
c[i] = a[i] + b[i];
11
Flynn's Taxonomy
12
Flynn's Taxonomy
13
Your Turn !!!
Guess what are the SIMD drawbacks??!!
14
Flynn's Taxonomy
17
Flynn's Taxonomy
❖Multiple Instruction, Multiple Data (MIMD):
18
Your Turn
Compare between SIMD and MIMD
19
20
❖Suppose you want to do a puzzle that has, say, a
thousand pieces. time??
21
❖ A friend came to help!!
❖ you’ll both reach into the pile of pieces at the same time
❖ Speedup??
22
❖More help!!
❖ Contention??
❖ Communication??
❖ Speedup??
23
❖Now let’s try something a little different.
❖ Decomposition ??
24
❖More??
❖ Easy??
❖ Load balance
25
Parallel Computing Platform
Logical Organization
26
Parallel Computing Platform
Logical Organization
Platforms that provide a shared data space are Platforms that support messaging are called
called shared-address-space machines or message passing platforms or multi-computers.
multiprocessors
27
Parallel Computing Platform
Logical Organization
❖It is important to note the difference between the terms shared address
space and shared memory.
28
Parallel Computing Platform
1- Accessing Shared data (cont.) Logical Organization
❖ Shared memory machines can divided into two main classes based upon
memory access times:
➢Uniform Memory Access (UMA) and
(NUMA)
❖Often made by physically linking two or more Symmetric Multiprocessors SMPs.
31
(a) Uniform-memory access (b) Uniform-memory-access (c) Non-uniform-memory-
shared-address-space computer; shared-address-space access shared-address-
computer with caches and space computer with local
memories memory only. 32
Your Turn
Compare between NUMA and UMA
33
Parallel Computing Platform
Logical Organization
Platforms that provide a shared data space are Platforms that support messaging are called
called shared-address-space machines or message passing platforms or multicomputers.
multiprocessors
34
Parallel Computing Platform
Logical Organization
2- Exchanging messages
❖Distributed memory systems require a communication network to
connect inter-processor memory.
35
Parallel Computing Platform
2- Exchanging messages (Cont.) Logical Organization
❖ These platforms are programmed using (variants of) send and receive
primitives. {GetID, NumProcs}.
37
❖Each processor P (with its own local
cache C) is connected to exclusive local
memory, i.e. no other CPU has direct
access to it.
❖ Non-blocking vs Blocking
communication 38
(MPI)—A distributed memory parallel programming language
❖ The same program on each processor/machine (SPMD—a very useful subset of MIMD)
39
(MPI)—A distributed memory parallel programming language
41
Your Turn
Shared Memory Distributed Memory
Advantages Advantages
Global address space provides a user-friendly Memory is scalable with number of processor,
Increase the number of processors and the size of
programming prespective to memory. memory increases proportionally.
Data sharing between tasks is both fast and Each processor can rapidly access its own
uniform due to the proximity of memory to memory without interference and without
CPUs overhead incurred with trying to maintain cache
coherency.
Cost effectiveness
Disadvantages
Lack of scalability between memory and
CPUs. Adding more CPUs can increases traffic Disadvantages
on the shared memory and CPU path Programmer responsible for many details
associated with data communication between
Programmer responsibility for processors.
synchronization
Difficult to map existing data structures, based on
Expensive global memory, to this memory organization
NUMA access times 42