Lecture 1 - Introduction

Content Parallel and Distributed Computing
Introduction
The course covers two interrelated branches: parallel computing and distributed computing Which aspects will be covered?
Parallel / distributed architectures Parallel / distributed algorithms Parallel / distributed programming Parallel / distributed operating systems
Computer Science Department Technical University of Cluj-Napoca

Fall 2011
Content
Parallel computing
Interconnection networks: static networks (metrics, topologies) and dynamic networks (buses, crossbars, multistage networks) Performance and scalability: metrics, scalability definition, Amdahls law Parallel algorithms design Parallelization process, case study Data dependency Decomposition techniques (recursive, data exploratory, speculative) Mapping techniques Static (based on data partitioning, tasks partitioning, hierarchical) Dynamic (centralized, distributed) Dense matrix algorithms: matrix-vector multiplication (1D partitioning and 2D partitioning, comparison 1D to 2D), matrix-matrix multiplication (2D partitioning, Cannon algorithm) Sorting algorithms
Content
Distributed computing
Time Physical clocks synchronization (Cristian, Berkeley, Network Time Protocol) Logical clocks (Scalar time, Vector time, Efficient implementation of vector clocks - Singhal-Kshemkalyani) Distributed mutual exclusion: problem definition, token ring, Suzuki-Kasami, Central coordinator, Lamport, Ricart-Agrawala Causal order: problem definition, Birman-Schiper-Stephenson, Schiper-EggliSandoz Snapshot: problem definition, Chandy-Lamport, Spezialetti-Kearns, Lai-Yang Leader election: problem definition, General networks: FloodMax, OptFloodMax Synchronous / asynchronous ring: LeLann, Chang-Roberts, HirschbergSinclair, Franklin, Peterson Anonymous ring: Itai-Rodeh MapReduce - Hadoop
Goal
After completion of this course the students
will get a good understanding of designing parallel algorithms will get a good understanding of distributed algorithms
Administrative issues
Courses will be held on Thursday 18:00 20:00 in room 365 Lab activities will be held on Monday 08:00 16:00 in 36 Lecture notes will be made available on request Bibliography will be made available on request Sometimes is useful to take notes in class!!!
Administrative issues
Grading policy
Lab 30% Examen 70% Optional assignment 10%
Literature
Parallel Computing
(Grama) Introduction to Parallel Computing, A. Grama, A. Gupta, G. Karpypis, V. Kumar, 2003 (Culler) Parallel Computer Architecture: A Hardware / Software Approach, D.E. Culler, J.P. Singh, A. Gupta, Morgan Kaufman Publishers, 1999 Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers, B. Wilkinson, Prentice Hall, 2004
All work will be done individually, unless otherwise stated Lab, Assignment
Parallel Virtual Machine; GRID Assignment submission via email only; send to: Anca.Rarau@cs.utcluj.ro Late submissions are not accepted
Distributed Computing
(Kshemkalayani) Distributed Computing: Principles, Algorithms, and Systems, A. D. Kshemkalayani, M. Singhal, Cambridge University Press, 2008 (Coloris) Distributed Systems. Concepts and Design, G. Colouris, J. Dollimore, T. Kindberg, Addison-Wesley, Third Edition, 2001 Distributed Algorithms, N.A. Lynch, Morgan Kaufmann Publishers, 1996
Definition of parallel systems

Almasri and Gottlieb (1989)
collection of processing elements that communicate and cooperate to solve large problems fast
Introduction to Parallel Computing
Parallel application category

Applications in engineering and design Scientific applications Commercial applications Applications in computer science
Parallel applications
weather forecast, 3D plasma modeling, ocean circulation, viscous fluid dynamics, superconductor modeling, vision, chemical dynamics, ...
Why parallel systems?

solve a problem faster
Levels & types of parallelism

Explicit parallelism parallel platforms: PVM, MPI, OpenMP decomposition techniques mapping techniques compilers / interpreter Implicit parallelism
(programmer does not get involved)
solve a bigger problem in a reasonable time get more accurate result in a reasonable time than in case of using a single processor
(programmer gets involved)
Software parallelism
pipelining execution superscalar execution VLIW processors
Hardware parallelism
(exploit the instruction level parallelism)
Parallelism comes in various ways.

Critical components of parallel computing from programmers perspective
how to express parallel tasks? how to specify the interaction between parallel tasks?

how to express parallel tasks?
each program can be viewed as a parallel task each instruction can be viewed as a parallel task
1 2
for (int i = 0; i < 1000; i++) c[i] = a[i] + b[i];

how to specify the interaction between parallel tasks?
access a shared data space (multiprocessors) exchange messages

how to specify the interaction between parallel tasks? access a shared data space (multiprocessors)
processors interact by modifying the data stored in sharedaddress space memory can be local or global UMA (uniform memory access) all processors have the same time of accessing any memory module (local or global) NUMA easy to program read-only interactions read/write operations need mutual exclusion for concurrent access caches cache coherence mechanism put / get POSIX, OpenMP

how to specify the interaction between parallel tasks?
exchange messages
processors interact (send data, work and synchronize) by message passing each process has its exclusive address space send / receive specify target address mechanism to assign a unique identifier to each process whoami, numproces MPI, PVM

Level of parallelism
Program / process level parallelism Thread level parallelism Instruction level parallelism
Lab class (PVM, Grid)
Types of parallelism
Data parallelism Control (functional) parallelism
message passing architecture with p nodes can be emulated on a shared-address-space architecture

partition the shared address-space into p disjoint parts assign a part to each processor send / receive by writing / reading to the other processor partition
shared-address-space is costly to be emulated on a message passing architecture

accessing other node memory needs message sending and receiving
Control parallelism
Program/process level parallelism Data parallelism
Thread level parallelism
Parallel architectures taxonomy

Flynn Single instruction
SISD SIMD
Explicit parallelism
(programmer gets involved)
parallel platforms: PVM, MPI, OpenMP decomposition techniques mapping techniques compilers / interpreter Software parallelism
Single data Multiple data Culler
Multiple instruction
MISD MIMD
Instruction level parallelism Hardware parallelism

(exploit the instruction level parallelism)
Implicit parallelism
(programmer does not get involved)
pipelining execution superscalar execution VLIW processors
shared memory multiprocessor message passing architecture data parallel architecture (another name for SIMD) dataflow architecture systolic architecture
Parallelism comes in various ways.
Definitions of distributed systems

Coulouris et al (2001)
one in which hardware or software components located at networked computers communicate and coordinate their actions only by passing messages
Introduction to Distributed Computing
Tanenbaum (1995)
a collection of independent computers that appear to the user of the system as a single computer
Sloman and Kramer (1987)

one in which several autonomous processors and data stores supporting processes and/or databases interact in order to cooperate to achieve an overall goal. The processes coordinate their activities and exchange information by means of information transfer over a communication network.
Distributed applications
banking systems applications for conferences
Why distributed systems?

economy: share hardware resources and information potential improvement in performance and reliability bring improvement over the services provided by a single computer
access to a wider variety of resources (e.g. specialized processors, peripheral) that become accessible over a network
Distributed system
Architectural models
Client-server Peer-to-peer
Interaction models
Synchronous Asynchronous
Client-server
client calls a service of server (by sending a request message to the server) server does the work and sends the result back to the client server can act as a client for other servers issue: centralization (point of failure, bottleneck)
client
client
server
server
Peer-to-peer
all processes are equal every computer holds resources that are commonly shared no bottleneck for processing and communication issue: high complexity (find resources)
client
client
peer
peer
peer
peer
Interaction models
Synchronous lower and upper bounds on the execution time of processes messages are received within a known bounded time drift rates between local clocks have a known bound global physical time (with a certain precision) predicable behavior in terms of timing (proper for hard real-time apps) timeouts can be used to detect failures difficult and costly to implement Asynchronous no lower and upper bounds on the execution time of processes messages are not received within a known bounded time drift rates between local clocks do not have a known bound no global physical time (logical time is needed) unpredictable in terms of timing timeout cannot be used for failures detections widespread in practice
Parallel vs. Distributed vs. Concurrent
Parallel vs. distributed

Goal
parallel: solve a problem faster, bigger, more accurate distributed: resource sharing, improve reliability
Concurrent vs. parallel vs. distributed

Concurrent processes are executed simultaneously
on a single processors on a set of processors which are
physically close to each other physically far one from another
Processors type
parallel: homogenous distributed: heterogeneous
Geographical distribution
parallel: close to each other distributed: far from each other
References
Based on: Grama: chapter 1 Coulouris: chapter 1, 2

Lecture 1 - Introduction

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 1 - Introduction

Uploaded by

Copyright:

Available Formats

Content Parallel and Distributed Computing

Computer Science Department Technical University of Cluj-Napoca

Definition of parallel systems

Introduction to Parallel Computing

Parallel application category

Why parallel systems?

Levels & types of parallelism

(programmer gets involved)

pipelining execution superscalar execution VLIW processors

Parallelism comes in various ways.

Levels & types of parallelism

Levels & types of parallelism

for (int i = 0; i < 1000; i++) c[i] = a[i] + b[i];

Levels & types of parallelism

Levels & types of parallelism

Levels & types of parallelism

Levels & types of parallelism

message passing architecture with p nodes can be emulated on a shared-address-space architecture

shared-address-space is costly to be emulated on a message passing architecture

Program/process level parallelism Data parallelism

Thread level parallelism

Parallel architectures taxonomy

Single data Multiple data Culler

Instruction level parallelism Hardware parallelism

pipelining execution superscalar execution VLIW processors

Parallelism comes in various ways.

Definitions of distributed systems

Introduction to Distributed Computing

Sloman and Kramer (1987)

Why distributed systems?

Parallel vs. Distributed vs. Concurrent

Parallel vs. distributed

Concurrent vs. parallel vs. distributed

You might also like