Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Module 3: Parallel Algorithm and Design

Preliminaries – Decomposition Techniques – Characteristics of


Tasks and Interactions – Mapping Techniques for Load balancing –
Parallel Algorithm Models.

Reference

Grama, A. (2013). Introduction to parallel computing. Harlow,


England:Addison-Wesley.
Introduction to Parallel Algorithm

• Sequential and Parallel algorithm

• Key stages
• Identifying portions of the work that can be performed concurrently.
• Mapping the concurrent pieces of work onto multiple processes running in parallel.
• Distributing the input, output, and intermediate data associated with the program.
• Managing accesses to data shared by multiple processors.
• Synchronizing the processors at various stages of the parallel program execution.
3.1. Preliminaries
• Dividing a computation into smaller computations and assigning them to different
processors for parallel execution are the two key steps in the design of parallel algorithms.

3.1.1 Decomposition, Tasks, and Dependency Graphs

• The process of dividing a computation into smaller parts, some or all of which may
potentially be executed in parallel, is called decomposition.
• Tasks are programmer-defined units of computation into which the main computation is
subdivided by means of decomposition.
• Tasks can be of arbitrary size, but once defined, they are regarded as indivisible units of
computation.
Example: Dense matrix - vector multiplication

Figure: Decomposition of dense matrix-vector multiplication into n tasks, where


n is the number of rows in the matrix. The portions of the matrix and the input
and output vectors accessed by Task 1 are highlighted
Task Dependency Graph

• An abstraction used to express such dependencies among tasks and their relative order of
execution is known as a task dependency graph.

• A task-dependency graph is a directed acyclic graph in which the nodes represent tasks and
the directed edges indicate the dependencies amongst them.

• The task corresponding to a node can be executed when all tasks connected to this node by
incoming edges have completed.

• Example: Database query processing


Example: Database query processing

A database storing information about used vehicles


Consider the computations performed in processing the following query:

MODEL="Civic" AND YEAR="2001" AND (COLOR="Green" OR COLOR="White")

The different tables and their dependencies in a query processing operation


Consider the computations performed in processing the following query:

MODEL="Civic" AND YEAR="2001" AND (COLOR="Green" OR COLOR="White")

An alternate data-dependency graph for the query processing operation


3.1.2 Granularity, Concurrency, and Task-Interaction

• The number and size of tasks into which a problem is decomposed determines the
granularity of the decomposition.

• Fine-grained granularity – large no. of small task


• Course grained granularity – small no. of large task

Decomposition of dense matrix-vector multiplication into four tasks.

The portions of the matrix and the input and output vectors accessed by Task 1 are highlighted
3.1.2 Granularity, Concurrency, and Task-Interaction

Decomposition of dense matrix-vector multiplication into four tasks.


Degree of concurrency

• The maximum number of tasks that can be executed simultaneously in a parallel program at
any given time is known as its maximum degree of concurrency
• In most cases, the maximum degree of concurrency is less than the total number of tasks due
to dependencies among the tasks
• For example, the maximum degree of concurrency in the task-graphs above figures is four
• In general, for task dependency graphs that are trees, the maximum degree of concurrency is
always equal to the number of leaves in the tree.
• Average degree of concurrency - average number of tasks that can run concurrently over the
entire duration of execution of the program.
• Both the maximum and the average degrees of concurrency usually increase as the granularity
of tasks becomes smaller (finer)
Degree of concurrency

• A feature of a task-dependency graph that determines the average degree of concurrency for a
given granularity is its critical path

• The longest directed path between any pair of start and finish nodes is known as the critical
path.
• The sum of the weights of nodes along this path is known as the critical path length, where the
weight of a node is the size or the amount of work associated with the corresponding task.
• The ratio of the total amount of work to the critical-path length is the average degree of
concurrency.
• Therefore, a shorter critical path favors a higher degree of concurrency.
• The degree of concurrency also depends on the shape of the task-dependency graph and the
same granularity, in general, does not guarantee the same degree of concurrency.
Find the average degree of concurrency of the two task-dependency graphs?
Find the average degree of concurrency of the two task-dependency graphs?

critical path length is 27 critical path length is 34


Total amt. of work req. 63 Total amt. of work req. 64
Avg. degree of concurrency: 2.33 Avg. degree of concurrency: 1.88
Task Interaction

• Another performance limiting factor is the interaction among tasks running on different
physical processors
• The dependencies in a task-dependency graph usually result from the fact that the output of
one task is the input for another.
• For example, in the database query example, tasks share intermediate data; the table generated
by one task is often used by another task as input.
• Another example: matrix-vector multiplication (one copy of vector is available)

• The pattern of interaction among tasks is captured by what is known as a task-interaction


graph.
• The nodes in a task-interaction graph represent tasks and the edges connect tasks that interact
with each other.
Can you understand this task interaction graph?
Multiplying a sparse matrix A with a vector b.

• The computation of each element of the result vector is a task.


• Only non-zero elements of matrix A participate in the computation.
• We partition b across tasks, then the task interaction graph of the
computation is identical to the graph of the matrix A 16
Processes and Mapping

• Process is an abstract entity that uses the code and data corresponding to a task to produce the
output of that task within a finite amount of time
• In addition to performing computations, a process may synchronize or communicate with
other processes
• The mechanism by which tasks are assigned to processes for execution is called mapping.
• The task-dependency and task-interaction graphs play an important role in the selection of a
good mapping for a parallel algorithm
• A good mapping should seek
• to maximize the use of concurrency by mapping independent tasks onto different processes,
• it should seek to minimize the total completion time and
• it should seek to minimize interaction among processes

The most efficient decomposition mapping combination is a single task mapped onto a single
process. It wastes no time in idling or interacting, but achieves no speedup
Mappings of the task graphs onto four processes (for query processing example)

You might also like