8-Parallel Algorithm Design - Preliminaries-09-Jan-2020Material - I - 09-Jan-2020 - Module - 3 - Preliminaries PDF

Module 3: Parallel Algorithm and Design
Preliminaries – Decomposition Techniques – Characteristics of

Tasks and Interactions – Mapping Techniques for Load balancing –
Parallel Algorithm Models.
Reference
Grama, A. (2013). Introduction to parallel computing. Harlow,

England:Addison-Wesley.
Introduction to Parallel Algorithm
• Sequential and Parallel algorithm
• Key stages
• Identifying portions of the work that can be performed concurrently.
• Mapping the concurrent pieces of work onto multiple processes running in parallel.
• Distributing the input, output, and intermediate data associated with the program.
• Managing accesses to data shared by multiple processors.
• Synchronizing the processors at various stages of the parallel program execution.
3.1. Preliminaries
• Dividing a computation into smaller computations and assigning them to different
processors for parallel execution are the two key steps in the design of parallel algorithms.
3.1.1 Decomposition, Tasks, and Dependency Graphs
• The process of dividing a computation into smaller parts, some or all of which may
potentially be executed in parallel, is called decomposition.
• Tasks are programmer-defined units of computation into which the main computation is
subdivided by means of decomposition.
• Tasks can be of arbitrary size, but once defined, they are regarded as indivisible units of
computation.
Example: Dense matrix - vector multiplication
Figure: Decomposition of dense matrix-vector multiplication into n tasks, where

n is the number of rows in the matrix. The portions of the matrix and the input
and output vectors accessed by Task 1 are highlighted
Task Dependency Graph
• An abstraction used to express such dependencies among tasks and their relative order of
execution is known as a task dependency graph.
• A task-dependency graph is a directed acyclic graph in which the nodes represent tasks and
the directed edges indicate the dependencies amongst them.
• The task corresponding to a node can be executed when all tasks connected to this node by
incoming edges have completed.
• Example: Database query processing

Example: Database query processing
A database storing information about used vehicles

Consider the computations performed in processing the following query:
MODEL="Civic" AND YEAR="2001" AND (COLOR="Green" OR COLOR="White")
The different tables and their dependencies in a query processing operation

Consider the computations performed in processing the following query:
MODEL="Civic" AND YEAR="2001" AND (COLOR="Green" OR COLOR="White")
An alternate data-dependency graph for the query processing operation

3.1.2 Granularity, Concurrency, and Task-Interaction
• The number and size of tasks into which a problem is decomposed determines the
granularity of the decomposition.
• Fine-grained granularity – large no. of small task

• Course grained granularity – small no. of large task
Decomposition of dense matrix-vector multiplication into four tasks.
The portions of the matrix and the input and output vectors accessed by Task 1 are highlighted
3.1.2 Granularity, Concurrency, and Task-Interaction
Decomposition of dense matrix-vector multiplication into four tasks.

Degree of concurrency
• The maximum number of tasks that can be executed simultaneously in a parallel program at
any given time is known as its maximum degree of concurrency
• In most cases, the maximum degree of concurrency is less than the total number of tasks due
to dependencies among the tasks
• For example, the maximum degree of concurrency in the task-graphs above figures is four
• In general, for task dependency graphs that are trees, the maximum degree of concurrency is
always equal to the number of leaves in the tree.
• Average degree of concurrency - average number of tasks that can run concurrently over the
entire duration of execution of the program.
• Both the maximum and the average degrees of concurrency usually increase as the granularity
of tasks becomes smaller (finer)
Degree of concurrency
• A feature of a task-dependency graph that determines the average degree of concurrency for a
given granularity is its critical path
• The longest directed path between any pair of start and finish nodes is known as the critical
path.
• The sum of the weights of nodes along this path is known as the critical path length, where the
weight of a node is the size or the amount of work associated with the corresponding task.
• The ratio of the total amount of work to the critical-path length is the average degree of
concurrency.
• Therefore, a shorter critical path favors a higher degree of concurrency.
• The degree of concurrency also depends on the shape of the task-dependency graph and the
same granularity, in general, does not guarantee the same degree of concurrency.
Find the average degree of concurrency of the two task-dependency graphs?
Find the average degree of concurrency of the two task-dependency graphs?
critical path length is 27 critical path length is 34

Total amt. of work req. 63 Total amt. of work req. 64
Avg. degree of concurrency: 2.33 Avg. degree of concurrency: 1.88
Task Interaction
• Another performance limiting factor is the interaction among tasks running on different
physical processors
• The dependencies in a task-dependency graph usually result from the fact that the output of
one task is the input for another.
• For example, in the database query example, tasks share intermediate data; the table generated
by one task is often used by another task as input.
• Another example: matrix-vector multiplication (one copy of vector is available)
• The pattern of interaction among tasks is captured by what is known as a task-interaction

graph.
• The nodes in a task-interaction graph represent tasks and the edges connect tasks that interact
with each other.
Can you understand this task interaction graph?
Multiplying a sparse matrix A with a vector b.
• The computation of each element of the result vector is a task.

• Only non-zero elements of matrix A participate in the computation.
• We partition b across tasks, then the task interaction graph of the
computation is identical to the graph of the matrix A 16
Processes and Mapping
• Process is an abstract entity that uses the code and data corresponding to a task to produce the
output of that task within a finite amount of time
• In addition to performing computations, a process may synchronize or communicate with
other processes
• The mechanism by which tasks are assigned to processes for execution is called mapping.
• The task-dependency and task-interaction graphs play an important role in the selection of a
good mapping for a parallel algorithm
• A good mapping should seek
• to maximize the use of concurrency by mapping independent tasks onto different processes,
• it should seek to minimize the total completion time and
• it should seek to minimize interaction among processes
The most efficient decomposition mapping combination is a single task mapped onto a single
process. It wastes no time in idling or interacting, but achieves no speedup
Mappings of the task graphs onto four processes (for query processing example)

8-Parallel Algorithm Design - Preliminaries-09-Jan-2020Material - I - 09-Jan-2020 - Module - 3 - Preliminaries PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

8-Parallel Algorithm Design - Preliminaries-09-Jan-2020Material - I - 09-Jan-2020 - Module - 3 - Preliminaries PDF

Uploaded by

Copyright:

Available Formats

Module 3: Parallel Algorithm and Design

Preliminaries – Decomposition Techniques – Characteristics of

Grama, A. (2013). Introduction to parallel computing. Harlow,

• Sequential and Parallel algorithm

3.1.1 Decomposition, Tasks, and Dependency Graphs

Figure: Decomposition of dense matrix-vector multiplication into n tasks, where

• Example: Database query processing

A database storing information about used vehicles

MODEL="Civic" AND YEAR="2001" AND (COLOR="Green" OR COLOR="White")

The different tables and their dependencies in a query processing operation

MODEL="Civic" AND YEAR="2001" AND (COLOR="Green" OR COLOR="White")

An alternate data-dependency graph for the query processing operation

• Fine-grained granularity – large no. of small task

Decomposition of dense matrix-vector multiplication into four tasks.

Decomposition of dense matrix-vector multiplication into four tasks.

critical path length is 27 critical path length is 34

• The pattern of interaction among tasks is captured by what is known as a task-interaction

• The computation of each element of the result vector is a task.

You might also like