Welcome to Scribd!

Skip carousel

25 Revision

Uploaded by

sayo3712

0% found this document useful (0 votes)

3 views21 pages

Original Title

25.Revision

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

3 views21 pages

25 Revision

Uploaded by

sayo3712

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 21

Search inside document

Profiling – III and Revision

Lecture 25
April 17, 2024
Profiling

for (int i=0; i<50; i++)

{
MPI_Barrier (MPI_COMM_WORLD);
MPI_Alltoall(message, arrSize, MPI_INT, recvMessage, arrSize, MPI_INT,
MPI_COMM_WORLD);
}
Alltoall - NPROCS=4, Data size = 4 KB

Max. barrier time: 0.2 ms

Alltoall - NPROCS=4, Data size = 4 MB
Max. barrier time: 1.2 ms
Max. Alltoall time: 7.8 ms
Alltoall - NPROCS=16, Data size = 4 KB
Alltoall - NPROCS=16, Data size = 4 MB
Bcast – 16P, 4 KB
Bcast – 16P, 4 MB
Bcast – 32P, 4 KB
Bcast – 32P, 4 MB
Revision
Communication Graph Mapping
0 1 512 256
512 64 256
512 64
4
2 3 256 512 512
64 512 64

5 256 64 64

Linear mapping
0 1 2
0 1 2
Q1: What are the communicating pairs?
Q2: Distance/hops between the communicating pairs?
3 4 5 Q3: Total hop-bytes?
3 4 5
12
Estimation Function
• 𝑓𝑒𝑠𝑡 (𝑡, 𝑝, 𝑀) = Cost of placing a task 𝑡 onto processor 𝑝 under current task
mapping 𝑀
• Estimate how critical it is to place a task in the current cycle, select the task
with maximum criticality
• 𝑇𝑘 is the set of tasks yet to be placed
• 𝑃𝑘 is the set of processors that are available

𝑇𝑘 ∪ 𝑇ത𝑘 = Ø
𝑃𝑘 ∪ 𝑃ത𝑘 = Ø

13
Graph Model for SpMV

• Computation load? • Number of communications?

• Similar for both processors • 8 as per this graph
• Actually 6
14
Parallel I/O
Access Pattern Interleaved
access pattern

P0 P1 P0 P1

Each process reads a small chunk of data from a common file

MPI_File_set_view (fh, displacement, etype, filetype, “native”, info)

MPI_File_read_all (fh, data, datacount, MPI_INT, status)

16
Multiple Non-contiguous Accesses

• Every process’ local array is non-

P0 P1 contiguous in file
• Every process needs to make
small I/O requests
P2 P3 • Can these requests be merged?

17
Revision Q3: 3D domain decomposition
17 //initialize
18 for (int i=0; i<N; i++)
19 for (int j=0; j<N; j++)
20 for (int k=0; k<N; k++)
21 data[i][j][k] = (rank+1) * (i+j+k);

22 int xStart=_____________________________,
yStart=____________________________,
zStart=___________________________;

23 int xEnd=_______________________________,
yEnd=______________________________,
zEnd=_____________________________;
Revision Q4

A 3D matrix of size NxNxN was written to the file in the usual XYZ
memory order. P processes read this 3D matrix from a file using parallel
I/O following a 1D domain decomposition along Y-axis. Write an MPI
code snippet for this (you may ignore the obvious initializations and
finalizations). Assume that N is divisible by P.
Revision Q5
A sequential program P consists of three parts A, B, C. Part B is not
parallelizable. Parts A and C are parallelizable. The sequential runtimes
are Sa, Sb, Sc for the parts A, B, C respectively. Derive the speedup of P
on N processes, where the overhead to parallelize part A is Oa,
overhead to parallelize part C is Oc.
Revision Q6

Solutions: CS152 Computer Architecture and Engineering
Document17 pages
Solutions: CS152 Computer Architecture and Engineering
Babasrinivas Guduru
No ratings yet
10.collectives I
Document31 pages
10.collectives I
spareyash
No ratings yet
Computer Architecture ECE 361 Lecture 5: The Design Process & ALU Design
Document55 pages
Computer Architecture ECE 361 Lecture 5: The Design Process & ALU Design
Dayanand Jagtap
No ratings yet
Seminar - Niranjan B
Document46 pages
Seminar - Niranjan B
navyaprasada2002
No ratings yet
Parallel Programming Models: Sathish Vadhiyar
Document32 pages
Parallel Programming Models: Sathish Vadhiyar
dhruvbhagtani
No ratings yet
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 09-09-2021 Module-AdditionalNotes
Document51 pages
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 09-09-2021 Module-AdditionalNotes
atharva gundawar
No ratings yet
12.revision Parallelization
Document30 pages
12.revision Parallelization
spareyash
No ratings yet
Municators
Document32 pages
Municators
spareyash
No ratings yet
Short Introduction To Python Basics: Geared Towards Data Analysis
Document28 pages
Short Introduction To Python Basics: Geared Towards Data Analysis
Ahmad Nazir
No ratings yet
Lecture 3 - Introduction To Apache Spark - 1691899519972
Document67 pages
Lecture 3 - Introduction To Apache Spark - 1691899519972
Manish049
No ratings yet
LP-1 Lab Manual
Document31 pages
LP-1 Lab Manual
Gamer studio
No ratings yet
Chapter2 Part3
Document45 pages
Chapter2 Part3
Hồ Hương Vi Đào
No ratings yet
Algorithm Analysis: University of Technology and Engineering Vietnam National University Hanoi
Document31 pages
Algorithm Analysis: University of Technology and Engineering Vietnam National University Hanoi
Đặng Huy
No ratings yet
7 P2P-4
Document24 pages
7 P2P-4
spareyash
No ratings yet
Torch7 Scientific Computing For Lua (JIT) (CVPR2015)
Document30 pages
Torch7 Scientific Computing For Lua (JIT) (CVPR2015)
Ooon Nattaon
No ratings yet
IA010: Principles of Programming Languages: 9. Concurrency
Document46 pages
IA010: Principles of Programming Languages: 9. Concurrency
LUCKY SINGH
No ratings yet
Interview Questions
Document6 pages
Interview Questions
Ankur Rastogi
No ratings yet
Reflections FINAL PDF
Document15 pages
Reflections FINAL PDF
doob10163
No ratings yet
ATPESC 2019 Track-2 1-7-30 830am Guo-Raffenetti-Thakur-MPI For Scalable Computing
Document199 pages
ATPESC 2019 Track-2 1-7-30 830am Guo-Raffenetti-Thakur-MPI For Scalable Computing
oreh2345
No ratings yet
CS6710 Mipsx2
Document27 pages
CS6710 Mipsx2
AntonKots
No ratings yet
Design of Parallel Algorithm'S: Faculty Guide: Group Members
Document49 pages
Design of Parallel Algorithm'S: Faculty Guide: Group Members
Shweta Maddhesiya
No ratings yet
How Computers Work: Subject Purpose
Document38 pages
How Computers Work: Subject Purpose
Zammad
No ratings yet
Parallel Programming and MPI
Document54 pages
Parallel Programming and MPI
sandeepkaranpanwar
No ratings yet
DSA Week1
Document60 pages
DSA Week1
ayesha batool
No ratings yet
Watts & Strogatz Small World Model: Barış Ekdi
Document11 pages
Watts & Strogatz Small World Model: Barış Ekdi
bekdi
No ratings yet
Algorithms and Its Comparisons
Document21 pages
Algorithms and Its Comparisons
Fizza Irfan
No ratings yet
00 Ipc 5 Berkery cs162 Pipe Socket
Document15 pages
00 Ipc 5 Berkery cs162 Pipe Socket
deponly
No ratings yet
Before Feature Selection
Document23 pages
Before Feature Selection
Rifqi Zumadi
No ratings yet
Synchronization: Han Wang CS 3410, Spring 2012
Document67 pages
Synchronization: Han Wang CS 3410, Spring 2012
vln
No ratings yet
Jakir
Document7 pages
Jakir
Emtiaz Shemanto
No ratings yet
CSC 429 Mid Fall 14
Document8 pages
CSC 429 Mid Fall 14
SAI
No ratings yet
hw3 2022480266 Report
Document3 pages
hw3 2022480266 Report
jovanhuang
No ratings yet
IKI10230 Pengantar Organisasi Komputer Kuliah No. 04: Assembly Language
Document73 pages
IKI10230 Pengantar Organisasi Komputer Kuliah No. 04: Assembly Language
Herman Yuliandoko
No ratings yet
Fast I - O in Java in Competitive Programming - GeeksforGeeks
Document1 page
Fast I - O in Java in Competitive Programming - GeeksforGeeks
surafelhns
No ratings yet
Updated Cost Estimation (OCR)
Document6 pages
Updated Cost Estimation (OCR)
ssankara narayanan
No ratings yet
Paper 1: February 2021
Document30 pages
Paper 1: February 2021
zubair nawaz
No ratings yet
Class XII Study Material KV 2022-23
Document183 pages
Class XII Study Material KV 2022-23
Probably Arth
No ratings yet
Processes and Interaction
Document42 pages
Processes and Interaction
denzig
No ratings yet
Cpe242 Computer Architecture and Engineering Designing A Single Cycle Datapath
Document46 pages
Cpe242 Computer Architecture and Engineering Designing A Single Cycle Datapath
Rahma Akaichi
No ratings yet
An Investigation of Wide-Area Networks
Document3 pages
An Investigation of Wide-Area Networks
Foo Bar
No ratings yet
Applied Numerical Methods in Chemical Engineering 20201128
Document80 pages
Applied Numerical Methods in Chemical Engineering 20201128
Siberia Rusi
100% (1)
Xlwings and Excel
Document19 pages
Xlwings and Excel
Hoangdh
0% (1)
CS3361 - Data Science Laboratory
Document31 pages
CS3361 - Data Science Laboratory
paranjothi karthik
No ratings yet
Chapter 4
Document71 pages
Chapter 4
yehenew
No ratings yet
Context Threading: A Flexible and Efficient Dispatch Technique For Virtual Machine Interpreters
Document25 pages
Context Threading: A Flexible and Efficient Dispatch Technique For Virtual Machine Interpreters
John Root
No ratings yet
Advanced OpenACC Course Lecture2 Multi GPU 20160602
Document91 pages
Advanced OpenACC Course Lecture2 Multi GPU 20160602
amrithp1996
No ratings yet
Tutorial 4
Document32 pages
Tutorial 4
1378311976dcr
No ratings yet
An Introduction To Dynamic Analysis For R.E. (2020) PDF
Document30 pages
An Introduction To Dynamic Analysis For R.E. (2020) PDF
Justin Pierce
No ratings yet
State Machines
Document17 pages
State Machines
mavisXzeref 17
No ratings yet
MPI Plamen Krastev
Document49 pages
MPI Plamen Krastev
Akyas
No ratings yet
M.sc. Computer Science
Document18 pages
M.sc. Computer Science
beghinbose
No ratings yet
Project On Mini.
Document7 pages
Project On Mini.
Sumit Mishra
No ratings yet
Parallel Computing Lab Manual PDF
Document51 pages
Parallel Computing Lab Manual PDF
SAMINA ATTARI
No ratings yet
Python in Stata
Document21 pages
Python in Stata
Jeisson Moreno
No ratings yet
Lecture 4 - Source Code Analysis
Document52 pages
Lecture 4 - Source Code Analysis
berlin.asd123
No ratings yet
Python - Book
Document257 pages
Python - Book
visupace
No ratings yet
CSXII
Document61 pages
CSXII
Pragya Pandya
No ratings yet
Datapath 1 Notes
Document36 pages
Datapath 1 Notes
Altaf Hamed Shajahan
No ratings yet
Parallel Programming and MPI
Document54 pages
Parallel Programming and MPI
api-19815974
No ratings yet
C++ Demystified
From Everand
C++ Demystified
Jeff Kent
Rating: 2.5 out of 5 stars
2.5/5 (1)