Welcome to Scribd!

OpenMP To CUDA

Uploaded by

0% found this document useful (0 votes)

142 views17 pages

This document analyzes the mapping relationship between OpenMP parallel constructs and the CUDA stream programming model. It outlines the execution models and semantics of OpenMP and CUDA, and analyzes the performance of benchmarks mapped from OpenMP to CUDA. Key constructs such as parallel, worksharing and synchronization constructs can be mapped, while single, section and barrier may not be suitable due to limitations in parallelism. Most scientific applications are suitable for this mapping approach.

Original Description:

Original Title

OpenMP to CUDA

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Attribution Non-Commercial (BY-NC)

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

142 views17 pages

OpenMP To CUDA

Uploaded by

Hu Ming

Copyright:

Attribution Non-Commercial (BY-NC)

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 17

Search inside document

Mapping OpenMP to the Stream Programming Model

Hu Ming Zhang Fangzhou Yue Kun

Objective
1. Study the mapping relationship of parallel mechanism in OpenMP to stream programming model (CUDA). 2. Point out the which part is suitable for translation. 3. Analyzing typical scientific applications

Outline
OpenMP vs CUDA: Execution model

OpenMP vs CUDA: Semantics

OpenMP vs CUDA: Performace Analysis of Benchmarks

OpenMP vs CUDA Execution Model

OpenMP vs CUDA Semantic

Parallel Construct
parallel

Worksharing Construct
loop, sections, single

Master and Synchronization Construct

critical, barrier, taskwait, atomic, flush, ordered

Data Environment
shared, private, firstprivate, lastprivate, reduction, copyin, copyprivate

OpenMP vs CUDA Semantic

#include <omp.h> main() { int x; x = 0; #pragma omp parallel shared(x) { #pragma omp critical x = x + 1; } /* end of parallel section */ }

OpenMP vs CUDA Semantic

#pragma omp for ordered [clauses...] (loop region) #pragma omp ordered structured_block

(endo of loop region)

OpenMP vs CUDA Semantic

Most of the directives and clauses can be mapped into the stream programs

OpenMP vs CUDA Performance

OpenMP: OS level thread thread-centric parallel processing model thread can be complicated CUDA: lightweight hardware thread data-centric processing model simple control logic inefficient to handle branch

Map those constructs that have large parallelism and uniform processing among threads

OpenMP vs CUDA Performance

Not suitable:

single, section. -- they have small parallelism and different processing among threads
master ---- parallelism is 1

barrier, taskwait ---- demand all threads grouped into one block
lastprivate ---- processing is not uniform among threadc

OpenMP vs CUDA
To understand whether it is reasonable to translate OpenMP program to CUDA program, we should analyze the applications pattern.

Conclusion
1. A majority of scientific applications are suitable to be mapped to stream programming model. 2. The heterogeneous architecture using CPU and GPU will be more common.

Comments:
1.This papers work is mainly on analysis. 2.We think more real applications should be considered, not just benchmark.

3.Automatically translate OpenMP program to CUDA program may be possible.

Design and Implementation of A Parallel Priority Queue On Many-Core Architectures
Document10 pages
Design and Implementation of A Parallel Priority Queue On Many-Core Architectures
fonseca_r
No ratings yet
Apache Spark Interview Questions Book
Document15 pages
Apache Spark Interview Questions Book
Praneeth Krishna
100% (1)
Task Level Parallelization of All Pair Shortest Path Algorithm in Openmp 3.0
Document4 pages
Task Level Parallelization of All Pair Shortest Path Algorithm in Openmp 3.0
Hoàng Văn
No ratings yet
Soto Ferrari
Document9 pages
Soto Ferrari
fawzi5111963_7872830
No ratings yet
Accelerating CUDA Graph Algorithms at Maximum Warp
Document25 pages
Accelerating CUDA Graph Algorithms at Maximum Warp
thangmle
No ratings yet
High Performance Computing-1 PDF
Document15 pages
High Performance Computing-1 PDF
Priyanka Jadhav
No ratings yet
A Demonstration of Exact String Matching Algorithms With CUDA
Document10 pages
A Demonstration of Exact String Matching Algorithms With CUDA
Raymond Tay
No ratings yet
Big Data Meets HPC
Document3 pages
Big Data Meets HPC
PeterJohn32
No ratings yet
Openmp
Document21 pages
Openmp
Mark Veltzer
No ratings yet
Lec Notes
Document50 pages
Lec Notes
hvs1910
No ratings yet
Openmp: Dr. Nitya Hariharan (Intel)
Document42 pages
Openmp: Dr. Nitya Hariharan (Intel)
Kiara
No ratings yet
Begin Parallel Programming With OpenMP - CodeProject
Document8 pages
Begin Parallel Programming With OpenMP - CodeProject
ManojSudarshan
No ratings yet
FULLTEXT01
Document18 pages
FULLTEXT01
Dejan Vujičić
No ratings yet
Parallel Computing
Document28 pages
Parallel Computing
Gica Sely
No ratings yet
CD Multicore
Document10 pages
CD Multicore
bhalchimtushar0
No ratings yet
Picothreads: Lightweight Threads in Java: 1.1 Event-Based Programming vs. Thread Programming
Document8 pages
Picothreads: Lightweight Threads in Java: 1.1 Event-Based Programming vs. Thread Programming
anon-679511
No ratings yet
K Means Clustering Using Openmp: Subject: Operating Systems
Document12 pages
K Means Clustering Using Openmp: Subject: Operating Systems
Thìn Nguyễn
No ratings yet
Parallel Programming: Sathish S. Vadhiyar Course Web Page
Document36 pages
Parallel Programming: Sathish S. Vadhiyar Course Web Page
Anna Poorani
No ratings yet
Parallel Programming
Document17 pages
Parallel Programming
Yang Yi
No ratings yet
Parallel
Document14 pages
Parallel
HAMDANI Ibrahim
No ratings yet
Parallel Programming Using OpenMP
Document76 pages
Parallel Programming Using OpenMP
luiei1971
No ratings yet
Package Parallel': R-Core October 19, 2013
Document13 pages
Package Parallel': R-Core October 19, 2013
Freeman Jackson
No ratings yet
Parallel Computing Introduction
Document36 pages
Parallel Computing Introduction
ajishalfred
No ratings yet
Parallel Execution of A Parameter Sweep For Molecular Dynamics Simulations in A Hybrid GPU/CPU Environment
Document10 pages
Parallel Execution of A Parameter Sweep For Molecular Dynamics Simulations in A Hybrid GPU/CPU Environment
Syd Barrett
No ratings yet
JOERI HERMANS Distributed Keras
Document23 pages
JOERI HERMANS Distributed Keras
radiumtau
No ratings yet
M M M M: M M M M M M
Document28 pages
M M M M: M M M M M M
Aishwarya Pratap Singh
No ratings yet
Programming Shared-Memory Platforms With Openmp: John Mellor-Crummey
Document46 pages
Programming Shared-Memory Platforms With Openmp: John Mellor-Crummey
askbilladdmicrosoft
No ratings yet
07 OpenMP
Document28 pages
07 OpenMP
Hamid Kisha
No ratings yet
VT 2010 Parfor
Document68 pages
VT 2010 Parfor
Jair Sandoval
No ratings yet
Practice Questions For HPC Final Orals-2022
Document3 pages
Practice Questions For HPC Final Orals-2022
Chaitanya Nirfarake
No ratings yet
A Construction of Reinforcement Learning With Uralgnat: Kolen
Document6 pages
A Construction of Reinforcement Learning With Uralgnat: Kolen
ehsan_sa405
No ratings yet
Potential Approaches To Parallel Computation of Rayleigh Integrals in Measuring Acoustic Pressure and Intensity
Document5 pages
Potential Approaches To Parallel Computation of Rayleigh Integrals in Measuring Acoustic Pressure and Intensity
blhblh123
No ratings yet
Dynamic Load Balancing On Single-And Multi-GPU Systems
Document12 pages
Dynamic Load Balancing On Single-And Multi-GPU Systems
x2y2z2rm
No ratings yet
Open MP
Document35 pages
Open MP
Debarshi Majumder
No ratings yet
Exploiting Loop-Level Parallelism For Simd Arrays Using: Openmp
Document12 pages
Exploiting Loop-Level Parallelism For Simd Arrays Using: Openmp
Spin Fotonio
No ratings yet
1 Overview, Models of Computation, Brent's Theorem
Document8 pages
1 Overview, Models of Computation, Brent's Theorem
Dr P Chitra
No ratings yet
Using Spark On Cori: Lisa Gerhardt, Evan Racah NERSC New User Training
Document14 pages
Using Spark On Cori: Lisa Gerhardt, Evan Racah NERSC New User Training
sonlongho
No ratings yet
Mca 5
Document34 pages
Mca 5
Vivek Dubey
No ratings yet
Ambimorphic, Highly-Available Algorithms For 802.11B: Mous and Anon
Document7 pages
Ambimorphic, Highly-Available Algorithms For 802.11B: Mous and Anon
mdp anon
No ratings yet
Unit3 RMD PDF
Document25 pages
Unit3 RMD PDF
Monika
No ratings yet
A Tutorial On Parallel Computing On Shared Memory Systems
Document23 pages
A Tutorial On Parallel Computing On Shared Memory Systems
Swadhin Satapathy
No ratings yet
Big Data Computing Spark Basics and RDD: Ke Yi
Document43 pages
Big Data Computing Spark Basics and RDD: Ke Yi
Patrick Li
No ratings yet
Day 1 1-12 Intro-Openmp
Document57 pages
Day 1 1-12 Intro-Openmp
JAMEEL AHMAD
No ratings yet
Warp Ane Paper
Document18 pages
Warp Ane Paper
Shikhar Kumar
No ratings yet
MCUDA: An Efficient Implementation of CUDA Kernels On Multi-Cores
Document19 pages
MCUDA: An Efficient Implementation of CUDA Kernels On Multi-Cores
Dheevatsa Mudigere
No ratings yet
PDC - Lecture - No. 2
Document31 pages
PDC - Lecture - No. 2
nauman tariq
No ratings yet
Introduction To OpenMP
Document46 pages
Introduction To OpenMP
mceverin9
No ratings yet
Introduction To Parallel Processing
Document29 pages
Introduction To Parallel Processing
Azri Mohd Khanil
No ratings yet
UNIT V Parallel Programming Patterns in CUDA (T2 Chapter 7) - P P With CUDA
Document35 pages
UNIT V Parallel Programming Patterns in CUDA (T2 Chapter 7) - P P With CUDA
GANESH CINDI
No ratings yet
Shared Memory: Openmp Environment and Synchronization
Document32 pages
Shared Memory: Openmp Environment and Synchronization
karthik reddy
No ratings yet
Mit Openmp Mpi
Document77 pages
Mit Openmp Mpi
Thomas Yue
No ratings yet
Name: Wable Snehal Mahesh Subject:-Scala & Spark Div: - Mba Ii Roll No: - 57 Guidence Name: - Prof. Archana Suryawanshi - Kadam
Document11 pages
Name: Wable Snehal Mahesh Subject:-Scala & Spark Div: - Mba Ii Roll No: - 57 Guidence Name: - Prof. Archana Suryawanshi - Kadam
Snehal Mahesh Wable
No ratings yet
Lec 12 OpenMP
Document152 pages
Lec 12 OpenMP
Avinash
No ratings yet
Introduction To Parallel Processing: Shantanu Dutt University of Illinois at Chicago
Document51 pages
Introduction To Parallel Processing: Shantanu Dutt University of Illinois at Chicago
Dattatray Bhate
No ratings yet
Scimakelatex 14178 Boe+Gus
Document7 pages
Scimakelatex 14178 Boe+Gus
sdfoij
No ratings yet
High Performance Computer Architecture (CS60003)
Document15 pages
High Performance Computer Architecture (CS60003)
Sunil Mishra
No ratings yet
CCP - Parallel Computing
Document10 pages
CCP - Parallel Computing
Naveen Setty
No ratings yet
Node.js, JavaScript, API: Interview Questions and Answers
From Everand
Node.js, JavaScript, API: Interview Questions and Answers
John Edward Cooper Berg
Rating: 5 out of 5 stars
5/5 (1)
Learning Apache Spark 2
From Everand
Learning Apache Spark 2
Muhammad Asif Abbasi
No ratings yet
Learning Concurrent Programming in Scala
From Everand
Learning Concurrent Programming in Scala
Aleksandar Prokopec
No ratings yet