Professional Documents
Culture Documents
OpenMP To CUDA
OpenMP To CUDA
Objective
1. Study the mapping relationship of parallel mechanism in OpenMP to stream programming model (CUDA). 2. Point out the which part is suitable for translation. 3. Analyzing typical scientific applications
Outline
OpenMP vs CUDA: Execution model
Worksharing Construct
loop, sections, single
Data Environment
shared, private, firstprivate, lastprivate, reduction, copyin, copyprivate
#pragma omp for ordered [clauses...] (loop region) #pragma omp ordered structured_block
Most of the directives and clauses can be mapped into the stream programs
Map those constructs that have large parallelism and uniform processing among threads
single, section. -- they have small parallelism and different processing among threads
master ---- parallelism is 1
barrier, taskwait ---- demand all threads grouped into one block
lastprivate ---- processing is not uniform among threadc
OpenMP vs CUDA
To understand whether it is reasonable to translate OpenMP program to CUDA program, we should analyze the applications pattern.
Conclusion
1. A majority of scientific applications are suitable to be mapped to stream programming model. 2. The heterogeneous architecture using CPU and GPU will be more common.
Comments:
1.This papers work is mainly on analysis. 2.We think more real applications should be considered, not just benchmark.