Supercomputing On Graphics Cards: Marcus Bannerman

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

What is OpenCL? Why Use OpenCL?

OpenCL Hello World

Supercomputing on Graphics Cards


Marcus Bannerman

marcus.bannerman@cbi.uni-erlangen.de

An Introduction to OpenCL and the C++ Bindings

M. Bannerman

Supercomputing on Graphics Cards

What is OpenCL? Why Use OpenCL? OpenCL Hello World

Outline
1

What is OpenCL? Why was OpenCL Created? The Architecture of OpenCL GPU Power Current Implementations History Resources Why Use OpenCL? An Example OpenCL Hello World Header OpenCL Initialisation Memory Initialisation Running the Kernel Output
M. Bannerman Supercomputing on Graphics Cards

What is OpenCL? Why Use OpenCL? OpenCL Hello World

Why was OpenCL Created? The Architecture of OpenCL GPU Power Current Implementations History Resources

Outline
1

What is OpenCL? Why was OpenCL Created? The Architecture of OpenCL GPU Power Current Implementations History Resources Why Use OpenCL? An Example OpenCL Hello World Header OpenCL Initialisation Memory Initialisation Running the Kernel Output
M. Bannerman Supercomputing on Graphics Cards

What is OpenCL? Why Use OpenCL? OpenCL Hello World

Why was OpenCL Created? The Architecture of OpenCL GPU Power Current Implementations History Resources

Why was OpenCL Created?


Programmable shaders allowed graphics cards to be utilised for other calculations than rendering, but the cards would need to be tricked into performing these other computations. Vendors began developing SDKs to facilitate programming shaders but each vendor had its own standard. There are other devices (DSP, IBM Cell processor etc.) which are computationally powerful but lack a standard interface with which to access them. Apple wanted to access these resources in their hardware implementations (e.g., iPhone) and decided a standard interface would be a good thing.

M. Bannerman

Supercomputing on Graphics Cards

What is OpenCL? Why Use OpenCL? OpenCL Hello World

Why was OpenCL Created? The Architecture of OpenCL GPU Power Current Implementations History Resources

OpenCL is : A platform that allows a host program to discover OpenCL enabled devices (CPU, GPU, DSP, etc.). A runtime that allows the host program to manipulate contexts once they are created. A JIT compiler to create executables from OpenCL kernels so they may be run on the OpenCL devices. The kernel language is :
A subset of ISO C99 (restricted pointer operations, no unied namespace). There are extensions for parallelism, determining thread identity and synchronisation. Many built in math functions.

M. Bannerman

Supercomputing on Graphics Cards

What is OpenCL? Why Use OpenCL? OpenCL Hello World

Why was OpenCL Created? The Architecture of OpenCL GPU Power Current Implementations History Resources

Moores Law
Moores Law
10
10

10

CPU GPU

NVidia Fermi AMD HD5800

10 Transistor count

10

10

10

10

10 1970

1980

1990 Year

2000

2010

Figure: The evolution of processors, following the revised Moores law of doubling performance every 18 months and transistor count every 2 years.
M. Bannerman Supercomputing on Graphics Cards

What is OpenCL? Why Use OpenCL? OpenCL Hello World

Why was OpenCL Created? The Architecture of OpenCL GPU Power Current Implementations History Resources

NVidia Fermi architecture is expected to achieve close to a teraFLOP at double precision. Of the top 500 supercomputers, positions 33500 exhibit 9817 teraFLOPs. AMDs Cypress architecture achieves 150 GB/s memory bandwidth. Intels Core i7-965 is benchmarked at 24 GB/s memory bandwidth.

M. Bannerman

Supercomputing on Graphics Cards

What is OpenCL? Why Use OpenCL? OpenCL Hello World

Why was OpenCL Created? The Architecture of OpenCL GPU Power Current Implementations History Resources

June 2008: Apple submits an initial proposal for OpenCL to the Khronos Group (standards committee for OpenGL). December 2008: The specication for OpenCL 1.0 is standardised and released.
NVidia announce they will support OpenCL along with their existing CUDA architecture. AMD is replacing its Close to Metal oering with a OpenCL implementation.

August 2009 : AMD release their rst OpenCL development tools supporting CPUs in OpenCL. August 2009 : Apple release Snow Leopard which has full CPU+GPU OpenCL support. September 2009 : NVidia release its GPU OpenCL drivers and SDK. October 2009 : AMD release the latest version of their SDK, including OpenCL GPU support.
M. Bannerman Supercomputing on Graphics Cards

What is OpenCL? Why Use OpenCL? OpenCL Hello World

Why was OpenCL Created? The Architecture of OpenCL GPU Power Current Implementations History Resources

Khronos group, OpenCL C specication and quick reference card. http://www.khronos.org/opencl MacResearch.org. An excellent webcast series on the basics of OpenCL. http://www.macresearch.org AMDs OpenCL implementation, creators of the C++ bindings. http://ati.amd.com/technology/streamcomputing/opencl.html NVidias OpenCL implementation, with best practise guides. http://www.nvidia.com/object/cuda opencl.html

M. Bannerman

Supercomputing on Graphics Cards

What is OpenCL? Why Use OpenCL? OpenCL Hello World

An Example

Outline
1

What is OpenCL? Why was OpenCL Created? The Architecture of OpenCL GPU Power Current Implementations History Resources Why Use OpenCL? An Example OpenCL Hello World Header OpenCL Initialisation Memory Initialisation Running the Kernel Output
M. Bannerman Supercomputing on Graphics Cards

What is OpenCL? Why Use OpenCL? OpenCL Hello World

An Example

Boundary value of the electrostatic potential

Play movie Video taken from the MacResearch.org, OpenCL tutorial series, Episode 1.

M. Bannerman

Supercomputing on Graphics Cards

What is OpenCL? Why Use OpenCL? OpenCL Hello World

Header OpenCL Initialisation Memory Initialisation Running the Kernel Output

Outline
1

What is OpenCL? Why was OpenCL Created? The Architecture of OpenCL GPU Power Current Implementations History Resources Why Use OpenCL? An Example OpenCL Hello World Header OpenCL Initialisation Memory Initialisation Running the Kernel Output
M. Bannerman Supercomputing on Graphics Cards

What is OpenCL? Why Use OpenCL? OpenCL Hello World

Header OpenCL Initialisation Memory Initialisation Running the Kernel Output

OpenCL Hello World


Graphics cards have no traditional console output, so a true Hello, world! program would be useless. Aims of this example:
Demonstrate the initialisation steps required for OpenCL C++. Provide an example OpenCL kernel. Show that although the intitialisation is lengthy, it is straightforward.

OpenCL Hello World, A.K.A. the hard way to square the elements of an array. A simple example program that performs the following operation Outputi = Input2 i

M. Bannerman

Supercomputing on Graphics Cards

What is OpenCL? Why Use OpenCL? OpenCL Hello World

Header OpenCL Initialisation Memory Initialisation Running the Kernel Output

Hello World: Header and kernel


#i n c l u d e < i o s t r e a m > #i n c l u d e < v e c t o r > #i n c l u d e < a l g o r i t h m > // The OpenCL C++ b i n d i n g s , w i t h e x c e p t i o n s #d e f i n e CL ENABLE EXCEPTIONS #i n c l u d e c l . hpp c o n s t size_t problemSize = 1 0 2 4 ; // The compute k e r n e l we w i l l r u n c o n s t c h a r kernelSrc = k e r n e l void squareArray ( g l o b a l f l o a t input , g l o b a l f l o a t output ) { output [ g e t g l o b a l i d (0) ] = input [ g e t g l o b a l i d (0) ] input [ g e t g l o b a l i d (0) ] ; } ; i n t main ( )

M. Bannerman

Supercomputing on Graphics Cards

What is OpenCL? Why Use OpenCL? OpenCL Hello World

Header OpenCL Initialisation Memory Initialisation Running the Kernel Output

Hello World: OpenCL Initialisation


try { / OpenCL I n i t i a l i s a t i o n / // Open a c o n t e x t t o r u n t h e openCL k e r n e l i n cl : : Context context ( C L _ D E V I C E _ T Y P E _ G P U ) ;

// G a t h e r a l l t h e k e r n e l s o u r c e s f o r t h e OpenCL program cl : : Program : : Sources source ; source . push_back ( std : : make_pair ( kernelSrc , strlen ( kernelSrc ) ) ) ; //Make an OpenCL program cl : : Program program ( context , source ) ; // Get a l l t h e a v a i l a b l e d e v i c e s i n t h e c o n t e x t std : : vector <cl : : Device > devices = context . getInfo <CL_CONTEXT_DEVICES > () ; // B u i l d t h e k e r n e l s o u r c e s f o r try { program . build ( devices ) ; } c a t c h ( cl : : Error& err ) M. Bannerman { all d e v i c e s in the context

Supercomputing on Graphics Cards

What is OpenCL? Why Use OpenCL? OpenCL Hello World

Header OpenCL Initialisation Memory Initialisation Running the Kernel Output

Hello World: Memory initialisation

std : : cerr < < Building failed , < < err . what ( ) < < ( < < err . err ( ) < < ) < < \ n R e t r i e v i n g b u i l d l o g \ n < < program . getBuildInfo <CL_PROGRAM_BUILD_LOG >( devices [0]) < < \ n ; r e t u r n 1; } // Get t h e s q u a r e A r r a y k e r n e l t o u s e i n c a l c u l a t i o n s cl : : Kernel kernel ( program , s q u a r e A r r a y ) ; //Make a queue t o p u t j o b s on t h e f i r s t compute d e v i c e cl : : CommandQueue cmdQ ( context , devices [ 0 ] ) ;

M. Bannerman

Supercomputing on Graphics Cards

What is OpenCL? Why Use OpenCL? OpenCL Hello World

Header OpenCL Initialisation Memory Initialisation Running the Kernel Output

Hello World: Running the kernel

// C r e a t e a v e c t o r o f random i n p u t v a l u e s std : : vector <cl_float > input ; std : : generate_n ( std : : back_inserter ( input ) , problemSize , rand ) ; // S t a r t c o p y i n g t h i s d a t a t o t h e g r a p h i c s c a r d cl : : Buffer inputBuffer ( context , C L _ M E M _ R E A D _ O N L Y | CL_MEM_COPY_HOST_PTR , s i z e o f ( cl_float ) input . size ( ) , & input [ 0 ] ) ; //Make a b u f f e r t o h o l d t h e o u t p u t o f t h e k e r n e l cl : : Buffer outputBuffer ( context , CL_MEM_WRITE_ONLY , cl_float ) input . size ( ) ) ; s i z e o f (

M. Bannerman

Supercomputing on Graphics Cards

What is OpenCL? Why Use OpenCL? OpenCL Hello World

Header OpenCL Initialisation Memory Initialisation Running the Kernel Output

Hello World: Gathering the output

/ /

Ru nn in g on t h e g r a p h i c s c a r d

// S e t t h e two a r g u m e n t s o f t h e s q u a r e A r r a y k e r n e l kernel . setArg ( 0 , inputBuffer ) ; kernel . setArg ( 1 , outputBuffer ) ; // Get a F u n c t o r w h i c h w i l l r u n t h e k e r n e l on e v e r y i n p u t i t e m i n b l o c k s o f 64 t h r e a d s cl : : KernelFunctor func = kernel . bind ( cmdQ , cl : : NDRange ( input . size ( ) ) , cl : : NDRange ( 6 4 ) ) ; // Run t h e k e r n e l and w a i t f o r func ( ) . wait ( ) ; / / i t to f i n i s h

Checking the outputted data

//Make a b u f f e r t o h o l d t h e o u t p u t t e d d a t a
M. Bannerman Supercomputing on Graphics Cards

You might also like