Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

Faculty of Engineering & Technology

Semester: 6th Year: 3rd


B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

FACULTY OF ENGINEERING AND


TECHNOLOGY

BACHELOR OF TECHNOLOGY

GPU COMPUTING
(203105398)

LAB MANUAL

6th SEMESTER
COMPUTER SCIENCE & ENGINEERING
DEPARTMENT

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

CERTIFICATE

This is to certify that Mr./Ms …………………………with


Enrolment no.200303124548 has successfully completed his/her
laboratory experiments in the GPU COMPUTING
(203105398) from the department of PIT.-CSE (AI). During
the academic year.2022-23.

Date of Submission:......................... Staff In charge:...........................

Head Of Department:...........................................

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

INDEX

Sr. Page Starting Ending


Experiment Title Grade Sign
No. No. Date Date

Understand the system by various linux/


1. windows commands & GPU and CUDA
Architectures.

2. Understand the google colab.

3. Analyze the program using gprof profiles.

Wap to demonstrate the addition of an array


4.
using cuda code

Wap to demonstrate squaring an array using a


5.
simple cuda kernel

Wap to demonstrate vector-matrix


6.
multiplication using gpu global memory

Wap vector matrix multiplication with


7. measuring time using cuda events and uses
shared memory

Wap demonstrate vector-matrix


8. multiplication using gpu constant memory it
stores vector v in gpu constant memory

9. Analyse the program using nvidia profilers

With the help of gpu libraries like keras,


10.
tensorflow, gan etc develop a mini project.

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

PRACTICAL: 01
AIM: Understand the system by various linux/ windows commands & GPU
and CUDA Architectures.

GPU ARCHITECTURE:
Its architecture is tolerant of memory latency. Compared to a CPU, a GPU works with fewer,
and relatively small, memory cache layers. Reason being is that a GPU has more transistors
dedicated to computation meaning it cares less how long it takes the retrieve data from memory.
A Graphics Processor Unit (GPU) is mostly known for the hardware device used when running
applications that weigh heavy on graphics, i.e.3D modeling software or VDI infrastructures. In
the consumer market, a GPU is mostly used to accelerate gaming graphics. Today, GPGPU’s
(General Purpose GPU) are the choice of hardware to accelerate computational workloads in
modern High Performance Computing (HPC) landscapes.

Figure :1.1 GPU Architecture

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

Compute Unified Device Architecture(CUDA) ARCHITECTURE:


CUDA stands for Compute Unified Device Architecture. It is an extension of C/C++
programming. CUDA is a programming language that uses the Graphical Processing Unit
(GPU). It is a parallel computing platform and an API (Application Programming Interface)
model, Compute Unified Device Architecture was developed by Nvidia. This allows
computations to be performed in parallel while providing well-formed speed. Using CUDA, one
can harness the power of the Nvidia GPU to perform common computing tasks, such as
processing matrices and other linear algebra operations, rather than simply performing graphical
calculations.

Figure :1.2 CUDA Architecture

WORKING OF CUDA:
 GPUs run one kernel (a group of tasks) at a time.
 Each kernel consists of blocks, which are independent groups of ALUs.
 Each block contains threads, which are levels of computation.
 The threads in each block typically work together to calculate a value.
 Threads in the same block can share memory.
 In CUDA, sending information from the CPU to the GPU is often the most typical part
of the computation.

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

Applications of CUDA
1. Computational finance
2. Climate, weather, and ocean modeling
3. Data science and analytics
4. Deep learning and machine learning
5. Defense and intelligence
6. Manufacturing/AEC
7. Media and entertainment
8. Medical imaging
9. Oil and gas
10. Research
11. Safety and security
12. Tools and management
Linux Commands:

Figure :1.3 pwd command

Figure :1.4 cd command

Figure :1.5 cd .. command

Figure :1.6 ls command

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

Figure :1.7 mkdir command

Figure :1.8 rmdir command

Figure :1.9 echo command

Figure :1.10 echo command

Figure :1.11 command

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

PRACTICAL: 03
AIM: Analyze the program using GPROF profiles.
CODE:
#include <stdio.h>

int binarySearch(int array[], int x, int low, int high) {


while (low <= high) {
int mid = low + (high - low) / 2;

if (array[mid] == x)
return mid;

if (array[mid] < x)
low = mid + 1;

else
high = mid - 1;
}

return -1;
}

int main(void) {
int array[] = {3, 4, 5, 6, 7, 8, 9};
int n = sizeof(array) / sizeof(array[0]);
int x = 4;
int result = binarySearch(array, x, 0, n - 1);
if (result == -1)

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

printf("Not found");
else
printf("Element is found at index %d", result);
return 0;
}

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

OUTPUT

Figure :2.1 gprof profiles

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

PRACTICAL: 02
AIM: Understand the google colab.

What is Colaboratory?
Colaboratory, or “Colab” for short, is a product from Google Research. Colab allows anybody
to write and execute arbitrary python code through the browser, and is especially well suited to
machine learning, data analysis and education. More technically, Colab is a hosted Jupyter
notebook service that requires no setup to use, while providing access free of charge to
computing resources including GPUs
Is it really free of charge to use?

Yes. Colab is free of charge to use.

Seems too good to be true. What are the limitations?

Colab resources are not guaranteed and not unlimited, and the usage limits sometimes fluctuate.
This is necessary for Colab to be able to provide resources free of charge. For more details,
see Resource Limits

Users who are interested in more reliable access to better resources may be interested in Colab
Pro.

Resources in Colab are prioritized for interactive use cases. We prohibit actions associated with
bulk compute, actions that negatively impact others, as well as actions associated with bypassing
our policies. The following are disallowed from Colab runtimes:

 file hosting, media serving, or other web service offerings not related to interactive
compute with Colab
 downloading torrents or engaging in peer-to-peer file-sharing
 using a remote desktop or SSH
 connecting to remote proxies
 mining cryptocurrency
 running denial-of-service attacks
 password cracking

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

Figure: 3.1 Enabling gpu

Figure: 3.2 Simple python codes

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

Figure:3.3 Mathematical equation

Figure:3.4 Mathematical equation

Figure:3.5 Mathematical equation

Figure:3.6 Testing of gpu

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

PRACTICAL: 04
AIM: W.A.P to demonstrate the addition of an array using CUDA code.

CODE:
%%cu
#include <stdio.h>
int main()

int arr[] = {1, 2, 3, 4, 5};

int sum = 0;

int length = sizeof(arr)/sizeof(arr[0]);

for (int i = 0; i < length; i++) {

sum = sum + arr[i];

printf("Sum of all the elements of an array: %d", sum);

return 0;

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

OUTPUT

Figure: 4.1 Addition of Arrays

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

PRACTICAL: 05
AIM: W.A.P to demonstrate squaring an array using a simple CUDA kernel.

CODE:
%%cu
#include<stdio.h>

int main()

int arr[5] = {1,2,3,4,5};

int i = 0;

printf("Array elements:\n");

for(i = 0;i<5;i++)

printf("%d",arr[i]);

printf("\nsquare of array elements;\n");

for(i = 0;i<5;i++);

printf("%d",arr[i]*arr[i]);

printf("\n");

return 0;

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

OUTPUT

Figure: 5.1 Squaring an Array

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

PRACTICAL: 06
AIM: W.A.P to demonstrate vector-matrix multiplication using GPU global
memory.
CODE:
%%cu
#include<stdio.h>
#include<stdlib.h>

global void arradd(int* md, int* nd, int* pd, int size)
{
//Get unique identification number for a given thread
int myid = blockIdx.x*blockDim.x + threadIdx.x;

pd[myid] = md[myid] + nd[myid];


}

int main()
{
int size = 2000 * sizeof(int);
int m[2000], n[2000], p[2000],*md, *nd,*pd;
int i=0;

//Initialize the arrays


for(i=0; i<2000; i++ )
{
m[i] = i;
n[i] = i;
p[i] = 0;
}

// Allocate memory on GPU and transfer the data


cudaMalloc(&md, size);

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

cudaMemcpy(md, m, size, cudaMemcpyHostToDevice);

cudaMalloc(&nd, size);
cudaMemcpy(nd, n, size, cudaMemcpyHostToDevice);

cudaMalloc(&pd, size);

// Define number of threads and blocks


dim3 DimGrid(10, 1);
dim3 DimBlock(200, 1);

// Launch the GPU kernel function


arradd<<< DimGrid,DimBlock >>>(md,nd,pd,size);

// Transfer the results back to CPU memory


cudaMemcpy(p, pd, size, cudaMemcpyDeviceToHost);

// Free GPU arrays


cudaFree(md);
cudaFree(nd);
cudaFree (pd);

// Print the results


for(i=0; i<2000; i++ )
{
printf("\t%d",p[i]);
}
}

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

OUTPUT

Figure: 6.1 Vector Matrix Multiplication

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

PRACTICAL: 07
AIM: W.A.P vector matrix multiplication with measuring time using CUDA
events and uses shared memory.
CODE:
%%cu
#include<stdio.h>
#include<stdlib.h>

global void arradd(int* md, int* nd, int* pd, int size)
{

//Get unique identification number for a given thread


int myid = blockIdx.x*blockDim.x + threadIdx.x;

pd[myid] = md[myid] * nd[myid];


}

int main()
{
int size = 2000 * sizeof(int);
int m[2000], n[2000], p[2000],*md, *nd,*pd;
int i=0;

//Initialize the arrays


for(i=0; i<2000; i++ )

{
m[i] = i;

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

n[i] = i;
p[i] = 0;
}

// Allocate memory on GPU and transfer the data


cudaMalloc(&md, size);
cudaMemcpy(md, m, size, cudaMemcpyHostToDevice);

cudaMalloc(&nd, size);
cudaMemcpy(nd, n, size, cudaMemcpyHostToDevice);

cudaMalloc(&pd, size);
dim3 DimGrid(10, 1);
dim3 DimBlock(200, 1);

arradd<<< DimGrid,DimBlock >>>(md,nd,pd,size);


cudaMemcpy(p, pd, size, cudaMemcpyDeviceToHost);
cudaFree(md);
cudaFree(nd);
cudaFree (pd);
for(i=0; i<2000; i++ )
{
printf("\t%d",p[i]);
}
}

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

OUTPUT

Figure: 7.1 vector matrix multiplication uses shared memory

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

PRACTICAL: 08
AIM: W.A.P demonstrate vector-matrix multiplication using GPU constant
memory it stores vector v in GPU constant memory.
CODE:

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

Figure: 8.1 Code

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

OUTPUT

Figure: 8.2 Vector-matrix multiplication using GPU

PAGE NO.:
Faculty of Engineering & Technology
Semester: 6th Year: 3rd
B.Tech CSE-AI
Subject Name: GPU
Subject Code : 203105398

PRACTICAL: 09
AIM: Analyse the program using NVIDIA Profilers.
CODE:
!nvidia-smi

OUTPUT

Figure: 9.1 NVIDIA Profile

PAGE NO.:

You might also like