Ex 3 (Openmp - Iii) : Int Alg - Matmul2d (Int M, Int N, Int P, Float A, Float B, Float C)

Reg No : 19BCE2028
Name : Pratham Shah

Slot : L47+L48
EX 3 ( OPENMP –III)
SCENARIO – I
Write a simple OpenMP program for Matrix Multiplication. It is helpful to understand how
threads make use of the cores to enable course-grain parallelism.
Note that different threads will write different parts of the result in the array 𝒂, so we dont’t
get any problems during the parallel execution. Note that accesses to matrices 𝒃 𝒂𝒏𝒅 𝒄 are
read-only and do not introduce any problems either.
Code Snippet
int alg_matmul2D(int m, int n, int p, float** a, float** b, float** c)
{
int i,j,k;
#pragma omp parallel shared(a,b,c) private(i,j,k)
{
#pragma omp for
schedule(static) for (i=0; i<m;
i=i+1){ for (j=0; j<n;
j=j+1){
a[i][j]=0.;
for (k=0; k<p; k=k+1){
a[i][j]=(a[i][j])+((b[i][k])*(c[k][j]));
}
}
}
}
return 0;
}
ALGORITHM:
1. Parallelize the outermost loop which drives the accesses to the result matrix a in the first
dimension.
2. Work-sharing among the threads is such that different threads will calculate different
rows of the result matrix a
SOURCE CODE:
#include<stdio.h>
#include<omp.h>
int alg_matmul2D(int m, int n, int p, float **a, float **b, float **c)
{
int i,j,k;
Reg No : 19BCE2028
Name : Pratham Shah
Slot : L47+L48
#pragma omp parallel shared(a,b,c) private(i,j,k)
{
#pragma omp for schedule(static) for (i=0; i<m; i=i+1){ for (j=0; j<n; j=j+1){
a[i][j]=0.;
for (k=0; k<p; k=k+1){
a[i][j]=(a[i][j])+((b[i][k])*(c[k][j]));
}
}
}
}
return 0;
}
int main()
{
alg_matmul2D(3,4,5,1.3,4.3,3.1);
}
OUTPUT:
For 250x250 matrix: 0.21s sequential time, 0.1s parallel time
For 1000x1000 matrix: 7.5s sequential time, 2s parallel time
RESULTS:
A parallelized program drastically reduces the time of execution in multiplying matrices as

compared to sequential program.
SCENARIO – II
Write a OpenMP program to calculate the value of PI by evaluating the integral 4/(1 + x 2) .
You can use three pragma block directives to handle the critical section , the sum and the
product display.
ALGORITHM:
SOURCE CODE:
#include<stdio.h>
#include<omp.h>
#include<math.h>
Reg No : 19BCE2028
Name : Pratham Shah
Slot : L47+L48
#define PI 3.1415926538837211
/* Main Program */
int main()
{
int Noofintervals, i;
float sum, x, totalsum, h, partialsum, sumthread;
printf("Enter number of intervals\n");

scanf("%d", &Noofintervals);
if (Noofintervals <= 0) {
printf("Number of intervals should be positive integer\n");
exit(1);
}
sum = 0.0;
h = 1.0 / Noofintervals;
/*
* OpenMP Parallel Directive With Private Shared Clauses And Critical
* Section
*/
#pragma omp parallel for private(x) shared(sumthread)

for (i = 1; i < Noofintervals + 1; i = i + 1) {
x = h * (i - 0.5);
/* OPENMP Critical Section Directive */
#pragma omp critical

sumthread = sumthread + 4.0 / (1 + x * x);
}
partialsum = sumthread * h;
/* OpenMP Critical Section Directive */
#pragma omp critical

sum = sum + partialsum;
printf("The value of PI is \t%f \nerror is \t%1.16f\n", sum, fabs(sum - PI));

return 0;
}
OUTPUT :
Reg No : 19BCE2028
Name : Pratham Shah
Slot : L47+L48
RESULTS:
SCENARIO – III
Write a simple OpenMP program that uses some OpenMP API functions to extract information
about the environment. It should be helpful to understand the language / compiler features of
OpenMP runtime library.
To examine the above scenario, the functions such as omp_get_num_procs(),

omp_set_num_threads(), omp_get_num_threads(), omp_in_parallel(),
omp_get_dynamic() and omp_get_nested() are listed and the explanation is given below to
explore the concept practically.
o omp_set_num_threads() takes an integer argument and requests that the Operating

System provide that number of threads in subsequent parallel regions.
o omp_get_num_threads() (integer function) returns the actual number of threads in the
current team of threads. o omp_get_thread_num() (integer function) returns the ID of a
thread, where the ID ranges from 0 to the number of threads minus 1. The thread with the
ID of 0 is the master thread.
o omp_get_num_procs() returns the number of processors that are available when the
function is called.
o omp_get_dynamic() returns a value that indicates if the number of threads available in
subsequent parallel region can be adjusted by the run time.
o omp_get_nested( ) returns a value that indicates if nested parallelism is enabled.
Reg No : 19BCE2028
Name : Pratham Shah
Slot : L47+L48
ALGORITHM:
Use different OpenMP API functions to see their uses
SOURCE CODE:
1. omp_get_num_threads: Returns the number of threads in the parallel region.
#include <stdio.h>
#include <omp.h>
int main()
{
omp_set_num_threads(4);
printf("%d\n", omp_get_num_threads( ));
#pragma omp parallel
#pragma omp master
{
}
#pragma omp parallel num_threads(3)

#pragma omp master
{
}

}
Output:
2. omp_set_num_threads: Sets the number of threads in upcoming parallel regions, unless

overridden by a num_threads clause.
#include <stdio.h>
#include <stdlib.h>
#include <omp.h>
int main(int argc, char* argv[])

{
Reg No : 19BCE2028
Name : Pratham Shah
Slot : L47+L48
int current_num_threads = 0;
// Create the OpenMP parallel region, containing the number of threads as defined by
OMP_NUM_THREADS
{
// Each thread prints its identifier
printf("Loop 1: We are %d threads, I am thread %d.\n", omp_get_num_threads(),
omp_get_thread_num());
#pragma single
{
current_num_threads = omp_get_num_threads();
}
}
omp_set_num_threads(current_num_threads+1);
// Create the OpenMP parallel region, which will contain 8 threads

{
// Each thread prints its identifier
printf("Loop 1: We are %d threads, I am thread %d.\n", omp_get_num_threads(),
omp_get_thread_num());
}
return EXIT_SUCCESS;
}
Output:
3. omp_get_num_procs():Returns the number of processors that are available when the

function is called.
#include <stdio.h>
#include <omp.h>
int main( )
{
printf("%d\n", omp_get_num_procs( ));
#pragma omp master
{
printf("%d\n", omp_get_num_procs( ));
}
Reg No : 19BCE2028
Name : Pratham Shah
Slot : L47+L48
}
Output:
4. omp_in_parallel():Returns nonzero if called from within a parallel region.
#include <stdio.h>
#include <omp.h>
int main( )
{
printf_s("%d\n", omp_in_parallel( ));

#pragma omp master
{
printf_s("%d\n", omp_in_parallel( ));
}
}
Output:
5. omp_get_dynamic():
#include <stdio.h>
#include <omp.h>
int main()
{
omp_set_dynamic(9);
printf("%d\n", omp_get_dynamic( ));
#pragma omp master
{
printf("%d\n", omp_get_dynamic( ));
}
}
Output:
6. omp_get_nested():
#include <stdio.h>
#include <omp.h>
int main( )
Reg No : 19BCE2028
Name : Pratham Shah
Slot : L47+L48
{
omp_set_nested(1);
printf("%d\n", omp_get_nested( ));
#pragma omp master
{
printf("%d\n", omp_get_nested( ));
}
}
Output:
EXECUTION:
The execution of the codes provides the outputs as shown above.
RESULTS:
The different OpenMP API functions were successfully implemented and their uses were
observed.

Ex 3 (Openmp - Iii) : Int Alg - Matmul2d (Int M, Int N, Int P, Float A, Float B, Float C)

Uploaded by

Copyright:

Available Formats

You might also like

Ex 3 (Openmp - Iii) : Int Alg - Matmul2d (Int M, Int N, Int P, Float A, Float B, Float C)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ex 3 (Openmp - Iii) : Int Alg - Matmul2d (Int M, Int N, Int P, Float A, Float B, Float C)

Uploaded by

Copyright:

Available Formats

Reg No : 19BCE2028

Name : Pratham Shah

For 250x250 matrix: 0.21s sequential time, 0.1s parallel time

For 1000x1000 matrix: 7.5s sequential time, 2s parallel time

A parallelized program drastically reduces the time of execution in multiplying matrices as

printf("Enter number of intervals\n");

#pragma omp parallel for private(x) shared(sumthread)

/* OPENMP Critical Section Directive */

#pragma omp critical

/* OpenMP Critical Section Directive */

#pragma omp critical

printf("The value of PI is \t%f \nerror is \t%1.16f\n", sum, fabs(sum - PI));

To examine the above scenario, the functions such as omp_get_num_procs(),

o omp_set_num_threads() takes an integer argument and requests that the Operating

Use different OpenMP API functions to see their uses

1. omp_get_num_threads: Returns the number of threads in the parallel region.

printf("%d\n", omp_get_num_threads( ));

#pragma omp parallel num_threads(3)

printf("%d\n", omp_get_num_threads( ));

2. omp_set_num_threads: Sets the number of threads in upcoming parallel regions, unless

int main(int argc, char* argv[])

// Create the OpenMP parallel region, which will contain 8 threads

3. omp_get_num_procs():Returns the number of processors that are available when the

4. omp_in_parallel():Returns nonzero if called from within a parallel region.

#pragma omp parallel

The execution of the codes provides the outputs as shown above.

You might also like