Assignment

ASSIGNMENT NO 1
SUBMITTED BY :TAYYABA RIAZ

REG NO :2020-BCS-039
BCS-7
COURSE: PARALLEL AND DISTRIBUTED COMPUTING
SUBMITTED TO : MAM MADIHA
ASSIGNMENT NO 1
QUESTION :
Write applications,architecture of the processors and also give some details about in
which machines following processors are used .
1.Universal Memory Access (UMA):

Introduction:
UMA, or Uniform Memory Access, is a computer architecture for parallel computing where all
processors in a multiprocessor system have the same access time to a single shared memory.
Unlike NUMA, there is no distinction between local and remote memory access, since all
processors can access any part of memory with roughly the same latency. UMA is
characterized by its simplicity and ease of programming, but as the number of processors
increases, it can run into scalability limitations, as contention for shared memory can become a
bottleneck.
- *Applications:* Scientific computing, simulations, database administration, and general-
purpose computing all make use of UMA designs.
- *Industry:* UMA systems are used by sectors like research, finance, and manufacturing.
Architectural Design: All processors in UMA have equal access to a single shared memory,
often via a system bus, at all times. Although this consistent access makes managing memory
easier, it may cause competition for shared resources in systems with a lot of users.
*2. NUMA: Non-Uniform Memory Access

Introduction:
NUMA stands for Non-Uniform Memory Access and is a computer architecture designed for use
in parallel computing and multiprocessor systems. In a NUMA system, many processes (or
nodes) depend on a shared memory, but access to this shared memory is not the same across
all processes. This means that accessing memory connected to the central processor (node) is
faster than accessing memory connected to the remote processor. On NUMA systems:
1. Access to local memory:
Each processor has its own local memory and accessing this local memory is faster than
accessing the memory of other processors or nodes.
2. **Remote Memory Access**:
More latency occurs if the processor needs to access memory located in another location rather
than in its local location. This is because data needs to be exchanged between nodes; this
takes longer than getting into memory. NUMA architecture is used to increase the scalability
and performance of many processes by reducing contention for accessing memory. It is
ASSIGNMENT NO 1
especially useful on systems with large numbers of processors or cores because it helps
distribute memory access across multiple nodes and reduce the risk of conflicts by discussing
memory. Designed operating systems and software that support NUMA leverage this
architecture to reduce remote memory access by optimizing memory allocation and data
placement, thus improving overall system performance in parallel computing scenarios.
- *Applications:* Parallel processing, virtualization, and data analysis are performed using
NUMA architectures in large-scale servers, data centers, and supercomputers.
- *Industry:* High-performance computing (HPC), cloud computing, and scientific research
are all reliant on NUMA systems.
- *Architecture:* Each processor in NUMA systems is located closer to some memory banks
than others. access by processors local memory faster than remote memory, optimizes memory
access for parallel applications.
*3. Distributed Memory/Multicomputing:*

Introduction:
Distributed Memory/Multicomputing is a parallel computing architecture where multiple
independent computers or nodes work together in a coordinated manner. Each node has its
own separate memory and operates independently, communicating and sharing data through
message passing. This architecture is highly scalable and suitable for large-scale parallel
applications, but requires explicit message passing between nodes to exchange data, making
programming more difficult compared to shared-memory systems.
- *Applications:* Distributed memory architectures are commonly used in clusters for
scientific simulations, weather modeling, and parallel computing.
- *Industry:* Research institutions, academia and industries requiring intensive parallel
processing use distributed memory systems.
- *Architecture:* In a distributed memory system, each processor has its own local memory
and communication is achieved by message passing. It requires explicit data transfers between
processors.
*4. Hybrid Distributed Shared Memory:*

Introduction:
A parallel computing architecture called Hybrid Distributed Shared Memory (Hybrid DSM)
combines features of shared and distributed memory systems. A shared memory space that can
be accessed by numerous nodes is created in a hybrid DSM system where several nodes each
have their own local memory but can also share some of it with other nodes. This strategy
seeks to offer the simplicity of shared memory for simpler programming while delivering the
flexibility of distributed memory systems for scalability. In order to control data sharing and
ASSIGNMENT NO 1
coherence between nodes, a combination of hardware support and software techniques is

typically used.
- *Applications:* Hybrid architectures combine the features of both distributed and shared
memory systems, making them suitable for a variety of parallel workloads, including scientific
simulations and data analysis.
- *Industry:* Research, HPC and industrial intensive industries use hybrid architectures.
- *Architecture:* Hybrid systems use a combination of shared memory and message passing
techniques to balance memory access and data sharing between nodes.
*5. Data Flow Architecture:*

Introduction:
A parallel computing paradigm known as "Data Flow Architecture" places more emphasis on the
availability of data than on a predetermined control flow when it comes to task execution. This
architecture enables extremely parallel and effective processing because activities are started
as soon as the input data they need becomes accessible. Data Flow Architecture can adapt
flexibly to changing workloads and is especially well suited for data-intensive applications. It
contrasts with conventional von Neumann designs where control flow determines the execution
sequence and is frequently linked to parallelism in distributed systems, streaming data
processing, and high-performance computing.
- *Applications:* Dataflow architectures are used in signal processing, multimedia and
graphics applications that require high throughput and parallelism.
- *Industry:* Industries such as media production, telecommunications and image processing
rely on data flow architectures.
- *Architecture:* Data flow processors execute instructions based on data availability rather
than a fixed program counter. They excel in streaming and data intensive applications.
*6. Array Processor:*

Introduction:
A specific form of computer architecture called a "array processor" was created for the speedy
processing of data matrices or arrays. It is particularly effective for jobs requiring large-scale
numerical calculations, like signal processing and scientific simulations. A typical array
processor consists of several processing components that may run parallel operations on
several array elements at once, greatly accelerating computations. Because it can significantly
speed up activities that need repetitive array-based operations, this architecture is frequently
employed in supercomputers and specialized hardware for scientific and technical purposes.
- *Applications:* Array processors are used in scientific simulations, numerical calculations
and image processing where data is organized into arrays.
ASSIGNMENT NO 1
- *Industry:* Scientific research, aerospace and computer-aided design (CAD) industries use
array processors.
- *Architecture:* Array processors specialize in performing operations on arrays or matrices,
offering high parallelism and efficient execution of repetitive tasks.
*7. Concatenated Vector Processor:*

Introduction:
The Concatenated Vector Processor is a parallel computing architecture that combines
elements of vector processing and array processing. In this architecture, multiple processing
elements work together in a pipeline fashion to perform operations on concatenated vectors or
arrays of data. It is designed to increase computational efficiency by processing multiple data
elements at the same time and efficiently processing tasks that involve large datasets. Chained
vector processors are often used in scientific and engineering applications, where tasks such as
matrix operations, simulation, and signal processing can benefit from their parallel processing
capabilities. This architecture aims to strike a balance between the vector processing
capabilities of traditional supercomputers and the array processing capabilities seen in
specialized hardware for scientific and numerical computing.
- *Applications:* Concatenated vector processors are used in multimedia, scientific
simulations, and high-performance computing for tasks requiring repetitive data processing.
- *Industry:* Industries such as graphics rendering, physics simulation, and seismic data
analysis benefit from interconnected vector processors.
- *Architecture:* Concatenated vector processors use a series of stages to process data in
parallel, resulting in high throughput vectorized operations.
*8. Systolic field:*

Introduction:
A systolic array, also referred to as a systolic array, is a parallel computing architecture that is
highly structured and designed for efficient data flow across a network of processing elements.
These processing elements, often arranged in a grid or array, work in a synchronized, chained
manner, passing data through the system as if it were "pumped" through the array like blood by
the heart (hence the name "systolic"). This architecture is suitable for tasks that involve regular,
repetitive calculations, such as matrix multiplication or signal processing. Systolic arrays are
known for their efficient use of hardware resources and high throughput, making them valuable
in a variety of applications requiring computational speed and efficiency.
- *Applications:* Systolic arrays are used in applications such as image processing, pattern
recognition, and matrix operations.
- *Industry:* Industries such as healthcare (medical imaging), robotics, and signal processing
use systolic fields.
ASSIGNMENT NO 1
- *Architecture:* Systolic arrays consist of a grid of processing elements that pass data in a
regular, synchronized manner, making them efficient for data-dependent operations such as
convolutions and matrix multiplications.
These different multi-processor and shared-memory architectures cater to a wide range of

applications and industries, offering solutions for parallelism, high-performance computing, and
data-intensive tasks based on their specific characteristics and design principles.

Assignment

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Assignment

Uploaded by

Copyright:

Available Formats

ASSIGNMENT NO 1

SUBMITTED BY :TAYYABA RIAZ

1.Universal Memory Access (UMA):

*2. NUMA: Non-Uniform Memory Access

3. Distributed Memory/Multicomputing:

4. Hybrid Distributed Shared Memory:

coherence between nodes, a combination of hardware support and software techniques is

5. Data Flow Architecture:

6. Array Processor:

7. Concatenated Vector Processor:

8. Systolic field:

These different multi-processor and shared-memory architectures cater to a wide range of

You might also like