Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

Module 4 : Parallel & Pipeline Processing

Mrs. Minakshi S. Ghorpade

Page 1
Introduction to parallel processing concepts
What is Parallel Processing ?
• Large class of techniques used to provide simultaneous data processing tasks for the purpose of increasing
the computational speed of a computer system.
• Instead of processing each instructions sequentially as in a conventional computer, a Parallel Processing
system is able to perform concurrent data processing to achieve faster execution time.
• The system may have two or more ALUs and be able to execute two or more instructions at the same time.
• The system may have two or more processors operating concurrently.

Page 2
Purpose of Parallel Processing :
To speed up the computer processing capability.
To increase its throughput

Disadvantage of Parallel Processing


The amount of Hardware increases, so the cost of the system increases.

Classification of Parallel Processing : Various ways of Classification


1. From the internal Organization of the Processors
2. From the interconnection structure between Processors
3. From the flow of information through the system

Page 3
Flynn's Classification of Computers

• M.J. Flynn proposed a classification for the organization of a computer system by the number of instructions and
data items that are manipulated simultaneously.

• The sequence of instructions read from memory constitutes an instruction stream.

• The operations performed on the data in the processor constitute a data stream.

• Parallel processing may occur in the instruction stream, in the data stream, or both.

• It is a way of organizing multiple processor system.

Flynn's classification divides computers into four major groups that are:
1.Single instruction stream, single data stream (SISD)
2.Single instruction stream, multiple data stream (SIMD)
3.Multiple instruction stream, single data stream (MISD)
4.Multiple instruction stream, multiple data stream (MIMD)

Page 4
Single instruction stream, single data stream (SISD)

Page 5
Single instruction stream, multiple data stream (SIMD)

Page 6
Multiple instruction stream, single data stream (MISD)

Page 7
Multiple instruction stream, multiple data stream (MIMD)

Page 8
Instruction Level Parallelism
• Instruction Level Parallelism (ILP) is used to refer to the architecture in which multiple operations
can be performed parallelly in a particular process, with its own set of resources – address space,
registers, identifiers, state, program counters.
• It refers to the compiler design techniques and processors designed to execute operations, like
memory load and store, integer addition, float multiplication, in parallel to improve the performance
of the processors.
• Examples of architectures that exploit ILP are VLIWs, Superscalar Architecture.
• A typical ILP allows multiple-cycle operations to be pipelined.
• ILP is a measure of how many of the operations in a computer program can be performed
simultaneously

Page 9
Classification
• The classification of ILP architectures can be done in the following ways –

• Sequential Architecture :
• Here, program is not expected to explicitly convey any information
regarding parallelism to hardware,
• Dependence Architectures :
• Here, program explicitly mentions information regarding dependencies
between operations like dataflow architecture.
• Independence Architecture :
• Here, program gives information regarding which operations are
independent of each other so that they can be executed in stead of the
‘nop’s.
Page 10
Pipeline processing

Page 11
Pipeline processing
• Pipelining is the process of arrangement of hardware elements of CPU such
that its overall performance is increased.
• Simultaneous execution of more than one instruction tasks place in
pipelined process.
• In pipelining multiple instructions are overlapped in execution.
• General structure of n segment pipeline

Page 12
Example

Page 13
Pipeline stages

• Pipelining organizes the execution of the multiple instructions simultaneously.


• Pipelining improves the throughput of the system. In pipelining the instruction is divided into the
subtasks.
• Each subtask performs the dedicated task.
• The instruction is divided into 5 subtasks: instruction fetch, instruction decode, operand
fetch, instruction execution and operand store.
• The instruction fetch subtask will only perform the instruction fetching operation, instruction decode
subtask will only be decoding the fetched instruction and so on the other subtasks will do.

Page 14
Pipeline stages- Three Stage Pipeline

Page 15
pipeline stages

An instruction in a process is divided into 5 subtasks likely,


• In the first subtask, the instruction is fetched.
• The fetched instruction is decoded in the second stage.
• In the third stage, the operands of the instruction are fetched.
• In the fourth, arithmetic and logical operation are performed on the operands to
execute the instruction.
• In the fifth stage, the result is stored in memory.

Page 16
• The first instruction gets completed in 5 clock cycle.
• After the completion of first instruction, in every new clock cycle, a new instruction completes its execution.
• Observe that when the Instruction fetch operation of the first instruction is completed in the next clock cycle
the instruction fetch of second instruction gets started.
• This way the hardware never sits idle it is always busy in performing some or other operation.
• But, no two instructions can execute their same stage at the same clock cycle.

Page 17
Advantages of Pipelining
• Advantages of Pipelining
• Instruction throughput increases.
• Increase in the number of pipeline stages increases the number of instructions executed simultaneously.
• Faster ALU can be designed when pipelining is used.
• Pipelining increases the overall performance of the CPU.

• Disadvantages of Pipelining
• Designing of the pipelined processor is complex.
• The throughput of a pipelined processor is difficult to predict.

Page 18
Instruction pipelining
• Pipeline processing can occur not only in the data stream but in the instruction
stream as well.
• Most of the digital computers with complex instructions require instruction
pipeline to carry out operations like fetch, decode and execute instructions.
• In general, the computer needs to process each instruction with the following
sequence of steps.
1. Fetch instruction from memory.
2. Decode the instruction.
3. Calculate the effective address.
4. Fetch the operands from memory.
5. Execute the instruction.
6. Store the result in the proper place.

Page 19
Instruction pipelining

• Each step is executed in a particular segment, and there are times


when different segments may take different times to operate on the
incoming information.
• The organization of an instruction pipeline will be more efficient if the
instruction cycle is divided into segments of equal duration.
• One of the most common examples of this type of organization is
a Four-segment instruction pipeline.

Page 20
Four-segment instruction pipeline

• Segment 1:
• The instruction fetch segment can be
implemented using first in, first out (FIFO)
buffer.
• Segment 2:
• The instruction fetched from memory is
decoded in the second segment, and
eventually, the effective address is
calculated in a separate arithmetic circuit.
• Segment 3:
• An operand from memory is fetched in the
third segment.
• Segment 4:
• The instructions are finally executed in the
last segment of the pipeline organization. Page 21
Types of Pipelining

• Arithmetic Pipelining
• Instruction Pipelining
• Processor Pipelining
• Unifunction Vs. Multifunction Pipelining
• Static vs Dynamic Pipelining
• Scalar vs Vector Pipelining

Page 22
Advantages of Pipelining
• The cycle time of the processor is decreased. It can improve the instruction
throughput. Pipelining doesn't lower the time it takes to do an instruction.
Rather than, it can raise the multiple instructions that can be processed
together ("at once") and lower the delay between completed instructions
(known as 'throughput').
• If pipelining is used, the CPU Arithmetic logic unit can be designed quicker,
but more complex.
• Pipelining increases execution over an un-pipelined core by an element of
the multiple stages (considering the clock frequency also increases by a
similar factor) and the code is optimal for pipeline execution.
• Pipelined CPUs frequently work at a higher clock frequency than the RAM
clock frequency, (as of 2008 technologies, RAMs operate at a low
frequency correlated to CPUs frequencies) increasing the computer’s global
implementation.
Page 23
Types of Pipelining

• Arithmetic Pipelining
• Instruction Pipelining
• Processor Pipelining
• Unifunction Vs. Multifunction Pipelining
• Static vs Dynamic Pipelining
• Scalar vs Vector Pipelining

Page 24
Pipeline Hazards

Pipeline Hazards
In the pipeline system, some situations prevent the next instruction from performing the planned task on a
particular clock cycle due to some problems.

“Pipeline hazards are the situations that prevent the next instruction from being executing during its designated
clock cycle." These hazards create a problem named as stall cycles.

Types of Pipeline Hazards


1.Structural Hazard/ Resource Conflict
2.Data Hazard/ data Dependency
3.Control Hazard / Branch Difficulty

Page 25
Structural Hazard/ Resource conflict

• This type of Hazard occurs when two different Inputs try to use the same resource simultaneously.
• These hazards are caused by access to memory by two instructions at the same time. These conflicts can
be slightly resolved by using separate instruction and data memories.

• Structural hazards occur when the processor's hardware is not capable of executing all the
instructions in the pipeline simultaneously.
• Structural hazards within a single pipeline are rare modern processors because the instruction set
architecture is designed to support pipelining.

Page 26
Structural Hazard/ Resource conflict continue

• During clock cycle, I1 is fetching operand (OF) and no other instruction can access memory during cycle 3 and
same with I2.
• Instruction 3 (I3) is delay by 2 cycle as it cannot fetch instruction as memory is being access by other
instruction.
• Thus resource dependency can detoriate overall performance of pipeline execution.
• Above problem can be solved by using separate instruction and data memory.

Page 27
Data Hazard/ data Dependency

• Instruction 1 cycle is processed where it is fetched, decoded, operand fetch, execution and write back of
instruction takes place.
• When instruction two is processed from i+ 1, then it is fetched, decoded but we cannot fetch the operand
because the value of R2 and R3 is stored in R1, and that updated value is used in the next instruction operand.
• So in i + 1, we cannot fetch the operand because the R1 value is not updated. Therefore, we have to delay the
second instruction's operation fetch till the write back instruction of the first instruction is completed, and this
situation is called a Hazard.
• The instruction R1 result is required as an input for the next instruction R1 value, it means value of R1 in second
instruction depends on the resulting value of instruction R1and this dependency is called as Data
Dependency, and because of this data dependency, two Stall cycles have been created by the pipeline in
executing the instructions.
Page 28
here are three situations in which a data hazard can occur:
1.read after write (RAW), a true dependency
2.write after read (WAR), an anti-dependency
3.write after write (WAW), an output dependency

Page 29
Branch hazards

• Branch instructions, particularly conditional branch instructions, create data dependencies between
the branch instruction and the previous instruction, fetch stage of the pipeline.
• Since the branch instruction computes the address of the next instruction that the instruction fetch
stage should fetch from, it consumes some time and also some time is required to flush the pipeline
and fetch instructions from target location.
• This time wasted is called as branch penalty.

Page 30
Example:

MOV RO,77H
MOV R1, 73H
ADD RO,R1
JC NEXT

NEXT: MOV R2,R1

Page 31

You might also like