Ca07 2014 PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 56

COMPUTER

ARCHITECTURE AND
ORGANIZATION
Chapter 4.5

Pipelining
An Overview of Pipelining

• Pipelining- An implementation
technique in which multiple
instructions are overlapped in
execution, much like an assembly
line.
• Today, pipelining is nearly universal.

3
An Overview of Pipelining

4
An Overview of Pipelining
• The same principles apply to
processors where we pipeline
instruction execution.
• MIPS instructions classically take five
steps:
1. Fetch instruction from memory.
2. Read registers while decoding the instruction.
The regular format of MIPS instructions allows
reading and decoding to occur simultaneously.
3. Execute the operation or calculate an address.
4. Access an operand in data memory.
5. Write the result into a register 5
Single-Cycle versus Pipelined
Performance

6
Single-cycle execution versus
pipelined execution

7
Designing Instruction Sets for
Pipelining

Pipeline Hazards
• Structural Hazards
• Data Hazards
example
add $s0, $t0, $t1
sub $t2, $s0, $t3
Solution: Forwarding or Reordering the
Code?
• Control Hazards
8
Data Hazards:
Forwarding with Two
Instructions

• Graphical representation of the


instruction pipeline

9
Data Hazards:
Forwarding with Two
Instructions
• Graphical representation of
forwarding

10
Data Hazards:
Forwarding with Two
Instructions

• Graphical representation of stalling

11
Data Hazards:
Reordering the Code
• Consider the • MIPS code for this
following code segment
segment in C: lw $t1, 0($t0)
a = b + e; lw $t2, 4($t0)
c = b + f; add $t3, $t1,$t2
sw $t3, 12($t0)
lw $t4, 8($t0)
add $t5, $t1,$t4
sw $t5, 16($t0)
Find the hazards in the following code segment. 12
Data Hazards:
Reordering the Code
• Moving up the third lw instruction to
become the third instruction eliminates
both hazards:
lw $t1, 0($t0)
lw $t2, 4($t0)
lw $t4, 8($t0)
add $t3, $t1,$t2
sw $t3, 12($t0)
add $t5, $t1,$t4
sw $t5, 16($t0)
13
Control Hazards

• Graphical representation of stalling

14
Performance of “Stall on Branch”

15
The BIG Picture

• Pipelining increases the number of simultaneously


executing instructions and the rate at which
instructions are started and completed.
• Pipelining does not reduce the time it takes to
complete an individual instruction, also called the
latency.
• Instruction sets can either simplify or make life
harder for pipeline designers
• Designers cope with structural, control, and data
hazards.
• Branch prediction and forwarding help make a
computer fast while still getting the right answers.
16
Pipelined Datapath –
Single-Cycle Datapath

17
Pipelined Datapath – Five Stages

A datapath has five stages, named to


correspond to a stage of instruction execution:
• 1. IF: Instruction fetch
• 2. ID: Instruction decode and register file read
• 3. EX: Execution or address calculation
• 4. MEM: Data memory access
• 5. WB: Write back

18
Pipelined Datapath – Exceptions

There are, however, two exceptions to this


left-to-right flow of instructions:
– The write-back stage, which places the result
back into the register file in the middle of the
datapath

– The selection of the next value of the PC,


choosing between the incremented PC and the
branch address from the MEM stage

19
Pipelined Datapath –
Pipelined Execution

20
Pipelined Version of the Datapath

21
The First Pipe Stage
of an Instruction

22
The Second Pipe Stage
of an Instruction

23
The Third Pipe Stage
of a load Instruction

24
The Fourth Pipe Stage
of a load Instruction

25
The Fifth Pipe Stage
of a load Instruction

26
Five Pipe Stages
of a store Instruction

• 1. IF: Instruction fetch


• 2. ID: Instruction decode and register file read
• 3. EX: Execution or address calculation
• 4. MEM: Data memory access
• 5. WB: Write back

27
The Third Pipe Stage
of a store Instruction

28
The Fourth Pipe Stage
of a store Instruction

29
The Fifth Pipe Stage
of a store Instruction

30
The corrected pipelined datapath to
handle the load instruction properly

31
The portion of the datapath that is used
in all five stages of a load instruction

32
Multiple-clock-cycle pipeline diagram
of five instructions

33
Traditional multiple-clock-cycle pipeline
diagram of five instructions

34
The single-clock-cycle diagram
corresponding to clock cycle 5 of the pipeline

35
The pipelined datapath
with the control signals identified

36
The values of the Control Lines

37
Control Line Stages

According to the pipeline stage:


1.Instruction fetch
2.Instruction decode/register file read
3.Execution/address calculation
4.Memory access
5.Write-back

38
The Control Lines for
the final three stages

39
The pipelined datapath with
control signals connected to control portions
of pipeline registers

40
Data Hazards:
Forwarding vs. Stalling

A sequence with many dependences:

sub $2, $1,$3 # Register $2 written by sub


and $12,$2,$5 # 1st operand($2) depends on sub
or $13,$6,$2 # 2nd operand($2) depends on sub
add $14,$2,$2 # 1st($2) & 2nd($2) depend on sub
sw $15,100($2) # Base ($2) depends on sub

41
Pipelined dependences
in a five-instruction sequence

42
Notation that names the fields
of the pipeline registers

• Left of the period is the name of the


pipeline register
• Right of the period is the name of the field
in that register
• The two pairs of hazard conditions are:
1a. EX/MEM.RegisterRd = ID/EX.RegisterRs
1b. EX/MEM.RegisterRd = ID/EX.RegisterRt
2a. MEM/WB.RegisterRd = ID/EX.RegisterRs
2b. MEM/WB.RegisterRd = ID/EX.RegisterRt
43
The dependences between the pipeline
registers and the inputs to the ALU

44
ALU and pipeline registers
before adding forwarding

45
Control of EX conflict
if (EX/MEM.RegWrite
and (EX/MEM.RegisterRd ≠ 0
and (EX/MEM.RegisterRd =
ID/EX.RegisterRs)) ForwardA = 10

if (EX/MEM.RegWrite
and (EX/MEM.RegisterRd ≠ 0)
and (EX/MEM.RegisterRd =
ID/EX.RegisterRt)) ForwardB = 10

46
Control of MEM conflict

if (MEM/WB.RegWrite
and (MEM/WB.RegisterRd ≠ 0)
and (MEM/WB.RegisterRd =
ID/EX.RegisterRs)) ForwardA = 01

if (MEM/WB.RegWrite
and (MEM/WB.RegisterRd ≠ 0)
and (MEM/WB.RegisterRd =
ID/EX.RegisterRt)) ForwardB = 01

47
The control values for the
forwarding multiplexors

48
Datapath modified to resolve
hazards via forwarding

49
Datapath with added
2:1 multiplexor

50
A pipelined sequence of instructions

51
Data Hazards and Stalls

• We need a hazard detection unit


• It operates during the ID stage so that it can
insert the stall between the load and its use.
• the control for the hazard detection unit is this
single condition:
if (ID/EX.MemRead and
((ID/EX.RegisterRt = IF/ID.RegisterRs) or
(ID/EX.RegisterRt = IF/ID.RegisterRt)))
stall the pipeline

52
The way stalls are really inserted
into the pipeline

53
Pipelined Control Overview

54
The BIG Picture

• Although the compiler generally relies upon the hardware


to resolve hazards and thereby ensure correct execution,
the compiler must understand the pipeline to achieve the
best performance

• Otherwise, unexpected stalls will reduce the performance


of the compiled code

55
Learning Material

• This lecture covers chapters 4.5-4.7 from


Patterson-Hennessey

56

You might also like