Ca07 2014 PDF

COMPUTER
ARCHITECTURE AND
ORGANIZATION
Chapter 4.5
Pipelining
An Overview of Pipelining
• Pipelining- An implementation
technique in which multiple
instructions are overlapped in
execution, much like an assembly
line.
• Today, pipelining is nearly universal.
3
4
• The same principles apply to
processors where we pipeline
instruction execution.
• MIPS instructions classically take five
steps:
1. Fetch instruction from memory.
2. Read registers while decoding the instruction.
The regular format of MIPS instructions allows
reading and decoding to occur simultaneously.
3. Execute the operation or calculate an address.
4. Access an operand in data memory.
5. Write the result into a register 5
Single-Cycle versus Pipelined
Performance
6
Single-cycle execution versus
pipelined execution
7
Designing Instruction Sets for
Pipelining
Pipeline Hazards
• Structural Hazards
• Data Hazards
example
add $s0, $t0, $t1
sub $t2, $s0, $t3
Solution: Forwarding or Reordering the
Code?
• Control Hazards
8
Data Hazards:
Forwarding with Two
Instructions
• Graphical representation of the

instruction pipeline
9
Data Hazards:
Forwarding with Two
Instructions
• Graphical representation of
forwarding
10
Data Hazards:
Forwarding with Two
Instructions
• Graphical representation of stalling
11
Data Hazards:
Reordering the Code
• Consider the • MIPS code for this
following code segment
segment in C: lw $t1, 0($t0)
a = b + e; lw $t2, 4($t0)
c = b + f; add $t3, $t1,$t2
sw $t3, 12($t0)
lw $t4, 8($t0)
add $t5, $t1,$t4
sw $t5, 16($t0)
Find the hazards in the following code segment. 12
Data Hazards:
Reordering the Code
• Moving up the third lw instruction to
become the third instruction eliminates
both hazards:
lw $t1, 0($t0)
lw $t2, 4($t0)
lw $t4, 8($t0)
add $t3, $t1,$t2
sw $t3, 12($t0)
add $t5, $t1,$t4
sw $t5, 16($t0)
13
Control Hazards
• Graphical representation of stalling
14
Performance of “Stall on Branch”
15
The BIG Picture
• Pipelining increases the number of simultaneously

executing instructions and the rate at which
instructions are started and completed.
• Pipelining does not reduce the time it takes to
complete an individual instruction, also called the
latency.
• Instruction sets can either simplify or make life
harder for pipeline designers
• Designers cope with structural, control, and data
hazards.
• Branch prediction and forwarding help make a
computer fast while still getting the right answers.
16
Pipelined Datapath –
Single-Cycle Datapath
17
Pipelined Datapath – Five Stages
A datapath has five stages, named to

correspond to a stage of instruction execution:
• 1. IF: Instruction fetch
• 2. ID: Instruction decode and register file read
• 3. EX: Execution or address calculation
• 4. MEM: Data memory access
• 5. WB: Write back
18
Pipelined Datapath – Exceptions
There are, however, two exceptions to this

left-to-right flow of instructions:
– The write-back stage, which places the result
back into the register file in the middle of the
datapath
– The selection of the next value of the PC,

choosing between the incremented PC and the
branch address from the MEM stage
19
Pipelined Datapath –
Pipelined Execution
20
Pipelined Version of the Datapath
21
The First Pipe Stage
of an Instruction
22
The Second Pipe Stage
of an Instruction
23
The Third Pipe Stage
of a load Instruction
24
The Fourth Pipe Stage
25
The Fifth Pipe Stage
26
Five Pipe Stages
of a store Instruction
• 1. IF: Instruction fetch

• 2. ID: Instruction decode and register file read
• 3. EX: Execution or address calculation
• 4. MEM: Data memory access
• 5. WB: Write back
27
The Third Pipe Stage
28
The Fourth Pipe Stage
29
The Fifth Pipe Stage
30
The corrected pipelined datapath to
handle the load instruction properly
31
The portion of the datapath that is used
in all five stages of a load instruction
32
Multiple-clock-cycle pipeline diagram
of five instructions
33
Traditional multiple-clock-cycle pipeline
diagram of five instructions
34
The single-clock-cycle diagram
corresponding to clock cycle 5 of the pipeline
35
The pipelined datapath
with the control signals identified
36
The values of the Control Lines
37
Control Line Stages
According to the pipeline stage:

1.Instruction fetch
2.Instruction decode/register file read
3.Execution/address calculation
4.Memory access
5.Write-back
38
The Control Lines for
the final three stages
39
The pipelined datapath with
control signals connected to control portions
of pipeline registers
40
Data Hazards:
Forwarding vs. Stalling
A sequence with many dependences:
sub $2, $1,$3 # Register $2 written by sub

and $12,$2,$5 # 1st operand($2) depends on sub
or $13,$6,$2 # 2nd operand($2) depends on sub
add $14,$2,$2 # 1st($2) & 2nd($2) depend on sub
sw $15,100($2) # Base ($2) depends on sub
41
Pipelined dependences
in a five-instruction sequence
42
Notation that names the fields
of the pipeline registers
• Left of the period is the name of the

pipeline register
• Right of the period is the name of the field
in that register
• The two pairs of hazard conditions are:
1a. EX/MEM.RegisterRd = ID/EX.RegisterRs
1b. EX/MEM.RegisterRd = ID/EX.RegisterRt
2a. MEM/WB.RegisterRd = ID/EX.RegisterRs
2b. MEM/WB.RegisterRd = ID/EX.RegisterRt
43
The dependences between the pipeline
registers and the inputs to the ALU
44
ALU and pipeline registers
before adding forwarding
45
Control of EX conflict
if (EX/MEM.RegWrite
and (EX/MEM.RegisterRd ≠ 0
and (EX/MEM.RegisterRd =
ID/EX.RegisterRs)) ForwardA = 10
if (EX/MEM.RegWrite
and (EX/MEM.RegisterRd ≠ 0)
and (EX/MEM.RegisterRd =
ID/EX.RegisterRt)) ForwardB = 10
46
Control of MEM conflict
if (MEM/WB.RegWrite
and (MEM/WB.RegisterRd ≠ 0)
and (MEM/WB.RegisterRd =
ID/EX.RegisterRs)) ForwardA = 01
if (MEM/WB.RegWrite
and (MEM/WB.RegisterRd ≠ 0)
and (MEM/WB.RegisterRd =
ID/EX.RegisterRt)) ForwardB = 01
47
The control values for the
forwarding multiplexors
48
Datapath modified to resolve
hazards via forwarding
49
Datapath with added
2:1 multiplexor
50
A pipelined sequence of instructions
51
Data Hazards and Stalls
• We need a hazard detection unit

• It operates during the ID stage so that it can
insert the stall between the load and its use.
• the control for the hazard detection unit is this
single condition:
if (ID/EX.MemRead and
((ID/EX.RegisterRt = IF/ID.RegisterRs) or
(ID/EX.RegisterRt = IF/ID.RegisterRt)))
stall the pipeline
52
The way stalls are really inserted
into the pipeline
53
Pipelined Control Overview
54
The BIG Picture
• Although the compiler generally relies upon the hardware

to resolve hazards and thereby ensure correct execution,
the compiler must understand the pipeline to achieve the
best performance
• Otherwise, unexpected stalls will reduce the performance

of the compiled code
55
Learning Material
• This lecture covers chapters 4.5-4.7 from

Patterson-Hennessey
56

Ca07 2014 PDF

Uploaded by

Copyright:

Available Formats

You might also like

Ca07 2014 PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ca07 2014 PDF

Uploaded by

Copyright:

Available Formats

COMPUTER

• Graphical representation of the

• Graphical representation of stalling

• Graphical representation of stalling

• Pipelining increases the number of simultaneously

A datapath has five stages, named to

There are, however, two exceptions to this

– The selection of the next value of the PC,

• 1. IF: Instruction fetch

According to the pipeline stage:

A sequence with many dependences:

sub $2, $1,$3 # Register $2 written by sub

• Left of the period is the name of the

• We need a hazard detection unit

• Although the compiler generally relies upon the hardware

• Otherwise, unexpected stalls will reduce the performance

• This lecture covers chapters 4.5-4.7 from

You might also like