Professional Documents
Culture Documents
L8 PipelineHazards 1
L8 PipelineHazards 1
Outline
❖ Pipeline Hazards
❖ Control Hazards
EC792 2
Pipeline Hazards
❖ Hazards: Situations that would cause incorrect execution
If next instruction were launched during its designated clock cycle
1. Structural hazards
Caused by resource contention - a required resource is busy
Using same resource by two instructions during the same cycle
2. Data hazards
An instruction may compute a result needed by next instruction
Hardware can detect dependencies between instructions
Need to wait for previous instruction to complete its data
read/write
3. Control hazards
Caused by instructions that change control flow (branches/jumps)
Delays in changing the flow of control
❖ Hazards complicate pipeline control and limit performance
EC792 3
Structural Hazards
❖ Problem: Shared code & data memory
❖ Conflict for use of a resource
❖ In RISC-V pipeline with a single memory
Load/store requires data access
Instruction fetch would have to stall for that cycle
▪ Would cause a pipeline “bubble”
❖ Hence, pipelined datapaths require separate
instruction/data memories
Or separate instruction/data caches
EC792 4
Structural Hazards
❖ Problem: Conflict in accessing pipeline stages
Attempt to use the same hardware resource by two different
instructions during the same cycle
Structural Hazard
❖ Example
Two instructions are
Writing back ALU result in stage 4 attempting to write
Conflict with writing load data in stage 5 the register file
during same cycle
CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9 Time
EC792 5
Structural Hazards
❖ Problem: Conflict in accessing common memory
Attempt to use the same hardware resource by two different
instructions during the same cycle
❖ Example Structural Hazard
Two instructions are
Reading data from the memory in stage 4 attempting to access
Conflict with reading instruction in stage 1 memory during the
same cycle
CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9 Time
EC792 6
Resolving Structural Hazards
❖ Solution 1: Delay (Stall) Access to Resource
Must have mechanism to delay instruction access to resource
▪ Hardware includes control logic that stalls until earlier instruction is
no longer using contended resource
Delay all write backs to the register file to stage 5
▪ ALU instructions bypass stage 4 (memory) without doing anything
❖ Solution 2: Schedule instruction appropriately
Programmer explicitly avoids scheduling instructions that would
create structural hazards.
❖ Solution 3: Add more hardware resources (more costly)
Add more hardware to eliminate the structural hazard
▪ Have separate instruction and data memory
Redesign the register file to have two write ports
▪ First write port can be used to write back ALU results in stage 4
▪ Second write port can be used to write back load data in stage 5
EC792 7
Next . . .
❖ Pipelining versus Serial Execution
❖ Pipeline Hazards
❖ Control Hazards
EC792 8
Data Hazards
❖ Dependency between instructions causes a data hazard
❖ Three types:
Read after Write hazard (RAW)
Write after Read hazard (WAR)
Write after Write hazard (WAW)
EC792 9
Data Hazards
❖ Read After Write – RAW Hazard
Given two instructions I and J, where I comes before J
Instruction J reads an operand after it is written by I
Called a data dependence in compiler terminology
I: add R17, R18, R19 # R17 is written
J: sub R20, R17, R16 # R17 is read
Hazard occurs when J reads the operand before I writes it
WB = Write Back
IF = Instruction Fetch ID = Decode & EX = Execute MEM = Memory
Register Read Access
Imm NPC2
Next
NPC
PC
+1 Imm26
ALU result 32
Imm16 zero
Instruction 32 Data
Register File
ALUout
Instruction
A
A
BusA
WB Data
Memory Rs 5 Memory
RA L 0
Instruction Rt 5 Address
0 RB E 1 U Data_out
32
1
PC
Address BusB
B
0 0 32
1 RW Data_in
D
1 BusW 32
Rd
32
clk 10
Example of a RAW Data Hazard
Time (cycles) CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8
value of R18 10 10 10 10 10 20 20 20
Program Execution Order
add R20, R18, R13 IF Reg Reg Reg Reg ALU DM Reg
EC792 12
Solution 2: Forwarding ALU Result
❖ The ALU result is forwarded (fed back) to the ALU input
No bubbles are inserted into the pipeline and no cycles are wasted
❖ ALU result is forwarded from ALU, MEM, and WB stages
Time (cycles) CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8
value of R18 10 10 10 10 10 20 20 20
Program Execution Order
EC792 13
Implementing Forwarding
❖ Two multiplexers added at the inputs of A & B registers
Data from ALU stage, MEM stage, and WB stage is fed back
❖ Two signals: ForwardA and ForwardB control forwarding
ForwardA
Im26
Imm26 Imm16
32 32 ALU result
0 A
Result
Register File
Rs BusA 1
L Address
A
RA 2
Instruction
Rt 3 E U Data 0
WData
1
RB BusB
0 Memory
0 32
1 32 32 Data_out 1
RW
D
B
BusW 2 Data_in
3
32
Rd2
Rd4
Rd3
0
1
Rd
clk
ForwardB
EC792 14
Forwarding Control Signals
Signal Explanation
ForwardA = 0 First ALU operand comes from register file = Value of (Rs)
ForwardB = 0 Second ALU operand comes from register file = Value of (Rt)
EC792 15
Forwarding Example
Instruction sequence: When sub instruction is in decode stage
lw R12, 4(R8) ori will be in the ALU stage
ori R15, R9, 2
lw will be in the MEM stage
sub R11, R12, R15
ForwardA = 2 from MEM stage ForwardB = 1 from ALU stage
sub R11,R12,R15 ori R15, R9,2 lw R12,4(R8)
Imm26
Imm16 2
Imm
ext
32 32 ALU result
0 A
Result
Register File
Rs BusA 1 A
L Address
RA 2
Instruction
Rt 3 U Data 0
WData
1
RB BusB
0 Memory
0 32
1 32 32 Data_out 1
RW
D
B
BusW 2 Data_in
3
32
Rd2
Rd3
Rd4
0
1
Rd 1
clk
EC792 16
RAW Hazard Detection
❖ Current instruction being decoded is in Decode stage
Previous instruction is in the Execute stage
Second previous instruction is in the Memory stage
Third previous instruction in the Write Back stage
EC792 17
Hazard Detect and Forward Logic
Imm26
Im26
32 32 ALU result
E
0 A
Result
Register File
Rs BusA 1
L Address
A
RA 2
Instruction
Rt 3 U Data 0
WData
1
RB BusB
0 Memory
0 32
Rd 1 32 32 Data_out 1
RW
D
B
BusW 2 Data_in
3 ALUCtrl 32
Rd2
Rd3
Rd4
0
1
clk
RegDst
ForwardB ForwardA
Hazard Detect
and Forward
func
RegWrite RegWrite RegWrite
Op Main
EX
& ALU
MEM
Control
WB
EC792 18
Next . . .
❖ Pipelining versus Serial Execution
❖ Pipeline Hazards
❖ Control Hazards
EC792 19
Load Delay
❖ Unfortunately, not all data hazards can be forwarded
Load has a delay that cannot be eliminated by forwarding
❖ In the example shown below …
The LW instruction does not read data until end of CC4
Cannot forward data to ADD at end of CC3 - NOT possible
Time (cycles) CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8
forward data to
2nd next and later
add R20, R18, R13 IF Reg ALU DM Reg
instructions
EC792 20
Detecting RAW Hazard after Load
❖ Detecting a RAW hazard after a Load instruction:
The load instruction will be in the EX stage
Instruction that depends on the load data is in the decode stage
EC792 21
Stall the Pipeline for one Cycle
❖ ADD instruction depends on LW ➔ stall at CC3
Allow Load instruction in ALU stage to proceed
Freeze PC and Instruction registers (NO instruction is fetched)
Introduce a bubble into the ALU stage (bubble is a NO-OP)
❖ Load can forward data to next instruction after delaying it
Time (cycles) CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8
EC792 22
Showing Stall Cycles
❖ Stall cycles can be shown on instruction-time diagram
❖ Hazard is detected in the Decode stage
❖ Stall indicates that instruction is delayed
❖ Instruction fetching is also delayed after a stall
❖ Example:
Data forwarding is shown using blue & red arrows
CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9 CC10 Time
R2 data is available at ex stage and for sub alu i/p
is given by r2 and r18
EC792 23
Hazard Detect, Forward, and Stall
Imm26
Im26
32 32 ALU result
E
0 A
Result
Register File
Rs BusA 1
L Address
A
RA 2
Instruction
Rt 3 U Data 0
WData
1
RB BusB
PC
0 Memory
0 32
Rd 1 32 32 Data_out 1
RW
D
B
BusW 2 Data_in
3 32
Rd2
Rd3
Rd4
0
1
clk
RegDst
ForwardB ForwardA
Disable PC
func
Hazard Detect
Forward, & Stall
MemRead
Stall RegWrite RegWrite
Op Control Signals RegWrite
Main & ALU 0
EX
WB
=0
EC792 24
Code Scheduling to Avoid Stalls
❖ Compilers reorder code in a way to avoid load stalls
❖ Consider the translation of the following statements:
A = B + C; D = E – F; // A thru F are in Memory
EC792 25
Name Dependence: Write After Read
❖ Instruction J writes its result after it is read by I
❖ Called anti-dependence by compiler writers
I: sub R12, R9, R11 # R9 is read
J: add R9, R10, R11 # R9 is written
❖ Results from reuse of the name R9
❖ NOT a data hazard in the 5-stage pipeline because:
Reads are always in stage 2
Writes are always in stage 5, and
Instructions are processed in order
❖ Anti-dependence can be eliminated by renaming
Use a different destination register for add (eg, R13)
EC792 26
Name Dependence: Write After Write
❖ Same destination register is written by two instructions
❖ Called output-dependence in compiler terminology
I: sub R9, R12, R11 # R9 is written
J: add R9, R10, R11 # R9 is written again
❖ Not a data hazard in the 5-stage pipeline because:
All writes are ordered and always take place in stage 5
❖ However, can be a hazard in more complex pipelines
If instructions are allowed to complete out of order, and
Instruction J completes and writes R9 before instruction I
❖ Output dependence can be eliminated by renaming R9
❖ Read After Read is NOT a name dependence
EC792 27
Hazards due to Loads and Stores
❖ Consider the following statements:
sw R10, 0(R1)
lw R11, 6(R5)
Is there any Data Hazard possible in the above code sequence?
If 0(R1) == 6(R5), then it means the same memory location!!
Data dependency through Data Memory
But no Data Hazard since the memory is not accessed by the two
instructions simultaneously – writing and reading happens in two
consecutive clock cycles
But, in an out-of-order execution processor, this can lead to a
Hazard!
❖ Solving data hazards: Stall / Schedule / Forward / Speculate
Speculate: Guess that there is not a problem, if incorrect kill speculative
instruction and restart (later in the course)
EC792 28