Professional Documents
Culture Documents
Lec14 Pipeline Riscv - Key
Lec14 Pipeline Riscv - Key
Pipelining RISC-V
1
Administrivia
2
Review
Computer Science 61C Spring 2021 Kolb & Weaver
• Controller
• Tells universal datapath how to execute each instruction
• Instruction timing
• Set by instruction complexity, architecture, technology
• Pipelining increases clock frequency, “instructions per second”
• But does not reduce time to complete instruction
• Performance measures
• Different measures depending on objective
• Response time
• Jobs / second
• Energy per task
3
Levels of Representation/Interpretation
Machine
Interpretation
Hardware Architecture Description
(e.g., block diagrams)
Architecture
Implementation
Processor Memory
Enable?
Control Read/Write
Program
Datapath
Address
PC Bytes
Write
Registers
Data
Arithmetic & Logic Unit Data
Read
(ALU)
Data
Processor-Memory Interface
5
Pipelining Overview (Review)
6 PM 7 8 9 Nicholas Weaver
Nicholas Weaver
Instruction Fetch 200 ps 200 ps
or t3, t4, t5
or t3, t4, t5
or t3, t4, t5
instruction sequence
sw t0, 4(t3)
lw t0, 8(t3)
tcycle
addi t2, t2, 1 = 200 ps
9
RISC-V Pipeline
tinstruction = 1000 ps
Resource use in a Nicholas Weaver
sw t0, 4(t3)
lw t0, 8(t3)
tcycle
addi t2, t2, 1 = 200 ps
10
Single Cycle Datapath
Computer Science 61C Spring 2021 Kolb & Weaver
registers
rd
instruction
memory
PC
rs
memory
ALU
Data
rt
+4 imm
11
Pipeline registers
Computer Science 61C Spring 2021 Kolb & Weaver
registers
rd
instruction
memory
PC
rs
memory
ALU
Data
rt
+4 imm
13
IF for Load, Store, …
Computer Science 61C Spring 2021 Kolb & Weaver
14
ID for Load, Store, …
Computer Science 61C Spring 2021 Kolb & Weaver
15
EX for Load
Computer Science 61C Spring 2021 Kolb & Weaver
16
MEM for Load
Computer Science 61C Spring 2021 Kolb & Weaver
17
WB for Load – Oops!
Computer Science 61C Spring 2021 Kolb & Weaver
Wrong
register
number!
18
Corrected Datapath for Load
Computer Science 61C Spring 2021 Kolb & Weaver
19
Pipelined RISC-V RV32I Datapath
Recalculate PC+4 in M stage to avoid
sending both PC and PC+4 downNicholas Weaver
pipeline
pcX pcM
pcF pcD +4
+4 Reg[]
rs1X
DataD
1
aluX DataA ALU DMEM
0 AddrD Addr
pcF+4 Branch aluM DataR
IMEM AddrA
DataB
Comp. DataW
AddrB
instD
rs2X rs2M
immX
Imm.
instM
instW
instX
pcX pcM
pcF pcD +4
+4 Reg[]
rs1X
DataD
1
aluX DataA ALU DMEM
0 AddrD Addr
pcF+4 Branch aluM DataR
IMEM AddrA
DataB
Comp. DataW
AddrB
instD
rs2X rs2M
immX
Imm.
instM
instW
instX
Pipeline registers separate stages, hold data for each instruction in flight
21
Pipelined Control
Computer Science 61C Spring 2021 Kolb & Weaver
22
Pipelining Hazards
Computer Science 61C Spring 2021 Kolb & Weaver
24
Regfile Structural Hazards
Computer Science 61C Spring 2021 Kolb & Weaver
• Each instruction:
• can read up to two operands in decode stage
• can write one value in writeback stage
• Avoid structural hazard by having separate “ports”
• two independent read ports and one independent write port
• Three accesses per cycle can happen simultaneously
25
Structural Hazard: Memory Access
Computer Science 61C Spring 2021 Kolb & Weaver
memories
slt t6, t0, t3
sw t0, 4(t3)
lw t0, 8(t3)
26
Instruction and Data Caches
Computer Science 61C Spring 2021 Kolb & Weaver
27
Structural Hazards – Summary
Computer Science 61C Spring 2021 Kolb & Weaver
or t3, t4, t5
instruction sequence
sw t0, 4(t3)
lw t0, 8(t3)
29
Register Access Policy
• Exploit high speed of register
file (100 ps) Nicholas Weaver
1) WB updates value
add t0, t1, t2 2) ID reads new value
instruction sequence
• Indicated in diagram by
or t3, t4, t5 shading
sw t0, 4(t3)
lw t0, 8(t3)
Value of s0 5 5 5 5 5/9 9 9 9 9
or t6, s0, t3
sw s0, 8(t3)
• Bubble:
• effectively NOP: affected pipeline stages do “nothing”
32
Stalls and Performance
Computer Science 61C Spring 2021 Kolb & Weaver
33
Solution 2: Forwarding
Nicholas Weaver
5 5 5 5 5/9 9 9 9 9
Value of t0
add t0, t1, t2
or t3, t0, t5
instruction sequence
sw t0, 8(t3)
35
Detect Need for Forwarding
(example) Compare destination of
older instructions in
Nicholas Weaver
instD.rs1
or t3, t0, t5
36
Forwarding Path for RA to ALU
Nicholas Weaver
pcX pcM
pcF pcD +4
+4 Reg[]
rs1X
DataD
1
aluX DataA ALU DMEM
0 AddrD Addr
pcF+4 Branch aluM DataR
IMEM AddrA
DataB
Comp. DataW
AddrB
instD
rs2X rs2M
immX
Imm.
instM
instW
instX
Forwarding Control
Logic
37
Actual Forwarding Path Location…
Computer Science 61C Spring 2021 Kolb & Weaver
• We forward with two muxes just after RS1x and RS2x pipeline
registers
• The output of the read registers
• Select either the register, the output of the ALUm register, or the
output of the MEM/WB register
38
Agenda
Computer Science 61C Spring 2021 Kolb & Weaver
• Hazards
• Structural
• Data
• R-type instructions
• Load
• Control
• Superscalar processors
39
Load Data Hazard
Nicholas Weaver
1 cycle stall
unavoidable
forward
unaffected
40
Stall Pipeline
Computer Science 61C Spring 2021 Kolb & Weaver
Stall
repeat and
instruction
and forward
41
lw Data Hazard
Computer Science 61C Spring 2021 Kolb & Weaver
• Hazards
• Structural
• Data
• R-type instructions
• Load
• Control
• Superscalar processors
44
Control Hazards
Computer Science 61C Spring 2021 Kolb & Weaver
45
Observation
Computer Science 61C Spring 2021 Kolb & Weaver
46
Kill Instructions after Branch if Taken
Computer Science 61C Spring 2021 Kolb & Weaver
Convert to NOP
or t6, s0, t3
PC updated reflecting
label: xxxxxx
branch outcome
47
Reducing Branch Penalties
Computer Science 61C Spring 2021 Kolb & Weaver
48
Branch Prediction
Computer Science 61C Spring 2021 Kolb & Weaver
Taken branch
beq t0, t1, label
…..
Check guess correct
49
Implementing Branch Prediction...
Computer Science 61C Spring 2021 Kolb & Weaver
50
Agenda
Computer Science 61C Spring 2021 Kolb & Weaver
• Hazards
• Structural
• Data
• R-type instructions
• Load
• Control
• Superscalar processors
51
Increasing Processor Performance
Computer Science 61C Spring 2021 Kolb & Weaver
1. Clock rate
• Limited by technology and power dissipation
2. Pipelining
• “Overlap” instruction execution
• Deeper pipeline: 5 => 10 => 15 stages
• Less work per stage à shorter clock cycle
• But more potential for hazards (CPI > 1)
3. Multi-issue “superscalar” processor
52
Superscalar Processor
Computer Science 61C Spring 2021 Kolb & Weaver
P&H p. 340
54
Benchmark: CPI of Intel Core i7
Nicholas Weaver
CPI = 1
55
And That Is A Beast...
Computer Science 61C Spring 2021 Kolb & Weaver