Professional Documents
Culture Documents
Lesson 2 Branch Prediction
Lesson 2 Branch Prediction
Outline
-2-
# go to L1 if ($s1 == $s2)
# go to L1 if ($s1 != $s2)
-3-
op
rs
rt
address
6 bit
5 bit
5 bit
16 bit
-4-
Register Read
$x e $y
Write of
PC
-5-
ID
Instruction Decode
EX
Execution
ME
Memory Access
Write of
PC
WB
Write Back
beq $x,$y,offset
Instr. Fetch
& PC Increm.
Register Read
$x e $y
& (PC+4+offset)
-6-
WR
[25-21] Register
Read 1
Istruction [20-16] Register
Read 2
2-bit Left
Shifter
Content
register 1
To control logic of
conditional branch
Zero
Register File
Register
write
Write
Data
[15-0]
Content
registrer 2
ALU
Sign
16 bit Extension 32 bit
-7-
PC + 4
(from fetch unit)
Adder
Branch Target
Address
M
U
X
EX Execution
ID/EX
PC Write
WB
Write Back
MEM/WB
EX/MEM
+4
Adder
Adder
WR
PC
Read
Address
Instruction
[25-21]
[20-16]
Register
Read 1
Register
Read 2
OP
Content
register 1
ALU
Branch
Outcome
Zero
RF
Instruction
Memory
M
[15-11] U
X
Register
Write
Write
Data
[15-0]
IF Instruction Fetch
2-bit Left
Shifter
16 bit
M
U
X
Content
register 2
Read
Address
Write
Address
Write
Data
Sign
extension 32 bit
Result
WR
-8-
RD
Read Data
Data
Memory
M
U
X
-9-
Branch Hazards
- 10 -
IF
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
WB
The branch instruction may or may not change the PC in MEM stage,
but the next 3 instructions are fetched and their execution is
started.
If the branch is not taken, the pipeline execution is OK
If the branch is taken, it is necessary to flush the next 3 instructions
in the pipeline and fetched the lw instruction at the branch target
address (L1)
- 11 -
- 12 -
IF
ID
EX
ME
WB
stall
stall
stall
IF
or $13, $6, $2
add $14, $2, $2
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
- 13 -
WB
IF
ID
EX
ME
WB
stall
stall
IF
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
or $13, $6, $2
add $14, $2, $2
WB
- 14 -
- 15 -
EX/MEM
MEM/WB
IF/ID
+4
Adder
Adder
2-bit Left
WR Shift
PC
Register
Read 1 Cont.
Register Reg. 1
Read 2
Read
Address
Instruction
Register File
Instruction
Memory
M
U
X
16 bit
OP
ALU
M
U
X
Register Cont.
Write
Reg. 2
Write
Data
Sign
Extension
Branch
Outcome
32 bit
- 16 -
M
U
X
Result
Data
Memory
IF
ID
EX
ME
WB
stall
IF
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
or $13, $6, $2
add $14, $2, $2
WB
- 17 -
IF
ID
EX
ME
WB
IF
stall
ID
EX
ME
WB
stall
IF
ID
EX
- 18 -
ME
WB
IF
ID
EX
ME
WB
IF
stall
stall
ID
EX
ME
WB
stall
IF
ID
EX
- 19 -
ME
WB
- 20 -
- 21 -
- 22 -
- 23 -
- 24 -
- 25 -
IF
Instruction i+1
Instruction i+2
Instruction i+3
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
Instruction i+4
- 26 -
WB
Taken branch
IF
Instruction i+1
Branch target
Branch target+1
ID
EX
IF
ME
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
Branch target+2
WB
- 27 -
WB
- 28 -
- 29 -
- 30 -
4) Profile-Driven Prediction
- 31 -
- 32 -
IF
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
lw $8, 500($0)
- 33 -
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
Instr. i+3
- 34 -
WB
Taken branch
IF
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
WB
IF
ID
EX
ME
- 35 -
WB
- 36 -
if $2 == 0 then
- 37 -
if $1 == 0 then
if $1 == 0 then
- 38 -
if $1 == 0 then
if $1 == 0 then
or $7, $8, $9
or $7, $8, $9
- 39 -
To make the optimization legal for the target an fallthrough cases, it must be OK to execute the moved
instruction when the branch goes in the unexpected
direction.
By OK we mean that the instruction in the branch delay
slot is executed but the work is wasted (the program will
still execute correctly).
For example, if the destination register is an unused
temporary register when the branch goes in the
unexpected direction.
- 40 -
- 41 -
- 42 -
- 43 -
- 44 -
- 45 -
- 46 -
- 47 -
- 48 -
BHT
2k entries
T/NT
- 49 -
- 50 -
In a loop branch, even if a branch is almost always taken and then not
taken once, the 1-bit BHT will mispredict twice (rather than once)
when it is not taken.
At the last loop iteration, since the prediction bit will say taken,
while we need to exit from the loop.
When we re-enter the loop, at the end of the first loop iteration
we need to take the branch to stay in the loop, while the
prediction bit say to exit from the loop, since the prediction bit
was flipped on previous execution of the last iteration of the loop.
- 51 -
- 52 -
- 53 -
- 54 -
- 55 -
- 56 -
If(a==2) a = 0; bb1
L1: If(b==2) b = 0; bb2
L2: If(a!=b) {};
bb3
L1:
L2:
subi
bnez
add
subi
bnez
add
sub
beqz
r3,r1,2
r3,L1; bb1
r1,r0,r0
r3,r2,2
r3,L2; bb2
r2,r0,r0
r3,r1,r2
r3,L3; bb3
L3:
Branch bb3 is correlated to previous branches bb1 and bb2.
If previous branches are both not taken,
then bb3 will be taken (a!=b)
Prof. Cristina Silvano Politecnico di Milano
- 57 -
- 58 -
....
....
Branch Address
- 59 -
- 60 -
- 61 -
....
....
....
....
Branch Address
(k bit)
2k entries
- 62 -
4-bit branch
address
....
....
....
....
24 entries
2-bit Prediction
Prof. Cristina Silvano Politecnico di Milano
- 63 -
- 64 -
- 65 -
Frequency of Mispredictions
14%
12%
11%
10%
8%
6%
6%
6%
6%
5%
5%
4%
4%
2%
1%
1%
0%
- 66 -
li
eqntott
espresso
gcc
fpppp
spice
doducd
tomcatv
matrix300
nasa7
0%
- 67 -
GA Predictor
BHT
k-bitPC
T/NT
PHT
BHR
T NT NT T
- 68 -
T/NT
GShare Predictor
K-bit
PC
PHT
BHR
T/NT
T NT NT T
XOR
- 69 -
- 70 -
Associative lookup
- 71 -
BTB entry:
Tag
Target
Stat
.
T-T-NT
Present
- 72 -
Target
Address
T/NT
Speculation
- 73 -
Hardware-Based Speculation
- 74 -
Hardware-Based Speculation
- 75 -
References
- 76 -