Professional Documents
Culture Documents
15 - 370w18 - Pipeline Last
15 - 370w18 - Pipeline Last
15 - 370w18 - Pipeline Last
EECS Department
University of Michigan in Ann Arbor, USA
beq 1 1 10
add 3 4 5
time
taken?
Address of the
current branch
target address
If firstUsing
branch taken,
global second
history definitely
prediction, we not takento base the predictio
are able
on the direction taken by the first if. If (x<1), we know that the sec
EECS 370: Introduction to The University of Michigan
Computer Organization 10
6
Two Level Global Branch Prediction
First level: Global branch history register (N bits)
The direction of last N branches
Second level: Table of saturating counters for each history entry
The direction the branch took the last time the same history was seen
Pattern History Table (PHT)
00 …. 00
1 1 ….. 1 0 00 …. 01
2 3
previous one 00 …. 10
GHR
(global index
history 0 1
register) 11 …. 11
index
0 1
11 …. 11
q Data hazards: since register reads occur in stage 2 and register writes
occur in stage 5 it is possible to read the wrong value if is about to be
written.
q Control hazards: A branch instruction may change the PC, but not until
stage 4. What do we fetch before that?
q Exceptions: How do you handle exceptions in a pipelined processor with 5
instructions in flight?
Instr.
Data
Memory
Memory
Register
A
L
M
PC
File
U
U
X
Register
PC
Instr.
File
Mem
Register
File
Data
Memory
Register
File
Register
File
RF
RF
q Combine
Techniques
• SIMD
+
MP
+
MT=
SIMT
Inst.
Mem
RF
Data
Mem
q MT
used
to
hide
RF
memory
latencies
PC
RF
q SIMD
used
to
decrease
power
of
fetch/control
RF
q MP
used
to
improve
RF
throughput
EECS 370: Introduction to The University of Michigan 29
Computer Organization
LC2k
Pipeline
Summary
Fetch
Decode
Execute
Memory
WB
PC
read
Need
register
values
Branches
resolved
Register
values
produced
q Data
hazards
q Hazard
exists
if
producer-‐consumer
of
a
register
within
a
2
instrucIon
window
q Note
for
project,
the
window
is
3
instrucIons
q Detect
and
stall
–
insert
enough
noops
to
separate
producer
and
consumer
by
>
2
instrucIons
(>
3
instrucIons
for
project)
q Detect
and
forward
q Handles
all
cases
except
LW-‐USE,
need
1
noop
here
q Control
hazards
q Detect
and
stall
–
needs
3
noops
inserted
aler
each
branch
q Predict
and
squash
q Zero
noops
if
predict
correctly
q 3
if
predict
incorrectly
add
1
2
3
nor
1
4
5
add
4
6
7
What
value
is
wri]en
to
the
ALU
result
field
of
the
Mem/WB
pipeline
register
at
the
end
of
cycle
5.
nand
3
4
5
add
3
5
6
q How
many
cycles
are
saved
if
you
perform
speculate
and
squash
for
the
following
code
(assume
that
branches
are
predicted
to
be
not
taken)?
add
1
2
3
beq
1
5
1
nor
6
4
1
add
3
4
5
q Assume
the
branch
is
taken:
How
many
cycles
to
execute
this
code?
q Assume
the
branch
is
not
taken:
How
many
cycles
execute
this
code?
q How
many
cycles
are
saved
if
you
perform
speculate
and
squash
for
the
following
code
(assume
that
branches
are
predicted
to
be
not
taken)?
add
1
2
3
beq
1
5
1
nor
6
4
1
add
3
4
5
q Assume
the
branch
is
taken:
How
many
cycles
to
execute
this
code?
3
instr
+
3
stalls
+
4
to
empty
pipe
=
10
cycles
q Assume
the
branch
is
not
taken:
How
many
cycles
execute
this
code?
4
instr
+
4
to
empty
pipe
=
8
cycles
EECS 370: Introduction to The University of Michigan 39
Computer Organization
Calcula=ng
performance
with
control
hazards
Assume
the
first
branch
is
taken
50%
add
1
2
3
of
the
Ime
and
the
loop
iterates
100
beq
1
5
1
Imes,
and
forwarding
for
all
data
lw
6
4
1
hazards.
add
3
4
5
1. How
many
cycles
does
the
code
beq
5
7
-‐5
take
assuming
detect
and
stall
for
halt
control
hazards?
Assume halt is resolved
in WB stage