Professional Documents
Culture Documents
Chapter 6: Superscalar: Adapted From Mary Jane Irwin at Penn State University For
Chapter 6: Superscalar: Adapted From Mary Jane Irwin at Penn State University For
time
Instruction 1
Instruction 2
Instruction 3
Instruction 4
Instruction 5
time
Instruction 1
Instruction 2
Instruction 3
Instruction 4
Instruction 5
time
Instruction 1
Instruction 2
Instruction 3
Instruction 4
Instruction 5
Instruction 6
Superpipelined vs Superscalar
Superpipelined processors have longer instruction
latency than the SS processors which can degrade
performance in the presence of true dependencies
Superscalar processors are more susceptible to resource
conflicts – but we can fix this with hardware !
EX
EX
I1 WB
I ID I3
n I4 –same function unit as I3
IF IF I5 –data value produced by I4
EX
s WB
I2 ID ID I6 –same function unit as I5
t
r.
IF
EX
WB
I3 ID
O
r IF IF need forwarding
EX
d WB
I4 ID ID hardware
e
r
IF
EX
WB
I5 ID
In parallel can
8 cycles in total
Fetch/decode 2 IF IF
EX
WB
I6 Commit 2 ID ID
EX
EX
I1 WB
I ID I3
n I4 –same function unit as I3
IF I5 –data value produced by I4
EX
s WB
I2 ID I6 –same function unit as I5
t
r.
IF
EX
WB
I3 ID
O
r IF IF
EX
d WB
I4 ID ID
e
r
IF
EX
WB
I5 ID
7 cycles in total
IF IF
I6 EX WB
ID ID
EX
EX
I1 WB
I ID I3
n I4 –same function unit as I3
IF I5 –data value produced by I4
EX
s WB
I2 ID I6 –same function unit as I5
t
r.
IF
EX
WB
I3 ID
O
r IF IF
EX
d WB
I4 ID ID
e
r
IF IF
EX
WB
I5 ID ID
IF
6 cycles in total
EX
WB
I6 ID
issued in-order through a FIFO queue to maintain correct data flow. If there is
not a free reservation station of the appropriate type, the instruction queue
stalls.
Read operands - waits until no data hazards, then read operands
Write result - send the result to the CDB to be grabbed by any waiting register
or reservation stations
All instructions pass through the issue stage in order, but instructions stalling on
operands can be bypass by later instructions whose operands are available.
RAW hazards are handled by delaying instructions in reservation stations until all
their operands are available.
Reservation
Tag Data Tag Data Tag Data Tag Data
Stations
8 11
9 12
10 13
FP Adders FP Multiplers
Load and Store instructions to different memory addresses can be done in any order, but the
relative order of a Store and accesses to the same memory location must be maintained.
not match the address of any Store instruction in the Store buffers. If there is a match, stall
the instruction queue until, the corresponding Store completes. (Alternatively, the Store
could forward the value to the corresponding Load)
Before issuing a Store from the instruction queue, make sure that its effective address does
not match the address of any Store or Load instructions in the Store or Load buffers.