Download as pdf or txt
Download as pdf or txt
You are on page 1of 50

The Reorder Buffer

Hardware Speculative Execution


  Need HW buffer for results of
uncommitted instructions: reorder
buffer
–  Reorder buffer can be operand
source
–  Once operand commits, result is
found in register
–  3 fields: instr. type, destination,
value
–  Use reorder buffer number
instead
of reservation station
–  Instructions commit in order
–  As a result, its easy to undo
speculated instructions on
mispredicted branches or on
exceptions
Hardware Speculative Execution

  Need HW buffer for results of uncommitted instructions:


reorder buffer
–  3 fields:
•  instruction type

•  destination

•  value

–  once operand commits, result is found in register file


–  reservation station points to a ROB entry for pending source operand
–  flush ROB on a mispredicted branch
–  instructions are committed in-order from the ROB
•  handles precise exceptions
Four/Five Steps of Speculative
ROB/Tomasulo Algorithm
1. Dispatch—get instruction from FP Op Queue
If reservation station and reorder buffer slot free, issue instr & send
operands & reorder buffer no. for destination.
2. Issue—wait on operands
When both operands ready then execute; if not ready, watch CDB for
result; when both in reservation station, procede to execute
3. Execute —
4. Write result—finish execution (WB)
Write on Common Data Bus to all awaiting FUs & reorder buffer; mark
reservation station available.
5. Commit—update register with reorder result
When instr. at head of reorder buffer & result present, update register
with result (or store to memory) and remove instr from reorder buffer.
HW Speculative Execution

  The re-order buffer and in-order commit allow us to flush the


speculative instructions from the machine when a misprediction
is discovered.

  ROB is another possible source of operands


  ROB can provide precise exception in an out-of-order machine
  ROB allows us to ignore exceptions on speculative code.
ROB
ROB/Tomasulo – cycle 0
Loop:ADDD F4, F2, F0 Instruction Queue
MULD F8, F4, F2
SUBI …
ADDD F6, F8, F6
SUBD F8, F2, F0 0.0
SUBD F8, F2, F0 F0
ADDD F6, F8, F6 F2 2.0
SUBI … F4 4.0
MULD F8, F4, F2
BNEZ …, Loop F6 6.0
ADDD F4, F2, F0 F8 8.0

1
2 1
integer 3 2

FP adders FP mult’s
ROB
0 ADDD F4 - H
ROB/Tomasulo – cycle 1 1
2
3
Loop:ADDD F4, F2, F0 Instruction Queue
4
5
MULD F8, F4, F2 6
BNEZ
ADDD F6, F8, F6
SUBI
SUBD F8, F2, F0 F0 0.0
SUBD F8, F2, F0 F2 2.0
SUBI … F4 4.0 ROB0
ADDD F6, F8, F6
BNEZ … F6 6.0
MULD F8, F4, F2 F8 8.0

1 ADDD 2.0 0.0 ROB0


2 1
integer 3 2

FP adders FP mult’s
ROB
0 ADDD F4 - H
ROB/Tomasulo – cycle 2 1 MULD F8 -
2
3
Loop:ADDD F4, F2, F0 Instruction Queue 5
4
MULD F8, F4, F2
ADDD F4, F2, F0 6
ADDD F6, F8, F6
BNEZ 0.0
SUBD F8, F2, F0 F0
SUBI F2 2.0
SUBI … F4 4.0 ROB0
SUBD F8, F2, F0
BNEZ … F6 6.0
ADDD F6, F8, F6 F8 8.0 ROB1

1 ADDD 2.0 0.0 ROB0


2 1 MULD ROB0 2.0 ROB1
integer 3 2

FP adders FP mult’s
ROB
0 ADDD F4 - H
ROB/Tomasulo – cycle 3 1 MULD F8 -
2 ADDD F6 -
3
Loop:ADDD F4, F2, F0 Instruction Queue 5
4
MULD F8, F4, F2
MULD F8, F4, F2 6
ADDD F6, F8, F6
ADDD F4, F2, F0
SUBD F8, F2, F0 F0 0.0
BNEZ F2 2.0
SUBI … F4 4.0 ROB0
SUBI
BNEZ … F6 6.0 ROB2
SUBD F8, F2, F0 F8 8.0 ROB1

1 ADDD 2.0 0.0 ROB0


2 ADDD ROB1 6.0 ROB2 1 MULD ROB0 2.0 ROB1
integer 3 2

FP adders FP mult’s
ROB
0 ADDD F4 - H
ROB/Tomasulo – cycle 4 1 MULD F8 -
2 ADDD F6 -
3 SUBD F8 -
Loop:ADDD F4, F2, F0 Instruction Queue 5
4
MULD F8, F4, F2
ADDD F6, F8, F6 6
ADDD F6, F8, F6
MULD F8, F4, F2
SUBD F8, F2, F0 F0 0.0
ADDD F4, F2, F0 F2 2.0
SUBI … F4 4.0 ROB0
BNEZ
BNEZ … F6 6.0 ROB2
SUBI F8 8.0 ROB3

1 ADDD 2.0 0.0 ROB0


2 ADDD ROB1 6.0 ROB2 1 MULD ROB0 2.0 ROB1
integer 3 SUBD 2.0 0.0 ROB3 2

FP adders FP mult’s
ROB
0 ADDD F4 2.0 H
ROB/Tomasulo – cycle 5 1 MULD F8 -
2 ADDD F6 -
3 SUBD F8 -
Loop:ADDD F4, F2, F0 Instruction Queue 5
4 SUBI
MULD F8, F4, F2
SUBD F8, F2, F0 6
ADDD F6, F8, F6
ADDD F6, F8, F6 0.0
SUBD F8, F2, F0 F0
MULD F8, F4, F2 F2 2.0
SUBI … F4 4.0 ROB0
ADDD F4, F2, F0
BNEZ … F6 6.0 ROB2
BNEZ F8 8.0 ROB3

SUBI 1 ADDD 2.0 0.0 ROB0


2 ADDD ROB1 6.0 ROB2 1 MULD 2.0 2.0 ROB1
integer 3 SUBD 2.0 0.0 ROB3 2

FP adders FP mult’s

2.0 (ROB0)
ROB
0
ROB/Tomasulo – cycle 6 1 MULD F8 - H
2 ADDD F6 -
3 SUBD F8 -
Loop:ADDD F4, F2, F0 Instruction Queue
4 SUBI
5 BNEZ
MULD F8, F4, F2 6
SUBI
ADDD F6, F8, F6
SUBD F8, F2, F0 0.0
SUBD F8, F2, F0 F0
ADDD F6, F8, F6 F2 2.0
SUBI … F4 2.0
MULD F8, F4, F2
BNEZ … F6 6.0 ROB2
ADDD F4, F2, F0 F8 8.0 ROB3

SUBI 1
BNEZ 2 ADDD ROB1 6.0 ROB2 1 MULD 2.0 2.0 ROB1
integer 3 SUBD 2.0 0.0 ROB3 2

FP adders FP mult’s
ROB
0
ROB/Tomasulo – cycle 7 1 MULD F8 - H
2 ADDD F6 -
3 SUBD F8 -
Loop:ADDD F4, F2, F0 Instruction Queue
4 SUBI
5 BNEZ
MULD F8, F4, F2 6 ADDD F4
BNEZ
ADDD F6, F8, F6
SUBI 0.0
SUBD F8, F2, F0 F0
SUBD F8, F2, F0 F2 2.0
SUBI … F4 2.0 ROB6
ADDD F6, F8, F6
BNEZ … F6 6.0 ROB2
MULD F8, F4, F2 F8 8.0 ROB3

SUBI 1 ADDD 2.0 0.0 ROB6


BNEZ 2 ADDD ROB1 6.0 ROB2 1 MULD 2.0 2.0 ROB1
integer 3 SUBD 2.0 0.0 ROB3 2

FP adders FP mult’s
ROB
0 MULD F8 -
ROB/Tomasulo – cycle 8 1 MULD F8 - H
2 ADDD F6 -
3 SUBD F8 2.0
Loop:ADDD F4, F2, F0 Instruction Queue 5
4 SUBI
BNEZ
MULD F8, F4, F2 ADDD F4
ADDD F4, F2, F0 6
ADDD F6, F8, F6
BNEZ 0.0
SUBD F8, F2, F0 F0
SUBI F2 2.0
SUBI … F4 2.0 ROB6
SUBD F8, F2, F0
BNEZ … F6 6.0 ROB2
ADDD F6, F8, F6 F8 8.0 ROB0

SUBI 1 ADDD 2.0 0.0 ROB6


BNEZ 2 ADDD ROB1 6.0 ROB2 1 MULD 2.0 2.0 ROB1
integer 3 SUBD 2.0 0.0 ROB3 2 MULD ROB6 2.0 ROB0

FP adders FP mult’s

2.0 (ROB3)
ROB
0 MULD F8 -
ROB/Tomasulo – cycle 9 1 MULD F8 - H
2 ADDD F6 -
3 SUBD F8 2.0
Loop:ADDD F4, F2, F0 Instruction Queue 5
4 SUBI
BNEZ
MULD F8, F4, F2 ADDD F4
ADDD F4, F2, F0 6
ADDD F6, F8, F6
BNEZ 0.0
SUBD F8, F2, F0 F0
SUBI F2 2.0
SUBI … F4 2.0 ROB6
SUBD F8, F2, F0
BNEZ … F6 6.0 ROB2
ADDD F6, F8, F6 F8 8.0 ROB0

SUBI 1 ADDD 2.0 0.0 ROB6


BNEZ 2 ADDD ROB1 6.0 ROB2 1 MULD 2.0 2.0 ROB1
integer 3 2 MULD ROB6 2.0 ROB0

FP adders FP mult’s
ROB
0 MULD F8 -
ROB/Tomasulo – cycle 11 1 MULD F8 - H
2 ADDD F6 -
3 SUBD F8 2.0
Loop:ADDD F4, F2, F0 Instruction Queue 5
4 SUBI
BNEZ
MULD F8, F4, F2 ADDD F4 2.0
ADDD F4, F2, F0 6
ADDD F6, F8, F6
BNEZ 0.0
SUBD F8, F2, F0 F0
SUBI F2 2.0
SUBI … F4 2.0 ROB6
SUBD F8, F2, F0
BNEZ … F6 6.0 ROB2
ADDD F6, F8, F6 F8 8.0 ROB0

SUBI 1 ADDD 2.0 0.0 ROB6


BNEZ 2 ADDD ROB1 6.0 ROB2 1 MULD 2.0 2.0 ROB1
3 2 MULD 2.0 2.0 ROB0
integer

FP adders FP mult’s

2.0 (ROB6)
ROB
0 MULD F8 -
ROB/Tomasulo – cycle 15 1 MULD F8 4.0 H
2 ADDD F6 -
3 SUBD F8 2.0
Loop:ADDD F4, F2, F0 Instruction Queue 5
4 SUBI val
BNEZ
MULD F8, F4, F2 ADDD F4 2.0
ADDD F4, F2, F0 6
ADDD F6, F8, F6
BNEZ 0.0
SUBD F8, F2, F0 F0
SUBI F2 2.0
SUBI … F4 2.0 ROB6
SUBD F8, F2, F0
BNEZ … F6 6.0 ROB2
ADDD F6, F8, F6 F8 8.0 ROB0

1
BNEZ 2 ADDD 4.0 6.0 ROB2 1 MULD 2.0 2.0 ROB1
integer 3 2 MULD 2.0 2.0 ROB0

FP adders FP mult’s

4.0 (ROB1)
ROB
0 MULD F8 -
ROB/Tomasulo – cycle 16 1 ADDD F6 -
2 ADDD F6 - H
3 SUBD F8 2.0
Loop:ADDD F4, F2, F0 Instruction Queue 5
4 SUBI val
BNEZ
MULD F8, F4, F2 ADDD F4 2.0
MULD F8, F4, F2 6
ADDD F6, F8, F6
ADDD F4, F2, F0
SUBD F8, F2, F0 F0 0.0
BNEZ F2 2.0
SUBI … F4 2.0 ROB6
SUBI
BNEZ … F6 6.0 ROB2
SUBD F8, F2, F0 F8 4.0 ROB0

1 ADDD ROB0 ROB2 ROB1


BNEZ 2 ADDD 4.0 6.0 ROB2 1
integer 3 2 MULD 2.0 2.0 ROB0

Branch FP adders FP mult’s


Mispredicted!
ROB
0 flushed
ROB/Tomasulo – cycle 17 1 flushed
2 ADDD F6 - H
3 SUBD F8 2.0
Loop:ADDD F4, F2, F0 Instruction Queue
4 SUBI val
5 BNEZ nt
MULD F8, F4, F2 6 flushed
flushed
ADDD F6, F8, F6
flushed
SUBD F8, F2, F0 F0 0.0 RST
flushed F2 2.0 entries
SUBI … F4 2.0 from
flushed
BNEZ … F6 6.0 ROB2 branch
flushed F8 4.0 ROB3

1 ADDD ROB0 ROB2 ROB1


2 ADDD 4.0 6.0 ROB2 1
integer 3 2 MULD 2.0 2.0 ROB0

FP adders FP mult’s
ROB
0
ROB/Tomasulo – cycle 19 1
2 ADDD F6 10.0 H
3 SUBD F8 2.0
Loop:ADDD F4, F2, F0 Instruction Queue
4 SUBI val
5 BNEZ nt
MULD F8, F4, F2 6
ADDD F6, F8, F6
SUBD F8, F2, F0 F0 0.0
F2 2.0
SUBI … F4 2.0
BNEZ … F6 6.0 ROB2
F8 4.0 ROB3

1
2 ADDD 4.0 6.0 ROB2 1
integer 3 2

FP adders FP mult’s

10.0 (ROB2)
ROB
0
ROB/Tomasulo – cycle 20 1
2
3 SUBD F8 2.0 H
Loop:ADDD F4, F2, F0 Instruction Queue
4 SUBI val
5 BNEZ nt
MULD F8, F4, F2 6
ADDD F6, F8, F6
SUBD F8, F2, F0 F0 0.0
F2 2.0
SUBI … F4 2.0
BNEZ … F610.0
F8 4.0 ROB3

1
2 1
integer 3 2

FP adders FP mult’s
ROB
0
ROB/Tomasulo – cycle 21 1
2
3
Loop:ADDD F4, F2, F0 Instruction Queue
4 SUBI val H
5 BNEZ nt
MULD F8, F4, F2 6
ADDD F6, F8, F6
SUBD F8, F2, F0 F0 0.0
F2 2.0
SUBI … F4 2.0
BNEZ … F6 10.0
F8 2.0

1
2 1
integer 3 2

FP adders FP mult’s
HW support for More ILP

  Speculation: allow an instruction to issue that is dependent


on branch predicted to be taken without any consequences
(including exceptions) if branch is not actually taken (“HW
undo”)
  Often combined with dynamic scheduling
  Tomasulo: separate speculative bypassing of results from
real bypassing of results
–  When instruction no longer speculative, write results (instruction
commit)
–  execute out-of-order but commit in order
Our Favorite Loop Revisited

Loop: LD F0, 0(R1)


MULTD F4,F0,F2
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop

Analyze impact of:


Register Renaming, (360/91)
Memory Disambiguation, (360/91)
Control Speculation, (ROB, etc)
and Memory Speculation (Some modern superscalars)
Dependence SUBI

Analysis BNEZ

LD SUBI
MUL
BNEZ
SD

LD SUBI
MUL
BNEZ
SD

LD
SUBI
Data Dependence
MUL

SD BNEZ

Loop: LD F0, 0(R1) LD


MUL
MULTD F4,F0,F2
SD
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
Dependence SUBI

Analysis BNEZ

LD SUBI
MUL
BNEZ
SD

LD SUBI
MUL
BNEZ
SD
Control Dependence
LD
SUBI
Data Dependence
MUL

SD BNEZ

Loop: LD F0, 0(R1) LD


MUL
MULTD F4,F0,F2
SD
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
Dependence SUBI

Analysis BNEZ

LD SUBI
MUL
BNEZ
SD

LD SUBI
MUL
BNEZ
SD
Control Dependence
LD
SUBI
Data Dependence
MUL
Name Dependence SD BNEZ

Loop: LD F0, 0(R1) LD


MUL
MULTD F4,F0,F2
SD
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
Dependence SUBI

Analysis BNEZ

LD SUBI
MUL
BNEZ
SD

LD SUBI

Memory Dependence MUL


BNEZ
SD
Control Dependence
LD
SUBI
Data Dependence
MUL
Name Dependence SD BNEZ

Loop: LD F0, 0(R1) LD


MUL
MULTD F4,F0,F2
SD
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
Timing Assuming 1
1 cycle/instr.
2
Infinitely Wide Machine
3 6
4
7
5

8 11

Memory Dependence 9
12
10
Control Dependence
13
16
Data Dependence
14
Name Dependence 15 17

Loop: LD F0, 0(R1) 18


19
MULTD F4,F0,F2
20
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
1
Critical Path = 20 2

3 6
4
7
5

8 11

Memory Dependence 9
12
10
Control Dependence
13
16
Data Dependence
14
Name Dependence 15 17

Loop: LD F0, 0(R1) 18


19
MULTD F4,F0,F2
20
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
Let’s Add Register SUBI

Renaming … BNEZ

LD SUBI
MUL
BNEZ
SD

LD SUBI

Memory Dependence MUL


BNEZ
SD
Control Dependence
LD
SUBI
Data Dependence
MUL
Name Dependence SD BNEZ

Loop: LD F0, 0(R1) LD


MUL
MULTD F4,F0,F2
SD
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
Name Dependences SUBI

Disappear.. BNEZ

LD SUBI
MUL
BNEZ
SD

LD SUBI

Memory Dependence MUL


BNEZ
SD
Control Dependence
LD
SUBI
Data Dependence
MUL
Name Dependence SD BNEZ

Loop: LD F0, 0(R1) LD


MUL
MULTD F4,F0,F2
SD
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
Timing With 1

Register Renaming 2

3 3
4
4
5

6 5

Memory Dependence 7
6
8
Control Dependence
9
7
Data Dependence
10
Name Dependence 11 8

Loop: LD F0, 0(R1) 12


13
MULTD F4,F0,F2
14
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
1
Critical Path = 14 2

3 3
4
4
5

6 5

Memory Dependence 7
6
8
Control Dependence
9
7
Data Dependence
10
Name Dependence 11 8

Loop: LD F0, 0(R1) 12


13
MULTD F4,F0,F2
14
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
Lets Add Memory SUBI

Disambiguation BNEZ

LD SUBI
MUL
BNEZ
SD

LD SUBI

Memory Dependence MUL


BNEZ
SD
Control Dependence
LD
SUBI
Data Dependence
MUL
Name Dependence SD BNEZ

Loop: LD F0, 0(R1) LD


MUL
MULTD F4,F0,F2
SD
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
With Memory SUBI
Disambiguation
BNEZ
A
LD SUBI
A MUL
BNEZ
SD
A
LD SUBI

Address Dependence A MUL


BNEZ
SD
Control Dependence A
LD
SUBI
Data Dependence
A MUL
Name Dependence SD BNEZ
A
Loop: LD F0, 0(R1) LD
A MUL
MULTD F4,F0,F2
SD
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
Timing 1
with Memory
2
Disambiguation 3
3 3
3 4
4
5
5
5 5

Address Dependence 5 6
6
7
Control Dependence 7
7
7
Data Dependence
7 8
Name Dependence 9 8
9
Loop: LD F0, 0(R1) 9
9 10
MULTD F4,F0,F2
11
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
1
Control Runs Ahead,
Critical Path = 11 2
queuing up work..
3
3 3
3 4
4
5
5
5 5

Address Dependence 5 6
6
7
Control Dependence 7
7
7
Data Dependence
7 8
Name Dependence 9 8
9
Loop: LD F0, 0(R1) 9
9 10
MULTD F4,F0,F2
11
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
Add control SUBI
speculation (ROB)
BNEZ
… A
LD SUBI
A MUL
BNEZ
SD
A
LD SUBI

Address Dependence A MUL


BNEZ
SD
Control Dependence A
LD
SUBI
Data Dependence
A MUL
Name Dependence SD BNEZ
A
Loop: LD F0, 0(R1) LD
A MUL
MULTD F4,F0,F2
SD
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
With Control SUBI

Speculation A
BNEZ

LD SUBI
A MUL
BNEZ
SD
A
LD SUBI

Address Dependence A MUL


BNEZ
SD
Control Dependence A
LD
SUBI
Data Dependence
A MUL
Name Dependence SD BNEZ
A
Loop: LD F0, 0(R1) LD
A MUL
MULTD F4,F0,F2
SD
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
Timing With
Control Speculation 1

2
2
2 2
2 3
3
4
3
3 3

Address Dependence 3 4
3
5
Control Dependence 4
4 4
Data Dependence
4 5
Name Dependence 6 4
5
Loop: LD F0, 0(R1) 5
5 6
MULTD F4,F0,F2
7
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
Critical Path
1
Lengths = 7
2
2
2 2
2 3
3
4
3
3 3

Address Dependence 3 4
3
5
Control Dependence 4
4 4
Data Dependence
4 5
Name Dependence 6 4
5
Loop: LD F0, 0(R1) 5
5 6
MULTD F4,F0,F2
7
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
Add Memory SUBI
Dependence
BNEZ
Speculation A
LD SUBI
A MUL
BNEZ
SD
A
LD SUBI

Address Dependence A MUL


BNEZ
SD
Control Dependence A
LD
SUBI
Data Dependence
A MUL
Name Dependence SD BNEZ
A
Loop: LD F0, 0(R1) LD
A MUL
MULTD F4,F0,F2
SD
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
W / Memory SUBI
Dependence
BNEZ
Speculation A
LD SUBI
A MUL
BNEZ
SD
A
LD SUBI

Address Dependence A MUL


BNEZ
SD
Control Dependence A
LD
SUBI
Data Dependence
A MUL
Name Dependence SD BNEZ
A
Loop: LD F0, 0(R1) LD
A MUL
MULTD F4,F0,F2
SD
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
Critical Path 1
The memory critical path
disappears .. however, the
Length = 7 induction variable path remains.
2
(again) 2
2 2
2 3
3
4
3
3 3

Address Dependence 3 4
4
5
Control Dependence 4
4
4
Data Dependence
4 5
Name Dependence 6 5
5
Loop: LD F0, 0(R1) 5
5 6
MULTD F4,F0,F2
7
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
A New Loop A
A
LD
ADDU ADDU
LD

BNE

BNEZ

A ADDU ADDU
A
LD
Address Dependence LD

Control Dependence BNE

Data Dependence
BNEZ
Name Dependence
A ADDU ADDU
Loop: LB R4, 0(R1) A
LD
LB R5, 0(R2) LD
ADDU R1,R1,1
BNE
ADDU R2,R2,1
BNE R4,R5, Mismatch
BNEZ
BNEZ R4, Loop
what is this code? Is it parallel?
A New Loop 1
1
1
2 2
1

4 5 5
4
4
Address Dependence 4

Control Dependence 5

Data Dependence
6
Name Dependence
7 8 8
Loop: LW R4, 0(R1) 7
7
LW R5, 0(R2) 7
ADDU R1,R1,4
8
ADDU R2,R2,4
BNE R4,R5, Mismatch
9
BNEZ R4, Loop
Renaming A
A
LD
ADDU ADDU
+ LD

Dynamic Scheduling BNE

BNEZ

A ADDU ADDU
A
LD
Address Dependence LD

Control Dependence BNE

Data Dependence
BNEZ
Name Dependence
A ADDU ADDU
Loop: LW R4, 0(R1) A
LD
LW R5, 0(R2) LD
ADDU R1,R1,4
BNE
ADDU R2,R2,4
BNE R4,R5, Mismatch
BNEZ
BNEZ R4, Loop
1
Renaming, 1
1
1 1
1
Dynamic Scheduling: 2
No overlap of loop
iterations 3

4 4 4
4
4
Address Dependence 4

Control Dependence 5

Data Dependence
6
Name Dependence
7 7 7
Loop: LW R4, 0(R1) 7
7
LW R5, 0(R2) 7
ADDU R1,R1,4
8
ADDU R2,R2,4
BNE R4,R5, Mismatch
9
BNEZ R4, Loop
Renaming, 1
1
1
Dynamic Scheduling, 1
1 1

Speculation 2

- Allows Overlap 2

2 2 2
2
2
Address Dependence 2

Control Dependence 3

Data Dependence
3
Name Dependence
3 3 3
Loop: LW R4, 0(R1) 3
3
LW R5, 0(R2) 3
ADDU R1,R1,4
4
ADDU R2,R2,4
BNE R4,R5, Mismatch
4
BNEZ R4, Loop

You might also like