Professional Documents
Culture Documents
Reorder Buffer
Reorder Buffer
• destination
• value
1
2 1
integer 3 2
FP adders FP mult’s
ROB
0 ADDD F4 - H
ROB/Tomasulo – cycle 1 1
2
3
Loop:ADDD F4, F2, F0 Instruction Queue
4
5
MULD F8, F4, F2 6
BNEZ
ADDD F6, F8, F6
SUBI
SUBD F8, F2, F0 F0 0.0
SUBD F8, F2, F0 F2 2.0
SUBI … F4 4.0 ROB0
ADDD F6, F8, F6
BNEZ … F6 6.0
MULD F8, F4, F2 F8 8.0
FP adders FP mult’s
ROB
0 ADDD F4 - H
ROB/Tomasulo – cycle 2 1 MULD F8 -
2
3
Loop:ADDD F4, F2, F0 Instruction Queue 5
4
MULD F8, F4, F2
ADDD F4, F2, F0 6
ADDD F6, F8, F6
BNEZ 0.0
SUBD F8, F2, F0 F0
SUBI F2 2.0
SUBI … F4 4.0 ROB0
SUBD F8, F2, F0
BNEZ … F6 6.0
ADDD F6, F8, F6 F8 8.0 ROB1
FP adders FP mult’s
ROB
0 ADDD F4 - H
ROB/Tomasulo – cycle 3 1 MULD F8 -
2 ADDD F6 -
3
Loop:ADDD F4, F2, F0 Instruction Queue 5
4
MULD F8, F4, F2
MULD F8, F4, F2 6
ADDD F6, F8, F6
ADDD F4, F2, F0
SUBD F8, F2, F0 F0 0.0
BNEZ F2 2.0
SUBI … F4 4.0 ROB0
SUBI
BNEZ … F6 6.0 ROB2
SUBD F8, F2, F0 F8 8.0 ROB1
FP adders FP mult’s
ROB
0 ADDD F4 - H
ROB/Tomasulo – cycle 4 1 MULD F8 -
2 ADDD F6 -
3 SUBD F8 -
Loop:ADDD F4, F2, F0 Instruction Queue 5
4
MULD F8, F4, F2
ADDD F6, F8, F6 6
ADDD F6, F8, F6
MULD F8, F4, F2
SUBD F8, F2, F0 F0 0.0
ADDD F4, F2, F0 F2 2.0
SUBI … F4 4.0 ROB0
BNEZ
BNEZ … F6 6.0 ROB2
SUBI F8 8.0 ROB3
FP adders FP mult’s
ROB
0 ADDD F4 2.0 H
ROB/Tomasulo – cycle 5 1 MULD F8 -
2 ADDD F6 -
3 SUBD F8 -
Loop:ADDD F4, F2, F0 Instruction Queue 5
4 SUBI
MULD F8, F4, F2
SUBD F8, F2, F0 6
ADDD F6, F8, F6
ADDD F6, F8, F6 0.0
SUBD F8, F2, F0 F0
MULD F8, F4, F2 F2 2.0
SUBI … F4 4.0 ROB0
ADDD F4, F2, F0
BNEZ … F6 6.0 ROB2
BNEZ F8 8.0 ROB3
FP adders FP mult’s
2.0 (ROB0)
ROB
0
ROB/Tomasulo – cycle 6 1 MULD F8 - H
2 ADDD F6 -
3 SUBD F8 -
Loop:ADDD F4, F2, F0 Instruction Queue
4 SUBI
5 BNEZ
MULD F8, F4, F2 6
SUBI
ADDD F6, F8, F6
SUBD F8, F2, F0 0.0
SUBD F8, F2, F0 F0
ADDD F6, F8, F6 F2 2.0
SUBI … F4 2.0
MULD F8, F4, F2
BNEZ … F6 6.0 ROB2
ADDD F4, F2, F0 F8 8.0 ROB3
SUBI 1
BNEZ 2 ADDD ROB1 6.0 ROB2 1 MULD 2.0 2.0 ROB1
integer 3 SUBD 2.0 0.0 ROB3 2
FP adders FP mult’s
ROB
0
ROB/Tomasulo – cycle 7 1 MULD F8 - H
2 ADDD F6 -
3 SUBD F8 -
Loop:ADDD F4, F2, F0 Instruction Queue
4 SUBI
5 BNEZ
MULD F8, F4, F2 6 ADDD F4
BNEZ
ADDD F6, F8, F6
SUBI 0.0
SUBD F8, F2, F0 F0
SUBD F8, F2, F0 F2 2.0
SUBI … F4 2.0 ROB6
ADDD F6, F8, F6
BNEZ … F6 6.0 ROB2
MULD F8, F4, F2 F8 8.0 ROB3
FP adders FP mult’s
ROB
0 MULD F8 -
ROB/Tomasulo – cycle 8 1 MULD F8 - H
2 ADDD F6 -
3 SUBD F8 2.0
Loop:ADDD F4, F2, F0 Instruction Queue 5
4 SUBI
BNEZ
MULD F8, F4, F2 ADDD F4
ADDD F4, F2, F0 6
ADDD F6, F8, F6
BNEZ 0.0
SUBD F8, F2, F0 F0
SUBI F2 2.0
SUBI … F4 2.0 ROB6
SUBD F8, F2, F0
BNEZ … F6 6.0 ROB2
ADDD F6, F8, F6 F8 8.0 ROB0
FP adders FP mult’s
2.0 (ROB3)
ROB
0 MULD F8 -
ROB/Tomasulo – cycle 9 1 MULD F8 - H
2 ADDD F6 -
3 SUBD F8 2.0
Loop:ADDD F4, F2, F0 Instruction Queue 5
4 SUBI
BNEZ
MULD F8, F4, F2 ADDD F4
ADDD F4, F2, F0 6
ADDD F6, F8, F6
BNEZ 0.0
SUBD F8, F2, F0 F0
SUBI F2 2.0
SUBI … F4 2.0 ROB6
SUBD F8, F2, F0
BNEZ … F6 6.0 ROB2
ADDD F6, F8, F6 F8 8.0 ROB0
FP adders FP mult’s
ROB
0 MULD F8 -
ROB/Tomasulo – cycle 11 1 MULD F8 - H
2 ADDD F6 -
3 SUBD F8 2.0
Loop:ADDD F4, F2, F0 Instruction Queue 5
4 SUBI
BNEZ
MULD F8, F4, F2 ADDD F4 2.0
ADDD F4, F2, F0 6
ADDD F6, F8, F6
BNEZ 0.0
SUBD F8, F2, F0 F0
SUBI F2 2.0
SUBI … F4 2.0 ROB6
SUBD F8, F2, F0
BNEZ … F6 6.0 ROB2
ADDD F6, F8, F6 F8 8.0 ROB0
FP adders FP mult’s
2.0 (ROB6)
ROB
0 MULD F8 -
ROB/Tomasulo – cycle 15 1 MULD F8 4.0 H
2 ADDD F6 -
3 SUBD F8 2.0
Loop:ADDD F4, F2, F0 Instruction Queue 5
4 SUBI val
BNEZ
MULD F8, F4, F2 ADDD F4 2.0
ADDD F4, F2, F0 6
ADDD F6, F8, F6
BNEZ 0.0
SUBD F8, F2, F0 F0
SUBI F2 2.0
SUBI … F4 2.0 ROB6
SUBD F8, F2, F0
BNEZ … F6 6.0 ROB2
ADDD F6, F8, F6 F8 8.0 ROB0
1
BNEZ 2 ADDD 4.0 6.0 ROB2 1 MULD 2.0 2.0 ROB1
integer 3 2 MULD 2.0 2.0 ROB0
FP adders FP mult’s
4.0 (ROB1)
ROB
0 MULD F8 -
ROB/Tomasulo – cycle 16 1 ADDD F6 -
2 ADDD F6 - H
3 SUBD F8 2.0
Loop:ADDD F4, F2, F0 Instruction Queue 5
4 SUBI val
BNEZ
MULD F8, F4, F2 ADDD F4 2.0
MULD F8, F4, F2 6
ADDD F6, F8, F6
ADDD F4, F2, F0
SUBD F8, F2, F0 F0 0.0
BNEZ F2 2.0
SUBI … F4 2.0 ROB6
SUBI
BNEZ … F6 6.0 ROB2
SUBD F8, F2, F0 F8 4.0 ROB0
FP adders FP mult’s
ROB
0
ROB/Tomasulo – cycle 19 1
2 ADDD F6 10.0 H
3 SUBD F8 2.0
Loop:ADDD F4, F2, F0 Instruction Queue
4 SUBI val
5 BNEZ nt
MULD F8, F4, F2 6
ADDD F6, F8, F6
SUBD F8, F2, F0 F0 0.0
F2 2.0
SUBI … F4 2.0
BNEZ … F6 6.0 ROB2
F8 4.0 ROB3
1
2 ADDD 4.0 6.0 ROB2 1
integer 3 2
FP adders FP mult’s
10.0 (ROB2)
ROB
0
ROB/Tomasulo – cycle 20 1
2
3 SUBD F8 2.0 H
Loop:ADDD F4, F2, F0 Instruction Queue
4 SUBI val
5 BNEZ nt
MULD F8, F4, F2 6
ADDD F6, F8, F6
SUBD F8, F2, F0 F0 0.0
F2 2.0
SUBI … F4 2.0
BNEZ … F610.0
F8 4.0 ROB3
1
2 1
integer 3 2
FP adders FP mult’s
ROB
0
ROB/Tomasulo – cycle 21 1
2
3
Loop:ADDD F4, F2, F0 Instruction Queue
4 SUBI val H
5 BNEZ nt
MULD F8, F4, F2 6
ADDD F6, F8, F6
SUBD F8, F2, F0 F0 0.0
F2 2.0
SUBI … F4 2.0
BNEZ … F6 10.0
F8 2.0
1
2 1
integer 3 2
FP adders FP mult’s
HW support for More ILP
Analysis BNEZ
LD SUBI
MUL
BNEZ
SD
LD SUBI
MUL
BNEZ
SD
LD
SUBI
Data Dependence
MUL
SD BNEZ
Analysis BNEZ
LD SUBI
MUL
BNEZ
SD
LD SUBI
MUL
BNEZ
SD
Control Dependence
LD
SUBI
Data Dependence
MUL
SD BNEZ
Analysis BNEZ
LD SUBI
MUL
BNEZ
SD
LD SUBI
MUL
BNEZ
SD
Control Dependence
LD
SUBI
Data Dependence
MUL
Name Dependence SD BNEZ
Analysis BNEZ
LD SUBI
MUL
BNEZ
SD
LD SUBI
8 11
Memory Dependence 9
12
10
Control Dependence
13
16
Data Dependence
14
Name Dependence 15 17
3 6
4
7
5
8 11
Memory Dependence 9
12
10
Control Dependence
13
16
Data Dependence
14
Name Dependence 15 17
Renaming … BNEZ
LD SUBI
MUL
BNEZ
SD
LD SUBI
Disappear.. BNEZ
LD SUBI
MUL
BNEZ
SD
LD SUBI
Register Renaming 2
3 3
4
4
5
6 5
Memory Dependence 7
6
8
Control Dependence
9
7
Data Dependence
10
Name Dependence 11 8
3 3
4
4
5
6 5
Memory Dependence 7
6
8
Control Dependence
9
7
Data Dependence
10
Name Dependence 11 8
Disambiguation BNEZ
LD SUBI
MUL
BNEZ
SD
LD SUBI
Address Dependence 5 6
6
7
Control Dependence 7
7
7
Data Dependence
7 8
Name Dependence 9 8
9
Loop: LD F0, 0(R1) 9
9 10
MULTD F4,F0,F2
11
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
1
Control Runs Ahead,
Critical Path = 11 2
queuing up work..
3
3 3
3 4
4
5
5
5 5
Address Dependence 5 6
6
7
Control Dependence 7
7
7
Data Dependence
7 8
Name Dependence 9 8
9
Loop: LD F0, 0(R1) 9
9 10
MULTD F4,F0,F2
11
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
Add control SUBI
speculation (ROB)
BNEZ
… A
LD SUBI
A MUL
BNEZ
SD
A
LD SUBI
Speculation A
BNEZ
LD SUBI
A MUL
BNEZ
SD
A
LD SUBI
2
2
2 2
2 3
3
4
3
3 3
Address Dependence 3 4
3
5
Control Dependence 4
4 4
Data Dependence
4 5
Name Dependence 6 4
5
Loop: LD F0, 0(R1) 5
5 6
MULTD F4,F0,F2
7
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
Critical Path
1
Lengths = 7
2
2
2 2
2 3
3
4
3
3 3
Address Dependence 3 4
3
5
Control Dependence 4
4 4
Data Dependence
4 5
Name Dependence 6 4
5
Loop: LD F0, 0(R1) 5
5 6
MULTD F4,F0,F2
7
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
Add Memory SUBI
Dependence
BNEZ
Speculation A
LD SUBI
A MUL
BNEZ
SD
A
LD SUBI
Address Dependence 3 4
4
5
Control Dependence 4
4
4
Data Dependence
4 5
Name Dependence 6 5
5
Loop: LD F0, 0(R1) 5
5 6
MULTD F4,F0,F2
7
SD 0(R1), F4
SUBI R1,R1,#8
BNEZ R1, Loop
A New Loop A
A
LD
ADDU ADDU
LD
BNE
BNEZ
A ADDU ADDU
A
LD
Address Dependence LD
Data Dependence
BNEZ
Name Dependence
A ADDU ADDU
Loop: LB R4, 0(R1) A
LD
LB R5, 0(R2) LD
ADDU R1,R1,1
BNE
ADDU R2,R2,1
BNE R4,R5, Mismatch
BNEZ
BNEZ R4, Loop
what is this code? Is it parallel?
A New Loop 1
1
1
2 2
1
4 5 5
4
4
Address Dependence 4
Control Dependence 5
Data Dependence
6
Name Dependence
7 8 8
Loop: LW R4, 0(R1) 7
7
LW R5, 0(R2) 7
ADDU R1,R1,4
8
ADDU R2,R2,4
BNE R4,R5, Mismatch
9
BNEZ R4, Loop
Renaming A
A
LD
ADDU ADDU
+ LD
BNEZ
A ADDU ADDU
A
LD
Address Dependence LD
Data Dependence
BNEZ
Name Dependence
A ADDU ADDU
Loop: LW R4, 0(R1) A
LD
LW R5, 0(R2) LD
ADDU R1,R1,4
BNE
ADDU R2,R2,4
BNE R4,R5, Mismatch
BNEZ
BNEZ R4, Loop
1
Renaming, 1
1
1 1
1
Dynamic Scheduling: 2
No overlap of loop
iterations 3
4 4 4
4
4
Address Dependence 4
Control Dependence 5
Data Dependence
6
Name Dependence
7 7 7
Loop: LW R4, 0(R1) 7
7
LW R5, 0(R2) 7
ADDU R1,R1,4
8
ADDU R2,R2,4
BNE R4,R5, Mismatch
9
BNEZ R4, Loop
Renaming, 1
1
1
Dynamic Scheduling, 1
1 1
Speculation 2
- Allows Overlap 2
2 2 2
2
2
Address Dependence 2
Control Dependence 3
Data Dependence
3
Name Dependence
3 3 3
Loop: LW R4, 0(R1) 3
3
LW R5, 0(R2) 3
ADDU R1,R1,4
4
ADDU R2,R2,4
BNE R4,R5, Mismatch
4
BNEZ R4, Loop