Professional Documents
Culture Documents
Introduction To Instruction Level Parallelism (ILP) : ECE338 Parallel Computer Architecture Spring 2022
Introduction To Instruction Level Parallelism (ILP) : ECE338 Parallel Computer Architecture Spring 2022
Introduction To Instruction Level Parallelism (ILP) : ECE338 Parallel Computer Architecture Spring 2022
fld f0,0(x1)
Potential data
fadd.d f4,f0,f2 dependence.
sd $s0, 0($s0)
fsd f4,0(x1) Detectable only at
execution (not good )
addi x1,x1,-8 ld $s1, -20($s1)
bne x1,x2,Loop
bne x1,x2,Loop
Register Renaming
by the compiler
fdiv.d f0,f2,f4 fdiv.d f0,f2,f4
fadd.d f6,f0, f8 WAR fadd.d S,f0,f8
fsd f6,0(x1) fsd S,0(x1)
fsub.d f8,f10,f14
fsub.d T,f10,f14
fmul.d f6,f10,f8 WAW
fmul.d f6,f10,T
Not always possible to have two extra registers available
(atECE 338 Parallel
compile Computer Architecture
time!) 7
Control Dependence
• Ordering of instruction i with respect to a branch
instruction
S1;
if p2 {
S2;
}
• Instruction S2 cannot be moved before the p2 branch
• Instruction S1 cannot be moved after the p2 branch so that its
execution is controlled by the branch
– But, in some cases, instruction movement can happen
add x1,x2,x3
beq x12,x0,skip
sub x4,x5,x6 If x4 isn’t used after skip, it is possible to
add x5,x4,x9 move sub before the branch
skip:
or x7,x8,x9
ECE 338 Parallel Computer Architecture 8
Instruction Level Parallelism
•ILP is restricted by all these dependences
•ILP extraction is usually transparent to the programmer