CA Classes-126-130

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Computer Architecture Unit 6

Unit 6 Instruction-Level Parallelism and its Exploitation

Structure:
6.1 Introduction
Objectives
6.2 Dynamic Scheduling
Advantages of dynamic scheduling
Limitations of dynamic Scheduling
6.3 Overcoming Data Hazards
6.4 Dynamic Scheduling Algorithm – The Tomasulo Approach
6.5 High performance Instruction Delivery
Branch target buffer
Advantages of branch target buffer
6.6 Hardware-based Speculation
6.7 Summary
6.8 Glossary
6.9 Terminal Questions
6.10 Answers

6.1 Introduction
In pipelining, two or more instructions that are independent of each other
can overlap. This possibility of overlap is known as ILP (instruction-level
parallelism). It is addressed as ILP because the instructions may be
assessed parallelly. Parallelism level is quite small in straight-line codes
where there are no branches except the entry or exit. The easiest and most
widely used methodology to enhance parallelism is by exploiting parallelism
among the loop iterations. This is termed as “loop-level parallelism”.
In the previous unit, you studied design space of pipelines. You studied
various aspects such as pipelined execution of integer and Boolean
instructions and pipelined processing of loads and stores. In this unit, we will
throw light on the process of overcoming hazards with dynamic schedule, its
examples and algorithm. We will also examine the High performance
instruction delivery and hardware based speculation.

Manipal University of Jaipur B1648 Page No. 126


Computer Architecture Unit 6

Objectives:
After studying this unit, you should be able to:
 describe the process of overcoming the data hazards with dynamic
scheduling
 give examples of dynamic scheduling
 describe the Tomasulo approach of dynamic scheduling algorithm
 identify techniques of overcoming data hazards with dynamic scheduling
 analyse the concept of high performance instruction delivery
 explain hardware based speculation

6.2 Dynamic Scheduling


Pipeline fetches an instruction and executes it. This flow is restrained if
there exists any data dependencies among the instruction already in the
pipeline and the fetched instruction that can be hidden with bypassing or
forwarding. When the data dependence between the instructions cannot be
hidden, then in such a case the hazard detection hardware generally stalls
the instruction pipeline. In this scenario, new instructions are neither fetched
nor issued till the time the dependence is resolved. Techniques for
scheduling the instructions need to be examined properly in order to so as
to identify the dependent instructions and also to decrease the actual
hazards and their resultant stalls. This act of scheduling is termed as static
scheduling.
There is another category of scheduling known as dynamic scheduling. A
dynamic scheduling is the hardware based scheduling. In this approach, the
hardware rearranges the instruction execution to reduce the stalls. Dynamic
scheduling reduces the stalls and simultaneously maintains the data flow &
exceptions in the instruction execution.
6.2.1 Advantages of dynamic scheduling
There are various advantages of dynamic scheduling. They are as follows:
1. Dynamic scheduling is helpful in situations where the data
dependencies between the instructions are not known during the time of
compilation.
2. Dynamic scheduling also helps to simplify the task of compiler.
3. It permits code compiled by one pipeline in mind to execute efficiently on
some other pipeline.

Manipal University of Jaipur B1648 Page No. 127


Computer Architecture Unit 6

6.2.2 Limitations of dynamic scheduling


Dynamic scheduling has several limitations:
 The pipelining techniques we have used so far use in-order instruction
issue. This acts as a major limitation. In-order instruction means that the
following instructions cannot proceed if there is any instruction stalled in
instruction pipeline. Therefore, when two nearly positioned instructions
are dependent on each other, then a stall occurs.
Existence of multiple functional units could lead to idle-time of these
units. Suppose if any instruction j depends on any time-consuming
instruction i, which is presently being executed in the instruction
pipeline, then in such a case all instructions following instruction j needs
to be stalled till the time instruction i is over and instruction j begins
execution. For example, consider this code sequence:

Here F0, F1, F2….F14 are the floating point registers (FPRs) and DIVD,
ADDD and SUBD are the floating point operations on double
precision(denoted by D). The dependence of ADDD on DIVD causes a stall
in the pipeline; and thus, the SUBD instruction cannot execute. IF the
instructions are not executed in same sequence then this limitation could be
ruled out.
In case of DLX (DLX is a RISC processor architecture) pipeline, the
structural & data hazards are examined during the instruction decode (ID). If
any instruction can carry out appropriately, it is issued from ID. To
commence with the execution of the SUBD, we need to examine the
following two issues separately:
 Firstly we need to analyse the any type of structural hazards
 Secondly, we need to wait for the non-occurrence of any data hazard.
Structural hazards must be checked at the time of issuance. Therefore, in-
order instruction issuance is still used. Moreover, instruction implementation
must initiate at the instant when the data operands are available for access.

Manipal University of Jaipur B1648 Page No. 128


Computer Architecture Unit 6

Therefore the pipeline which executes out-of-order results in out-of-order


completion.
But the out-of-order completion results in various types of difficulties in
exception handling. The exceptions generated in a dynamic scheduled
processor are also imprecise because any instruction may be entirely
executed before any previously issued instruction generates an exception.
In such a scenario, it is quite challenging to again start after the interrupt.
For carrying out out-of-order execution, we need to necessarily separate the
ID (Instruction Decode) pipe stage into two. These are as follows:
1. Issue – In this stage, the instructions are decoded and a check for
identifying structural hazards is performed.
2. Read operands – In this stage the operands are read after no data
hazards are detected.
IF (instruction fetch) comes before the issue stage. The IF can fetch and
issue instructions from a queue or latch. The EX (Execution) stage follows
the read operands stage. Based on the complexity of operation, the
execution may involve various cycles. Consequently, there must be a
demarcation between the initiation of instruction execution and completion
of instruction execution. Doing so will allow simultaneous execution of
multiple instructions.
Self Assessment Questions
1. The methodology, which involves separation of dependent instructions,
minimizes data/structural hazards and consequential stalls is termed as
__________________.
2. To commence with the execution of the SUBD, we need to separate
the issue method into 2 parts: firstly ___________ and secondly
___________.
3. _______________ stage precedes the issue phase.
4. The ________________ stage follows the read operands stage similar
to the DLX pipeline.

6.3 Overcoming Data Hazards


Now let us discuss the methods of overcoming data hazards with dynamic
scheduling in this section.

Manipal University of Jaipur B1648 Page No. 129


Computer Architecture Unit 6

Dynamic Scheduling with a Scoreboard


In a dynamically scheduled pipeline, all instructions pass through the issue
stage in order (in-order issue); however, they can be stalled or bypass each
other in the second stage (read operands) and thus enter execution out of
order. Score board is a method of permitting out-of-order instruction
execution when sufficient resources are available and there are no data
dependencies.
The CDC (Control Data Corporation) 6600 scoreboard developed this
capability and it is named after it. (CDC 6600 was a family of mainframe
computers manufactured by Control Data Corporation)
Out-of-order instruction execution may give rise to WAR (Write after Read, a
type of data hazard) hazards which are not present in DLX floating point and
integer pipelines.
Let us consider that SUBD destination is F8 in the earlier example; then its
code sequence will be as shown below:

In this example you can see that ADDD and SUBD are interdependent. If
SUBD is executed before ADDD, then the data interdependence will be
violated resulting in wrong execution. Similarly, to refrain output
dependencies violation, it is essential to detect WAW (Write after Write)
data hazards Scoreboard technique helps to minimize or remove both the
structural as well as the data hazards. Scoreboard stalls the later instruction
that is engaged in the interdependence. Scoreboard’s goal is to execute an
instruction in each clock cycle (in situation where no structural hazards
exist). Therefore, when any instruction is stalls, some other independent
instructions may be executed. The scoreboard technique takes complete
accountability for issuing and executing the instruction together with all
hazards detection. To take advantage of executing instructions out-of-order
necessarily requires several instructions to be executed simultaneously. We
can achieve this by use of either of the two ways:
1. By utilizing pipelined functional units
2. By using multiple functional units
Manipal University of Jaipur B1648 Page No. 130

You might also like