Professional Documents
Culture Documents
UNIT
UNIT
PART - A
Embedded computers are computers that are lodged into other devices where the presence of the
computer is not immediately obvious. These devices range from everyday machine to handheld
digital devices. They have a wide range of processing power and cost.
Response time is the time between the start and the completion of the event. Also referred to as
execution time or latency. Throughput is the total amount of work done in a given amount of
time.
It measures the ability of system to handle transactions, which consists of database accesses and
updates. An airline reservation system and bank ATM are the examples of TP system.
Amdalh¶s law states that the performance improvement to be gained from using some faster
mode of execution is limited by the fraction of time the faster mode can be used.
Toy benchmarks are typically between 10 and 100 lines of code and produce result the user
already knows before running the toy program. E,g puzzle.
In this technique, a dynamic execution profile of the program, which indicates how often each
instruction is executed, is maintained.
. Suppose that we are considering an enhancement to the processor of a server system used for
web serving. The new CPU is 10 times faster on computation in the web serving application than
the original processor. Assuming that the original CPU is busy with computation 40% of the time
and is waiting for I/O 60% of the time. What is the overall speedup gained by incorporating the
enhancement? Fraction enhanced = 0.4 Speedup
enhanced = 10 Speedup
Temporal locality, states that recently accessed items are likely to be accessed in the near future
Spatial locality, says that items whose addresses are near one another tend to be referenced close
together in time.
CPU time = Instruction Count x Clock cycle Time x cycles per instruction
The hybrid approach reduces the variability in size and work of the variable architecture but
provide multiple instruction lengths to reduce code size.
15. Explain pipeline hazard and mention the different hazards in pipeline.
Hazards are situations that prevent the next instruction in the instruction stream from executing
during its designated clock cycle. Hazards reduce the overall performance from the ideal speedup
gained by pipelining. The three classes of hazards are,
Structural hazards.
Data hazards.
Control hazards.
Forwarding can be generalized to include passing a result directly to the functional unit that
fetches it. The result is forwarded from the pipeline register corresponding to the output of one
unit to the input of the same unit.
18. Consider an unpipelined processor. Assume that it has a 1ns clock cycle and that it uses 4
cycles for ALU operations and branches and 5 cycles for memory operations. Assume that the
relative frequencies of these operations are 40%, 20% and 40% respectively. Suppose that due to
clock skew and setup, pipelining the processor adds 0.2 ns of overhead to the clock. Ignoring any
latency impact, how much speedup in the instruction execution rate will we gain from a pipeline?
= clock cycle x Average CPI = 1 ns x ((40% x 4)+(20 x 4)+(40 x 5)) = 4.4 ns.
The average instruction execution time on an pipelined processor is = 1+0.2ns = 1.2ns Speedup =
Avg. instruction time unpipelined/ Avg.
19. Briefly explain the different conventions for ordering the bytes within a larger object?
Little endian byte order puts the byte whose address is ³x«.x000´ at the least significant
position in the double word. Big endian byte order puts the byte whose address is ³x«.x000´ at
the most significant position in the double word.
Data hazards arise when an instruction depends on the results of a previous instruction in a way
that is expressed by the overlapping of instructions in the pipeline.
PART - B
1.Discuss the different ways how instruction set architecture can be classified? Stack
Architecture Accumulator Architecture Register-Memory Architecture Register-Register
Architecture
2. Explain about memory addressing and discuss the different addressing modes in instruction set
architecture? Little endian, and big endian Register, immediate, displacement, register indirect,
Indexed, direct or absolute, Memory indirect, auto increment, auto decrement. Scaled addressing
modes
3. Explain the operands and operation used for media and signal processing? Operands - Fixed
point, Blocked floating point Operations - Partitioned add operation, SIMD (single instruction
multiple data), paired single operation,
4. Explain with examples the various hazards in pipelining? Data hazard, structural hazards,
Control hazards.
5. Discuss in detail about data hazards and Explain the technique used to overcome data hazard?
Data hazards ± how does it occur with an example Forwarding technique
PART - A
Data dependence
Name dependence
Control Dependence
if p1 {s1;}
if p2 {s2;}
These technique uses in-order instruction issue and execution. Instructions are issued in program
order, and if an instruction is stalled in the pipeline, no later instructions can proceed.
Reservation station fetches and buffers an operand as soon as available, eliminating the need to
get the operand from a register.
Loop: L.D F0,0(R1) ADD.D F4,F0,F2 S.D F4,0(R1) DADDUI R1,R1,#-8 BNE R1,R2, loop
In dynamic scheduling the hardware rearranges the instruction execution to reduce the stalls
while maintaining data flow and exception behavior.
The pipeline may have already completed instructions that are later in program order than
instruction causing exception. The pipeline may have not yet completed some instructions that
are earlier in program order than the instructions causing exception.
These predictors use several levels of branch-prediction tables together with an algorithm for
choosing among the multiple predictors.
To reduce the branch penalty we need to know from what address to fetch by end of IF
(instruction fetch). A branch prediction cache that stores the predicted address for the next
instruction after a branch is called a branch-target buffer or branch target cache.
The goal of multiple issue processors is to allow multiple instructions to issue in a clock cycle.
They come in two flavors: superscalar processors and VLIW processors.
13. What is speculation?
It is a small memory indexed by the lower portion of the address of the branch instruction. The
memory contains a bit that says whether the branch was recently taken or not.
Superscalar processors issue varying number of instructions per clock and are either statically
scheduled or dynamically scheduled.
It combines three key ideas: dynamic branch prediction to choose which instruction to execute,
speculation to allow the execution of instructions before control dependences are resolved and
dynamic scheduling to deal with the scheduling of different combinations of basic blocks.
18. How many branch selected entries are in a (2,2) predictors that has a total of 8K bits in a
prediction buffer?
number of prediction entries selected by the branch = 8K number of prediction entries selected
by the branch = 1K
19. What is the advantage of using instruction type field in ROB?
The instruction field specifies whether instruction is a branch or a store or a register operation
The advantage of tournament predictor is its ability to select the right predictor for right branch.
PART - B
1. What is instruction-level parallelism? Explain in details about the various dependences caused
in ILP? Define ILP Various dependences include ± Data dependence, Name dependence and
control dependence.
2. Discuss about tomasulo¶s algorithm to over come data hazard using dynamic scheduling?
Dynamic scheduling with an example. How data hazard is caused. Architecture Reservation
station.
3. Explain how to reduce branch cost with dynamic hardware prediction? Basic branch
prediction and prediction buffers Correlating branch predictors Tournament based predictors
4. Explain how hardware based speculation is used to overcome control dependence? Ideas in
hardware based speculation ± dynamic scheduling, speculation, dynamic branch prediction.
Architecture and example Reorder buffer.