Via Assignmnt

1) Consider the following fragment of C code:
for (i=0; i<=100; i=i+1) { a[i] = b[i] + c; }

Assume that a and b are arrays of words and the base address of a is in $a0 and the base
address of b is in $a1. Register $t0 is associated with variable i and register $s0 with c. Write the
code in MIPS.
ANS:
from the given data it is represented as:
$a0=a , $a1=b, $t0=i , $s0=c
loop:
clear $t0;
#clear the value of i that is i=0
addi $s0, $zero, 100
# ADD immediate i.e.put the value in c as c=0+100
lw $t1, 0($a1)
#LOAD WORD Here 0 is offset value and $a1 is base

# address and $t1 = b[i]
add $t1, $t1, $s0
# ADD $t1 = b[i]=b[i]+c
sw $t1, 0($a0)
# STORE WORD $t1 to address of a[i]
addi $a0, $a0, 4
# ADD immediate $a0 = address of a[i+1]
addi $a1, $a1, 4
# ADD immediate $a0 = address of a[i+1]
addi $t0, $t0, 1
# ADD immediate $t0 = $t0 + 1
beq $t0, $s0, exit
#BRANCH IF EQUAL if ($t0 = 100) EXIT
j loop
# Unconditional Jump to loop
exit:
# EXIT THE PROGRAM
2) Add comments to the following MIPS code and describe in one sentence what it
computes. Assume that $a0 is used for the input and initially contains n, a positive integer.
Assume that $v0 is used for the output.
begin: addi $t0, $zero, 0
addi $t1, $zero, 1
loop: slt $t2, $a0, $t1
bne $t2, $zero, finish
add $t0, $t0, $t1

addi $t1, $t1, 2
j loop
finish: add $v0, $t0, $zero
ANS
begin: addi $t0, $zero, 0
# ADD immediate ,$t0=sum=0
addi $t1, $zero, 1
# ADD immediate, $t1=k=1
loop: slt $t2, $a0, $t1
#SET LESS THEN, Set $t2 if n is less than c i.e.$t2=1 if k>n
bne $t2, $zero, finish
#Branch not equal, Branch to finish if $t2 not equal to zero i.e.
# $t2=1(n>k)
add $t0, $t0, $t1
#ADD sum= sum+k
addi $t1, $t1, 2
# ADD immediate ,k=k+2
j loop
# unconditional jump to loop
finish: add $v0, $t0, $zero
#ADD, $v0=$t0+0 i.e. output=sum
Output $v0 is the sum of the odd positive integers 1 + 3 + 5 + which are less than or equal to n.
Q3. We wish to compare the performance of two different computers: M1 and M2. The
following measurements have been made on these computers:
Program
Time on M1
Time on M2
2 sec
1.5 sec
5 sec
10 sec
Program
Instructions executed on M1
Instructions executed on M2
1
5 109
6 109
(a) Which computer is faster for each program, and how many times as fast is it?
Ans.
Time
Time
Speed = execute P1 on M 1 execute P 1 on M 2
If the ratio is greater than 1, then M2 is faster.

But if ratio is between 0 and 1, then M1 is faster.
For Program 1
M2 is 1.33 times as fast as M1. (2/1.5 = 1.33)
For program 2
M1 is 2 times as fast as M2. (10/5 = 2)
(b) Find the instruction execution rate (instructions per second) for each computer when
running program 1.
ANS.
Instruction execution rate=
No of Instructions Executed
( Instruction per sec)
Instruction Count
Instruction execution rate for M1 is,

(5*10^9) / 2 = 2.5*10^9 (instructions per sec)
Instruction execution rate for M1 is,
(6*10^9) / 1.5 = 4*10^9 (instructions per sec)
(c) The clock rates for M1 and M2 are 3 GHz and 5 GHz respectively. Find the CPI for
program 1 on both machines.
Ans.
CPI (Clock Per Instruction )=
Executiontime Clock Rate

InstructionCount
For M1,
CPI = (2*3*10^9) / (5*10^9) = 1.2 CPI
For M2
CPI = (1.5*5*10^9) / (6*10^9) = 1.25 CPI
(d) Suppose that program 1 must be executed 1600 times each hour. Any remaining time should
be used to run program 2. Which computer is faster for this workload? Performance is
measured here by the throughput of program 2.
Ans.
On M1,
Time required for P1 in 1 hr. =
1600*2
3200 s.
Time Left for P2
3600 3200
400s
P2 can Run
400 / 5
80 times in an hr.
Time required for P1 in 1 hr. =
1600*1.5
2400 s.
Time Left for P2
3600 2400
1200 s.
P2 can Run
1200 / 10
120 times in an hr.
On M2,
Hence, from the above data we can say that the performance of M2 is better than M1 because
throughput of M2 in terms of P2 is more than that of M1.
1 PIPELINE OPTIMIZATION TECHNIQUES:
Given a Resource Reservation Table, we would like to set up a systematic method which
optimizes the throughput of the process using this table. For maximum throughput, we would
like to launch new operations as frequently as possible. Thus, we want to minimize the time
gap between launching two operations. This is called the Sample Period (SP).
The minimum Sampling Period

Consider an operation in which the busiest resource is used for n cycles. If we launch a new
operation every n cycles, this resource will be used 100% of the time. If we launch operations
any more frequently than this, the resource will not have enough time to do its work.
Therefore, the minimum possible Sample Period is equal to the maximum number of cycles
for which the busiest of the resource(s) is in operation.
Sampling Period
We want to minimize the sampling period. But the sampling period need not be a constant. SP
can cycle through a finite set of values. We should therefore define an Average Sampling
period ASP. The minimum value of this average Sampling Period (MASP) is given by the
number of cycles for which the busiest resource is used in an operation.
Cyclic Sampling Period
Consider the following reservation table:
0
RSC1
RSC2
RSC3
0
0
0
0
Now the next operation can be launched in cycle 1 itself. However, the following one can only
be launched after a gap of 3 cycles in cycle
ROM
RAM
ALU
0
0
1
1
0
2
0
1
0
3
1
0
1
4
2
1
5
3
2
6
2
3
2
7
3
2
3
8
4
3
9
5
4
Again, the next operation can be launched in the next cycle (in cycle 5) and after that, with a
gap of 3 cycles in cycle 8
10
4
5
4
Average sampling period:
ROM
RAM
ALU
0
0
1
1
0
2
0
1
0
3
1
0
1
4
2
1
5
3
2
6
2
3
2
7
3
2
3
8
4
3
9
5
4
New operations can be launched in clock periods 0, 1, 4, 5, 8, 9 . . . . Thus, the sample period
cycles through the values {1, 3}. The average of the cycle is called the Average Sampling
Period (ASP). The Average Sampling period (ASP) is 2 here. The whole pattern repeats every
4 cycles. This is called the period p.
Minimum Average Sampling Period
The minimum value of the Average Sampling Period (MASP) is given by the maximum
number of cycles for which a resource is busy during an operation. Therefore, given a
reservation table, MASP is known. If the actual average Sampling Period is equal to MASP,
the system is already optimum and nothing needs to be done. If the actual average Sampling
Period is greater than MASP, we can attempt to modify the reservation table, such that MASP
is achieved.
Pipeline Optimization
1. For a given reservation table, find the current average sample period (ASP).
2. Find the largest no. of cycles for which a resource is busy.
3. This is equal to the Minimum possible Average Sampling Time (MASP).
4. If ASP = MASP, there is nothing to be done.
5. Else, we should try to re-schedule events such that MASP is achieved.
10
4
5
4
Method to achieve MASP We first consider various cycles whose average is the desired
MASP. For example, if MASP is 2, we can have cycles of {2}, {1,3} or {1,1,4} etc. The
periods are 2, 4 and 6 in these three cases. Dinesh Sharma Pipeline Optimization The
Generator Set For each cycle, we construct a generator set G, which contains elements of the
cycle, their sums taken two at a time, three at a time etc., modulo periodicity p. In our
example, cycles are {2}, {1,3} and {1,1,4} For a cycle of {2}, p = 2, so G = {0} For a cycle
of {1,3}, p = 4, so G = {0,1,3} For a cycle of {1,1,4}, p = 6, so G = {0,1,2,4,5}.
For each selected cycle, We now construct the Source set S. This contains integers 0 throughp1, from which all members of G except 0 have been removed. In our example, cycles are {2},
{1,3} and {1,1,4} Cycle p G S {2}, 2 {0} {0,1} {1,3}, 4 {0,1,3} {0,2} {1,1,4}, 6 {0,1,2,4,5}
{0,3}
Dinesh Sharma Pipeline Optimization Design Sets For each selected cycle, We construct
Design sets Di which have the property that: if a D and b D then |a b| also D. In our
example, Cycle p S D sets {2}, 2 {0,1} {0}, {1} and {0,1} {1,3}, 4 {0,2} {0}, {2}, {0,2}
{1,1,4}, 6 {0,3} {0}, {3}, {0,3}
The Design sets do not depend on the reservation table. The sets G, S and Di are constructed
from the repetition cycles whose average value is the MASP. Therefore we can make a library
of these in advance for different combinations of MASP values and cycles - and use them
when needed.
Row Vectors
We construct a row vector for each resource in the reservation table. The row vector is a set
which contains the clock period in which a specific resource is busy.
Resource Reservation Table
0
ROM
RAM
ALU
0
0
0
0
In this example, the row vector for ROM is {0,1}, for RAM is {1,3} and for ALU is {2}.
Matching Rows with Design Sets
Choose a particular cycle with the desired MASP. (Say MASP = 2, cycle = {2}).
Pick the corresponding design sets. (In this example, D = {0}, {1}, {0,1}).
For each resource, take its row vector and take a design set with the same cardinality.
Align these according to defined rules.
Rules for Alignment of the First elements
Compare R(1) and D(1).
If these are equal, nothing needs to be done.
Else,
If R(1) < D(1), add D(1)-R(1) to all members of R
If R(1) > D(1), add R(1)-D(1) to all members of D
This is equivalent to a rigid shift of R or D till their first members are aligned.

Via Assignmnt

Uploaded by

Copyright:

Available Formats

You might also like

Via Assignmnt

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Via Assignmnt

Uploaded by

Copyright:

Available Formats

1) Consider the following fragment of C code:

for (i=0; i<=100; i=i+1) { a[i] = b[i] + c; }

#clear the value of i that is i=0

addi $s0, $zero, 100

# ADD immediate i.e.put the value in c as c=0+100

#LOAD WORD Here 0 is offset value and $a1 is base

add $t1, $t1, $s0

# ADD $t1 = b[i]=b[i]+c

# STORE WORD $t1 to address of a[i]

addi $a0, $a0, 4

# ADD immediate $a0 = address of a[i+1]

addi $a1, $a1, 4

# ADD immediate $a0 = address of a[i+1]

addi $t0, $t0, 1

# ADD immediate $t0 = $t0 + 1

beq $t0, $s0, exit

#BRANCH IF EQUAL if ($t0 = 100) EXIT

# Unconditional Jump to loop

# EXIT THE PROGRAM

add $t0, $t0, $t1

# ADD immediate ,$t0=sum=0

addi $t1, $zero, 1

# ADD immediate, $t1=k=1

loop: slt $t2, $a0, $t1

#SET LESS THEN, Set $t2 if n is less than c i.e.$t2=1 if k>n

bne $t2, $zero, finish

add $t0, $t0, $t1

#ADD sum= sum+k

addi $t1, $t1, 2

# ADD immediate ,k=k+2

# unconditional jump to loop

finish: add $v0, $t0, $zero

#ADD, $v0=$t0+0 i.e. output=sum

If the ratio is greater than 1, then M2 is faster.

Instruction execution rate for M1 is,

Executiontime Clock Rate

Time Left for P2

Time required for P1 in 1 hr. =

Time Left for P2

120 times in an hr.

1 PIPELINE OPTIMIZATION TECHNIQUES:

The minimum Sampling Period

Average sampling period:

Rules for Alignment of the First elements

Compare R(1) and D(1).

If these are equal, nothing needs to be done.

You might also like