Professional Documents
Culture Documents
coa 4
coa 4
coa 4
[Type here]
[Type here]
M4- ARITHMETIC
4.1 Introduction
4.1.1 Numbers
Addition and subtraction are basic operations performed on digital computer. The Arithmetic and Logic Unit
(ALU) performs these operations along with other logical operations like AND,OR,NOT,XOR.
So, in case of sign and magnitude way of representing numbers, the only difference between +ve and –ve
numbers is the MSB bit.
1010
4.1.2 : Addition of Positive numbers
Example:
0 0 1 1
+0 +1 +0 +1
0 1 1 10
[Type here]
[Type here]
Carry bit
When we add two “1’s a carry bit is generated which is moved to next higher bit. In the below
example, addition of 3rd bit generates a carry which is moved to higher bit.
Ex: 0101
0110
1011
Ex: 2 + 3 =? 1 + 6 =?
2 0010 1 0001
+ 3 0011 + 6 0110
5 0101 7 0111
1011 -5 (1)
Step 2:
2 is written as 0010
-2 in 1’s complement is opposite of 2 i.e. 1101
-2 in 2’s complement = -5 in 1’s complement + 1
1101
0001
1110 -2 (2)
Add (1) & (2) 1011
1110
Note: If you already have 2’s complement of numbers then you simply add them
10100
[Type here]
[Type here]
Ignore the MSB 1, hence solution is 0100
?
Step1: Find 2’s complement of Y (1001), i.e. find 1’s complement of 1001 which is equal to 0110. To
that add “1” to get the 2’s complement, which is equal to 0111.
Step 2: Add this value (i.e. 0111) to X (i.e. 1101). The answer is 10100. Ignoring the carry bit we get
the answer as 0100.
Example 3: Now, in all the above examples we conveniently used bits. But how to calculate (-7) – (-5)?
Solution: First obtain -7 using 2’s complement
7 is written as 0111
-7 in 1’s complement is written as opposite of 7 i.e. 1000
-7 in 2’s complement is (add 1 to 1’s complement of -7). So we get 1001 -------------- (1)
Example 4: Let’s take another example. How to subtract -7 and 1 i.e. -7 – (1) = ?
Solution: Step1: Find 2’s complement representation of -7 and -1
Step 2: Add these two values.
2’s complement of -7 is 1001.
2’s complement of -1 is 1111
Adding both the values gives 11000. Ignoring the MSB bit gives the answer 1000
Case 1: Let’s take a simple example to add two numbers i.e. 7 + 1 = ? When the operation is in binary bits,
the following may be obtained:
0111 (i.e. 7) + 0001 (i.e. 1) = 1000. But 1000 is -8 !!!
So, 7 + 1 = -8. This is clearly wrong. Thus, an overflow occurs at MSB position.
Case 2: When we add -4 and -6 we get 1100 + 1010 = 1 0110 i.e. +6 !!!.
So, -4 + -6 results in +6! This is wrong. There is again an overflow.
[Type here]
[Type here]
When does an overflow occur?
Possibly when we add two numbers of same sign (i.e. adding two positive numbers or two negative
numbers)
In case 2 it can be seen that, there are two possibilities:
(a) Overflow bit after MSB
(b) Change in sign at MSB bit
4.1.5 Characters
Computers don’t just handle numbers, it also handles non- numeric data like alphabets, punctuation
marks etc. Hence, codes are used to represent all of these. One such code is ASCII. “Unicode” is
another way to represent characters for other languages like Hindi. Hex value 0x0905 represents अ,
0x0906 represents अ etc.
Summary:
Wheel representation of 2’s complement numbers.
Let us consider adding +7 to -3. To do this using the wheel option, first
locate 7 (i.e. 0111) then move 13 steps to the right (13 steps because 2's
complement for -3 is 1101 which is 13). This gives the answer 0100 (+4)
The table shown below depicts all the 3 forms of representation of numbers:
[Type here]
[Type here]
Sum(Si) Logic
If you notice carefully Sum (Si) bit is 1 (i.e. ON) in 4 scenarios.
Scenario 1: xi = 0,yi=0 and Ci=1 i.e.x̅𝑖 𝑦̅𝑖 𝑐𝑖
Scenario 2: xi =0, yi=1 ,Ci=0 i.e. 𝑥𝑖̅ 𝑦𝑖 𝑐𝑖̅
Scenario 3: xi = 1, yi=0, Ci=0 i.e. 𝑥𝑖 𝑦𝑖̅ 𝑐𝑖̅
Scenario 4: xi = 1, yi= 1, Ci=1 i.e. 𝑥𝑖𝑦𝑖𝑐𝑖
Hence, Si =x𝑖̅ 𝑦̅+ 𝑥̅𝑖 𝑦𝑖 𝑐𝑖̅ +𝑥𝑖 𝑦̅𝑖 𝑐𝑖̅ +𝑥𝑖 𝑦𝑖 𝑐𝑖 (Note: The + refers to “OR” and not arithmetic ‘+’)
In short, Si = 𝒙𝒊 𝒚𝒊 𝒛 𝒊 i.e. xi XOR yiXOR Ci
Carry (Ci) Logic
You will notice that carry is 1 in following scenarios
Scenario 1: xi = 0,yi=1 and Ci=1 i.e. 𝑥𝑖̅ 𝑦𝑖 𝑐𝑖 or 𝑦𝑖 𝑐𝑖
Scenario 2: xi = 1,yi=0 and Ci=1 i.e. 𝑥𝑖 𝑦𝑖̅ 𝑐𝑖 or 𝑥𝑖 𝑐𝑖
Scenario 3: xi = 1,yi=1 and Ci=0 i.e. 𝑥𝑖 𝑦𝑖 𝑐𝑖̅ or 𝑥𝑖 𝑦𝑖
Scenario 4: xi = 1,yi=1 and Ci=1 i.e. 𝑥𝑖𝑦𝑖𝑐𝑖
Now, Scenario 3 and Scenario 4 can be shown by single equation i.e. 𝑥𝑖𝑦𝑖
Hence, carry is generated when 𝑐𝑖+1 = 𝑦𝑖𝑐𝑖 + 𝑥𝑖𝑐𝑖 + 𝑥𝑖𝑦𝑖
Circuits for Si and Ci+1
Both the above circuits can be put together and shown as a full adder (FA) as below:
[Type here]
[Type here]
(Ex: If n=8 and we want to add two 32-bit numbers, then we will need 4 such ripple address)
So, a k n-bit adder is shown as:
Addition:
Addition of 2n bit numbers is fairly straight forward:
Step 1: Obtain 2’s complement of X (X is n bit numbers)
Step 2: Obtain 2’s complement of Y (Y is n bit numbers)
Step 3: Use Fig 6.2 (b) to add these numbers. Co will be equal to 0 in this case.
Xn-1 and Yn-1 shall be the sign bits (MSB)
The problem with overflow is that the sum has a different sign compared to operands
Note: Overflow is seen when
xn-1 = 0, yn-1 = 0 and Sn-1 = 1 i.e. ̅𝑋̅𝑛−̅𝑌
𝑛̅ −̅ 1
1 𝑆𝑛−1
or
xn-1 = 1, yn-1 = 1 and Sn-1 = 0 i.e.𝑋𝑛−1 𝑌𝑛−1 ̅𝑆 𝑛̅ −̅1
[Type here]
[Type here]
Another way of saying that an overflow can be detected is by Cn-1 Cn. i.e., (CnXOR Cn-1). If it is 1
then an overflow occurred else an overflow did not occur.
This is an interesting circuit let’s see how we can add and subtract 2 n-bit numbers using this single
circuit. Before that we can see that Y’s content is XORed before being fed to the circuit. An XOR
logic follows below table:
𝒚𝒊 Add/Sub Control 𝒚𝒊 Add/Sub Control
0 0 0
0 1 1
1 0 1
1 1 0
Addition:
Note: If Add/Sub = 0 then Yi input is same as Yi Add/Sub. So, to add X and Y we need to
do the following:
Step 1: Pass X to lines X0 … Xn-1
Step 2: Pass Y to lines Y0…Yn-1
Step 3: Pass 0 to Add/Sub line which goes as carry C0 as well.
Subtraction:
So, the circuit should do the following
(a) Calculate 2’s complement of Y (i.e. ex 1)
(b) Add it to X (ex 7)
[Type here]
[Type here]
The XOR gate at “Y” converts Y to 1’s complement of Y. The “Add/Sub Control” which results
in Carry C0 bit to be 1. So this bit further adds 1 to the sum resulting in 2’s complement of Y.
Hence X and Y are essentially subtracted.
We can write Pi = xi yiinstead of Pi = xi + yiand still Ci+1 will hold good, because of following reason.
Xi Yi xi+yi xi yi
0 0 0 0
0 1 1 1
1 0 1 1
1 1 1 0
Note: + means logical OR and means XOR
Only difference is the last row, where xi yi gives 0 instead of 1 but this is compensated by Gi i.e.
xiyi = 1.1 = 1 leading to Ci = 1+ 0 = 1. Hence, we can write
Ci+1 = Gi + PiCi where Gi = xiyi Pi=xi Yi and we know Si = xi Yi Ci
So, we can write the basic cell for a single bit adder as below.
[Type here]
[Type here]
We can recursively write Ci+1 as below
Ci+1 = Gi + PiCi
=Gi + Pi[Gi-1 + Pi-1Ci-1]
=Gi + PiGi-1 + PiPi-1Ci-1
= Gi + PiGi-1 + PiPi-1[Gi-2+ Pi-2Ci-2]
=Gi + PiGi-1 + PiPi-1Gi-2 + PiPi-1Pi-2Ci-2
Ci+1 = Gi + PiGi-1 + PiPi-1Gi-2+…..+PiPi-1….P1G0 + PiPi-1…P0C0
One will notice one important thing: to calculate thei+1th, carry you only need C0.You don’t need the
chain now.
As soon as you apply the value of X,Y and C0 the sum is obtained in 3 gate delay (instead of n gate
delay) as below:
o One gate delay to calculate ALL Pi and Gi
o One gate delay for AND logic (ex PiGi-1)
o One more gate delay to do the OR logic (ex: Gi+PiGi+…+…+)
Hence, in 3 gate delay we get the carry.
o For sum we need to do one final XOR. Hence One more XOR for sum
Hence, the sum is obtained in four gate delays.
The carry C1, C2, C3, C4 can be represented in terms of G and P as below:
C1 = G0 + P0C0
C2 = G1 + P1C1 = G1 + P1(G0+ P0C0) = G1+ P1G0 + P1P0C0
C3 = G2+ P2C2 = G2+ P2G1 + P2P1G0 + P2P1P0C0
C4 = G3 + P3C3 = G3 + P3G2 + P3P2G1 + P3P2P1G0 + P3P2P1P0C0
The above circuit is called a carry-lookaheadadder.
Carry-lookahead circuit calculates C1,C2,C3,C4 using Pi and Gi. This circuit uses 3 gate delays for all
carry and 4 gate delays for sum. In comparison the 4 –bit ripple carry adder uses 7 gate delays for S 3
and 8 gate delays for C4.
Multiple 4-bit carry look ahead adder can be used to implement n-bit address.
For example: Eight 4-bit adders can be connected together to form a 32-bit adder.
In this case sum S31 and carry C32 are available after 63 and 64 gate delays respectively.
[Type here]
[Type here]
Higher – level Generate and Propagate Function
The figure below shows the 16-bit carry –look ahead adder built using 4-bit adders.
[Type here]
[Type here]
The square box represents a single cell that implements partial product for one bit as shown:
o Each row i, where 0 ≤ i ≤ 3 adds the multiplicand to the incoming partial product, PP i to generate
the outgoing partial product, PP (i+1), if qi=1. However, if qi=0, PPi is passed vertically downwards
unchanged.
o Note: The worst case signal propagation delay path is from the upper right corner of the array to
the higher order product bit at the bottom left corner of array.
o The path has a total of 6(n-1)-1 gate delays including initial AND gate delay in all cells for n X n
array.
The other method to perform multiplication is to use the adder circuitry in the ALU for a number of
sequential steps. The figure below shows the sequential circuit binary multiplier:
[Type here]
[Type here]
Round 1
Step 1: Add M and A (only if q0=1. Since q0=1 we can add M and A) .
Step 3: Shift right by 1 bit the C A Q value. New values in C A and Q after shift right by 1 bit
is
Round 2:
Step 1: Add M and A only if q0=1. This is true
Step 2: Store the values in “C” and “A” (i.e. sum and carry generated in step 1)
Step 3: Shift right “C” and “Q” by 1 bit. So, the value of “C” “A” and “Q” after shifting right is
Round 3
Step 1: Add M and A if q0 = 1
This condition fails so no addition done (q0≠ 1)
Step 2: The sum and carry generated in Step 1 are stored in “C” and “A”
Note: Since no sum was done, so no new values are updated for “C” and “A” so “C”,
“A” and “Q” values remain as in step 3 of round 2.
Round 4:
Step 1: Add M and A if q0=1
Step 2: The sum and carry generated in Step 1 are copied to “A” and “C” respectively.
Step 3: Shift right by 1 bit “C”, “A” and “Q”. Hence, the values in C A and Q after shifting
right is:
Final answer:
The values in A and Q concatenated together to form the product
[Type here]
[Type here]
The above steps can be written in the following table to visualize in one shot.
[Type here]
[Type here]
5.1 Introduction
Instruction Set Processor (ISP) or processor executes machine instructions and coordinates
the activities of other units.
It is also termed Central Processing Unit (CPU). The term “Central” is less appropriate today
because many modern computer systems include several processing units.
Organization of processors has evolved over the years, driven by developments in technology
and need to provide high performance.
To achieve high performance, make various functional units operate in parallel.
Such high performance processors have:
* Pipelined organization – execution of one instruction is started before the execution of
preceding instruction is completed.
* Superscalar operation – several instructions are fetched and executed at the same time.
Here, we discuss on basic ideas that are common to all processors.
[Type here]
[Type here]
MUX – multiplexer selects either output of Y or constant 4 (to increment the contents of
program counter).
[Type here]
[Type here]
ML R: Fetch the contents of a given memory location and load them into processor
register.
R ML: Store a word of data from a processor register to memory location.
5.2.1 Register Transfers
Instruction execution involves data transfers from one register to another.
For each register, 2 control signals are used. It is represented symbolically as shown in Figure
7.2.
[Type here]
[Type here]
When Riin=1, mux selects data on the bus. Data is loaded into flip-flop at rising edge of the
clock.
When Riin=0, mux feeds back the value currently stored in flip-flop.
When Riout=0, gate’s output is in high-impedance (electrically disconnected) state. i.e. open-
state of switch.
When Riout=1, gate drives the bus to 0 or 1, depending on the value of Q.
[Type here]
[Type here]
To fetch a word of information from memory, processor has to specify the address of the
memory location where this information is stored and request a Read operation.
Information can be instruction or an operand
Processor transfers the required address to MAR.
Processor uses control lines to indicate that a Read operation is needed.
When requested data are received from the memory , they are stored in register MDR
The processor completes one internal data transfer in one clock cycle.
The speed of operation of the addressed device varies with the device.
Devices include cache memory, register in memory mapped I/O devices, main memory, etc.
The cache responds to a read request in one clock cycle.
When cache miss occurs, request is forwarded to main memory which introduces several clock
cycles delay.
To accommodate variability in response time, the processor waits until it receives an indication
that requested Read operation has been completed.
A control signal called Memory Function Completed(MFC) is used for this purpose.
Addressed device sets this signal to 1 to indicate that the contents of the specified location
have been read and are available on the data lines of the memory bus.
Consider the instruction Move (R1), R2. The actions needed to execute this instruction are:
o MAR [R1]
o Start a Read operation on the memory bus.
o Wait for the MFC response from the memory.
o Load MDR from the memory bus.
o R2 [MDR]
[Type here]
[Type here]
Contents of MAR are always available on the address lines of memory bus.
When a new address is loaded into MAR, it will appear on the memory bus at the beginning of
the next clock cycle as shown.
A Read control signal is activated at the same time MAR is loaded.
This signal will cause the bus interface circuit to send a read command, MR(Memory Read)
on the bus.
MDRinE is active waiting for a response from the memory.
Data received from memory are loaded into MDR at the end of the clock cycle in which MFC
signal is received.
In the next clock cycle. MDRout is activated to transfer the data to register R2.
Signals are activated as follows:
1. R1out, MARin, Read
2. MDRinE, WMFC (wait for arrival of MFC signal.
3. MDRout, R2in
[Type here]
[Type here]
[Type here]
[Type here]
[Type here]
[Type here]
The offset X used in a branch instruction is the difference between the branch target address
and the address immediately following the branch instruction.
Ex:- If branch instruction is at 2000, branch target address is 2050, then value of X must be 46.
(This is because PC would have incremented during fetch phase, so it would be pointing to
2004 already. Therefore, only 46 is the offset.)
For a conditional branch, we need to check status of condition codes before loading a new
value into PC.
For (Branch > 0) instruction, Step 4 is replaced with Offset-field-of IRout, Add, Zin, If N=0, then
End.If N=0, the processor returns to step 1 immediately after step 4.
If N=1, step 5 is performed to load a new value into PC, thus performing the branch operation.
[Type here]
[Type here]
All general purpose registers are combined into a single block called the register file.
Register file has 3 parts.
o 2 output’s allowing contents of two different registers to be accessed simultaneously and
their contents are placed on A and B.
o 1 port allows the data on C to be loaded into third register during the same clock cycle.
Buses A and B are used to transfer the source operands to the A and B inputs of ALU.
Output of ALU is transferred over bus C.
If ALU simple pass one of its two input operands unmodified to bus C, indicate using R=A or
R=B
Using incremental eliminates the need to add 4 to PC using ALU and add operation.
Ex: - Control sequence for the instruction Add R4, R5, R6 for the 3-bus organization
1. PCout, R=B, MARin, Read, IncPC
Contents of PC are passed through ALU using R=B control signal and loaded into MAR to start a
memory read operation. PC is incremented by 4 to point to the next instruction in sequence.
2. WMFC
Processor waits for MFC signal from memory.
3. MDRoutB, R=B, IRin
The instruction code is received in MDR and transferred to IR.this completes the fetch phase.
4. R4outA, R5outB, SelectA, Add, R6in, End.
The instruction is decoded and add operation takes place.
[Type here]
[Type here]
The decoder/encoder block is a combinational circuit that generates the requested control
signals (outputs) depending on the states of all its inputs.
The step decoder provides a separate signal line for each step, or time slot, in the control
sequence.
Output of instruction decoder consists of a separate line for each machine instruction.
For any instruction loaded in IR, one of the output lines INS1 through INSm is set to 1 and all
other lines are set to 0.
Input signals to the encoder block are combined to generate the individual control signals like
Yin, PCout, Add, End, etc.
[Type here]
[Type here]
Logic Function:
End signal starts a new instruction fetch cycle by resetting the control step counter to its
starting value.
The control hardware can be viewed as a state machine that changes from one state to
another in every clock cycle, depending on the contents of IR, condition codes and external
inputs.
Output of the state machine are control signals
Sequence of operations carried out by the machine is determined by wiring of the logic
elements, hence the name “hardwired”.
[Type here]
[Type here]
Most of the processors today use separate caches for instructions and data.
Processor is connected to the system bus through bus interface.
To increase the potential for concurrent operations, several integer and floating point units.
[Type here]
[Type here]
The control unit can generate the control signals for any instruction by sequentially reading the
CWs of the micro routine from the control store.
To accomplish this, the organization of CU can be:
[Type here]
[Type here]
Micro Program Counter (µPC) is used to read the control words sequentially from the control
store.
Every time a new instruction is loaded into the IR, the output of the block labelled “starting
address generator” is loaded into the µPC.
µPC is automatically incremented by the clock, causing successive microinstructions to be
read from the control store.
Therefore, the control signals are delivered to various parts of the processor in correct
sequence.
This organization cannot handle a situation, wherein the CU has to check the status of
condition codes or external inputs.
Hardwired control handles this situation by including an appropriate logic function in the
encoder circuit.
In microprogrammed control, alternative approach is to use conditional branch
microinstructions.
The micro-routine for Branch instruction says that: After loading Branch<0 into IR, a branch
microinstruction transfers control to the corresponding micro-routine, which is assumed to start
at location 25 in control store.
The microinstructions at location 25 tests the N bit of condition codes.
o If it is 0, a branch takes place to location 0 to fetch a new machine instruction.
o Otherwise, microinstruction at location 26 is executed. Then 27 is followed.
To support this microprogram branching, CU is as shown:
[Type here]
[Type here]
In this CU, the µPC is incremented every time a new microinstruction is fetched from the
microprogram memory, except in the following situations:
o When a new instruction is loaded into IR, the µPC is loaded with starting address of
µroutine for that instruction.
o When a Branch µinstruction is encountered and the branch condition is satisfied, the
µPC is loaded with the branch address.
o When an End µinstruction is encountered, the µPC is loaded with the address of first
CW in the µ-routine for instruction fetch cycle.
[Type here]
[Type here]
5.6.1 Microinstructions
A straight forward way to structure microinstructions is to assign one bit position to each control
signal.
This scheme has a serious drawback – assigning individual bits to each control signal results in
long microinstructions because the number of required signals is large.
Only few bits are set to 1, which means the available bit space is poorly used.
Approaches to design a format for microinstructions:
1) Assuming that a processor contains only 4 general-purpose registers, R0,R1,R2 and R3.
Enable some of the connections in this processor permanently. Such as output of IR to
decoding circuits – both inputs to the ALU.
Connections to various registers require 20 gating signals
Control signals like Read, Write, Select, WMFC and End signals need space.
Assuming 16 functions to perform ALU including Add, Subtract, AND and XOR.
In total 42 control signals are needed.
Disadvantage of this approach: Most signals are not needed simultaneously, and many
signals are mutually exclusive. This space can be reduced.
2) The signals can be grouped so that all mutually exclusive signals are placed in same group.
A binary coding scheme is used to represent the signals within a group.
[Type here]
[Type here]
Disadvantage of this approach: this format requires a little more hardware because
decoding circuits must be used to decode the bit patterns of each field into individual
control signals.
Advantage: - This format results in smaller control store Only 20 bits are needed to store
the patterns for 42 signals.
3) Enumerating the patterns of required signals in all possible microinstructions.
Each meaningful combination of active control signals can be assigned a distinct code
that represents the microinstruction.
Such full encoding reduces the length of MW’s but increase complexity of required
decoder circuits.
Such highly encoded schemes that use compact codes to specify only a small number
of control functions in each µinstruction are referred to as a “vertical organization”.
“Horizontal organization” is an encoded scheme in which many resources can be
controlled with a single ingle instruction as shown in Figure 7.15.
This organization is useful when a higher operating speed is desired and when the
machine structure allows parallel use of resources
The second approach is a horizontal organization.
[Type here]
[Type here]
Branch Address Modification using Bit-ORing
From the flowchart, it can be seen that branches are made to different addresses
because some parts of the micro routineis shared among all the microprograms
At a point labelled α, a decision is to be made about branching:
o If direct mode is specified, instruction at location 170 is bypassed and control
goes to 171
o If indirect mode is specified, then the µinstruction at location 170 is executed to
fetch the operand from memory.
This is performed using a technique called bit-ORing.
[Type here]
[Type here]
Bit-ORing
Simplest way to transfer control directly to location 171 is to make the preceding branch
µinstruction specify the address 170 and then use an OR gate to change the LSB of this
address to 1 if direct addressing mode is specified. This is known as bit-ORing technique.
[Type here]
[Type here]
[Type here]
[Type here]
Octal Binary
Address generated by instruction decoder 101 001 000 001
Indexed 161 001 110 001
Autodecrement 141 001 100 001
Autoincrement 121 001 010 001
Register direct 101 001 000 001
Register indirect 111 001 001 001
8th Bit
0 Direct
1 Indirect
Processor has 16 registers being used for addressing, each specified using 4-bit code.
There are 2 stages of decoding:
o The microinstruction field must be decoded to determine that an Rsrc orRdst register is
involved.
o The decoded output is then used to gate the contents of the Rsrc or Rdst fields in IR.
Into second decoder, which produces the gating signals for actual registers R0 to R15
The micro routine for Add (Rsrc)+Rdst has two Bit-ORing examples:
1) Microinstruction at location 003:
There are 5 starting addresses for the micro routine depending on the addressing mode.
These addresses differ in the middle octal digit only.
The 3 bits to be ORed with the middle digit are supplied by decoding circuitry connected
to the src address.
[Type here]
[Type here]
The decoding circuits generate the starting address of a given µroutine on the basis of opcode
in IR.
The next address bits are fed through OR gates to µAR.
[Type here]
[Type here]
The address can be modified depending on the data in the IR, condition codes and external
inputs.
Reconsidering the instruction, “Add (Rsrc)+, Rdst”
o µroutine is shown in Figure 7.21
o if we use the control structure just designed, we need to modify the µinstruction format
designed on Figure 7.19
Extra fields to be added along with the previous format are:
o Signal ORmode is used to indicate whether bit-ORing is used or not.
o Signal ORindsrc is used to indicate whether indirect addressing of source operand is
used for wide branching in the flowchart of Figure 7.20.
o One bit in the µinstruction is used to indicate when the output of the instruction decoder
is to be gated into the µAR.
o Each µinstruction contains an 8-bit field that holds the address of the next µinstruction.
[Type here]
[Type here]
[Type here]
[Type here]
5.6.6 Emulation
Given a computer with certain instruction set, it is possible to define additional machine
instructions and implement with extra µroutines using microprogrammed control.
Given computer M1 is added with instruction set of different computer M2.
Machine language of M2 can be run on M1i.e M1 emulates M2.
Emulation:
o Allows to replace obsolete equipment with up-to-date machines.
o Supports no software changes to be made to run existing systems
o Facilitates transitions to new computer systems with minimal disruption.
o Is easier when machines involved have similar architectures.
o However, can be done on different architecture machines too.
Problem:
Write the control sequence of execution of the instruction ADD (R3),R1. For this sequence of
instructions, the processor is driven by a continuously running clock such that each control step is
2ns in duration. How long will the processor have to wait in steps 2 & 5, assuming that a memory
read operation takes 16ns to complete? Also compute the percentage of time for which the
processor is idle during the execution of this instruction.
Solution:
Control sequence:
1. PCout, MARin, Read, Select4, Add, Zin
2. Zout, PCin, Yin, WMFC
3. MDRout, IRin
[Type here]
[Type here]
Therefore, (5X2)+(2X16)=42ns
The processor is idle during memory read operations. i.e. for a duration of 32ns out of 42ns.
Therefore, processor idle time = 32ns/42ns = 76.2% of the total time.
Ability to handle
Difficult Easier
large/complex instruction sets
[Type here]