Download as pdf or txt
Download as pdf or txt
You are on page 1of 375

UNIT-I

Digital Computers

https://jntuhbtechadda.blogspot.com/
Introduction
What is a Computer?
Computer is an Electronic Device, that
- Accepts the information from the outside world.
- Stores it
- Process it to Produce the results
- Transfers the results to the outside world

How Computers are classified based on the kind of I/O?


- Analog Computers
- Digital Computers
- Hybrid Computers
https://jntuhbtechadda.blogspot.com/
Analog Computer:
- Set up according to initial conditions and can be later
changed freely
- Deals with continuously varying physical quantities such as
temperature, pressure, humidity, etc.,
- Solutions to the given problem are obtained by measuring
the variables/parameters

Digital Computer:
- Performs Various Computational Tasks
- Represents the information as discrete values
- Uses variables to store information

https://jntuhbtechadda.blogspot.com/
Hybrid Computer:
- Consists of both Analog and Digital components
- Solves differential equations and complex mathematical
equations
- Handles logical and numerical operations

What could be result of our logic and/or Decision Making?


Our logic and/or decision makes results in either True-or-False
OR Yes-or-No. That is consisting of two states.

Hence, Digital computers function more reliably with only


two states.
So, digital computers use the binary number system, which
has two digits: 0 and 1. A binary digit is called a bit.
https://jntuhbtechadda.blogspot.com/
They also called as Binary Computers.
The information in digital computers is represented / stored as
groups of bits.
What these group of bits can represent?
- Values
- Letters of alphabet
- Special Symbols
- Instructions or Commands

What is the decimal equivalent for “1001011”?


1 X 26 + 0 X 25 + 0 X 24 + 1 X 23 + 0 X 22 + 1 X 21 + 1 X 20 = 75

https://jntuhbtechadda.blogspot.com/
A computer system is subdivided into two functional entities:
- Hardware
- Software
Hardware, consists of all the electronic and
electromechanical components, i.e., the physical parts.
Software, consists of the programs and data that are
manipulated to perform various data-processing tasks.

What is a program?
A sequence of instructions for the computer to perform a task
is called a program.

The data manipulated by the program usually constitute the


https://jntuhbtechadda.blogspot.com/
data base.
The software is basically classified into two types:
- System Software, consists of a collection of programs, to
make more effective use of the computer. It is an
indispensable part of a total computer system.
Ex: OS, linkers, loaders, compilers, interpreters, etc.,
- Application Software, consists of programs written by the
user for the purpose of solving particular problem(s).
Examples?

https://jntuhbtechadda.blogspot.com/
Block Diagram of Digital Computer
Random Access Memory
(RAM)

Registers
Arithmetic & Logic Unit Central Processing Unit
(ALU) (CPU)

Control Unit

Input-Output Processor
Input Devices Output Devices
(IOP)

Communication https://jntuhbtechadda.blogspot.com/ Controlling the Transfer


Points of View
There exists three different points of view for the computer
hardware.
- Computer Architecture
- Computer Organization
- Computer Design Instruction Formats

Behavior Instruction Set

Computer Architecture
Addressing Modes

Specifications
&
Structure
https://jntuhbtechadda.blogspot.com/
Connection among
Hardware Components

Computer Organization

Verification for
Operations

Determination of
Hardware

Computer Design

Way of
Connecting the Parts
Computer
Implementation https://jntuhbtechadda.blogspot.com/
Classification of Architectures
All the Computer Architectures are based on the one of the
following:
1. Von-Neumann Architecture
2. Harvard Architecture
1. Von-Neumann Architecture:

Address

Bus Data

Control

https://jntuhbtechadda.blogspot.com/
2. Harvard Architecture:

https://jntuhbtechadda.blogspot.com/
Register Transfer
and
Microoperations

https://jntuhbtechadda.blogspot.com/
Topics to be Covered
Ø Register Transfer language

Ø Register Transfer

Ø Bus and Memory Transfers

Ø Arithmetic Microoperations

Ø Logic Microoperations

Ø Shift Microoperations

Ø Arithmetic Logic Shift Unit


https://jntuhbtechadda.blogspot.com/
Register Transfer language
A digital system:
- vary in size and complexity from a few integrated circuits to
complex interconnected and interacting digital computers.
- uses a modular approach.
- is an interconnection of digital hardware modules using
common data and control paths, to accomplish a specific
information-processing task.

The parts of the modules, called as digital components


include:
- Registers,
- Decoders,
- Arithmetic elements, and
https://jntuhbtechadda.blogspot.com/
- Control logic.
Hence, a Digital module is defined by:
- A set of registers
- Operations performed on the data stored in them, called as
microoperations.

So, a microoperation is an elementary operation performed on


the information stored in one or more registers.

The result of the operation may:


- replace the previous information of a register or
- transferred to another register.
Examples: shift, count, clear, and load.
https://jntuhbtechadda.blogspot.com/
Now, the internal hardware organization of a digital computer
is defined by specifying the:
1. Set of registers and their functions.
2. Sequence of microoperations performed on the binary
information stored in the registers.
3. Control that initiates the sequence of microoperations

One way to specify the sequence of microoperations is by


explaining every operation in words. But, it involves a lengthy
descriptive explanation.

Hence, it is convenient to use a suitable symbology for the


description of the sequence of transfers between registers and
the related microoperations along with the control signals.
https://jntuhbtechadda.blogspot.com/
The symbolic notation used is called a Register Transfer
Language (RTL).
In this, the symbols define:
- Various types of microoperations
- Associated hardware
- Control Signals

https://jntuhbtechadda.blogspot.com/
Register Transfer
Computer Registers are designated by capital letters
(sometimes followed by numerals) to denote the function of
the register.
Examples:
MAR (Memory Address Register) - holds an address of a
memory location
PC (Program Counter) – holds the address of the next
instruction
IR (Instruction Register) – contains the current instruction
R1 (Processor Register) – holds the temporary data.

The individual flip-flops in an n-bit register are numbered in


sequence from 0 through n - 1, starting from 0 in the rightmost
https://jntuhbtechadda.blogspot.com/
position and increasing the numbers toward the left.
The following figures illustrates different ways of representing
registers in block diagram.
a> The most common way to represent a register is by a
rectangular box with the name of the register inside.

b> The individual bits can be specified as numbers.

c> The numbering of bits in a 16-bit register can be marked on


top of the box.

d> A 16-bit register, say PC, is partitioned into two parts as


Bits 0 through 7 are assigned the symbol L (for low byte) and
bits 8 through 15 are assigned the symbol H (for high byte).
https://jntuhbtechadda.blogspot.com/
A replacement operator (←) is used to denote the information
transfer from one register to another.
To transfer the content of register R1, Source Register into
register R2, Destination Register, the statement R2 ← R1 is
used. It results in:
- The replacement of previous content of R2.
- The content of R1 is unaltered.
- The content of R1 is transferred simultaneously to R2 using
the circuits from the outputs of R1 to inputs of R2.

Register transfer under a predetermined condition can be


denoted as:
If (P = 1) then (R2 ← R1)
i.e., using If-then Here, P is a control signal
https://jntuhbtechadda.blogspot.com/
For convenience, the control variables are separated from the
register transfer operation using a control function, i.e., a
Boolean variable that is equal to either 1 or 0.
By including the control function, the above statement
becomes:
P: R2 ← R1
The Hardware required to implement the above statement is:

https://jntuhbtechadda.blogspot.com/
Here,
- The n outputs of register R1 are connected to the n inputs of
register R2, n will be replaced with the length of the register.
- The load input of R2 is activated by the control variable, P
- Both the register R2 and control variable are synchronized
with the same clock.
- P is activated by the control section on the rising edge of a
clock pulse at time t.
- On the next positive transition of the clock at time t + 1,
finds the load input active and the data inputs of R2 are then
loaded into the register in parallel.
- P may go back to 0 at time t + 1; otherwise, the transfer will
occur with every clock pulse transition.
https://jntuhbtechadda.blogspot.com/
It is illustrated in the following timing diagram

The basic symbols of the register transfer notation are:

https://jntuhbtechadda.blogspot.com/
BUS and Memory Transfer
A digital system consists of many registers and a path is
required for transfer of information among them.
Suppose separate lines are used between each register and
other registers number of wires become excessive.
Hence, most efficient way is to use a Common Bus System

A bus is a path consists of set of common lines, one for each


bit of a register through which binary information is
transferred one at a time.

Control signals determine the register to be selected by the bus


during each particular register transfer.
https://jntuhbtechadda.blogspot.com/
One way of constructing common bus is using Multiplexers.
Multiplexer is a combinational circuit that takes n inputs and
provides single output based on the selection lines.
A 4 X 1 multiplexer with two selection lines S0 & S1 is:

Multiplexer selects the source register to place the binary


information onto the Bus.https://jntuhbtechadda.blogspot.com/
A common bus system constructed for four registers is:

https://jntuhbtechadda.blogspot.com/
- Four 4 X 1 Multiplexers each having four data inputs (0
through 3) and two selection inputs (S0 and S1) are used.
- The two selection lines S0 and S1 are connected to the
selection inputs of all the four multiplexers.
- These selection lines choose the four bits of one register and
transfer them into the four-line common bus.
- With both of the select lines at low logic, i.e. S1S0 = 00, the
0 data inputs of all four multiplexers are selected and
applied to the outputs to form the bus.
- In the above case, the bus lines receive the content of
register A
- Similarly, when S1S0 = 01, register B is selected, and the
bus lines will receive the content of register B.
https://jntuhbtechadda.blogspot.com/
The Function Table shown below indicates the register
selected by the bus for each of the four possible binary value
of the selection lines.

Note:
- The number of
multiplexers needed to
construct the bus is equal to
the number of bits in each
register.

- The size of each


multiplexer must be K X 1
since it multiplexes ‘K' data
lines.

https://jntuhbtechadda.blogspot.com/
The transfer of information from a bus into one of the
destination registers can be accomplished by:
- Connecting the bus lines to the inputs of all destination
registers
- Activating the load control of the particular destination
register selected.

Now, the symbolic statement(s) for a register transfer using a


common bus system is/are:
BUS ← C R1 ← BUS
If the presence of bus system is know, it may be convenient to
show the transfer as:
R1 ← C
https://jntuhbtechadda.blogspot.com/
Three State Bus Buffers:
Another way of constructing a common bus system is using
three-state gates instead of multiplexers.
A three-state gate is a digital circuit/component, that exhibits:
- Two states equivalent to logic 1 and 0
- Third state is a high-impedance state, behaves like an open
circuit and does not have a logical significance.
In case of the common bus system, the most commonly used
three state gate is a buffer gate.
The graphic symbol of a three-state buffer gate is:

https://jntuhbtechadda.blogspot.com/
From the above figure, the observations are:
- It differs from normal/conventional buffer by having both
normal and control inputs.
- For control input 1, the output is enabled and behaves like a
normal buffer.
- For control input 0, the output is disabled and enters into
high-impedance state.
Decoder: Decoder is a
combinational circuit that has ‘n’
input lines and maximum of 2n
output lines. One of these outputs
will be active High based on the
combination of inputs present,
when the decoder is enabled.
https://jntuhbtechadda.blogspot.com/
By using the high-impedance state feature, a large number of
three-state gate outputs can be connected with wires to form a
common bus line without endangering loading effects,

https://jntuhbtechadda.blogspot.com/
- The outputs of four buffers are connected together to form a
single bus line.
- At any given time, only one buffer has to be in the active
state.
- So, the control inputs to the buffers should determine the
buffer to communicate with the bus line.
- Hence, only one control input has to be active.
- This can be ensured by using a decoder, as:
o With the enable input of the decoder as 0, all of its
four outputs are 0, and the bus line is in a high-
impedance state because all four buffers are disabled.
o With the enable input is active i.e., 1, one of the
three-state buffers will be active, depending on the
binary value in the select inputs of the decoder.
https://jntuhbtechadda.blogspot.com/
Random Access Memory (RAM)
RAM is a collection of storage cells together with associated
circuits needed to transfer information in and out of storage.
The memory stores binary information in groups of bits called
words.
Each memory cell is indicated by an unique address and holds
one word of information.
Let us assume the RAM contains r = 2k words and each word
is of ‘n’ bits. Hence, it needs:
- ‘n’ data input lines
- ‘n’ data output lines
- ‘k’ address lines
- A Read control line
https://jntuhbtechadda.blogspot.com/
- A Write control line
Memory Transfer
A memory word is designated by the letter M.
The address of the memory word need to be specified for the
memory transfer operations.
The Address Register, AR holds the address and the Data
register, DR holds the data involved in the transfer.
The transfer of information from a memory unit to the outside
is called a Read operation.
So, the Read signal causes a transfer of information into the
data register (DR) from the memory word (M) selected by the
address register (AR).
Thus, a read operation can be stated as:
Read: DR ← M [AR]
https://jntuhbtechadda.blogspot.com/
The transfer of new information into the memory for storing is
called a Write operation.
So, the Write signal causes a transfer of information from
register DR into the memory word (M) selected by address
register (AR).
Hence, the corresponding write operation can be stated as:
Write: M [AR] ← DR

The read and write operations are illustrated by:

https://jntuhbtechadda.blogspot.com/
Micro-Operations
Microoperation is the elementary operation performed on
information stored in registers.

Computer system microoperations are broadly classified into:


Ø Arithmetic microoperations
Ø Logic microoperations
Ø Shift microoperations

https://jntuhbtechadda.blogspot.com/
Arithmetic Microoperations
• In general, the Arithmetic Micro-operations deals with the
operations performed on numeric data stored in the registers.

• The basic Arithmetic Micro-operations are classified into the


following categories:
– Addition
– Subtraction
– Increment
– Decrement

• Some additional Arithmetic Micro-operations are:


– Add with carry
– Subtract with borrow
– Transfer/Load, etc.https://jntuhbtechadda.blogspot.com/
The arithmetic microoperations are tabulated as:

In the above table, multiplication and division operations are not


mentioned. In a Digital system where they are implemented by
combinational circuits
In general ,
• Multiplication operation is implemented with a sequence of add and
shift microoperations.
• Division is implemented with a sequence of subtract and shift
https://jntuhbtechadda.blogspot.com/
microoperations.
Binary Adder
The Add micro-operation’s implementation using Hardware
requires:
- Registers to hold the data
- Digital components to perform the arithmetic addition
A Binary Adder is a digital circuit that performs the arithmetic
sum of two binary numbers provided with any length.
Binary Adders are classified into two types:
- Half Adder
- Full Adder
Half Adder:

https://jntuhbtechadda.blogspot.com/
Full Adder using two Half Adders:

https://jntuhbtechadda.blogspot.com/
Now, a Binary Adder is constructed using full-adder circuits
connected in series, with the output carry from one full-adder
connected to the input carry of the next full-adder.
The following figure illustrates the construction of 4-bit binary
adder using four full adders

The augend bits (A) and the addend bits (B) are designated by
subscript numbers from right to left, with subscript '0'
denoting the low-order bit.
https://jntuhbtechadda.blogspot.com/
The carry inputs starts from C0 to C3 connected in a chain
through the full-adders. C4 is the resultant output carry
generated by the last full-adder circuit.
The sum outputs (S0 to S3) generates the required arithmetic
sum bits of augend and addend bits.

In general, for an n-bit binary adder:


- n full adders are required
- Output carry from each full-adder is connected to the input
carry of the next-high-order full-adder.
- The data bits for A input comes from source register, say
R1 and data bits for B input comes from source register R2.
- The resultant sum of the data inputs of A and B can be
transferred to a third register or to one of the source
registers (R1 or R2).https://jntuhbtechadda.blogspot.com/
Binary Adder-Subtractor
The Subtraction micro-operation can be performed by:
- Finding the 2's compliment of addend bits.
- Adding it to the augend bits.

Note:
§ The 2's compliment can be obtained by taking the 1's
compliment and adding one to the least significant bit.
§ The 1’s complement can be obtained by inverting each bit
of the given number.

Now, both the addition and subtraction operations can be


combined into one common circuit by including an exclusive-
OR gate with each full-adder.
https://jntuhbtechadda.blogspot.com/
A 4-bit binary adder-subtractor is:

The mode input (M) decides the kind of operation as:


- At a low logic, i.e. '0', the circuit act as an adder.
- At a high logic, i.e. '1', the circuit act as a subtractor.

Each exclusive-OR gate connected in series, receives input M


and one of the inputs of B.https://jntuhbtechadda.blogspot.com/
With M = 0,
- The output from XORs will be B⊕ 0 = B.
- So, the full-adders receive the value of B, the input carry is
0, and the circuit performs A plus B.

With M =1,
- The output from XORs will be B⊕ 1 = Bʹ, i.e., B inputs are
complemented.
- Since C0 = 1, a 1 is added through the input carry, to get 2’s
complement of B.
- Hence, the circuit performs the operation A plus the 2's
complement of B, i.e., A - B.

https://jntuhbtechadda.blogspot.com/
Binary Incrementer
The increment micro-operation adds ‘1’ to the value stored in
a register.
For instance, a 4-bit register has a binary value 0110, when
incremented by one the value becomes 0111.
A 4-bit combinational circuit incrementer, by cascading four
half adders is as shown below:

https://jntuhbtechadda.blogspot.com/
In the above figure,
- The least significant half adder will have logic-1 as one
input and least significant bit of the number to be
incremented as other input.
- The output carry from one half-adder is connected to one of
the inputs of the next-higher-order half-adder.
- The binary incrementer circuit receives the four bits from A0
through A3, adds one to it, and generates the incremented
output in S0 through S3.
- The output carry C4 will be 1 only after incrementing binary
1111.

Note: An n-bit binary incrementer can be constructed by


including ‘n’ half-adders.
https://jntuhbtechadda.blogspot.com/
Binary Decrementer
The decrement means subtraction of ‘1’ from given value.
Let us consider R – 1
- Can be represented as R + (-1) , -1 is represented as 2’s
complement of 1 which is all 1’s.
- Adding R contents with 2’s complement of 1 results in
Decrementation.

https://jntuhbtechadda.blogspot.com/
Arithmetic Circuit
All the arithmetic microoperations can be implemented in one
composite arithmetic circuit.
The basic component of this arithmetic circuit is the parallel
adder.
By controlling the data inputs to the adder, it is possible to
obtain different types of arithmetic operations.

The arithmetic circuit contains:


- Four full-adder circuits that constitute the 4-bit adder.
- Four multiplexers for choosing different operations.
- Two selection signals S0 & S1.
- Two 4-bit inputs A and B.
- A 4-bit output D.
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Function Table:

https://jntuhbtechadda.blogspot.com/
Logic Microoperations
- Specify binary operations for strings of bits stored in
registers.
- Consider each bit of the register separately and treat them as
binary variables.
- Seldom used in scientific computations.
- Useful for bit manipulation of binary data.

Ex: The exclusive-OR microoperation with the contents of


two registers R1 and R2, at P can be symbolized as:
P: R1 ← R1 ⊕ R2

https://jntuhbtechadda.blogspot.com/
The special symbols used to denote logic operations are:
- OR denoted by ˅
- AND denoted by ˄
- Complement i.e., 1’s complement is denoted by a bar on top
of the symbol representing the register.

The symbol ‘+’ has two meanings, based on the occurrence of


the symbol:
- In a microoperation, it denotes addition operation.
- In a control function, it denotes OR operation.
Ex: P + Q : R1 ← R2 + R3, R4 ← R5 ˅ R6

https://jntuhbtechadda.blogspot.com/
List of Logic Microoperations
16 different logic operations can be performed with two binary
variables.

They can be determined from all possible truth tables obtained


with two binary variables as shown below:

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Hardware Implementation
The hardware implementation requires that logic gates be
inserted for each bit or pair of bits in the registers to perform
the required logic function.

Although there are 16 logic microoperations, only four-AND,


OR, XOR (exclusive-OR), and complement are used.
From these four operations all others can be derived.

The following figure shows one stage of a circuit that


generates the four basic logic microoperations.
It also provides the function table.
https://jntuhbtechadda.blogspot.com/
S1 S0 Output Operation
0 0 E=A⊕ B XOR
0 1 E=A∨ B OR
1 0 E=A∧ B AND
https://jntuhbtechadda.blogspot.com/
1 1 E=A Complement
Some Applications
To manipulate individual bits or a portion of a word stored in a
register, the logic microoperations are very useful.
Logic Microoperations can be used on a register to:
- Change bit values
- Delete a group of bits
- Insert new bit values
The operations are:
§ Selective-Set
§ Selective-Complement
§ Selective-Clear
§ Mask
§ Insert
§ Clear https://jntuhbtechadda.blogspot.com/
Selective-set: This operation sets the specific/desired bit(s) to
1 in register A.
This can be done:
- by placing 1(s) in the desiring position(s) in the register B,
the remaining as 0(s) &
- performing OR operation with register A.

Selective-complement: This operation complements/inverts


the specific/desired bit(s) in register A.
This can be done:
- by placing 1(s) in the desiring position(s) in the register B ,
the remaining as 0(s) &
- performing XOR operation with register A.
https://jntuhbtechadda.blogspot.com/
Selective-clear: This operation clears the specific/desired
bit(s) to 0 in register A.
This can be done in two ways.
In the first method:
- by placing 1(s) in the desiring position(s) in the register B,
the remaining as 0(s) &
- performing A ← A ˄ Bʹ.

In the second method:


- by placing 0(s) in the desiring position(s) in the register B,
the remaining as 1(s) &
- performing A ← A ˄ B.

https://jntuhbtechadda.blogspot.com/
Mask: This operation is similar to the selective-clear operation
except that only a portion of the bits of A are cleared.
This can be done:
- by placing 0(s) in the specified portion in the register B, the
remaining portion as 1(s) &
- performing A ← A ˄ B.

Insert: This operation inserts a new value into a group of bits.


This is done by:
- first masking the bits and
- then ORing them with the required value.

Clear: This operation clears all the bits of register A.


https://jntuhbtechadda.blogspot.com/
Shift Microoperations
Shift microoperations are used for:
- serial transfer of data
- in conjunction with arithmetic, logic, and other data-
processing operations.
The contents of a register can be either shifted to the left or the
right. In both, the first flip-flop receives its binary information
from the serial input.
During a shift-left operation the serial input transfers a bit into
the rightmost position.
During a shift-right operation the serial input transfers a bit into
the leftmost position.
There are three types of shifts:
- Logical
- Circular
https://jntuhbtechadda.blogspot.com/
- Arithmetic
Symbolic Designation Description
R ← shl R Logical Shift-left register R
R ← shr R Logical Shift-right register R
R ← cil R Circular shift-left register R
R ← cir R Circular shift-right register R
R ← ashl R Arithmetic shift-left R
R ← ashr R Arithmetic shift-right R

A logical shift is one that transfers 0 through the serial input.


The symbols shl and shr are used to denote logical shift-left
and shift-right microoperations.

https://jntuhbtechadda.blogspot.com/
The circular shift (rotate) operation, circulates the bits of the
register around the two ends without loss of information.
This is accomplished by connecting the serial output of the
shift register to its serial input.
The symbols cil and cir are used for the circular shift left and
right, respectively.

https://jntuhbtechadda.blogspot.com/
An arithmetic shift is a microoperation that shifts a signed
binary number either to the left or right.

An arithmetic shift-left multiplies a signed binary number by


2. An arithmetic shift-right divides the number by 2.

Arithmetic shifts must leave the sign bit unchanged because


the sign of the number remains the same after the operation.

The leftmost bit in a register holds the sign bit, and the
remaining bits hold the number.
The sign bit is 0 for positive and 1 for negative. Negative
numbers are in 2's complement form.
https://jntuhbtechadda.blogspot.com/
Let us consider the register R of n bits.
Bit Rn-1 in the leftmost position holding the sign bit.
Rn-2 is the most significant bit of the number and R0 is the least
significant bit.

The arithmetic shift-right leaves the sign bit unchanged and


shifts the number (including the sign bit) to the right.
Thus Rn-1 remains the same, Rn-2 receives the bit from Rn-1 and
so on for the other bits in the register. The bit in R0 is lost.
https://jntuhbtechadda.blogspot.com/
The arithmetic shift-left inserts a 0 into R0, and shifts all other
bits to the left.
The initial bit of Rn-1 is lost and replaced by the bit from Rn- 2 .
Because of this, there is a possibility of sign reversal if the bit
in Rn-1 changes in value after the shift, due to overflow.

An overflow occurs after an arithmetic shift left, if Rn-1 is not


equal to Rn-2 before the shift.

https://jntuhbtechadda.blogspot.com/
An overflow flip-flop V, can be used to detect an arithmetic
shift-left overflow. Hence
V = Rn-1 ⊕ Rn-2
If V = 0, there is no overflow.
If V = 1, there is an overflow and a sign reversal after the shift.

https://jntuhbtechadda.blogspot.com/
Hardware Implementation
A shift unit would be constructed in one way using a
bidirectional shift register with parallel load.
In this:
- Information can be transferred to the register in parallel.
- Can be shifted either to the right or left.
In this type of shift unit,
- A clock pulse is needed for loading the data into the register.
- Another pulse is needed to initiate the shift.

With many registers in a processor unit, it is more efficient to


implement the shift operation with a combinational circuit.
https://jntuhbtechadda.blogspot.com/
In this implementation:
- The content of a register to be shifted is placed onto a
common bus.
- The output of the common bus is connected to the
combinational shifter.
- The shifter shifts the loaded content.
- The result is loaded back into the register.
This whole operation requires only one clock pulse for loading
the shifted value into the register.
A 4-bit combinational circuit shifter can be constructed:
- Four Multiplexers
- Four data inputs, A0 through A3
- Four data outputs, H0 through H3
- Two serial inputs, one for shift left (IL) and the other for
shift right (IR)
https://jntuhbtechadda.blogspot.com/
- One Selection Input, S
For S = 0, the input data are
shifted right (down in the
diagram).
For S = 1, the input data are
shifted left (up in the diagram).
The function table shows which
input goes to each output after the
shift.

A shifter with n data inputs


and outputs requires n
multiplexers. https://jntuhbtechadda.blogspot.com/
Arithmetic Logic Shift Unit
Instead of individual registers performing the microoperations
directly, in a computer system, a number of storage registers
are connected to a common operational unit called as
Arithmetic Logic Unit (ALU).
To perform a microoperation:
- The contents of specified registers are placed in the inputs
of the common ALU.
- The ALU performs an operation.
- The result of the operation is then transferred to a
destination register.
Note: All the above operations can be performed during one
clock pulse period.
https://jntuhbtechadda.blogspot.com/
To perform shift operations there are two possibilities:
1. A separate shift unit
2. The shift unit as part of the overall ALU.
One stage of an arithmetic logic shift unit is:

https://jntuhbtechadda.blogspot.com/
In the above Circuit:
- The subscript i designates a typical stage.
- Inputs Ai and Bi are applied to both the arithmetic and logic
units.
- A particular microoperation is selected with inputs S1 and
S0.
- The data in the 4 X 1 multiplexer are selected with inputs S3
and S2.
- The inputs to the multiplexer are:
ØAn arithmetic output in Di.
ØA logic output in Ei.
ØAi+1 for the shift-right operation
ØAi-1 for the shift-left operation
https://jntuhbtechadda.blogspot.com/
The one stage ALU circuit provides:
- Eight arithmetic operations,
- Four logic operations,
- Two shift operations.
Each operation is selected with the five variables S3, S2, S1, S0,
and Ci.

https://jntuhbtechadda.blogspot.com/
- The one stage circuit must be repeated n times for an n-bit
ALU.
- The output carry Ci+1 of a given arithmetic stage must be
connected to the input carry Ci of the next stage in
sequence.
- The input carry to the first stage, Cin, provides a selection
variable for the arithmetic operations.

https://jntuhbtechadda.blogspot.com/
Basic Computer
Organization and Design

https://jntuhbtechadda.blogspot.com/
Topics to be Covered
ØInstruction Codes

ØComputer Registers

ØComputer Instructions

ØTiming and Control

ØInstruction Cycle

ØMemory Reference Instructions

ØInput–Output and Interrupt


https://jntuhbtechadda.blogspot.com/
Instruction Codes
The organization of the computer is defined by:
- Internal Registers
- Timing and Control Structure
- Set of Instructions

The internal organization of a digital system is defined by the


sequence of microoperations, performed on the data stored in
its registers.

The general purpose digital computer is capable of:


- Executing various microoperations
- Performing specific sequence of operations as instructed
https://jntuhbtechadda.blogspot.com/
The user of a computer can specify the sequence of operations
by means of a program.

Hence, a program is a set of instructions that specify:


- Operations
- Operands
- Sequence of Processing

The task of dataprocessing may be altered either by


specifying:
- A new program with different instructions (or)
- The same instructions with different data
https://jntuhbtechadda.blogspot.com/
- A Computer Instruction is a binary code that specifies a
sequence of microoperations to be performed.
- Every computer has its own unique instruction set.
- Instructions and data are stored in memory.
- The stored program concept, the important property of a
general-purpose computer, is the ability to store and execute
instructions.
- The computer reads each instruction from memory and
places it in a control register.
- The control interprets the binary code of the instruction and
execute it by issuing a sequence of microoperations.

An Instruction Code is a group of bits that instruct the


computer to perform a specific operation.
https://jntuhbtechadda.blogspot.com/
It is usually divided into parts, with each part having its own
particular interpretation.
- The first part being operation part, called as Operation
Code
- The remaining parts consist of Registers, Memory Words
and/or Operands itself.

The operation code of an instruction is a group of bits that


define operation and the number of bits depend on the total
number of available operations.
Ex: Add, Subtract, Multiply, Shift, Complement, etc.,

The operation code must consist of at least n bits for a given 2n


(or less) distinct operations.
https://jntuhbtechadda.blogspot.com/
The relationship between a computer operation and a
microoperation is:
- An operation is part of an instruction, i.e., a binary code that
instructs the computer to perform a specific operation.
- After receiving the instruction from memory, the control
unit interprets the operation code bits to issue a sequence of
control signals for initiation of microoperations in internal
registers.
- Thus, for every operation code, the control issues a
sequence of microoperations needed for the hardware
implementation of the specified operation.

So, an operation code is sometimes called as a


macrooperation because it specifies a set of microoperations.
https://jntuhbtechadda.blogspot.com/
The specified operation must be performed on some data
stored in processor registers or in memory.

So, an instruction code also specifies both source and


destinations.
Memory words can be specified in instruction codes by their
address.
Processor registers can be specified by a binary code of k bits
that specifies one of 2k registers.

Each computer has its own instruction code format.


These instruction code formats are framed by computer
designers, while specifying the architecture of the computer.
https://jntuhbtechadda.blogspot.com/
Stored Program Organization
Let us organize a simple computer with:
- One processor register
- An instruction code format

The format for instruction consists of two parts:


- The first part specifies the operation to be performed
- The second specifies an memory address of the operand

The operand is read from memory is used as the data to be


operated along with the data stored in the processor register.
https://jntuhbtechadda.blogspot.com/
The stored program organization is illustrated by:

In the instruction format:


- Four bits for the operation code (opcode) to specify one out
of 16 possible operations
- 12 bits to specify the memory address of an operand, as the
memory size is 4096. https://jntuhbtechadda.blogspot.com/
A single-processor register, is usually named as accumulator
and labeled as AC.
The operation is performed with the memory operand and the
content of AC (Second Operand).
For few instructions, an instruction code does not need an
operand from memory, i.e., bits 0 through 11 can be used for
other purposes.
For example, operations on data stored in the AC register, such
as clear AC, complement AC, increment AC, etc.,.

https://jntuhbtechadda.blogspot.com/
Indirect Address
The second part of an instruction code (0-11 bits) can specify:
- Actual Operand, called as Immediate Operand.
- Address of the operand, called as Direct Address.
- Address of a memory word, consisting the address of the
operand, called as Indirect Address.
Note: One bit in the instruction code can be used to
distinguish between a direct and an indirect address.
The instruction format consists:
- 3-bit Operation Code
- 12-bit Address
- An indirect address mode bit
designated by I
https://jntuhbtechadda.blogspot.com/
I = 0 for a Direct Address; I = 1 for an Indirect Address
The Direct and Indirect Addressing is demonstrated using:

https://jntuhbtechadda.blogspot.com/
Computer Registers
For the execution of the program stored in the memory, the
registers for the following purposes are required:
- A counter to calculate the address of the next instruction.
- Storing the instruction code after it is read from memory.
- Manipulation of data.
- Holding a memory address.

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
In the above figure:
The memory unit has a capacity of 4096 words and each word
contains 16 bits. So,
- All the registers holding the address should be of 12 bits size
ØAddress Register (AR)
ØProgram Counter (PC)
- All the registers holding the data/instruction should be of 16
bits.
Ø Data Register (DR)
Ø Accumulator (AC)
Ø Instruction Register (IR)
Ø Temporary Register (TR)
Hence, the processor (CPU) also should be of 16 bits.
The input register (INPR) receives an 8-bit character from an input
device.
The output register (OUTR) holds an 8-bit character for an output
https://jntuhbtechadda.blogspot.com/
device.
In the instruction code:
- 12 bits for the address of an operand.
- 3 bits to specify operation, i.e., Opcode
- 1 bit to specify a direct or indirect address

The PC:
- goes through a counting sequence by incrementing, so that
the computer reads the instructions sequentially.
- sequential execution continues unless a branch instruction,
that calls for a transfer to a nonconsecutive instruction in the
program, is encountered.
- the address part of a branch instruction is transferred to
become the address of the next instruction.
https://jntuhbtechadda.blogspot.com/
Common Bus System
Our basic computer has:
- Eight Registers,
- A Memory Unit, &
- A Control Unit.

So, paths must be provided to transfer information from one


register to another and between memory and registers.

The efficient way is to use a common bus system, either using


multiplexers or three-state buffer gates.

The connection of the registers and memory of the basic


computer to a common bus system is:
https://jntuhbtechadda.blogspot.com/
S2S1S0 – Selection Lines
LD – Load
INR – Increment
CLR – Clear
E – End Carry

https://jntuhbtechadda.blogspot.com/
Computer Instructions
The basic computer supports three instruction code formats of
16 bits each. In this:
- The operation code (opcode) part of the instruction contains
three bits.
- The meaning of the remaining 13 bits depends on the
opcode.
1> A memory-reference instruction, out of 13 bits uses:
- 12 bits to specify an address.
- One bit to specify the addressing mode I. I is equal to 0 for
direct address and to 1 for indirect address.

https://jntuhbtechadda.blogspot.com/
2> A register-reference instruction specifies:
- The operation code 111 with a 0 in the leftmost bit (bit 15)
of the instruction.
- An operation on or a test of the AC register by using the
other 12 bits.

3> An input-output instruction specifies:


- The operation code 111 with a 1 in the leftmost bit of the
instruction.
- The remaining 12 bits are used to specify the type of input-
output operation or test performed.

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Instruction Set Completeness
The instruction set designed for a computer is said to be
complete, if it includes sufficient number of instructions in
each of the following categories:
1. Arithmetic, logical, and shift instructions.
2. Instructions for moving information to and from memory
and processor registers.
3. Program control instructions together with instructions that
check status conditions.
4. Input and output instructions.

- ADD, INC, AND, CMA, CLA, CIR, CIL


- LDA, STA
- BUN, BSA, ISZ, SPA, SNA, SZA, SZE, SKI, SKO
https://jntuhbtechadda.blogspot.com/
- INP, OUT
Timing and Control
A master clock generator’s clock pulses are used to control the
timing of all the flip-flops and registers of both the system and
control unit.

The clock pulses able to change the state of a register only


after it is enabled by a control signal.

The control signals generated in the control unit, provide:


- Control inputs for the multiplexers in the common bus
- Control inputs in processor registers
- Microoperations for the accumulator
https://jntuhbtechadda.blogspot.com/
There are two major types of control organization:
- Hardwired Control
- Microprogrammed control
The control logic in hardwired control is implemented with:
- Gates
- Flip-flops
- Decoders
- Other digital circuits
Advantage:
It can be optimized to produce a fast mode of operation.
Disadvantage:
To modify or change the design, the wiring among the various
components need to be changed.
https://jntuhbtechadda.blogspot.com/
In the microprogrammed organization:
- The control information is stored in a control memory.
- The control memory is programmed to initiate the required
sequence of microoperations.
Advantage:
The required changes or modifications can be done by
updating the microprogram in control memory.
Disadvantage:
Optimization at the hardware level is not possible.

The block diagram of the control unit, shown below consists


of:
- Two decoders
- A sequence counter
- A number of control logic gates
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Instruction Cycle
The program, resided in the memory, is executed in the
computer by going through a cycle for each instruction.
Each instruction cycle in turn is subdivided into a sequence of
subcycles or phases.

In the basic computer each instruction cycle consists of the


following phases:
1. Fetch an instruction from memory.
2. Decode the instruction.
3. Read the effective address from memory if the instruction
has an indirect address.
4. Execute the instruction.
The above steps are repeated for each instruction, till the
https://jntuhbtechadda.blogspot.com/
HALT is encountered.
Fetch and Decode
Initially:
- The program counter PC is loaded with the address of the
first instruction in the program.
- The sequence counter SC is cleared to 0, providing a
decoded timing signal T0.

The microoperations for the fetch and decode phases are


specified using register transfer statements as given below:
T0: AR ← PC
T1: IR ← M[AR], PC ← PC + 1
T2: D0, ... , D7 ←Decode IR(12-14), AR←IR(0-11), I←IR(15)
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
At T0, to perform AR ← PC :
1. Place the content of PC onto the bus by making the bus
selection inputs S2S1S0 equal to 010.
2. Transfer the content of the bus to AR by enabling the LD
input of AR.

At T1, to perform IR ← M[AR], PC ← PC + 1:


1. Enable the read input of memory.
2. Place the content of memory onto the bus by making
S2S1S0 = 111.
3. Transfer the content of the bus to IR by enabling the LD
input of IR.
4. Increment PC by enabling the INR input of PC.
https://jntuhbtechadda.blogspot.com/
Determine the Type of Instruction

D7ʹ IT3: AR ← M [AR]


D7ʹ IʹT3: Nothing
D7Iʹ T3: RR Instruction
D7IT3: IO Instruction

https://jntuhbtechadda.blogspot.com/
Register-Reference Instructions

https://jntuhbtechadda.blogspot.com/
Memory-Reference Instructions
The function of the memory-reference instructions can be
defined precisely by means of register transfer notation as
shown below:

https://jntuhbtechadda.blogspot.com/
The effective address is placed in the address register AR:
- During timing signal T2 when I = 0.
- During timing signal T3 when I = 1 as AR ← M[AR].
The execution of the memory-reference instructions starts with
timing signal T4.

AND to AC (AND):
D0T4: DR ← M [AR]
D0T5: AC ← AC /\ DR, SC ← 0

ADD to AC (ADD):
D1T4: DR ← M [AR]
D1T5: AC ← AC + DR, E ← Cout, SC ← 0
https://jntuhbtechadda.blogspot.com/
Load to AC (LDA):
D2T4: DR ← M [AR]
D2T5: AC ← DR, SC ← 0

Store AC (STA):
D3T4: M [AR] ← AC, SC ← 0

Branch Unconditionally (BUN):


D4T4: PC ← AR, SC ← 0

Increment and Skip if Zero (ISZ):


D6T4: DR ← M [AR]
D6T5: DR ← DR + 1
D6T6: M [AR] ← DR, if (DR = 0) then (PC ← PC + 1),
SC ← 0 https://jntuhbtechadda.blogspot.com/
Branch and Save Return Address (BSA):
- Useful for branching to a portion of the program called a
subroutine or procedure.
- Stores the address of the next instruction in sequence into a
memory location specified by the effective address.
- Transfers the effective address plus one to PC to serve as the
address of the first instruction in the subroutine.

D5T4: M [AR] ← PC,


AR ← AR + 1
D5T5: PC ← AR,
SC ← 0

https://jntuhbtechadda.blogspot.com/
Input-Output and Interrupt
For the full functionality of a computer, the communication
with the external environment is essential, because:
- Instructions and data stored in memory must be entered
through some input device.
- Computational results must be available to the user through
some output device.

Commercial computers include many types of input and


output devices.

To demonstrate, a terminal unit with a keyboard as an input


device and printer as an output device is used.
https://jntuhbtechadda.blogspot.com/
Input-Output Configuration
- The terminal sends and receives serial information, with
eight bits of an alphanumeric code, as unit.
- The serial information from the keyboard is shifted into the
input register INPR.
- The serial information for the printer is stored in the output
register OUTR .
- These two registers communicate with a communication
interface serially and with the AC in parallel.

https://jntuhbtechadda.blogspot.com/
The input-output configuration is:

https://jntuhbtechadda.blogspot.com/
Input-Output Instructions
Input and output instructions are needed for:
- Transferring information to and from AC register
- Checking the flag bits
- Controlling the interrupt facility
These instructions have an operation code 1111 (12-15), with
D7 = 1 and I = 1 and the remaining bits of the instruction
specify the particular operation.

https://jntuhbtechadda.blogspot.com/
Program Interrupt
The process of communication can be achieved in two ways:
- Programmed Control Transfer
- Interrupt Controlled Transfer
Programmed Control Transfer:
In this, the computer
- Keeps checking the flag bit
- On set, initiates an information transfer.
The information flow rate of the computer is much higher than
that of the input-output device, makes this transfer inefficient.

An alternative is to let the external device inform the computer


once it is ready for the transfer, this leads too…
https://jntuhbtechadda.blogspot.com/
Interrupt Controlled Transfer:
- This type of transfer uses the interrupt facility.
- The computer does not check the flags, while running a program.
- Once a flag is set, the computer is momentarily interrupted from
proceeding with the current program.
- The computer deviates momentarily from the current work to
take care of the input or output transfer.
- It then resumes the current program, from the point prior to the
interrupt.
The interrupt enable flip-flop (IEN) can be set and cleared with two
instructions:
- With IEN is cleared to 0 (using the IOF instruction), the flags
cannot interrupt the computer.
With IEN is set to 1 (using ION instruction), the computer can be
interrupted. https://jntuhbtechadda.blogspot.com/
These two instructions provide the programmer, the capability
of making a decision to use or not to use the interrupt facility.

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Interrupt Cycle
The R flip-flop is set to 1 if
- IEN = 1 and either FGI or FGO are equal to 1.
- The timing signals T0, T1 or T2 are not active.
The condition for setting flip-flop R to 1 can be expressed as:
T0ʹT1ʹ T2ʹ(IEN)(FGI + FGO): R ← 1

Modified fetch and decode phases will be:


RʹT0: AR ← PC
RʹT1: IR ← M[AR], PC ← PC + 1
RʹT2: D0, ... , D7 ←Decode IR(12-14), AR←IR(0-11),
I←IR(15)
https://jntuhbtechadda.blogspot.com/
During Interrupt Cycle:
RT0: AR ← 0, TR ← PC
RT1: M [AR ] ← TR, PC ← 0
RT2: PC ← PC + 1, IEN ← 0, R ← 0, SC ← 0

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
UNIT-II
Microprogrammed
Control

https://jntuhbtechadda.blogspot.com/
Topics to be Covered
ØControl Memory

ØAddress Sequencing

ØMicroprogram Example

ØDesign of Control Unit

https://jntuhbtechadda.blogspot.com/
Control Memory
• The control unit in a digital computer initiates sequences of
microoperations.
• There are finite number of different types of
microoperations in a given system.
• The number of sequences of microoperations that are
performed determines the complexity of the digital system.

• In Hardwired Control Unit the control signals are generated


by hardware using conventional logic design techniques.
• Microprogramming is an elegant and systematic method, is
a second alternative for designing the control unit of a
digital computer. https://jntuhbtechadda.blogspot.com/
The control function that specifies a microoperation is a binary
variable or control variable.
- In one binary state, the corresponding microoperation is
executed.
- The opposite binary state does not change the state of the
registers in the system.
- The active state may be either the 1 state or the 0 state,
depending on the application.

For Example, in a bus-organized system, the selection of the


paths in multiplexers, decoders, and arithmetic logic units is
based on the control signals (a group of bits) specifying
microoperations.
https://jntuhbtechadda.blogspot.com/
So, the control unit initiates a series of sequential steps of
microoperations.
But, at any given time, certain microoperations are to be
initiated, while others remain idle.
The control variables at any given time can be represented by
a string of l's and 0's called a control word.
Control words can be programmed to perform various
operations on the components of the system.
A control unit, in which binary control variables are stored in
memory is called a microprogrammed control unit.
Each word in control memory represents a microinstruction.
The microinstruction specifies one or more microoperations
for the system.
A sequence of microinstructions constitutes a microprogram.
https://jntuhbtechadda.blogspot.com/
Since alterations of the microprogram are not needed, the
control memory can be a read-only memory (ROM).
ROM words are made permanent during the hardware
production of the unit.
The use of a microprogram involves placing all control
variables in words of ROM for use by the control unit through
successive read operations.
The content of the word in ROM at a given address specifies a
microinstruction.
In dynamic microprogramming, a microprogram will be
loaded initially from an auxiliary memory such as a magnetic
disk.
Control units that use dynamic microprogramming employ a
writable control memory, but is used mostly for reading.
https://jntuhbtechadda.blogspot.com/
A computer that employs a microprogrammed control unit will
have two separate memories:
- Main Memory.
- Control Memory.
Main Memory:
- Available to the user for storing the programs, consisting of
machine instructions and data.
- The contents may alter when the data are manipulated and
the program is changed.

Control Memory:
- Holds a fixed microprogram that cannot be altered by the
occasional user.
https://jntuhbtechadda.blogspot.com/
The microprogram consists of microinstructions that specify
various internal control signals for execution of register
microoperations.

Each machine instruction initiates a series of microinstructions


in control memory.

These microinstructions generate the microoperations to:


- fetch the instruction from main memory
- evaluate the effective address
- execute the operation specified by the instruction
- return control to the fetch phase in order to repeat the cycle
for the next instruction
https://jntuhbtechadda.blogspot.com/
The general configuration of a microprogrammed control unit
is demonstrated in the following block diagram:

- The control memory here is ROM, in which all control


information is permanently stored.
- The control (memory) address register (CAR) specifies the
address of the microinstruction.
- The control data register (CDR) holds the microinstruction
read from the memory.
- The microinstruction contains a control word, that specifies
one or more microoperations for the data processor
https://jntuhbtechadda.blogspot.com/
- After the execution, the control must determine the next address
through Next Address Generator (Sequencer).
- The location of the next microinstruction may be the one next in
sequence, or may be located somewhere else in the control
memory.
- So, a microinstruction contains bits for initiating
microoperations in the data processor part and bits that also
determine the address sequence for the control memory.
- The next address may also be a function of external input
conditions.

Typical functions of a microprogram sequencer are:


• loading an initial address to start the control operations
• incrementing the control address register by one
• loading an address from control memory
• transferring an external address
https://jntuhbtechadda.blogspot.com/
The CDR holds the present microinstruction while the next
address is computed and read from memory.
Because of this, the CDR is sometimes called as a pipeline
register.
This configuration requires a two-phase clock:
- One clock applied to the address register.
- Second one to the data register.

The system can operate without the CDR by applying a single-


phase clock to the address register.
The control word and next-address information are taken
directly from the control memory.
The ROM operates as a combinational circuit, with the address
value as the input and the corresponding word as the output.
https://jntuhbtechadda.blogspot.com/
The content of the specified word in ROM remains in the
output wires as long as its address value remains in the CAR.
No read signal is needed as in a random-access memory.
With the single phase clock, the CAR is the only component in
the control system that receives clock pulses.
The other two components, the sequencer and the control
memory are combinational circuits and do not need a clock.

The main advantage of the microprogrammed control is that


once the hardware configuration is established, there is no
need for further hardware or wiring changes.

To establish a different control sequence for the system,


specify a different set of microinstructions for control memory
https://jntuhbtechadda.blogspot.com/
Address Sequencing
Microinstructions are stored in control memory in groups,
with each group as a routine.
Each computer instruction has its own microprogram routine
to generate the corresponding microoperations.

The hardware controlling the address sequencing must be


capable of:
- Sequencing the microinstructions within a routine.
- Able to branch from one routine to another.

Let us discuss the steps involved during the execution of a


single computer instruction.
https://jntuhbtechadda.blogspot.com/
1. Fetch Routine:
As soon as the computer is on, an initial address, usually the
address of the instruction fetch routine, is loaded into the CAR
The fetch routine is sequenced by incrementing the control
address register through the rest of its microinstructions.
At the end of the fetch routine, the instruction is in the IR of
the computer.
2. Effective Address Calculation Routine:
Next, the control memory must go through this routine, to
determine the effective address of the operand.
This routine can be reached through a branch microinstruction,
based on the status of the mode bits of the instruction.
After the completion of this routine, the address of the operand
is available in the memory address register (MAR).
https://jntuhbtechadda.blogspot.com/
3. Execution Routine:
Depending on the operation code, the microoperation steps
will be generated.
Each instruction has its own microprogram routine stored in a
given location of control memory.
The transformation from the instruction code bits to an address
of the routine is referred to as a mapping process.

Once the required routine is reached, the microinstructions


may be sequenced by incrementing the CAR.
Sometimes, the sequence of microoperations will depend on
values of certain status bits in processor registers.
Microprograms that employ subroutines will require an
external register for storing the return address.
https://jntuhbtechadda.blogspot.com/
4. Return to the Fetch Routine:
After the completion of execution, the control must return to
the fetch routine.
This is accomplished by executing an unconditional branch
microinstruction to the first address of the fetch routine.

The address sequencing capabilities required in a control


memory are:
1. Incrementing of the control address register.
2. Unconditional branch or conditional branch, depending on
status bit conditions.
3. A mapping process from the bits of the instruction to an
address for control memory.
4. A facility for subroutine call and return.
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
In the above block diagram of a control memory, the selection
of the next microinstruction address can happen in 4 ways:
- The incrementer increments the content of the CAR by one,
to select the next microinstruction in sequence.
- Branching is achieved by specifying the branch address in
one of the fields of the microinstruction.
- Conditional branching is obtained by using part of the
microinstruction to select a specific status bit in order to
determine its condition.
- An external address is transferred into control memory via a
mapping logic circuit.
- The return address for a subroutine from a special register
used, while returning from the subroutine.
https://jntuhbtechadda.blogspot.com/
Conditional Branching
The branch logic facilitates the decision-making capabilities in
the control unit.
The status conditions are special bits, providing information
such as:
- Carry-out of an adder,
- Sign bit of a number,
- Mode bits of an instruction,
- Input or output status conditions.
Information in these bits can be tested, i.e., for either 0 or 1
and actions initiated based on it.
The status bits and the field in the microinstruction specifying
a branch address, control the conditional branch decisions.
https://jntuhbtechadda.blogspot.com/
The simplest way is to test the specified condition:
- if the condition is met, branch to the indicated address;
- otherwise, the address register is incremented.

Let us implement using a multiplexer by considering eight


status bit conditions in the system.
Three bits in the microinstruction are used to specify any one
of eight status bit conditions and provide the selection
variables for the multiplexer.
- A ‘1’ output in the multiplexer generates a control signal to
transfer the branch address from the microinstruction into
the CAR.
- A ‘0’ output in the multiplexer causes the address register to
be incremented. https://jntuhbtechadda.blogspot.com/
An unconditional branch can be implemented by loading the
branch address from control memory into the CAR.

This can be accomplished by fixing the value of one status bit


at the input of the multiplexer, so it is always equal to 1.

A reference to this bit by the status bit select lines from control
memory causes the branch address to be loaded into the CAR
unconditionally.

https://jntuhbtechadda.blogspot.com/
Mapping of Instruction
A special type of branching is used to transfer control to the
first word of a microprogram routine for an instruction.
The status bits for this type of branch are the bits in the
operation code part of the instruction.

https://jntuhbtechadda.blogspot.com/
Here,
- An operation code of four bits is used to specify up to 16
distinct instructions.
- The control memory has 128 words, requiring an address of
seven bits, from 0000000 to 1111111.
- For each operation code there exists a microprogram routine
in control memory that executes the instruction.
- This provides for each computer instruction a microprogram
routine with a capacity of four microinstructions.
- If the routine needs more than four microinstructions, it can
use addresses 1000000 through 1111111.
- If it uses fewer than four microinstructions, the unused
memory locations would be available for other routines.
https://jntuhbtechadda.blogspot.com/
Implementation of Mapping in ROM:
Mapping concept can be extended to a more general mapping
rule by using a ROM to specify the mapping function.
In this configuration:
- The bits of the instruction specify the address of a mapping
ROM.
- The contents of the mapping ROM give the bits for the
CAR.

Advantages:
- The microprogram routine executing the instruction can be
placed in any desired location in control memory.
- The mapping concept provides flexibility for adding
instructions for control memory as the need arises.
https://jntuhbtechadda.blogspot.com/
Implementation of Mapping using PLD:
The mapping function is sometimes implemented by means of
an integrated circuit called programmable logic device(PLD).
A PLD is similar to ROM, except that it uses AND and OR
gates with internal electronic fuses.
The interconnection between inputs, AND gates, OR gates,
and outputs can be programmed as in ROM.
A mapping function, expressed in terms of Boolean
expressions can be implemented conveniently with a PLD.

https://jntuhbtechadda.blogspot.com/
Subroutines
Subroutines are programs used by other routines to accomplish
a particular task and can ne called from any point.
The identical sections of Microprograms can be saved as
Subroutines.

Example: The sequence of microoperations for the effective


address generation of the operand for an instruction is
common to all memory reference instructions.
So, this sequence could be a subroutine and can be called from
other routines for effective address computation.

https://jntuhbtechadda.blogspot.com/
Microprograms using subroutines must have a provision for:
- Storing the return address during a subroutine call.
- Restoring the address during a subroutine return.

This can be accomplished by:


- Placing the incremented output from the CAR into SBR.
- Branching to the beginning of the subroutine.
- SBR providing the return address to the main routine.

The best way to implement SBR as a register file, organizing


the registers in a last-in, first-out (LIFO) stack.

https://jntuhbtechadda.blogspot.com/
Microprogram Example
After the configuration of a computer and establishment of its
microprogrammed control unit, the designer's task is to
generate the microcode for the control memory.

This code generation is called microprogramming and is


similar to conventional machine language programming.

A simple digital computer is considered to show the process of


microprogramming.

The computer used here is similar but not identical to the basic
computer. https://jntuhbtechadda.blogspot.com/
Computer Configuration

https://jntuhbtechadda.blogspot.com/
From the above block diagram it is observed that:
It consists of two memory units:
- Main memory for storing instructions and data.
- Control memory for storing the microprogram.

Four registers are associated with the processor unit:


- Program Counter PC.
- Address Register AR.
- Data Register DR.
- Accumulator register AC.
Two registers are associated with the control unit:
- Control Address Register CAR.
- Subroutine Register SBR.
https://jntuhbtechadda.blogspot.com/
Multiplexers are used for the transfer of information among
the registers in the processor, instead of a common bus.

- DR can receive information from AC, PC, or memory.


- AR can receive information from PC or DR.
- PC can receive information only from AR.
- The arithmetic, logic, and shift unit performs
microoperations with data from AC and DR and places the
result in AC.
- Memory receives its address from AR.
- Input data written to memory come from DR.
- Data read from memory can go only to DR.

https://jntuhbtechadda.blogspot.com/
The computer instruction format is:

It consists of three fields:


- 1-bit field for indirect addressing symbolized by I.
- 4-bit operation code (opcode).
- 11-bit address field.

Four of the 16 possible memory-reference instructions:

https://jntuhbtechadda.blogspot.com/
- ADD instruction adds the content of the operand found in
the effective address to the content of AC.
- BRANCH instruction causes a branch to the effective
address if the operand in AC is negative.
- STORE instruction transfers the content of AC into the
memory word specified by the effective address.
- EXCHANGE instruction swaps the data between AC and
the memory word specified by the effective address.

To make the microprogramming example simple, only four


instructions, considered here are microprogrammed.
The remaining 12 instructions can also be microprogrammed
in a similar manner.
https://jntuhbtechadda.blogspot.com/
Microinstruction Format
The microinstruction format for the control memory is:

The 20 bits of the microinstruction are divided into four


functional parts:
- The three fields F1, F2, and F3 specify microoperations for
the computer.
- The CD field selects status bit conditions.
- The BR field specifies the type of branch to be used.
- The AD field contains a branch address.
- The address field is seven bits wide, since the control
memory has 128 = 27 words.
https://jntuhbtechadda.blogspot.com/
The microoperations are subdivided into three fields of three
bits each.
The three bits in each field are encoded to specify seven
distinct microoperations as:

https://jntuhbtechadda.blogspot.com/
Symbolic Microinstructions
The microinstructions can be specified by using the symbols.
A symbolic microprogram can be translated into its binary
equivalent using an assembler, similar to conventional.
Each line of the assembly language microprogram defines a
symbolic microinstruction.

Each symbolic microinstruction is divided into five fields:


- Label.
- Microoperations.
- CD.
- BR.
- AD. https://jntuhbtechadda.blogspot.com/
The fields specify the following information:
1. The label field may be empty or it may specify a symbolic
address. A label is terminated with a colon (:).
2. The microoperations field consists of one, two, or three
symbols, separated by commas.
3. The CD field has one of the letters U, I, S, or Z.
4. The BR field contains one of the four symbols.
5. The AD field specifies a value for the address field of the
microinstruction in one of three possible ways:
a. With a symbolic address, appeared as a label.
b. With the symbol NEXT to designate the next address in
sequence.
c. When the BR field contains a RET or MAP symbol, the AD
field is left empty and is converted to seven zeros by the
assembler. https://jntuhbtechadda.blogspot.com/
The pseudoinstruction ORG is used to define the origin, or
first address, of a microprogram routine.
For example ORG 64 informs the assembler to place the next
microinstruction in control memory at decimal address 64,
which is equivalent to the binary address 1000000.

https://jntuhbtechadda.blogspot.com/
The Fetch Routine
The control memory has 128 words, and each word contains
20 bits.
So, to microprogram the control memory, it is necessary to
determine the bit values of each of the 128 words.
The first 64 words (addresses 0 to 63) are occupied by the
routines for the 16 instructions, the remaining for any other
purpose.
A convenient starting location for the fetch routine is address
64.

The microinstructions needed for the fetch routine are:


AR ← PC
DR ← M[AR], PC ← PC + 1
https://jntuhbtechadda.blogspot.com/
AR ← DR (0-10), CAR(2-5) ← DR (l l-14), CAR(0, 1,6) ← 0
The fetch routine needs three microinstructions, which are
placed in control memory at addresses 64, 65, and 66.
Using the assembly language conventions, the symbolic
microprogram for the fetch routine as follows:

ORG 64
FETCH: PCTAR U JMP NEXT
READ, INCPC U JMP NEXT
DRTAR U MAP

https://jntuhbtechadda.blogspot.com/
Symbolic Microprogram
The execution of the third (MAP) microinstruction in the fetch
routine results in a branch to address 0xxxx00, where xxxx are
the four bits of the operation code.

Example:
For ADD instruction with opcode is 0000, the MAP
microinstruction will transfer the address 0000000 to CAR,
the start address for the ADD routine in control memory.
Similarly, the first address for the BRANCH and STORE
routines are 0 0001 00 (decimal 4) and 0 0010 00 (decimal 8),
respectively.
The first address for the other 13 routines are at address values
12, 16, 20, …, 60, i.e., four words in control memory for each
routine. https://jntuhbtechadda.blogspot.com/
In each routine, the microinstructions are required for:
- Evaluating the effective address.
- Executing the instruction.

As the indirect address mode is associated with all memory-


reference instructions, the microinstructions for the indirect
address are stored as a subroutine INDRCT and is located after
the fetch routine.

The INDRCT subroutine has two microinstructions:


INDRCT: READ U JMP NEXT
DRTAR U RET
The partial Symbolic Microprogram with the ADD,
https://jntuhbtechadda.blogspot.com/
BRANCH, STORE, EXCHANGE, fetch and INDRCT is:
https://jntuhbtechadda.blogspot.com/
Binary Microprogram
The symbolic microprogram is a convenient form for writing
microprograms as it is easy to read and understand.
But the microprogram is not stored in memory as a symbolic
form, but as binary.
So, the symbolic microprogram must be translated to binary
either by means of an assembler program or by the user.

The equivalent binary form of the above symbolic


microprogram is:

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Design of Control Unit
The bits of the microinstruction are divided into fields, with
each field defining a distinct, separate function.
The various fields encountered in instruction formats provide:
- Control bits to initiate microoperations in the system.
- Special bits to specify the way that the next address is to be
evaluated.
- An address field for branching.

The number of control bits initiating microoperations can be


reduced by grouping mutually exclusive variables into fields.
So, encoding the k bits in each field provide 2k
microoperations. https://jntuhbtechadda.blogspot.com/
The disadvantages are:
- Each field requires a decoder to produce the corresponding
control signals.
- So, requires additional hardware external to the control
memory.
- Increases the delay time of the control signals as they must
propagate through the decoding circuits.

The nine bits of the microoperation field are divided into three
subfields of three bits each.
The control memory output of each subfield must be decoded
to provide the distinct microoperations.
The outputs of the decoders are connected to the appropriate
inputs in the processor unit.
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Microprogram Sequencer
The basic components of a microprogrammed control unit are:
- Control Memory.
- Circuits selecting the next address.
The address selection part is called a microprogram
sequencer.

A microprogram sequencer can be constructed with digital


functions to suit a particular/wide range of applications.

For this, an integrated circuit sequencer must provide an


internal organization that can be adapted to a wide range of
applications. https://jntuhbtechadda.blogspot.com/
The purpose of a microprogram sequencer is to provide an
address to the control memory to read and execute a
microinstruction.
The next-address logic of the sequencer determines the
specific address source to be loaded into the CAR.
The choice of the address source is guided by the next-address
information bits that the sequencer receives from the present
microinstruction.

Commercial sequencers include within the unit an internal


register stack used for temporary storage of addresses during
microprogram looping and subroutine calls.
Some sequencers provide an output register which can
function as the address register for the control memory.
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
1
BVRIT HYDERABAD College of Engineering for Women

Central Processing Unit

Introduction:
Intel produced first 4-bit microprocessor 4004 in 1971 and 8-bit microprocessor 8008 in 1972, which
are not general purpose. In 1974, Intel launched the first general purpose 8-bit microprocessor 8080
and later 8085 with few more added features in the architecture to be a functionally complete
microprocessor.
The limitations of 8-bit microprocessors are:
- Low speed
- Low memory addressing capability
- Limited number of general purpose registers
- Less powerful instruction set

General Register Organization


·
· The Central Processing Unit (CPU) is called the brain of the computer that performs
data-
processing operations. Figure 3.1 shows the three major parts of CPU.
·

Intermediate data is stored in the register set during the execution of the instructions. The
microoperations required for executing the instructions are performed by the arithmetic
logic unit whereas the control unit takes care of transfer of information among the registers
and guides the ALU. The control unit services the transfer of information among the
registers and instructs the ALU about which operation is to be performed. The computer
instruction set is meant for providing the specifications for the design of the CPU. The
design of the CPU largely, involves choosing the hardware for implementing the machine
instructions.
·
· The need for memory locations arises for storing pointers, counters, return address,
temporary results and partial products. Memory access consumes the most of the time off an
operation in a computer. It is more convenient and more efficient to store these intermediate
values in processor registers.

·
·
·
·
·
·
https://jntuhbtechadda.blogspot.com/

Faculty: P Kavitha, G Nagamani, R S Murali Nath Subject: COA


2
BVRIT HYDERABAD College of Engineering for Women

· A common bus system is employed to contact registers that are included in the CPU
in a large number. Communications between registers is not only for direct data transfer but
also for performing various micro-operations. A bus organization for such CPU register
shown in Figure 3.2, is connected to two multiplexers (MUX) to form two buses A and B.
The selected lines in each multiplexers select one register of the input data for the particular
bus.

OPERATION OF CONTROL UNIT:


The control unit directs the information flow through ALU by:
- Selecting various Components in the system
- Selecting the Function of ALU

[1] MUX A selector (SELA): BUS A R2


[2] MUX B selector (SELB): BUS B R3
[3] ALU operation selector (OPR): ALU to ADD
[4] Decoder destination selector (SELD): R1 Out Bus

Control Word

Encoding of register selection fields

https://jntuhbtechadda.blogspot.com/

Faculty: P Kavitha, G Nagamani, R S Murali Nath Subject: COA


3
BVRIT HYDERABAD College of Engineering for Women

Encoding of ALU operations

https://jntuhbtechadda.blogspot.com/

Faculty: P Kavitha, G Nagamani, R S Murali Nath Subject: COA


4
BVRIT HYDERABAD College of Engineering for Women

Stack organization:
A stack is a storage device that stores information in such a manner that the item
stored last is the first item retrieved.
The stack in digital computers is essentially a memory unit with an address register that
can count only. The register that holds the address for the stack is called a stack pointer
(SP) because its value always points at the top item in the stack.
The physical registers of a stack are always available for reading or writing. It is
the content of the word that is inserted or deleted.

Register stack:

Figure 5.1: Block diagram of a 64-word stack

A stack can be placed in a portion of a large memory or it can be organized as a collection


of a finite number of memory words or registers. Figure shows the organization of a 64-
word register stack.
The stack pointer register SP contains a binary number whose value is equal to the address
of the word that is currently on top of the stack. Three items are placed in the stack: A, B,
and C, in that order. Item C is on top of the stack so that the content of SP is now 3.

https://jntuhbtechadda.blogspot.com/

Faculty: P Kavitha, G Nagamani, R S Murali Nath Subject: COA


5
BVRIT HYDERABAD College of Engineering for Women

To remove the top item, the stack is popped by reading the memory word at address 3 and
decrementing the content of SP. Item B is now on top of the stack since SP holds address
2.
To insert a new item, the stack is pushed by incrementing SP and writing a word in the
next-higher location in the stack.
In a 64-word stack, the stack pointer contains 6 bits because 26 = 64.
Since SP has only six bits, it cannot exceed a number greater than 63 (111111 in binary).
When 63 are incremented by 1, the result is 0 since 111111 + 1 = 1000000 in binary, but
SP can accommodate only the six least significant bits.
Similarly, when 000000 is decremented by 1, the result is 111111. The one-bit register
FULL is set to 1 when the stack is full, and the one-bit register EMTY is set to 1 when the
stack is empty of items.
DR is the data register that holds the binary data to be written into or read out of the stack.

PUSH:
If the stack is not full (FULL =0), a new item is inserted with a push operation. The push
operation consists of the following sequences of microoperations:

SP ← SP + 1 Increment stack pointer


M [SP] ← DR WRITE ITEM ON TOP OF THE STACK
IF (SP = 0) then (FULL ← 1) Check is stack is full
EMTY ← 0 Mark the stack not empty

The stack pointer is incremented so that it points to the address of next-higher word. A
memory write operation inserts the word from DR into the top of the stack.
SP holds the address of the top of the stack and that M[SP] denotes the memory
word specified by the address presently available in SP.
The first item stored in the stack is at address 1. The last item is stored at address 0. If SP
reaches 0, the stack is full of items, so FULL is set to 1. This condition is reached if the
top item prior to the last push was in location 63 and, after incrementing SP, the last item
is stored in location 0.
Once an item is stored in location 0, there are no more empty registers in the stack. If an item
is written in the stack, obviously the stack cannot be empty, so EMTY is cleared to 0.

POP:
A new item is deleted from the stack if the stack is not empty (if EMTY = 0). The pop
operation consists of the following sequences of microoperations:

DR ← M [SP] Read item on top of the stack


SP ← SP - 1 Decrement stack pointer
IF (SP = 0) then (EMTY ← 1) Check if stack is empty
FULL ← 0 Mark the stack not full
I

https://jntuhbtechadda.blogspot.com/

Faculty: P Kavitha, G Nagamani, R S Murali Nath Subject: COA


6
BVRIT HYDERABAD College of Engineering for Women

The top item is read from the stack into DR. The stack pointer is then decremented. If
its value reaches zero, the stack is empty, so EMTY is set to 1.

This condition is reached if the item read was in location 1.


Once this item is read out, SP is decremented and reaches the value 0, which is the initial
value of SP. If a pop operation reads the item from location 0 and then SP is decremented,
SP is changes to 111111, which is equivalent to decimal 63.
In this configuration, the word in address 0 receives the last item in the stack. Note also
that an erroneous operation will result if the stack is pushed when FULL = 1 or popped
when EMTY = 1.

Memory Stack.

The implementation of a stack in the CPU is done by assigning a portion of memory to


a stack operation and using a processor register as a stack pointer.
Figure shows a portion of computer memory partitioned into three segments: program,
data, and stack.
The program counter PC points at the address of the next instruction in the program
which is used during the fetch phase to read an instruction.

Figure: Computer memory with program, data, and stack segments

The address registers AR points at an array of data which is used during the execute
phase to read an operand.
The stack pointer SP points at the top of the stack which is used to push or pop items into
or from the stack.
The three registers are connected to a common address bus, and either one can provide an
address for memory.
As shown in Figure 5.2, the initial value of SP is 4001 and the stack grows with
decreasing addresses. Thus the first item stored in the stack is at address 4000, the second
item is stored at address 3999, and the last address that can be used for the stack is 3000.
We assume that the items in the stack communicate with a data register DR.
https://jntuhbtechadda.blogspot.com/

Faculty: P Kavitha, G Nagamani, R S Murali Nath Subject: COA


7
BVRIT HYDERABAD College of Engineering for Women

PUSH
A new item is inserted with the push operation as follows:
SP←SP-
1 M[SP] ←
DR
The stack pointer is decremented so that it points at the address of the next word. A
memory write operation inserts the word from DR into the top of the stack.

POP
A new item is deleted with a pop operation as follows:
DR ←
M[SP] SP
← SP+1
The top item is read from the stack into DR.
The stack pointer is then incremented to point at the next item in the stack.
The two microoperations needed for either the push or pop are (1) an access to memory
through SP, and (2) updating SP.Which of the two microoperations is done first and
whether SP is updated by incrementing or decrementing depends on the organization of
the stack.
In figure. the stack grows by decreasing the memory address. The stack may be
constructed to grow by increasing the memory also.The advantage of a memory stack
is that the CPU can refer to it without having to specify an address, since the address
is always available and automatically updated in the stack pointer.

Instruction formats
Insruction fields:
OP-code field - specifies the operation to be performed
Address field - designates memory address(s) or a processor register(s)
Mode field - specifies the way the operand or the effective address is determined.

The number of address fields in the instruction format depends on the internal organization of CPU.
The three most common CPU organizations:

1. Single Accumulator Organization


2. General Register Organization
3. Stack Organization

Based on the no of registers used we have 4 types of instruction formats


1. Three address instruction
2. Two address instruction
3. One address instruction
4. Zero address instruction

Three and two address instruction formats can be used in General register
organization. https://jntuhbtechadda.blogspot.com/

Faculty: P Kavitha, G Nagamani, R S Murali Nath Subject: COA


8
BVRIT HYDERABAD College of Engineering for Women

One address instruction format is used in Single Accumulator organization


Zero address instruction format is used in Stack organization

ADDRESSING MODES

Specifies a rule for interpreting or modifying the address field of the instruction (before the
operand is actually referenced)

* Variety of addressing modes


- to give programming flexibility to the user
- to use the bits in the address field of the
instruction efficiently

TYPES OF ADDRESSING MODES :

Register Indirect Mode


Instruction specifies a register which contains the memory address of the operand
- Saving instruction bits since register addres is shorter than the memory address
- Slower to acquire an operand than both the register addressing or memory addressing
- EA = [IR(R)] ([x]: Content of x)
-
Auto-increment or Auto-decrement features:
Same as the Register Indirect, but:
When the address in the register is used to access memory, the value in the register is
Incremented or decremented by 1 (after or before the execution of the instruction)

Direct Address Mode


Instruction specifies the memory address which can be used directly to the physical memory
- Faster than the other memory addressing modes
- Too many bits are needed to specify the address for a large physical memory space
- EA = IR(address), (IR(address): address field of IR)
https://jntuhbtechadda.blogspot.com/

Faculty: P Kavitha, G Nagamani, R S Murali Nath Subject: COA


9
BVRIT HYDERABAD College of Engineering for Women

Indirect Addressing Mode


The address field of an instruction specifies the address of a memory location that contains
the address of the operand
- When the abbreviated address is used, large physical memory can be addressed with a relatively
small number of bits
- Slow to acquire an operand because of an additional memory access
- EA = M[IR(address)]

Relative Addressing Modes


The Address fields of an instruction specifies the part of the address(abbreviated address) which
can be used along with a designated register to calculate the address of the operand
PC Relative Addressing Mode(R = PC)
- EA = PC + IR(address)
- Address field of the instruction is short
- Large physical memory can be accessed with a small number of address bits

Indexed Addressing Mode


XR: Index Register:
- EA = XR + IR(address)
Base Register Addressing
Mode BAR: Base Address
Register:
- EA = BAR + IR(address)

ADDRESSING MODES - EXAMPLES


AddressMemory
200 Load to AC Mode
PC = 200 201 Address = 500
202 Next instruction
R1 = 400

399 450
XR = 100
400 700

AC
500 800

600 900
Addressing Effective Content
Mode Address of AC
Direct address 500 /* AC (500) */ 800 702 325
Immediate operand - /* AC 500 */ 500
Indirect address 800 /* AC ((500)) */ 300
Relative address 702 /* AC (PC+500) */ 325 800 300
Indexed address 600 /* AC (XR+500) */ 900
Register - /*AC R1 */ 400
Register indirect 400 /* AC (R1) */ 700
Autoincrement 400 /* AC (R1)+ */ 700
Autodecrement 399 /* AC -(R) */ 450

https://jntuhbtechadda.blogspot.com/

Faculty: P Kavitha, G Nagamani, R S Murali Nath Subject: COA


10
BVRIT HYDERABAD College of Engineering for Women

Data Transfer Instructions


Data transfer instructions move data from one place in the computer to another
without changing the data content.
The most common transfers are between memory and processor registers, between
processor registers and input or output, and between the processor registers themselves.
The load instruction has been used mostly to designate a transfer from memory to
a processor register, usually an accumulator.
The store instruction designates a transfer from a processor register into memory.
The move instruction has been used in computers with multiple CPU registers to designate
a transfer from one register to another. It has also been used for data transfers between
CPU registers and memory or between two memory words.
The exchange instruction swaps information between two registers or a register and a
memory word.
The input and output instructions transfer data among processor registers and input
or output terminals.
The push and pop instructions transfer data between processor registers and a
memory stack.

Data Manipulation Instructions


Three Basic Types: Arithmetic instructions
Logical and bit manipulation instructions
Shift instructions

https://jntuhbtechadda.blogspot.com/

Faculty: P Kavitha, G Nagamani, R S Murali Nath Subject: COA


11
BVRIT HYDERABAD College of Engineering for Women

Arithmetic Instructions

Name Mnemonic
Increment INC
Decrement DEC
Add ADD
Subtract SUB
Multiply MUL
Divide DIV
Add with Carry ADDC
Subtract with Borrow SUBB
Negate(2’s Complement) NEG

Logical and Bit Manipulation Instructions Shift Instructions


Name Mnemonic Name Mnemonic
Clear CLR Logical shift right SHR
Complement COM Logical shift left SHL
AND AND Arithmetic shift right SHRA
OR OR Arithmetic shift left SHLA
Exclusive-OR XOR Rotate right ROR
Clear carry CLRC Rotate left ROL
Rotate right thru
Set carry SETC carry RORC
Complement carry COMC Rotate left thru carry ROLC
Enable interrupt EI
Disable
interrupt DIe

Program Control Instructions

It is sometimes convenient to supplement the ALU circuit in the CPU with a status
register where status bit conditions be stored for further analysis. Status bits are also
called condition-code bits or flag bits.
Figure 5.3 shows the block diagram of an 8-bit ALU with a 4-bit status register. The four
status bits are symbolized by C, S, Z, and V. The bits are set or cleared as a result of an
operation performed in the ALU.

Figure 5.3: Status Register Bits

1. Bit C (carry) is set to 1 if the end carry C8 is 1. It is cleared to 0 if the carry is 0.


2. Bit S (sign) is set to 1 if the highest-order bit F7 is 1. It is set to 0 if set to 0 if
the bit is 0.
https://jntuhbtechadda.blogspot.com/

Faculty: P Kavitha, G Nagamani, R S Murali Nath Subject: COA


12
BVRIT HYDERABAD College of Engineering for Women

3. Bit Z (zero) is set to 1 if the output of the ALU contains all 0’s. it is cleared to 0
otherwise. In other words, Z = 1 if the output is zero and Z = 0 if the output is not
zero.
4. Bit V (overflow) is set to 1 if the exclusives-OR of the last two carries is equal to
1, and cleared to 0 otherwise. This is the condition for an overflow when negative
numbers are in 2’s complement. For the 8-bit ALU, V = 1 if the output is greater
than + 127 or less than -128.
The status bits can be checked after an ALU operation to determine certain
relationships that exist between the vales of A and B.
If bit V is set after the addition of two signed numbers, it indicates an overflow condition.
If Z is set after an exclusive-OR operation, it indicates that A = B.
A single bit in A can be checked to determine if it is 0 or 1 by masking all bits except
the bit in question and then checking the Z status bit.

Program Control Instructions

CMP and TST instructions do not retain their results of operations(- and AND, respectively).
They only set or clear certain Flags.

Conditional Branch Instructions


Mnemonic Branch condition Tested condition
BZ Branch if zero Z=1
BNZ Branch if not zero Z=0
BC Branch if carry C=1
BNC Branch if no carry C=0
BP Branch if plus S=0
BM Branch if minus S=1
BV Branch if overflow V=1
BNV Branch if no overflow V=0
Unsigned compare conditions (A - B)
BHI Branch if higher A>B
BHE Branch if higher or equal A B
BLO Branch if lower A<B
BLOE Branch if lower or equal A B
BE Branch if equal A=B
BNE Branch if not equal A B
Signed compare conditions (A - B)
BGT Branch if greater than A>B
BGE Branch if greater or equal A B
BLT Branch if less than A<B
BLE Branch if less or equal A B
BE Branch if equal A=B
BNE Branch if nothttps://jntuhbtechadda.blogspot.com/
equal A B

Faculty: P Kavitha, G Nagamani, R S Murali Nath Subject: COA


13
BVRIT HYDERABAD College of Engineering for Women

Subroutine Call and Return

SUBROUTINE CALL Call subroutine


Jump to subroutine
Branch to subroutine
Branch and save return address
Two Most Important Operations are Implied;

* Branch to the beginning of the Subroutine


- Same as the Branch or Conditional Branch

* Save the Return Address to get the address of the location in the Calling Program upon exit from
the Subroutine
- Locations for storing Return Address:
• Fixed Location in the subroutine(Memory)
• Fixed Location in memory
• In a processor Register
• In a memory stack
- most efficient way

PROGRAM INTERRUPT:
The concept of program interrupt is used to handle a variety of problems that arise out
of normal program sequence.
Program interrupt refers to the transfer of program control from a currently running program
to another service program as a result of an external or internal generated request. Control
returns to the original program after the service program is executed.
After a program has been interrupted and the service routine been executed, the CPU
must return to exactly the same state that it was when the interrupt occurred.
Only if this happens will the interrupted program be able to resume exactly as if nothing had
happened.
The state of the CPU at the end of the execute cycle (when the interrupt is recognized)
is determined from:

1. The content of the program counter


2. The content of all processor registers
3. The content of certain status conditions

The interrupt facility allows the running program to proceed until the input or output device
sets its ready flag. Whenever a flag is set to 1, the computer completes the execution of the
instruction in progress and then acknowledges the interrupt.
The result of this action is that the retune address is stared in location 0. The instruction in
location 1 is then performed; this initiates a service routine for the input or output transfer.
The service routine can be stored in location 1.
The service routine must have instructions to perform the following tasks:

1. Save contents of processor registers.


2. Check which flag is set.
3. Service the device whose flag is set.
https://jntuhbtechadda.blogspot.com/

Faculty: P Kavitha, G Nagamani, R S Murali Nath Subject: COA


14
BVRIT HYDERABAD College of Engineering for Women

4. Restore contents of processor registers.


5. Turn the interrupt facility on.
6. Return to the running program.

Types of interrupts.:
There are three major types of interrupts that cause a break in the normal execution of a program.
They can be classified as:
1. External interrupts
2. Internal interrupts
3. Software interrupts

1) External interrupts:
External interrupts come from input-output (I/0) devices, from a timing device, from a
circuit monitoring the power supply, or from any other external source.
Examples that cause external interrupts are I/0 device requesting transfer of data, I/o device
finished transfer of data, elapsed time of an event, or power failure. Timeout interrupt may
result from a program that is in an endless loop and thus exceeded its time allocation.
Power failure interrupt may have as its service routine a program that transfers the complete
state of the CPU into a nondestructive memory in the few milliseconds before power ceases.
External interrupts are asynchronous. External interrupts depend on external conditions
that are independent of the program being executed at the time.

2) Internal interrupts:
Internal interrupts arise from illegal or erroneous use of an instruction or data.
Internal interrupts are also called traps.
Examples of interrupts caused by internal error conditions are register overflow, attempt to
divide by zero, an invalid operation code, stack overflow, and protection violation. These
error conditions usually occur as a result of a premature termination of the instruction
execution. The service program that processes the internal interrupt determines the corrective
measure to be taken.
Internal interrupts are synchronous with the program. . If the program is rerun, the
internal interrupts will occur in the same place each time.

3) Software interrupts:
A software interrupt is a special call instruction that behaves like an interrupt rather than a
subroutine call. It can be used by the programmer to initiate an interrupt procedure at any
desired point in the program.
The most common use of software interrupt is associated with a supervisor call instruction.
This instruction provides means for switching from a CPU user mode to the supervisor mode.
Certain operations in the computer may be assigned to the supervisor mode only, as for
example, a complex input or output transfer procedure. A program written by a user must run
in the user mode.
When an input or output transfer is required, the supervisor mode is requested by means of a
supervisor call instruction. This instruction causes a software interrupt that stores the old CPU
state and brings in a new PSW that belongs to the supervisor mode.
The calling program must pass information to the operating system in order to specify
the particular task requested.
https://jntuhbtechadda.blogspot.com/

Faculty: P Kavitha, G Nagamani, R S Murali Nath Subject: COA


UNIT III

Computer Arithmetic
Ms. Nagamani G
Ms. Kavitha P
Mr. Murali Nath R S

https://jntuhbtechadda.blogspot.com/
Addition & Subtraction
The Addition and Subtraction operations can be
performed on the data represented as:

- Signed-Magnitude

- Signed-2’s Complement

- Floating-Point

- BCD

- Decimal
https://jntuhbtechadda.blogspot.com/
Signed-Magnitude

Addition: A + B A: Augend B: Addend


Subtraction: A - B A: Minuend B: Subtrahend

https://jntuhbtechadda.blogspot.com/
Hardware Implementation

https://jntuhbtechadda.blogspot.com/
Flowchart for Add and Subtract operations

https://jntuhbtechadda.blogspot.com/
Signed-2’s Complement

https://jntuhbtechadda.blogspot.com/
Floating-Point

https://jntuhbtechadda.blogspot.com/
Flowchart

https://jntuhbtechadda.blogspot.com/
BCD

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Multiplication algorithms

https://jntuhbtechadda.blogspot.com/
Manual Process
This process involves shift and add operations.

The sign of the product is positive, if both the operands


have the same sign;https://jntuhbtechadda.blogspot.com/
otherwise negative.
Hardware Implementation

https://jntuhbtechadda.blogspot.com/
Flowchart

https://jntuhbtechadda.blogspot.com/
Example

https://jntuhbtechadda.blogspot.com/
Booth Algorithm
Signed 2’s Complement Numbers
A String of 0s in the multiplier require no addition, but shifting
A String of 1s in the multiplier from bit weight 2k to weight 2m can be
treated as 2k+1 – 2m.

The multiplication process is:


1. The multiplicand i subtracted from the partial product upon
encountering the first least significant 1 in a string of 1’s in the
multiplier (10).
2. The multiplicand is added to the partial product upon encountering
the first 0 (previous should be 1) in a string of 0’s in the multiplier
(01).
3. The partial product does not change when the multiplier bit is
identical to the previoushttps://jntuhbtechadda.blogspot.com/
multiplier bit (00 or 11).
Hardware Implementation

https://jntuhbtechadda.blogspot.com/
Flowchart

https://jntuhbtechadda.blogspot.com/
Example

https://jntuhbtechadda.blogspot.com/
Array Multiplier

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Floating-Point Numbers
The multiplication of two floating-point numbers
require:
- Multiplication of mantissas &
- Addition of the exponents

The multiplication algorithm can be subdivided into four


parts:
1. Check for zeros
2. Add the exponents
3. Multiply the mantissas
4. Normalize the product

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
division algorithms

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Input-Output Organization 1

UNIT IV INPUT-OUTPUT ORGANIZATION

• Peripheral Devices

• Input-Output Interface

• Asynchronous Data Transfer

• Modes of Transfer

• Priority Interrupt

• Direct Memory Access

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 2 Peripheral Devices

PERIPHERAL DEVICES

Input Devices Output Devices


• Keyboard • Card Puncher, Paper Tape Puncher
• Optical input devices • CRT
- Card Reader • Printer (Impact, Ink Jet,Laser, Dot Matrix
- Paper Tape Reader • Plotter
- Bar code reader • Analog
- Digitizer • Voice
- Optical Mark Reader
• Magnetic Input Devices
- Magnetic Stripe Reader
• Screen Input Devices
- Touch Screen
- Light Pen
- Mouse
• Analog Input Devices

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 3 Input/Output Interfaces

INPUT/OUTPUT INTERFACES
* Provides a method for transferring information between internal storage
(such as memory and CPU registers) and external I/O devices

* Resolves the differences between the computer and peripheral devices

- Peripherals - Electromechanical Devices


CPU or Memory - Electronic Device

- Data Transfer Rate


Peripherals - Usually slower
CPU or Memory - Usually faster than peripherals
Some kinds of Synchronization mechanism may be needed

- Unit of Information
Peripherals - Byte
CPU or Memory - Word

- Operating Modes
Peripherals - Autonomous, Asynchronous
CPU or Memory - Synchronous
https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 4 Input/Output Interfaces

I/O BUS AND INTERFACE MODULES


I/O bus
Data
Processor Address
Control

Interface Interface Interface Interface

Keyboard
and Printer Magnetic Magnetic
display disk tape
terminal

Each peripheral has an interface module associated with it

Interface
- Decodes the device address (device code)
- Decodes the commands (operation)
- Provides signals for the peripheral controller
- Synchronizes the data flow and supervises
the transfer rate between peripheral and CPU or Memory
Typical I/O instruction
Device address
Op. code Function code
(Command)
https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 5 Input/Output Interfaces

CONNECTION OF I/O BUS

Op. Device Function Accumulator Computer


code address code register I/O
control
CPU
Sense lines
Data lines
Function code lines
I/O
bus
Device address lines

Connection of I/O Bus to CPU

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 6 Input/Output Interfaces

I/O BUS AND MEMORY BUS


Functions of Buses

* MEMORY BUS is for information transfers between CPU and the MM


* I/O BUS is for information transfers between CPU
and I/O devices through their I/O interface

Physical Organizations
* Many computers use a common single bus system
for both memory and I/O interface units
- Use one common bus but separate control lines for each function
- Use one common bus with common control lines for both functions
* Some computer systems use two separate buses,
one to communicate with memory and the other with I/O interfaces
I/O Bus
- Communication between CPU and all interface units is via a common
I/O Bus
- An interface connected to a peripheral device may have a number of
data registers , a control register, and a status register
- A command is passed to the peripheral by sending
to the appropriate interface register
- Function code and sense lines are not needed (Transfer of data, control,
and status information is always via the common I/O Bus)
https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 7 Input/Output Interfaces

ISOLATED vs MEMORY MAPPED I/O


Isolated I/O
- Separate I/O read/write control lines in addition to memory read/write
control lines
- Separate (isolated) memory and I/O address spaces
- Distinct input and output instructions

Memory-mapped I/O
- A single set of read/write control lines
(no distinction between memory and I/O transfer)
- Memory and I/O addresses share the common address space
-> reduces memory address range available
- No specific input or output instruction
-> The same memory reference instructions can
be used for I/O transfers
- Considerable flexibility in handling I/O operations

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 8 Asynchronous Data Transfer

ASYNCHRONOUS DATA TRANSFER

Synchronous and Asynchronous Operations


Synchronous - All devices derive the timing
information from common clock line
Asynchronous - No common clock

Asynchronous Data Transfer


Asynchronous data transfer between two independent units requires that
control signals be transmitted between the communicating units to
indicate the time at which data is being transmitted

Two Asynchronous Data Transfer Methods


Strobe pulse
- A strobe pulse is supplied by one unit to indicate
the other unit when the transfer has to occur

Handshaking
- A control signal is accompanied with each data
being transmitted to indicate the presence of data
- The receiving unit responds with another control
signal to acknowledge receipt of the data
https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 9 Asynchronous Data Transfer

STROBE CONTROL
* Employs a single control line to time each transfer
* The strobe may be activated by either the source or the destination unit

Source-Initiated Strobe Destination-Initiated Strobe


for Data Transfer for Data Transfer

Block Diagram Block Diagram


Data bus Data bus
Source Destination Source Destination
unit Strobe unit unit Strobe unit

Timing Diagram Timing Diagram


Valid data Valid data
Data Data

Strobe Strobe

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 10 Asynchronous Data Transfer

HANDSHAKING

Strobe Methods

Source-Initiated

The source unit that initiates the transfer has


no way of knowing whether the destination unit
has actually received data

Destination-Initiated

The destination unit that initiates the transfer


no way of knowing whether the source has
actually placed the data on the bus

To solve this problem, the HANDSHAKE method


introduces a second control signal to provide a Reply
to the unit that initiates the transfer

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 11 Asynchronous Data Transfer

SOURCE-INITIATED TRANSFER USING HANDSHAKE


Data bus
Source Data valid Destination
Block Diagram unit Data accepted unit

Valid data
Data bus
Timing Diagram

Data valid

Data accepted

Sequence of Events Source unit Destination unit


Place data on bus.
Enable data valid.
Accept data from bus.
Enable data accepted
Disable data valid.
Invalidate data on bus.
Disable data accepted.
Ready to accept data
(initial state).
* Allows arbitrary delays from one state to the next
* Permits each unit to respond at its own data transfer rate
* The rate of transfer is determined by the slower unit
https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 12 Asynchronous Data Transfer

DESTINATION-INITIATED TRANSFER USING HANDSHAKE


Data bus
Block Diagram Source Data valid Destination
unit Ready for data unit

Timing Diagram Ready for data

Data valid

Valid data
Data bus

Sequence of Events Source unit Destination unit


Ready to accept data.
Place data on bus. Enable ready for data.
Enable data valid.

Accept data from bus.


Disable data valid. Disable ready for data.
Invalidate data on bus
(initial state).

* Handshaking provides a high degree of flexibility and reliability because the


successful completion of a data transfer relies on active participation by both units
* If one unit is faulty, data transfer will not be completed
-> Can be detected by means ofhttps://jntuhbtechadda.blogspot.com/
a timeout mechanism
Computer Organization Computer Architectures Lab
Input-Output Organization 13 Asynchronous Data Transfer

ASYNCHRONOUS SERIAL TRANSFER

Asynchronous Serial Transfer

- Employs special bits which are inserted at both


ends of the character code
- Each character consists of three parts; Start bit; Data bits; Stop bits.

1 1 0 0 0 1 0 1
Start Character bits Stop
bit bits
(1 bit) (at least 1 bit)

A character can be detected by the receiver from the knowledge of 4 rules;


- When data are not being sent, the line is kept in the 1-state (idle state)
- The initiation of a character transmission is detected
by a Start Bit , which is always a 0
- The character bits always follow the Start Bit
- After the last character , a Stop Bit is detected when
the line returns to the 1-state for at least 1 bit time
The receiver knows in advance the transfer rate of the
bits and the number of information bits to expect
https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 14

Modes of Transfer

• The binary information that is received from an external


device is usually stored in the memory unit.
• The information that is transferred from the CPU to the
external device is originated from the memory unit.
• CPU merely processes the information but the source
and target is always the memory unit.
• Data transfer between CPU and the I/O devices may be
done in 3 different modes.
• Programmed I/O
• Interrupt Initiated I/O
• DMA

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 15

Programmed I/O
• It is due to the result of the I/O instructions that are
written in the computer program.
• Each data item transfer is initiated by an instruction in the
program.
• Usually the transfer is from a CPU register and memory.
In this case it requires constant monitoring by the CPU of
the peripheral devices.
• Ex: I/O device does not have direct access to the
memory unit.
• A transfer from I/O device to memory requires the
execution of several instructions by the CPU, including
an input instruction to transfer the data from device to the
CPU and store instruction to transfer the data from CPU
to memory.
https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 16

• In programmed I/O, the CPU stays in the program loop


until the I/O unit indicates that it is ready for data transfer.
• This is a time consuming process since it needlessly
keeps the CPU busy. This situation can be avoided by
using an interrupt facility.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 17

Interrrupt Initiated I/O


• Since in programmed I/O we saw the CPU is kept busy
unnecessarily.
• This situation can very well be avoided by using an
interrupt driven method for data transfer.
• By using interrupt facility and special commands to
inform the interface to issue an interrupt request signal
whenever data is available from any device.
• In the meantime the CPU can proceed for any other
program execution. The interface meanwhile keeps
monitoring the device. Whenever it is determined that the
device is ready for data transfer it initiates an interrupt
request signal to the computer.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 18

• Upon detection of an external interrupt signal the CPU


stops momentarily the task that it was already
performing, branches to the service program to process
the I/O transfer, and then return to the task it was
originally performing.
• Note: Both the methods programmed I/O and Interrupt-
driven I/O require the active intervention of the
processor to transfer data between memory and the I/O
module, and any data transfer must traverse
a path through the processor.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 19

• Thus both these forms of I/O suffer from two inherent


drawbacks.
– The I/O transfer rate is limited by the speed with which the processor
can test and service a device.
– The processor is tied up in managing an I/O transfer; a number of
instructions must be execute for each I/O transfer.

To avoid processor intervention in data transfer


between I/O and Memory we use most predominently
used technique called DMA(Direct Memory Access)

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 20

Direct Memory Access


• The data transfer between a fast storage media such as
magnetic disk and memory unit is limited by the speed of
the CPU.
• Thus we can allow the peripherals directly communicate
with each other using the memory buses, removing the
intervention of the CPU.
• This type of data transfer technique is known as DMA or
direct memory access.
• During DMA the CPU is idle and it has no control over
the memory buses.
• The DMA controller takes over the buses to manage the
transfer directly between the I/O devices and the memory
unit.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 21

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 22

• Bus Request : It is used by the DMA controller to


request the CPU to relinquish the control of the buses.
• Bus Grant : It is activated by the CPU to Inform the
external DMA controller that the buses are in high
impedance state and the requesting DMA can take
control of the buses. Once the DMA has taken the control
of the buses it transfers the data. This transfer can take
place in many ways.
• Burst transfer
• Cycle stealing

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 23

What is DMA Controller


• The term DMA stands for direct memory access. The
hardware device used for direct memory access is
called the DMA controller.
• DMA controller is a control unit, part of I/O
device’s interface circuit, which can transfer blocks
of data between I/O devices and main memory with
minimal intervention from the processor.

• DMA controller provides an interface between the
bus and the input-output devices. Although it
transfers data without intervention of processor, it is
controlled by the processor.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 24

• The processor initiates the DMA controller by


sending the starting address, Number of words in the
data block and direction of transfer of data .i.e. from
I/O devices to the memory or from main memory to
I/O devices. More than one external device can be
connected to the DMA controller.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 25

DMA in Computer Architecture

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 26

• DMA controller contains an address unit, for


generating addresses and selecting I/O device for
transfer. It also contains the control unit and data
count for keeping counts of the number of blocks
transferred and indicating the direction of transfer of
data. When the transfer is completed, DMA informs
the processor by raising an interrupt. The typical
block diagram of the DMA controller is shown in the
figure.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 27

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 28

DMA Transfer

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 29

• DMA controller has to share the bus with the


processor to make the data transfer.
• The device that holds the bus at a given time is
called bus master.
• When a transfer from I/O device to the memory or
vice verse has to be made, the processor stops the
execution of the current program, increments the
program counter, moves data over stack then sends
a DMA select signal to DMA controller over the
address bus.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 30

• If the DMA controller is free, it requests the control of


bus from the processor by raising the bus request
signal.
• Processor grants the bus to the controller by raising
the bus grant signal, now DMA controller is the bus
master.
• The processor initiates the DMA controller by
sending the memory addresses, number of blocks of
data to be transferred and direction of data transfer.
• After assigning the data transfer task to the DMA
controller, instead of waiting ideally till completion of
data transfer, the processor resumes the execution
of the program after retrieving instructions from the
stack.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 31

• DMA controller now has the full control of buses and


can interact directly with memory and I/O devices
independent of CPU.
• It makes the data transfer according to the control
instructions received by the processor.
• After completion of data transfer, it disables the bus
request signal and CPU disables the bus grant signal
thereby moving control of buses to the CPU.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 32

• When an I/O device wants to initiate the transfer then


it sends a DMA request signal to the DMA controller,
for which the controller acknowledges if it is free.
• Then the controller requests the processor for the
bus, raising the bus request signal. After receiving
the bus grant signal it transfers the data from the
device. For n channeled DMA controller n number of
external devices can be connected.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 33

• The DMA transfers the data in three modes which


include the following.
• a) Burst Mode: In this mode DMA handover the buses
to CPU only after completion of whole data transfer.
Meanwhile, if the CPU requires the bus it has to stay
ideal and wait for data transfer.
• b) Cycle Stealing Mode: In this mode, DMA gives
control of buses to CPU after transfer of every byte. It
continuously issues a request for bus control, makes
the transfer of one byte and returns the bus. By this
CPU doesn’t have to wait for a long time if it needs a
bus for higher priority task.
• c) Transparent Mode: Here, DMA transfers data only
when CPU is executing the instruction which does
not require the use of buses.
https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 34

Interrupts

• Data transfer between the CPU and the peripherals is


initiated by the CPU. But the CPU cannot start the
transfer unless the peripheral is ready to communicate
with the CPU.
• When a device is ready to communicate with the CPU, it
generates an interrupt signal. A number of input-output
devices are attached to the computer and each device is
able to generate an interrupt request.
• The main job of the interrupt system is to identify the
source of the interrupt. There is also a possibility that
several devices will request simultaneously for CPU
communication. Then, the interrupt system has to decide
which device is to be serviced first.
https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 35

Priority Interrupt
• A priority interrupt is a system which decides the priority
at which various devices, which generates the interrupt
signal at the same time, will be serviced by the CPU.
• The system has authority to decide which conditions are
allowed to interrupt the CPU, while some other interrupt
is being serviced.
• Generally, devices with high speed transfer such
as magnetic disks are given high priority and slow
devices such as keyboards are given low priority.
• When two or more devices interrupt the computer
simultaneously, the computer services the device with
the higher priority first.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 36

• Types of Interrupts:
• Hardware Interrupts
• When the signal for the processor is from an external
device or hardware then this interrupts is known
as hardware interrupt.
Example: when we press any key on our keyboard to do
some action, then this pressing of the key will generate
an interrupt signal for the processor to perform certain
action. Such an interrupt can be of two types:
• Maskable Interrupt:The hardware interrupts which can
be delayed when a much high priority interrupt has
occurred at the same time.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 37

• Non Maskable Interrupt


• The hardware interrupts which cannot be delayed and
should be processed by the processor immediately.
• Software Interrupts
• The interrupt that is caused by any internal system of the
computer system is known as a software interrupt. It
can also be of two types:
• Normal Interrupt:The interrupts that are caused by
software instructions are called normal software
interrupts.
• Exception:Unplanned interrupts which are produced
during the execution of some program are
called exceptions, such as division by zero.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 38

• Establishing priority for device can be done in 2


ways – Software and Hardware
• Polling is software procedure which has common
branch address for all interrupts
• Program that takes care of interrupts will go to
branch address and starts polling the interrput
sources in sequence
• Highest priority device is checked first then lower
priority device
• Disadvantage here is if many interrupts are raised
time required to poll will be greater than to service
the io device.
https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 39

• So we can use hardware priority interrupt function


• This will accept interrupt from many devices and
determines which has the highest priority and serve
that first.
• To speed up the operation each interrupt source has
its own vector address which allows to access its
own service routine directly. Thus no polling is
rrequired

• Hardware priority function can be established by


either serial or parallel connection of interrupt lines

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 40

• Daisy Chaining Priority


• The daisy-chaining method of establishing priority
consists of a serial connection of all devices that request
an interrupt.

The device with the highest priority is placed in the first
position, followed by lower-priority devices up to the
device with the lowest priority, which is placed last in the
chain.

This method of connection between three devices and
the CPU is shown in Fig. The interrupt request line is
common to all devices and forms a wired logic
connection.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 41

• If any device has its interrupt signal in the low-level


state, the interrupt line goes to the low-level state and
enables the interrupt input in the CPU.
• When no interrupts are pending, the interrupt line stays
in the high-level state and no interrupts are recognized
by the CPU.
• The CPU responds to an interrupt request by enabling
the interrupt acknowledge line. This signal is received by
device 1 at its PI (priority in) input.
• The acknowledge signal passes on to the next device
through the PO (priority out) output only if device 1 is not
requesting an interrupt.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 42

• If device 1 has a pending interrupt, it blocks the


acknowledge signal from the next device by placing a 0
in the PO output.

It then proceeds to insert its own interrupt vector
address (VAD) into the data bus for the CPU to use
during the interrupt cycle.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 43

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 44

• A device with PI = 0 generates PO=0 as output to


inform next lowest priority device that ACK signal is
blocked.
• Then the device places its Vector address (VAD) in
to processor bus to use during interrupt cycle.
• A device req interupt has 1 in PI and makes PO as 0
to block from next priority device.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 45

• Thus the device with PI = 1 and PO = 0 is the one with


the highest priority that is requesting an interrupt, and
this device places its VAD on the data bus.

The daisy chain arrangement gives the highest priority
to the device that receives the interrupt acknowledge
signal from the CPU.

The farther the device is from the first position, the lower
is its priority.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 46

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 47

• Figure shows the internal logic that must be included


within each device when connected in the daisy-chaining
scheme. The device sets its RF flip-flop when it wants to
interrupt the CPU.
• The output of the RF flip-flop goes through an open-
collector inverter, a circuit that provides the wired logic
for the common interrupt line.
• If PI = 0, both PO and the enable line to VAD are equal
to 0, irrespective of the value of RF. If PI = 1 and RF = 0,
then PO = 1 and the vector address is disabled.
• This condition passes the acknowledge signal to the
next device through PO.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 48

• The device is active when PI = 1 and RF = 1. This


condition places a 0 in PO and enables the vector
address for the data bus.

It is assumed that each device has its own distinct
vector address. The RF flip-flop is reset after a sufficient
delay to ensure that the CPU has received the vector
address.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 49

• The parallel priority interrupt method uses a register


whose bits are set separately by the interrupt signal from
each device.
• Priority is established according to the position of the
bits in the register.
• In addition to the interrupt register, the circuit may
include a mask register whose purpose is to control the
status of each interrupt request.
• The mask register can be programmed to disable lower-
priority interrupts while a higher-priority device is being
serviced.
• It can also provide a facility that allows a high-priority
device to interrupt the CPU while a lower-priority device
is being serviced.
https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 50

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 51

• It consists of an interrupt register whose individual bits


are set by external conditions and cleared by program
instructions.
• The magnetic disk, being a high-speed device, is given
the highest priority. The printer has the next priority,
followed by a character reader and a keyboard.
• The mask register has the same number of bits as the
interrupt register. By means of program instructions, it is
possible to set or reset any bit in the mask register.
• Each interrupt bit and its corresponding mask bit are
applied to an AND gate to produce the four inputs to a
priority encoder.

https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


Input-Output Organization 52

• In this way an interrupt is recognized only if its


corresponding mask bit is set to 1 by the program. The
priority encoder generates two bits of the vector address,
which is transferred to the CPU.
• Another output from the encoder sets an interrupt status
flip-flop lST when an interrupt that is not masked occurs.
The interrupt enable flip-flop lEN can be set or cleared by
the program to provide an overall control over the
interrupt system.
• The outputs of IST ANDed with IEN provide a common
interrupt signal for the CPU.
• The interrupt acknowledge INTACK signal from the CPU
enables the bus buffers in the output register and a
vector address VAD is placed into the data bus.
https://jntuhbtechadda.blogspot.com/

Computer Organization Computer Architectures Lab


MEMORY ORGANIZATION
Syllabus
• Memory hierarchy

• Main Memory

• Auxiliary Memory

• Associate Memory

• Cache Memory
https://jntuhbtechadda.blogspot.com/
Memory
Memory Hierarchy:

• In the Computer System Design, Memory Hierarchy is


an enhancement to organize the memory such that it
can minimize the access time. The Memory Hierarchy
was developed based on a program behaviour known as
locality of references

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
We can infer the following characteristics of
Memory Hierarchy Design from above figure:
• Capacity:
It is the global volume of information the
memory can store. As we move from top to
bottom in the Hierarchy, the capacity increases.
• Access Time:
It is the time interval between the read/write
request and the availability of the data. As we
move from top to bottom in the Hierarchy, the
access time increases.
https://jntuhbtechadda.blogspot.com/
• Performance:
Earlier when the computer system was designed without
Memory Hierarchy design, the speed gap increases
between the CPU registers and Main Memory due to large
difference in access time. This results in lower performance
of the system and thus, enhancement was required. This
enhancement was made in the form of Memory Hierarchy
Design because of which the performance of the system
increases. One of the most significant ways to increase
system performance is minimizing how far down the
memory hierarchy one has to go to manipulate data.
• Cost per bit:
As we move from bottom to top in the Hierarchy, the cost
per bit increases i.e. Internal Memory is costlier than
External Memory.

https://jntuhbtechadda.blogspot.com/
Main Memory
• The main memory acts as the central storage unit
in a computer system. It is a relatively large and
fast memory which is used to store programs and
data during the run time operations.
• The primary technology used for the main
memory is based on semiconductor integrated
circuits. The integrated circuits for the main
memory are classified into two major units.
• RAM (Random Access Memory) integrated circuit
chips
• ROM (Read Only Memory) integrated circuit chips
https://jntuhbtechadda.blogspot.com/
RAM Chips
• The RAM integrated circuit chips are further classified into
two possible operating modes, static and dynamic
• The primary compositions of a static RAM are flip-flops that
store the binary information. The nature of the stored
information is volatile, i.e. it remains valid as long as power
is applied to the system. The static RAM is easy to use and
takes less time performing read and write operations as
compared to dynamic RAM.
• The dynamic RAM exhibits the binary information in the
form of electric charges that are applied to capacitors. The
capacitors are integrated inside the chip by MOS
transistors. The dynamic RAM consumes less power and
provides large storage capacity in a single memory chip.

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
• A 128 * 8 RAM chip has a memory capacity of 128
words of eight bits (one byte) per word. This requires a
7-bit address and an 8-bit bidirectional data bus.
• The 8-bit bidirectional data bus allows the transfer of
data either from memory to CPU during
a read operation or from CPU to memory during
a write operation.
• The read and write inputs specify the memory
operation, and the two chip select (CS) control inputs
are for enabling the chip only when the microprocessor
selects it.
• The bidirectional data bus is constructed using three-
state buffers.

https://jntuhbtechadda.blogspot.com/
• The output generated by three-state buffers can be
placed in one of the three possible states which include
a signal equivalent to logic 1, a signal equal to logic 0,
or a high-impedance state.
• Note: The logic 1 and 0 are standard digital signals
whereas the high-impedance state behaves like an
open circuit, which means that the output does not
carry a signal and has no logic significance.
• The following function table specifies the operations of
a 128 * 8 RAM chip.
• From the functional table, we can conclude that the
unit is in operation only when CS1 = 1 and CS2 = 0. The
bar on top of the second select variable indicates that
this input is enabled when it is equal to 0.

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
ROM Chips
• The primary component of the main memory is RAM
integrated circuit chips, but a portion of memory may
be constructed with ROM chips.
• A ROM memory is used for keeping programs and data
that are permanently resident in the computer.
• Apart from the permanent storage of data, the ROM
portion of main memory is needed for storing an initial
program called a bootstrap loader. The primary
function of the bootstrap loader program is to start the
computer software operating when power is turned
on.
• ROM chips are also available in a variety of sizes and
are also used as per the system requirement.

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
• A ROM chip has a similar organization as a RAM
chip. However, a ROM can only perform read
operation; the data bus can only operate in an
output mode.
• The 9-bit address lines in the ROM chip specify
any one of the 512 bytes stored in it.
• The value for chip select 1 and chip select 2 must
be 1 and 0 for the unit to operate. Otherwise, the
data bus is said to be in a high-impedance state.
https://jntuhbtechadda.blogspot.com/
CONNECTION OF MEMORY TO CPU

https://jntuhbtechadda.blogspot.com/
Auxiliary Memory
• In addition to main memory we have large
storage called as Auxiliary or secondary storage
• Bigger in size, lesser in cost, more access time
• Physical properties of a device are complex but
logical properties can be characterized and
compared by few parameters
• Namely – Access mode, access time, transfer
rate, cost and capacity.
• Mode represents direct or indirect
• Access time – average time required to reach
location and access the data

https://jntuhbtechadda.blogspot.com/
• In magnetic disks and tapes
• Acess time = seek time + transfer time

• As seek time is more than transfer time the data


in them is maintained in the form of
records/blocks.

• Magnetic disks: consists of cylindrical surface –


divided to tracks – divided to sectors which hold
info.

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Associative Memory
• An associative memory can be considered as a memory
unit whose stored data can be identified for access by
the content of the data itself rather than by an address
or memory location.
• Associative memory is often referred to as Content
Addressable Memory (CAM).
• Associative memory is much slower than RAM, and is
rarely encountered in mainstream computer designs.
• When a write operation is performed on associative
memory, no address or memory location is given to the
word.
• The memory itself is capable of finding an empty
unused location to store the word.

https://jntuhbtechadda.blogspot.com/
• On the other hand, when the word is to be read from
an associative memory, the content of the word, or
part of the word, is specified.
• The words which match the specified content are
located by the memory and are marked for reading.
• Associative memory is expensive to implement as
integrated circuitry.
• Associative memory can be used in certain very-high-
speed searching applications where search time is very
critical and must be very short.

https://jntuhbtechadda.blogspot.com/
H/W organization of Associative Memory

https://jntuhbtechadda.blogspot.com/
• From the block diagram, we can say that an
associative memory consists of a memory array
and logic for 'm' words with 'n' bits per word.
• The functional registers like the argument
register A and key register K each have n bits, one
for each bit of a word. The match
register M consists of m bits, one for each
memory word.
• The words which are kept in the memory are
compared in parallel with the content of the
argument register.

https://jntuhbtechadda.blogspot.com/
• The key register (K) provides a mask for choosing
a particular field or key in the argument word.
• If the key register contains a binary value of all
1's, then the entire argument is compared with
each memory word.
• Otherwise, only those bits in the argument that
have 1's in their corresponding position of the
key register are compared.
• Thus, the key provides a mask for identifying a
piece of information which specifies how the
reference to memory is made.

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
• The diagram represent the relation between the
memory array and the external registers in an
associative memory.
• The cells present inside the memory array are
marked by the letter C with two subscripts.
• The first subscript gives the word number and the
second specifies the bit position in the word. For
instance, the cell Cij is the cell for bit j in word i.
• A bit Aj in the argument register is compared with
all the bits in column j of the array provided that
Kj = 1. This process is done for all columns j = 1, 2,
3......, n.

https://jntuhbtechadda.blogspot.com/
• If a match occurs between all the unmasked
bits of the argument and the bits in word i,
the corresponding bit Mi in the match register
is set to 1. If one or more unmasked bits of the
argument and the word do not match, Mi is
cleared to 0.

https://jntuhbtechadda.blogspot.com/
Cache Memory
• The data or contents of the main memory that
are used frequently by CPU are stored in the
cache memory so that the processor can easily
access that data in a shorter time.
• Whenever the CPU needs to access memory, it
first checks the cache memory. If the data is not
found in cache memory, then the CPU moves into
the main memory.
• Cache memory is placed between the CPU and
the main memory. The block diagram for a cache
memory can be represented as:

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
• The cache is the fastest component in the memory hierarchy
and approaches the speed of CPU components.
• The basic operation of a cache memory is as follows:
– When the CPU needs to access memory, the cache is
examined. If the word is found in the cache, it is read from
the fast memory.
– If the word addressed by the CPU is not found in the
cache, the main memory is accessed to read the word.
– A block of words one just accessed is then transferred
from main memory to cache memory. The block size may
vary from one word (the one just accessed) to about 16
words adjacent to the one just accessed.
– The performance of the cache memory is frequently
measured in terms of a quantity called hit ratio.

https://jntuhbtechadda.blogspot.com/
– When the CPU refers to memory and finds the word in
cache, it is said to produce a hit.
– If the word is not found in the cache, it is in main
memory and it counts as a miss.
– The ratio of the number of hits divided by the total
CPU references to memory (hits plus misses) is the hit
ratio.
The basic characteristics of cache memory is its fast
access time. Therefore, little time or no time must be
wasted when searching data in cache memory.
Transformation of data from main memory to cache is
called Mapping process.
https://jntuhbtechadda.blogspot.com/
We have three types of mapping procedures
• Associative mapping
• Direct mapping
• Set-associative mapping
To help understand the mapping procedure, we have the following
example:

https://jntuhbtechadda.blogspot.com/
Associative mapping:

• The fastest and most flexible cache organization uses an


associative memory

• The associative memory stores both the address and data


of the memory word

• This permits any location in cache to store ant word from


main memory

• The address value of 15 bits is shown as a five-digit octal


number and its corresponding 12-bit word is shown as a
four-digit octal number

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
 A CPU address of 15 bits is places in the argument
register and the associative memory us searched for a
matching address
 If the address is found, the corresponding 12-bits data is
read and sent to the CPU
 If not, the main memory is accessed for the word

 If the cache is full, an address-data pair must be


displaced to make room for a pair that is needed and
not presently in the cache

https://jntuhbtechadda.blogspot.com/
Direct Mapping
• Associative memory is expensive compared to RAM
• In general case, there are 2^k words in cache memory and 2^n words in
main memory (in our case, k=9, n=15)
• The n bit memory address is divided into two fields: k-bits for the index
and n-k bits for the tag field

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Set-Associative Mapping
• The disadvantage of direct mapping is that two words
with the same index in their address but with different
tag values cannot reside in cache memory at the same
time
• Set-Associative Mapping is an improvement over the
direct-mapping in that each word of cache can store
two or more word of memory under the same index
address
• In the next slide, each index address refers to two data
words and their associated tags
• Each tag requires six bits and each data word has 12
bits, so the word length is 2*(6+12) = 36 bits

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Unit V

https://jntuhbtechadda.blogspot.com/
CISC & RISC Characteristics
• Important in computer acrhi. Is instruction set
provided to computer.
• Instruction set selected for a computer
determines the way machine language
programs are constructed.

• In early days computers had small instruction


sets mainly because of costly hardware to
implement them.
https://jntuhbtechadda.blogspot.com/
CISC
• With advent of digital hardware cost of h/w
became cheaper which resulted in increase in
number and complexity of instructions.
• A system can have 100 or even more than 200
instructions
• These computers have diff datatypes and
more no of add modes

https://jntuhbtechadda.blogspot.com/
• Trend in computer hardware complexity was
influenced by
– Upgrading existing models
– Adding instructions to facilitate Translation of high
level to machine level programs
– Moving the functions from software
implementation to hardware implementation
– A computer with large no of instructions is called
Complex Instruction Set Computer(CISC)
https://jntuhbtechadda.blogspot.com/
• In early 80s, there were computer use few
instructions with simple constructs so that
they can be executed much faster within CPU
without having to use memory often which is
called Reduced Instruction Set
Computer(RISC)

https://jntuhbtechadda.blogspot.com/
CISC Characteristics
• Large no of instructions
• More no of addressing modes
• Some instructions that perform specialized tasks which
are used infrequently
• Variable length instruction formats
• Instructions that manipulate operands in memory

• With the use of more instructions and addressing


modes more hardware logic is required to implement
them which will be complex and thus slow down the
computations

https://jntuhbtechadda.blogspot.com/
RISC Characteristics
• It involves an attempt to reduce the execution
time by simplifying instruction set
• Characteristics
– Few instructions
– Few addressing modes
– Memory access limited to load and store
– All operations are done within registers of CPU
– Fixed length instruction format easy for decoding
– Single cycle instruction execution(Pipelining)

https://jntuhbtechadda.blogspot.com/
– Uses large no of registers
– Efficient instruction Pipeline
– Overlapped register windows
• Here much time is consumed to save and restore values
from memory/stack while implementing Subroutine call
& return to avoid this we use overlapping of register
windows

https://jntuhbtechadda.blogspot.com/
• Register Windows:
• Procedure CALL and RETURN occurs in HL
programming languages. When translated into
machine language,
• a procedure CALL produces a sequence of
instructions that:
• - Save register values. –
• Pass parameters needed for the procedure.
• - Call a subroutine to execute the body of the
procedure.

https://jntuhbtechadda.blogspot.com/
• After a procedure RETURN, the program will:
– Restore the old register values,
• – Pass results to the calling program,
• – Return from the subroutine.

https://jntuhbtechadda.blogspot.com/
• Saving and restoring registers and passing parameters
and results involve time-consuming operations.
• To overcome this, one of the following techniques must
be used:
• 1. Using multiple-register banks: each procedure is
allocated its own bank of registers. This will eliminate
the need for saving and storing register values.
• 2. Using the memory stack to store the parameters that
are needed by the procedure, but this requires a
memory access every time the stack is accessed.

https://jntuhbtechadda.blogspot.com/
• 3. Using overlapped register windows (RISC
processors) to provide the passing of
parameters and avoid the need for saving and
restoring register values.

https://jntuhbtechadda.blogspot.com/
• For overlapped register windows:
• Each procedure CALL results in the allocation
of a new window consisting of a set from the
register file for use by the new procedure.
• Each procedure CALL activates a new register
window by incrementing a pointer, while the
RETURN statement decrements the pointer
and causes the activation of the previous
window

https://jntuhbtechadda.blogspot.com/
• Windows for adjacent procedures have
overlapping registers that are shared to
provide the passing of parameters and results.

https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
• Example:
• The system has 74 registers:
• R0-R9: Ten hold parameters shared by all procedures.
• R10-R73: Sixty four registers are divided into FOUR
windows to accommodate procedures A, B, C & D.
• Each register window consists of 10 local registers and
TWO sets of 6 registers common to adjacent windows.
• Local Registers: used for local variables.
• Common Registers: used for exchange of parameters
and results between adjacent procedures.

https://jntuhbtechadda.blogspot.com/
• Each procedure has 32 registers when it is
active, these are; global regs + local regs +
overlapping regs 10+10+6+6= 32 registers
• To calculate Register Window Size:
• G: No. of global registers.
• L: No. of local registers in each window.
• C: No. of registers common to two windows
• W: No. of windows.
https://jntuhbtechadda.blogspot.com/
• Window size (S): No. of registers available for
each window.
– Window Size= S= L+2C+G

Register file (F): Total no. of registers needed in the


processor.
Register File= F= (L+C)W+G
For the given example: G=10, L=10, C=6 and W=4,
then
S= L+2C+G= 32 regs,
F= (L+C)W+G= 74 regs
https://jntuhbtechadda.blogspot.com/
9-1
Chap. 9 Pipeline and Vector Processing
 Parallel Processing
 Simultaneous data processing tasks for the purpose of increasing the
= computational speed
 Perform concurrent data processing to achieve faster execution time
 Multiple Functional Unit : Parallel Processing Example
 Separate the execution unit into eight functional units operating in parallel
 Computer Architectural Classification Adder-subtractor

 Data-Instruction Stream : Flynn


Integer multiply
 Serial versus Parallel Processing : Feng
 Parallelism and Pipelining : Händler Logic unit

 Flynn’s Classification Shift unit

To Memory
 1) SISD (Single Instruction - Single Data stream) Incrementer

Processor
» for practical purpose: only one processor is useful registers
Floatint-point
add-subtract
» Example systems : Amdahl 470V/6, IBM 360/91
Floatint-point
IS multiply

Floatint-point
divide

CU PU MM
IS DS

https://jntuhbtechadda.blogspot.com/
9-2

 2) SIMD
Shared
(Single Instruction - Multiple Data stream) DS 1
memmory
PU 1 MM 1
» vector or array operations
 one vector operation includes many DS 2
PU 2 MM 2
operations on a data stream
IS
CU
» Example systems : CRAY -1, ILLIAC-IV

DS n
PU n MM n

IS

 3) MISD
(Multiple Instruction - Single Data stream)
» Data Stream Bottle neck DS

IS 1 IS 1
CU 1 PU 1

Shared memory

IS 2 IS 2
CU 2 PU 2 MM n MM 2 MM 1

IS n IS n
CU n PU n

DS

https://jntuhbtechadda.blogspot.com/
9-3

 4) MIMD
(Multiple Instruction - Multiple Data stream)
» Multiprocessor Shared
memory
IS 1 IS 1 DS
System CU 1 PU 1 MM 1

IS 2 IS 2
CU 2 PU 2 MM 2

v v

IS n IS n
CU n PU n MM n

 Main topics in this Chapter


 Pipeline processing :
» Arithmetic pipeline :
» Instruction pipeline :
 Vector processing :adder/multiplier pipeline Large vector, Matrices,
 Array processing : array processor Array Data
» Attached array processor :
» SIMD array processor :
https://jntuhbtechadda.blogspot.com/
9-4

 Pipelining
 Pipelining
 Decomposing a sequential process into suboperations
 Each subprocess is executed in a special dedicated segment concurrently

 Pipelining Example
 Multiply and add operation : Ai * Bi  Ci ( for i = 1, 2, …, 7 )
 3 Suboperation Segment
»1) R1  Ai, R2  Bi : Input Ai and Bi
»2) R3  R1* R2, R4  Ci : Multiply and input Ci
»3) R5  R3  R4 : Add Ci

https://jntuhbtechadda.blogspot.com/
9-5

 Pipelining
 General considerations
 4 segment pipeline :
» S : Combinational circuit
for Suboperation
» R : Register(intermediate
results between the
segments)
 Space-time diagram :
» Show segment utilization
as a function of time
 Task : T1, T2, T3,…, T6 Clock cycles 1 2 3 4 5 6 7 8 9

» Total operation 1 T 1 T 2 T 3 T 4 T 5 T 6

performed going through


2 T T T T T T6

Segment
1 2 3 4 5
all the segment
3 T 1 T 2 T 3 T 4 T5 T 6

4 T 1 T 2 T 3 T4 T 5 T 6

https://jntuhbtechadda.blogspot.com/
9-6

 Speedup S : Nonpipeline / Pipeline


 With pipeline: k-segment pipeline with a clock time tp to execute n tasks
 Without pipeline: Each task takes tn
 S = n • tn / ( k + n - 1 ) • tp = 6 • 6 tn / ( 4 + 6 - 1 ) • tp = 36 tn / 9 tn = 4
» n : task number ( 6 )
» tn : time to complete each task in nonpipeline ( 6 cycle times = 6 tp)
» tp : clock cycle time ( 1 clock cycle )
» k : segment number ( 4 )
Clock cycles 1 2 3 4 5 6 7 8 9
1 T 1 T 2 T 3 T 4 T 5 T 6

Segment
2 T 1 T 2 T 3 T 4 T 5 T6

3 T 1 T 2 T 3 T 4 T5 T 6

4 T 1 T 2 T 3 T4 T 5 T 6

https://jntuhbtechadda.blogspot.com/
9-7

 Instruction Pipeline
 Instruction Cycle
1) Fetch the instruction from memory
2) Decode the instruction
3) Calculate the effective address
4) Fetch the operands from memory
5) Execute the instruction
6) Store the result in the proper place

https://jntuhbtechadda.blogspot.com/
9-8

 Instruction Pipeline
 Example : Four-segment Instruction Pipeline
 Four-segment CPU pipeline :
» 1) FI : Instruction Fetch
» 2) DA : Decode Instruction & calculate EA
» 3) FO : Operand Fetch
» 4) EX : Execution
 Timing of Instruction Pipeline :
» Instruction 3 Branch

Step : 1 2 3 4 5 6 7 8 9 10 11 12 13
Instruction : 1 FI DA FO EX

2 FI DA FO EX

(Branch) 3 FI DA FO EX

4 FI FI DA FO EX

5 FI DA FO EX

6 FI DA FO EX

7 FI DA FO EX

No Branch Branch
https://jntuhbtechadda.blogspot.com/

Computer System Architecture Chap. 9 Pipeline and Vector Processing Dept. of Info. Of Computer
9-9

 Pipeline Conflicts : 3 major difficulties


 1) Resource conflicts
» memory access by two segments at the same time
 2) Data dependency
» when an instruction depend on the result of a previous instruction, but this result is not
yet available
 3) Branch difficulties
» branch and other instruction (interrupt, ret, ..) that change the value of PC
 Data Dependency
 Hardware
» Hardware Interlock
 previous instruction Hardware Delay
» Operand Forwarding
 previous instruction
 Software
» Delayed Load
 previous instruction No-operation instruction

https://jntuhbtechadda.blogspot.com/
9-10

Clock cycles : 1 2 3 4 5 6 7 8 9 10
 Delayed Branch 1. Load I A E

» 1) No-operation instruction 2. Increment I A E

» 2) Instruction Rearranging 3. Add I A E

4. Subtract I A E

5. Branch to X I A E

6. No-operation I A E

7. No-operation I A E

8. Instruction in X I A E

(a) Using no-operation instructions

Clock cycles : 1 2 3 4 5 6 7 8

1. Load I A E

2. Increment I A E

3. Branch to X I A E

4. Add I A E

5. Subtract I A E

6. Instruction in X I A E

(b) Rearranging the instructions

https://jntuhbtechadda.blogspot.com/
9-11

 9-5 RISC Pipeline


Conflict
 RISC CPU
 Instruction Pipeline
Clock cycles : 1 2 3 4 5 6
 Single-cycle instruction execution
1. Load R1 I A E
 Compiler support
2. Load R2 I A E

 Example : Three-segment Instruction Pipeline 3. Add R1+R2 I A E

 3 Suboperations Instruction Cycle 4. Store R3 I A E


» 1) I : Instruction fetch
(a) Pipeline timing with data conflict
» 2) A : Instruction decoded and ALU operation
» 3) E : Transfer the output of ALU to a register,
Clock cycles : 1 2 3 4 5 6 7
memory, or PC 1. Load R1 I A E

 Delayed Load : 2. Load R2 I A E


» Instruction(ADD R1 + R3) Conflict
3. No-operation I A E
» Delayed Load
4. Add R1+R2 I A E
 No-operation
5. Store R3 I A E
 Delayed Branch :
(b) Pipeline timing with delayed
load

https://jntuhbtechadda.blogspot.com/
9-12

 9-5 RISC Pipeline


 Example : Three-segment
Instruction Pipeline
 3 Suboperations
Instruction Cycle
» 1) I : Instruction fetch
» 2) A : Instruction
decoded and ALU
operation
» 3) E : Transfer the output
of ALU to a register,
memory, or PC
 Delayed Branch :

https://jntuhbtechadda.blogspot.com/
9-13

 9-6 Vector Processing


 Science and Engineering Applications
 Long-range weather forecasting, Petroleum explorations, Seismic data analysis,
Medical diagnosis, Aerodynamics and space flight simulations, Artificial
intelligence and expert systems, Mapping the human genome, Image processing
 Vector Operations
 Arithmetic operations on large arrays of numbers
 Conventional scalar processor
» Machine language » Fortran language
Initialize I = 0
DO 20 I = 1, 100
20 Read A(I)
20 C(I) = A(I) + B(I)
Read B(I)
Store C(I) = A(I) + B(I)
Increment I = I + 1
If I  100 go to 20
Continue

 Vector processor
» Single vector instruction
C(1:100) = A(1:100) + B(1:100)
https://jntuhbtechadda.blogspot.com/
9-14

 Vector Instruction Format :


Operation Base address Base address Base address Vector
code source 1 source 2 destination length
ADD A B C 100
 Matrix Multiplication
 3 x 3 matrices multiplication : n2 = 9 inner product
a11 a12 a13  b11 b12 b13  c11 c12 c13 
a a 22 a   b b22 b   c c22 c23 
 21 23   21 23   21 
a31 a32 a33  b31 b32 b33  c31 c32 c33 

» c11  a11 b11  a12 b21  a13 b31 : inner product 9

 Cumulative multiply-add operation : n3 = 27 multiply-add


ccab
» c11  c11  a11 b11  a12 b21  a13 b31 : multiply-add
      9 X 3 multiply-add = 27
Initialize C11 = 0

https://jntuhbtechadda.blogspot.com/
9-15

 Pipeline for calculating an inner product :


 Floating point multiplier pipeline : 4 segment
 Floating point adder pipeline : 4 segment
 C  A1B1  A2 B2  A3 B3    Ak Bk
» after 1st clock input » after 4th clock input
Source Source
A A

A1B1 A4B4 A3B3 A2B2 A1B 1

Source Multiplier Adder Source Multiplier Adder


B pipeline pipeline B pipeline pipeline

» after 8th clock input » after 9th, 10th, 11th ,...


Source Source
A A

A8B8 A7B7 A6B 6 A5B5 A4B4 A3B3 A2B2 A1B1 A8B8 A7B7 A6B 6 A5B5 A4B4 A3B3 A2B2 A1B1

Source Multiplier Adder Source Multiplier Adder


B pipeline pipeline B pipeline pipeline

» Four section summation


A2 B2  A6 B6 A1B1  A5 B5
C  A1B1  A5B5  A9 B9  A13 B13 
 A2 B2  A6 B6  A10 B10  A14 B14  , , ,  
 A3B3  A7 B7  A11B11  A15 B15 
 A4 B4  A8 B8  A12 B12  A16 B16 
https://jntuhbtechadda.blogspot.com/
9-16

 Memory Interleaving : Address bus

 Simultaneous access to memory from two or AR AR AR AR

more source using one memory bus system


Memory Memory Memory Memory
 Even / Odd Address Memory Access array array array array

DR DR DR DR

 Supercomputer Data bus

 Supercomputer = Vector Instruction + Pipelined floating-point arithmetic


 Performance Evaluation Index
» MIPS : Million Instruction Per Second
» FLOPS : Floating-point Operation Per Second
 megaflops : 106, gigaflops : 109
 Cray supercomputer : Cray Research
» Clay-1 : 80 megaflops, 4 million 64 bit words memory
» Clay-2 : 12 times more powerful than the clay-1
 VP supercomputer : Fujitsu
» VP-200 : 300 megaflops, 32 million memory, 83 vector instruction, 195 scalar instruction
» VP-2600 : 5 gigaflops

https://jntuhbtechadda.blogspot.com/
9-17

 9-7 Array Processing


 Used to deal processing large arrays of data
 Array processors are also known as multiprocessors or vector processors. They perform
computations on large arrays of data.
 Thus, they are used to improve the performance of the computer.

 Types of Array Processors


• There are basically two types of array processors:
• Attached Array Processors
• SIMD Array Processors

 Attached Array Processors


• An attached array processor is a processor which is attached to a general purpose computer and
its purpose is to enhance and improve the performance of that computer in numerical
computational tasks.
• It achieves high performance by means of parallel processing with multiple functional units.

https://jntuhbtechadda.blogspot.com/
9-18

https://jntuhbtechadda.blogspot.com/
9-19

 SMID Array processor


 SIMD is the organization of a single computer containing multiple processors operating in
parallel.
 The processing units are made to operate under the control of a common control unit, thus
providing a single instruction stream and multiple data streams.

https://jntuhbtechadda.blogspot.com/
9-20

 It contains a set of identical processing elements (PE's), each of which is having a local
memory M.
 Each processor element includes an ALU and registers.
 The master control unit controls all the operations of the processor elements. It also decodes
the instructions and determines how the instruction is to be executed.
 The main memory is used for storing the program.
 The control unit is responsible for fetching the instructions.
 Vector instructions are send to all PE's simultaneously and results are returned to the
memory.
 The best known SIMD array processor is the ILLIAC IV computer developed by
the Burroughs corps.
 SIMD processors are highly specialized computers. They are only suitable for numerical
problems that can be expressed in vector or matrix form and they are not suitable for other
types of computations.

https://jntuhbtechadda.blogspot.com/
9-21

https://jntuhbtechadda.blogspot.com/
BVRIT HYDERABAD College of Engineering for Women PAGE NO:1

Multi Processors
Syllabus
Characteristics of Multiprocessors, Interconnection Structures, Interprocessor arbitration, Interprocessor
communication and synchronization, Cache Coherence.

Characteristics of Multiprocessors:
The interconnection of two or more CPUs with memory and input-output equipment is called as multiprocessor
system.

The term "processor" here refers to


- A central processing unit (CPU) or
- An input-output processor(lOP).

But, a system with a single CPU and one or more lOPs is usually not a multiprocessor system, unless lOP has
computational facilities comparable to a CPU.
So, a multiprocessor system implies the existence of multiple CPUs, with one or more lOPs.

A multicomputer system consists of several autonomous computers interconnected with each other by means of
communication lines.

Both multiprocessor and multicomputer systems support concurrent operations.


However, there exists few differences between them and they are:
1. Construction of multicomputer is easier and cost effective than a multiprocessor.
2. Programming is easier in multiprocessor system compared to multicomputer system.
3. Multiprocessor supports parallel computing, Multicomputer supports distributed computing.

The reliability of the system is improved with multiprocessing because a failure or error in one processor has a
limited effect on the rest of the system, as the second processor can be assigned to perform the functions of the
disabled processor.
Hence, the system will continue to function correctly with probably some loss in efficiency.
High performance can be achieved using a multiprocessor organization as the computations can proceed in
parallel in one of the following two ways.
1. Multiple independent jobs can be operated in parallel.
2. A single job can be partitioned into multiple parallel tasks.

1. Multiple independent jobs:


- The overall function can be partitioned into a number of independent tasks.
- Each task is assigned to separate processor, sometimes to a special purpose processor.
- Examples:
a. In a computer system
 One processor performs the computations for an industrial process control.
 Other processors monitor and control the various parameters, such as temperature and flow rate.

NAME OF THE FACULTY:MuraliNath R S, Nagamani G, Kavitha P SUBJECT:COA


https://jntuhbtechadda.blogspot.com/
BVRIT HYDERABAD College of Engineering for Women PAGE NO:2

b. In a computer system
 One processor performs high-speed floating-point mathematical computations.
 Other processor(s) take(s) care of routine data-processing tasks.

2. Multiple Parallel Tasks:


Here a program is divided into parallel executable tasks and this can be achieved in one of two ways:
a. The user can explicitly declare for parallel execution of certain tasks in the program. An operating system
with programming language constructs suitable for specifying parallel processing is required.
b. A compiler with multiprocessor software can automatically detect parallelism in a user's program by
checking for data dependency in the entire program using following criteria:
 A program depending on the data generated in another part, then the part yielding the needed data must
be executed first.
 Two parts of a program that do not use data generated by each can run concurrently.

Based on the organization of memory, the multiprocessors are classified into two types:
1. Shared-memory or tightly coupled microprocessor
2. Distributed- memory or loosely coupled microprocessor

1. Shared-memory or tightly coupled microprocessor:


- Shares a common memory
- May have cache memory for each processor
- Information is shared by placing in the common global memory.

2. Distributed- memory or loosely coupled microprocessor:


- Each processor has its own local memory
- Switching scheme is used to connect the processors
- Information is exchanged in the form of packets using routing mechanism. A packet consists of an address, the
data content, and some error detection code.
- Packets can be addressed to a specific processor, a subset of processors or all the processors.

Tightly coupled systems can tolerate a higher degree of interaction between tasks.
Loosely coupled systems are most efficient for minimal interaction between tasks.

Interconnection Structures:
A multiprocessor system consists of following components:
- CPUs,
- IOPs connected to input-output devices,
- A memory unit that may be partitioned into a number of separate modules

Depending on the number of transfer paths available between the processors and memory in a sharedmemory
system or distributed memory system, the interconnection among the components can have different physical
configurations. They are:
1. Time-shared common bus
2. Multiport memory
3. Crossbar switch
NAME OF THE FACULTY:MuraliNath R S, Nagamani G, Kavitha P SUBJECT:COA
https://jntuhbtechadda.blogspot.com/
BVRIT HYDERABAD College of Engineering for Women PAGE NO:3

4. Multistage switching network


5. Hypercube system
1. Time-Shared Common Bus:
In a common-bus multiprocessor system, a number of processors are connected through a common path to a
memory unit as shown below.

In this configuration:
- At any given time, only one processor having the control over the bus is allowed to communicate with the
memory or another processor.
- Any processor wishing to initiate a transfer:
 First determines the availability of the bus
 After the availability of the bus, address the destination is used to initiate the transfer and a command is
issued to indicate the operation to be performed.
 The destination recognizes its address in the bus and responds to the sender’s control signals
accordingly.

The system will resolve the conflicts by incorporating a bus controller that establishes priorities among the
requesting units.
Limitations:
- A single common-bus system is restricted to one transfer at a time. So, all other processors are either busy
with internal operations or must be idle waiting for the bus.
- Hence, the total overall transfer rate within the system is limited by the speed of the single bus.

To overcome these, the processors in the system can be kept busy through the implementation of two or more
independent buses to permit multiple simultaneous bus transfers. But, this increases the overall system cost and
complexity.

A more economical implementation of a dual bus structure is as shown below.

NAME OF THE FACULTY:MuraliNath R S, Nagamani G, Kavitha P SUBJECT:COA


https://jntuhbtechadda.blogspot.com/
BVRIT HYDERABAD College of Engineering for Women PAGE NO:4

In the above configuration:


- Number of local buses is used to connect to its own local memory and to one or more processors.
- Each local bus may be connected to a CPU, an IOP, or any combination of processors along with local
memory. Part of the local memory can be cache memory.
- A system bus controller links each local bus to a common system bus.
- The memory or IOP (not in the figure) connected to the common system bus is shared by all processors.
- At a given time, only one processor is allowed to communicate with the shared memory and other common
resources through the system bus.
- All other processors are kept busy communicating with their local memory and IO devices.

2. Multiport Memory:
A multiport memory system employs separate buses between each memory module and each CPU as shown
below.

In the above configuration:


- Each processor bus consisting of the address, data and control lines is connected to each memory module.
- The memory module is having four ports to accommodate the buses.
- The internal control logic of the memory module determines one of the port to have access at any given
time.
NAME OF THE FACULTY:MuraliNath R S, Nagamani G, Kavitha P SUBJECT:COA
https://jntuhbtechadda.blogspot.com/
BVRIT HYDERABAD College of Engineering for Women PAGE NO:5

- The conflicts are resolved by assigning fixed priorities to each memory port, and the processors are
connected accordingly.
Advantage:
High transfer rate can be achieved using multiple paths between processors and memory.

Disadvantages:
- Expensive memory control logic
- Large number of cables and connectors.

In lieu of the above, this interconnection structure is appropriate for a small number of processors.

3. Crossbar Switch:
The crossbar switch organization shown below consists of a number of crosspoints that are placed at
intersections between processor buses and memory module paths.

In the above organization:


- The small square at each crosspoint is a switch to determine the path from a processor to a memory module.
- Each switch point has control logic to set up the transfer path between a processor and memory based on the
address placed in the bus.
- Switch point also resolves multiple requests for access to the same memory module on a predetermined
priority basis.

NAME OF THE FACULTY:MuraliNath R S, Nagamani G, Kavitha P SUBJECT:COA


https://jntuhbtechadda.blogspot.com/
BVRIT HYDERABAD College of Engineering for Women PAGE NO:6

The functional design of a crossbar switch connected to one memory module is illustrated below.

The multiplexers and arbitration logic:


- Selects the data, address, and control from one CPU for communication with the memory module.
- Establishes priority levels and implements using priority encoder to select one CPU during conflicts to
access the same memory.

A crossbar switch organization uses a separate path associated with each memory module to support
simultaneous transfers. However, the hardware required to implement the switch can become quite large and
complex.

4. Multistage Switching Network:


The basic component of a multistage network is a two-input, two-output interchange switch as shown below.

Each 2 X 2 switch has


- Two inputs labeled A and B, and two outputs, labeled 0 and 1.
- The control signals (not shown) to establish the interconnection between the input and output terminals.
- The capability to connect either input A or B to either of the outputs.
- The capability to arbitrate between conflicting requests.

The 2 x 2 switch can be used as a building block, to construct multistage network to control the communication
between a number of sources and destinations.

NAME OF THE FACULTY:MuraliNath R S, Nagamani G, Kavitha P SUBJECT:COA


https://jntuhbtechadda.blogspot.com/
BVRIT HYDERABAD College of Engineering for Women PAGE NO:7

Let us consider the following binary tree:

- The two processors P1 and P2 are connected through switches to eight memory modules marked in binary
from 000 through 111.
- The path from source to a destination is determined from the binary bits of the destination number.
 The first bit determines the switch output in the first level.
 The second bit specifies the output of the switch in the second level
 The third bit specifies the output of the switch in the third level.

For example, to connect P1 to memory 101, a path is formed with:


- output 1 in the first-level switch,
- output 0 in the second-level switch
- output 1 in the third-level switch

The above example illustrates that either P1 or P2 can be connected to any one of the eight memories.
But, certain request patterns cannot be satisfied simultaneously.
For example, if P1 is connected to one of the destinations 000 through 011, P 2 can be connected to only one of
the destinations 100 through 111.

Many different topologies have been proposed for multistage switching networks to control
- Processor-memory communication in a tightly coupled multiprocessor system
- Communication between the processing elements in a loosely coupled system.

The omega switching network shown below is one of the popular topology.

NAME OF THE FACULTY:MuraliNath R S, Nagamani G, Kavitha P SUBJECT:COA


https://jntuhbtechadda.blogspot.com/
BVRIT HYDERABAD College of Engineering for Women PAGE NO:8

In the above configuration:


- There is exactly one path from each source to any particular destination.
- Some request patterns cannot be connected simultaneously. For example, any two sources cannot be
connected simultaneously to destinations 000 and 001.

The establishment of the path takes place as:


- The source initiates the request by sending a 3-bit destination number in the switching network.
- Each level examines a different bit in the 3-bit number to determine the 2 x 2 switch setting.
 Level 1 inspects the most significant bit.
 Level 2 inspects the middle bit, and
 Level 3 inspects the least significant bit.
- At the input of the 2 x 2 switch, it is routed to the upper output if the specified bit is 0 or to the lower output
if the bit is 1.

In a tightly coupled multiprocessor system, the source is a processor and the destination is a memory module.
- The first pass through the network sets up the path.
- Succeeding passes are used to transfer the address into the memory and then transfer the data in either
direction based on Read or Write signals

In a loosely coupled multiprocessor system, both the source and destination are processing elements.
- The first pass through the network sets up the path.
- Then the source transfers the message to the destination.

5. Hypercube Interconnection:
The hypercube or binary n-cube multiprocessor structure is a loosely coupled system composed of N = 2nnodes
interconnected in an n-dimensional binary cube.
- Each node can be a CPU, local memory or an I/O interface.
- Each node has direct communication paths called as edges to n other neighbor nodes.
- There are 2n distinct n-bit binary addresses that can be assigned to the nodes.
- Each node address differs from that of each of its n neighbors by exactly one bit position.
For example, the three neighbors of the node with address 100 in a three-cube structure are 000, 110, and
101. Each of these binary numbers differs from address 100 by one bit value.

The hypercube structure for n = 1, 2, and 3 is as given below.

In general, an n-cube structure has 2nnodes with a processor residing in each node.

NAME OF THE FACULTY:MuraliNath R S, Nagamani G, Kavitha P SUBJECT:COA


https://jntuhbtechadda.blogspot.com/
BVRIT HYDERABAD College of Engineering for Women PAGE NO:9

In an n-cube structure, one to n links are required to route messages from a source node to a destination node.
For example, in a three-cube structure, node 000:
- Can communicate directly with node 001.
- Must cross at least two links to communicate with 011 (from 000 to 001 to 011 or from 000 to 010 to 011).
- Go through at least three links to communicate from node 000 to node 111.

A routing procedure is based on the result of exclusive-OR of the source node address with the destination node
address. The resulting binary value will have 1 bits corresponding to the axes on which the two nodes differ.
The message is then sent along any one of the axes.
For example, in a three-cube structure, a message from 010 to 001produces an exclusive-OR result equal to 011.
The message can be sent along the second axis to 000 and then through the third axis to 001.

Interprocessor Arbitration:
A number of buses at various levels are used to transfer the information between the components in computer
systems.
- Internal buses for transfer of information between processor registers and ALU.
- A Memory bus for transferring data, address and read/write information
- An I/O bus is used to transfer information to and from I/O devices

A system bus connects major components in a multiprocessor system, such as CPUs, IOPs and memory.

System Bus: The lines in a system bus are divided into three functional groups:
- Data
- Address
- Control
In addition, there will be power distribution lines also to the components.
Data Lines:
- Provides a path for the transfer of data between processors and common memory.
- Usually in the multiples of 8.
- Terminated with three-state buffers
- Bidirectional
Address Lines:
- Used to identify a memory address or any other source or destination (I/O ports)
- Determines the maximum possible memory capacity
- Terminated with three-state buffers
- Unidirectional, from processor to memory or I/O ports
Control Lines:
- Provide signals for controlling the information transfer between components.
- These include:
 Timing signals indicate the validity of data and address information.
 Command signals specify operations to be performed.
 Transfer Signals such as
 Memory read and write.
 Acknowledge of a transfer.
 Interrupt requests.
 Bus control signals – Bus request, Bus grant & Arbitration.

NAME OF THE FACULTY:MuraliNath R S, Nagamani G, Kavitha P SUBJECT:COA


https://jntuhbtechadda.blogspot.com/
BVRIT HYDERABAD College of Engineering for Women PAGE NO:10

Data transfers over the system transfers may be:


- Synchronous
- Asynchronous

In a synchronous bus,
- Each data item is transferred from the source to destination units during a pre-known time slice.
- Both source and destination are driven by a common clock.
- In the case of separate clocks, synchronization signals are transmitted periodically to keep all clocks in step
with each other.

In an asynchronous bus, the transfer of each data item is accomplished by using handshaking control signals.

The following figures list the 86 lines available in the IEEE standard 796 multi bus.

Arbitration Procedures:
All the requests from the processor(s) are serviced by arbitration procedures based on the priorities. There are
three kinds of arbitration procedures:
- Serial
- Parallel
- Dynamic
Serial Arbitration Procedure:
- This is a hardware bus priority resolving technique.
- The units requesting the control of the system bus are connected in series as a daisy-chain connection.
- The processors/units are assigned priority based on their position along the priority control line.
The daisy-chain connection of four arbiters is shown below:

NAME OF THE FACULTY:MuraliNath R S, Nagamani G, Kavitha P SUBJECT:COA


https://jntuhbtechadda.blogspot.com/
BVRIT HYDERABAD College of Engineering for Women PAGE NO:11

In the above figure:


- Each processor has its own bus arbiter logic with priority-in (PI) and priority-out (PO) lines.
- The PO of each arbiter is connected to the PI of next-lower priority arbiter.
- The PI of the highest-priority unit is maintained at a logic 1 value.
- The PO output of a particular arbiter is equal to 0, if its PI is 1 and it is requesting the system bus, otherwise
PO output is 1.
- In this way the priority is passed to the next unit in the chain.
- If PI is 0, then PO also will be 0.
- The device with PI as 1 and PO as 0, will receive the control over the system bus by activating bus busy
line.

In case of a higher-priority processor requests the bus; the lower-priority processor must complete its bus
operation before relinquishing the control.

Parallel Arbitration Procedure:


- This is a hardware bus priority resolving technique.
- It uses an external priority encoder and decoder.

In the above figure:


- Each bus arbiter has a bus request output line and a bus acknowledge input line.
- The request line is enabled, when its corresponding processor requests the access to the system bus.
- The processor takes control of the bus; when it’s corresponding acknowledge input line is enabled.
- An orderly transfer of control is achieved with the help of bus busy line.
- All the four request lines are connected to a 4X2 encoder.
- The encoder generates 2-bit code, representing the highest-priority device among the requests.
- This 2-bit output from the encoder helps the 2X4 decoder to enable the appropriate acknowledge line.

NAME OF THE FACULTY:MuraliNath R S, Nagamani G, Kavitha P SUBJECT:COA


https://jntuhbtechadda.blogspot.com/
BVRIT HYDERABAD College of Engineering for Women PAGE NO:12

Dynamic Arbitration Algorithms:


- Both serial and parallel arbitration procedures use a static priority algorithm as the priority of each device is
fixed.
- A dynamic priority algorithm provides the capability for the system to change the priority of the devise
while in operation.

Few arbitration procedures that use dynamic priority algorithms are:


 Time slice
 Polling
 Least Recently Used (LRU)
 First-Come First-Serve (FCFS)
 Rotating daisy-chain

Time slice Algorithm:


- A fixed-length time slice of bus time is allocated to each processor serially in round-robin fashion.
- The service provided to each processor is independent of its location along the bus.
- No preference is given to any processor as the same amount of time is allocated.

Polling Algorithm:
- The poll lines connected to all units are used by the bus controller to define an address for each device.
- The bus controller sequences through the addresses in a prescribed manner.
- A processor that requires access, recognizes its address, activates the bus busy line and accesses the bus.
- The polling process continues by choosing a different processor, after a number of bus cycles.
- This process is programmable and the selection priority can be altered under program control.

LRU Algorithm:
- The highest priority is given to the unit that has not used the bus for the longest interval.
- The priorities are adjusted after a number of bus cycles based on the usage.
- No unit is favored as the priorities are dynamically changed and every unit will get an opportunity.

FCFS Algorithm:
- All the requests are served in the order received.
- A queue is established by the bus controller to maintain the arrived bus requests.
- Each unit must wait for its turn to use the bus on a First-In, First-Out (FIFO) basis.

Rotating daisy-chain Algorithm:


- It is a dynamic extension of serial daisy-chain algorithm.
- There is no central bus controller.
- The priority-out of the last device is connected to the priority-in of the first device, forming closed loop.
- Each arbiter priority for a given bus cycle is determined by its position from the arbiter, controlling the bus.
- The arbiter releasing the bus will have the lowest priority.

NAME OF THE FACULTY:MuraliNath R S, Nagamani G, Kavitha P SUBJECT:COA


https://jntuhbtechadda.blogspot.com/
BVRIT HYDERABAD College of Engineering for Women PAGE NO:13

Interprocessor Communication and Synchronization


Interprocessor Communication
In a multiprocessor system a facility must be provided for communication among the various processors.
- A communication path can be established through common input-output channels.
- In a shared memory multiprocessor system,
 A portion of memory, act as a message center is available for all the processors to access.
 Each processor can leave messages for other processors and pick up messages intended for it.
 The sending processor structures a request, a message, or a procedure and places it in the memory
mailbox by setting the appropriate status bits.
 The receiving processor can check the mailbox periodically to determine if there are valid messages for
it.
 The response time in this procedure is poor.
 This procedure can be made efficient when the sending processor alerts the receiving process or by
means of an interrupt signal.
 This alerts the interrupted processor that a new message was inserted by the interrupting processor.
- A communication path between two CPUs can be established through a link between two lOPs associated
with two different CPUs. This type of link allows each CPU to treat the other as an IO device so that
messages can be transferred through the IO path.

To prevent conflicting use of shared resources by several processors there must be a provision in the operating
system for assigning resources to processors.
There are three organizations that have been used in the design of operating system for multiprocessors:
- Master-slave configuration
- Separate operating system
- Distributed operating system.

Master-Slave Configuration:
- One processor, designated the master, always executes the operating system functions.
- The remaining processors, denoted as slaves, do not perform operating system functions, but requests the
service by interrupting the master.

Separate operating system:


- Every processor may have its own copy of the entire operating system, i.e., loosely coupled systems.
- Each processor can execute the operating system routines it needs.

Distributed operating system:


- The operating system routines are distributed among the available processors.
- But, each particular operating system function is assigned to only one processor at a time.
- Also referred to as a floating operating system as the routines float from one processor to another and the
execution of the routines may be assigned to different processors at different times.

In a loosely coupled multiprocessor system,


- There is no shared memory for passing information.
- The memory is distributed among the processors.
- The communication between processors is by message passing through IO channels.
- The communication is initiated by the sender, calling a procedure that resides in the memory of the
destination to establish a channel.
NAME OF THE FACULTY:MuraliNath R S, Nagamani G, Kavitha P SUBJECT:COA
https://jntuhbtechadda.blogspot.com/
BVRIT HYDERABAD College of Engineering for Women PAGE NO:14

- A message is then sent with a header and various data objects used to communicate between nodes.
- In the case of availability of multiple paths, the operating system in each node contains routing information.
- The communication efficiency of the inter processor network depends on:
 Communication routing protocol.
 Processor speed.
 Data link speed.
 Topology of the network.

Interprocessor Synchronization
The multiprocessor’s instruction set contains basic instructions to implement communication and
synchronization between cooperating processes.
- Communication refers to the exchange of data between different processes.
- Synchronization refers to the control information, needed to enforce the correct sequence of processes and
to ensure mutually exclusive access to shared writable data.

Multiprocessor systems include various mechanisms to deal with the synchronization of resources.
- Low-level primitives to enforce mutual exclusion are implemented directly by the hardware.
- A number of hardware mechanisms for mutual exclusion have been developed. One of the most popular
methods is through the use of a binary semaphore.

Mutual Exclusion with a Semaphore:


Mutual exclusion is a mechanism that
- Guarantee orderly access to shared memory and other shared resources.
- Protect data from being changed simultaneously by two or more processors.
- Enable one processor to exclude or lock out access to a shared resource by other processors when it is in a
critical section.

A critical section is a program sequence that, once begun, must complete execution before another processor
accesses the same shared resource.
A binary variable called a semaphore is often used to indicate whether or
not a processor is executing a critical section. A semaphore is a software controlled
flag that is stored in a memory location that all processors can access.
When the semaphore is equal to 1, it means that a processor is executing a
critical program, so that the shared memory is not available to other processors.
When the semaphore is equal to 0, the shared memory is available to any
requesting processor. Processors that share the same memory segment agree
by convention not to use the memory segment unless the semaphore is equal
to 0, indicating that memory is available . They also agree to set the semaphore
to 1 when they are executing a critical section and to clear it to 0 when they
are finished.
Testing and setting the semaphore is itself a critical operation and must
be performed as a single indivisible operation. If it is not, two or more processors
may test the semaphore simultaneously and then each set it, allowing
them to enter a critical section at the same time. This action would allow
simultaneous execution of critical section, which can result in erroneous initialization
of control parameters and a loss of essential information.
A semaphore can be initialized by means of a test and set instruction in

NAME OF THE FACULTY:MuraliNath R S, Nagamani G, Kavitha P SUBJECT:COA


https://jntuhbtechadda.blogspot.com/
BVRIT HYDERABAD College of Engineering for Women PAGE NO:15

conjunction with a hardware lock mechanism. A hardware lock is a process or generated


signal that serves to prevent other processors from using the system
bus as long as the signal is active. The test-and-set instruction tests and sets
a semaphore and activates the lock mechanism during the time that the instruction
is being executed. This prevents other processors from changing the
semaphore between the time that the processor is testing it and the time that
it is setting it. Assume that the semaphore is a bit in the least significant
position of a memory word whose address is symbolized by SEM. Let
the mnemonic TSL designate the "test and set while locked" operation. The
instruction
TSLSEM
will be executed in two memory cycles (the first to read and the second to write)
without interference as follows:
R +-M[SEM]
M[SEM] +- 1
Test semaphore
Set semaphore

Cache Coherence
• Advantage of cache memory
• Problem in multiprocessor systems as each processor maintains its own cache and updated data might
not be reflected in all local cache memory
• Cache inherence occurs in multiprocessor systems with share writable data.
• Share readable data doesnt result in any issue

• Another configuration that may cause cache coherence is DMA in conjunction with IOP connected to
system bus.

NAME OF THE FACULTY:MuraliNath R S, Nagamani G, Kavitha P SUBJECT:COA


https://jntuhbtechadda.blogspot.com/
BVRIT HYDERABAD College of Engineering for Women PAGE NO:16

Solutions to cache coherence problems

• Do not allow private cache for each processor instead maintain shared cache but increases avg access
time
• Attach private cache but the data stored in cache should be non sharable – such data is called cachable.
– Shared writable data are non cachable
– But the cost increases because of extra software overhead
– Centralized global table in compiler– which allows writable data to be in at least one cache.
– Status of memory blocks is maintained in CGT.
– Blocks are identified as RO(read only) RAW(read & write)
– All cache will have blocks named as ROs and 1 cache will have RAW block. If any updations in
cache that has RAW block will not affect other caches as they don’t have that block.
• Centralized global table in compiler– which allows writable data to be in atleast one cache.
– Status of memory blocks is maintained in CGT.
– Blocks are identified as RO(read only) RAW(read & write)
– All cache will have blocks named as ROs and 1 cache will have RAW block. If any updations in
cache that has RAW block will not affect other caches as they don’t have that block.

 Hardware solution:
– A hardware is used to monitor bus watching mechanism over all caches attached to the bus
called Snoopy cache controller
– Several schemes are proposed using snoopy controller.
– One such easy approach is using write through policy.

 Steps:
– All snoopy controllers watch bus for memory write operations
– When a word is updated in cache, main memory is updated.
– Local snoopy controllers of all the caches check their memory if they have copy of the data that is
overwritten
– If the copy exists in remote cache, that location is marked invalid if not updated.
– If it is invalid in future reference it may lead cache miss

NAME OF THE FACULTY:MuraliNath R S, Nagamani G, Kavitha P SUBJECT:COA


https://jntuhbtechadda.blogspot.com/

You might also like