Professional Documents
Culture Documents
Unit-I Digital Computers
Unit-I Digital Computers
Digital Computers
https://jntuhbtechadda.blogspot.com/
Introduction
What is a Computer?
Computer is an Electronic Device, that
- Accepts the information from the outside world.
- Stores it
- Process it to Produce the results
- Transfers the results to the outside world
Digital Computer:
- Performs Various Computational Tasks
- Represents the information as discrete values
- Uses variables to store information
https://jntuhbtechadda.blogspot.com/
Hybrid Computer:
- Consists of both Analog and Digital components
- Solves differential equations and complex mathematical
equations
- Handles logical and numerical operations
https://jntuhbtechadda.blogspot.com/
A computer system is subdivided into two functional entities:
- Hardware
- Software
Hardware, consists of all the electronic and
electromechanical components, i.e., the physical parts.
Software, consists of the programs and data that are
manipulated to perform various data-processing tasks.
What is a program?
A sequence of instructions for the computer to perform a task
is called a program.
https://jntuhbtechadda.blogspot.com/
Block Diagram of Digital Computer
Random Access Memory
(RAM)
Registers
Arithmetic & Logic Unit Central Processing Unit
(ALU) (CPU)
Control Unit
Input-Output Processor
Input Devices Output Devices
(IOP)
Computer Architecture
Addressing Modes
Specifications
&
Structure
https://jntuhbtechadda.blogspot.com/
Connection among
Hardware Components
Computer Organization
Verification for
Operations
Determination of
Hardware
Computer Design
Way of
Connecting the Parts
Computer
Implementation https://jntuhbtechadda.blogspot.com/
Classification of Architectures
All the Computer Architectures are based on the one of the
following:
1. Von-Neumann Architecture
2. Harvard Architecture
1. Von-Neumann Architecture:
Address
Bus Data
Control
https://jntuhbtechadda.blogspot.com/
2. Harvard Architecture:
https://jntuhbtechadda.blogspot.com/
Register Transfer
and
Microoperations
https://jntuhbtechadda.blogspot.com/
Topics to be Covered
Ø Register Transfer language
Ø Register Transfer
Ø Arithmetic Microoperations
Ø Logic Microoperations
Ø Shift Microoperations
https://jntuhbtechadda.blogspot.com/
Register Transfer
Computer Registers are designated by capital letters
(sometimes followed by numerals) to denote the function of
the register.
Examples:
MAR (Memory Address Register) - holds an address of a
memory location
PC (Program Counter) – holds the address of the next
instruction
IR (Instruction Register) – contains the current instruction
R1 (Processor Register) – holds the temporary data.
https://jntuhbtechadda.blogspot.com/
Here,
- The n outputs of register R1 are connected to the n inputs of
register R2, n will be replaced with the length of the register.
- The load input of R2 is activated by the control variable, P
- Both the register R2 and control variable are synchronized
with the same clock.
- P is activated by the control section on the rising edge of a
clock pulse at time t.
- On the next positive transition of the clock at time t + 1,
finds the load input active and the data inputs of R2 are then
loaded into the register in parallel.
- P may go back to 0 at time t + 1; otherwise, the transfer will
occur with every clock pulse transition.
https://jntuhbtechadda.blogspot.com/
It is illustrated in the following timing diagram
https://jntuhbtechadda.blogspot.com/
BUS and Memory Transfer
A digital system consists of many registers and a path is
required for transfer of information among them.
Suppose separate lines are used between each register and
other registers number of wires become excessive.
Hence, most efficient way is to use a Common Bus System
https://jntuhbtechadda.blogspot.com/
- Four 4 X 1 Multiplexers each having four data inputs (0
through 3) and two selection inputs (S0 and S1) are used.
- The two selection lines S0 and S1 are connected to the
selection inputs of all the four multiplexers.
- These selection lines choose the four bits of one register and
transfer them into the four-line common bus.
- With both of the select lines at low logic, i.e. S1S0 = 00, the
0 data inputs of all four multiplexers are selected and
applied to the outputs to form the bus.
- In the above case, the bus lines receive the content of
register A
- Similarly, when S1S0 = 01, register B is selected, and the
bus lines will receive the content of register B.
https://jntuhbtechadda.blogspot.com/
The Function Table shown below indicates the register
selected by the bus for each of the four possible binary value
of the selection lines.
Note:
- The number of
multiplexers needed to
construct the bus is equal to
the number of bits in each
register.
https://jntuhbtechadda.blogspot.com/
The transfer of information from a bus into one of the
destination registers can be accomplished by:
- Connecting the bus lines to the inputs of all destination
registers
- Activating the load control of the particular destination
register selected.
https://jntuhbtechadda.blogspot.com/
From the above figure, the observations are:
- It differs from normal/conventional buffer by having both
normal and control inputs.
- For control input 1, the output is enabled and behaves like a
normal buffer.
- For control input 0, the output is disabled and enters into
high-impedance state.
Decoder: Decoder is a
combinational circuit that has ‘n’
input lines and maximum of 2n
output lines. One of these outputs
will be active High based on the
combination of inputs present,
when the decoder is enabled.
https://jntuhbtechadda.blogspot.com/
By using the high-impedance state feature, a large number of
three-state gate outputs can be connected with wires to form a
common bus line without endangering loading effects,
https://jntuhbtechadda.blogspot.com/
- The outputs of four buffers are connected together to form a
single bus line.
- At any given time, only one buffer has to be in the active
state.
- So, the control inputs to the buffers should determine the
buffer to communicate with the bus line.
- Hence, only one control input has to be active.
- This can be ensured by using a decoder, as:
o With the enable input of the decoder as 0, all of its
four outputs are 0, and the bus line is in a high-
impedance state because all four buffers are disabled.
o With the enable input is active i.e., 1, one of the
three-state buffers will be active, depending on the
binary value in the select inputs of the decoder.
https://jntuhbtechadda.blogspot.com/
Random Access Memory (RAM)
RAM is a collection of storage cells together with associated
circuits needed to transfer information in and out of storage.
The memory stores binary information in groups of bits called
words.
Each memory cell is indicated by an unique address and holds
one word of information.
Let us assume the RAM contains r = 2k words and each word
is of ‘n’ bits. Hence, it needs:
- ‘n’ data input lines
- ‘n’ data output lines
- ‘k’ address lines
- A Read control line
https://jntuhbtechadda.blogspot.com/
- A Write control line
Memory Transfer
A memory word is designated by the letter M.
The address of the memory word need to be specified for the
memory transfer operations.
The Address Register, AR holds the address and the Data
register, DR holds the data involved in the transfer.
The transfer of information from a memory unit to the outside
is called a Read operation.
So, the Read signal causes a transfer of information into the
data register (DR) from the memory word (M) selected by the
address register (AR).
Thus, a read operation can be stated as:
Read: DR ← M [AR]
https://jntuhbtechadda.blogspot.com/
The transfer of new information into the memory for storing is
called a Write operation.
So, the Write signal causes a transfer of information from
register DR into the memory word (M) selected by address
register (AR).
Hence, the corresponding write operation can be stated as:
Write: M [AR] ← DR
https://jntuhbtechadda.blogspot.com/
Micro-Operations
Microoperation is the elementary operation performed on
information stored in registers.
https://jntuhbtechadda.blogspot.com/
Arithmetic Microoperations
• In general, the Arithmetic Micro-operations deals with the
operations performed on numeric data stored in the registers.
https://jntuhbtechadda.blogspot.com/
Full Adder using two Half Adders:
https://jntuhbtechadda.blogspot.com/
Now, a Binary Adder is constructed using full-adder circuits
connected in series, with the output carry from one full-adder
connected to the input carry of the next full-adder.
The following figure illustrates the construction of 4-bit binary
adder using four full adders
The augend bits (A) and the addend bits (B) are designated by
subscript numbers from right to left, with subscript '0'
denoting the low-order bit.
https://jntuhbtechadda.blogspot.com/
The carry inputs starts from C0 to C3 connected in a chain
through the full-adders. C4 is the resultant output carry
generated by the last full-adder circuit.
The sum outputs (S0 to S3) generates the required arithmetic
sum bits of augend and addend bits.
Note:
§ The 2's compliment can be obtained by taking the 1's
compliment and adding one to the least significant bit.
§ The 1’s complement can be obtained by inverting each bit
of the given number.
With M =1,
- The output from XORs will be B⊕ 1 = Bʹ, i.e., B inputs are
complemented.
- Since C0 = 1, a 1 is added through the input carry, to get 2’s
complement of B.
- Hence, the circuit performs the operation A plus the 2's
complement of B, i.e., A - B.
https://jntuhbtechadda.blogspot.com/
Binary Incrementer
The increment micro-operation adds ‘1’ to the value stored in
a register.
For instance, a 4-bit register has a binary value 0110, when
incremented by one the value becomes 0111.
A 4-bit combinational circuit incrementer, by cascading four
half adders is as shown below:
https://jntuhbtechadda.blogspot.com/
In the above figure,
- The least significant half adder will have logic-1 as one
input and least significant bit of the number to be
incremented as other input.
- The output carry from one half-adder is connected to one of
the inputs of the next-higher-order half-adder.
- The binary incrementer circuit receives the four bits from A0
through A3, adds one to it, and generates the incremented
output in S0 through S3.
- The output carry C4 will be 1 only after incrementing binary
1111.
https://jntuhbtechadda.blogspot.com/
Arithmetic Circuit
All the arithmetic microoperations can be implemented in one
composite arithmetic circuit.
The basic component of this arithmetic circuit is the parallel
adder.
By controlling the data inputs to the adder, it is possible to
obtain different types of arithmetic operations.
https://jntuhbtechadda.blogspot.com/
Logic Microoperations
- Specify binary operations for strings of bits stored in
registers.
- Consider each bit of the register separately and treat them as
binary variables.
- Seldom used in scientific computations.
- Useful for bit manipulation of binary data.
https://jntuhbtechadda.blogspot.com/
The special symbols used to denote logic operations are:
- OR denoted by ˅
- AND denoted by ˄
- Complement i.e., 1’s complement is denoted by a bar on top
of the symbol representing the register.
https://jntuhbtechadda.blogspot.com/
List of Logic Microoperations
16 different logic operations can be performed with two binary
variables.
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Hardware Implementation
The hardware implementation requires that logic gates be
inserted for each bit or pair of bits in the registers to perform
the required logic function.
https://jntuhbtechadda.blogspot.com/
Mask: This operation is similar to the selective-clear operation
except that only a portion of the bits of A are cleared.
This can be done:
- by placing 0(s) in the specified portion in the register B, the
remaining portion as 1(s) &
- performing A ← A ˄ B.
https://jntuhbtechadda.blogspot.com/
The circular shift (rotate) operation, circulates the bits of the
register around the two ends without loss of information.
This is accomplished by connecting the serial output of the
shift register to its serial input.
The symbols cil and cir are used for the circular shift left and
right, respectively.
https://jntuhbtechadda.blogspot.com/
An arithmetic shift is a microoperation that shifts a signed
binary number either to the left or right.
The leftmost bit in a register holds the sign bit, and the
remaining bits hold the number.
The sign bit is 0 for positive and 1 for negative. Negative
numbers are in 2's complement form.
https://jntuhbtechadda.blogspot.com/
Let us consider the register R of n bits.
Bit Rn-1 in the leftmost position holding the sign bit.
Rn-2 is the most significant bit of the number and R0 is the least
significant bit.
https://jntuhbtechadda.blogspot.com/
An overflow flip-flop V, can be used to detect an arithmetic
shift-left overflow. Hence
V = Rn-1 ⊕ Rn-2
If V = 0, there is no overflow.
If V = 1, there is an overflow and a sign reversal after the shift.
https://jntuhbtechadda.blogspot.com/
Hardware Implementation
A shift unit would be constructed in one way using a
bidirectional shift register with parallel load.
In this:
- Information can be transferred to the register in parallel.
- Can be shifted either to the right or left.
In this type of shift unit,
- A clock pulse is needed for loading the data into the register.
- Another pulse is needed to initiate the shift.
https://jntuhbtechadda.blogspot.com/
In the above Circuit:
- The subscript i designates a typical stage.
- Inputs Ai and Bi are applied to both the arithmetic and logic
units.
- A particular microoperation is selected with inputs S1 and
S0.
- The data in the 4 X 1 multiplexer are selected with inputs S3
and S2.
- The inputs to the multiplexer are:
ØAn arithmetic output in Di.
ØA logic output in Ei.
ØAi+1 for the shift-right operation
ØAi-1 for the shift-left operation
https://jntuhbtechadda.blogspot.com/
The one stage ALU circuit provides:
- Eight arithmetic operations,
- Four logic operations,
- Two shift operations.
Each operation is selected with the five variables S3, S2, S1, S0,
and Ci.
https://jntuhbtechadda.blogspot.com/
- The one stage circuit must be repeated n times for an n-bit
ALU.
- The output carry Ci+1 of a given arithmetic stage must be
connected to the input carry Ci of the next stage in
sequence.
- The input carry to the first stage, Cin, provides a selection
variable for the arithmetic operations.
https://jntuhbtechadda.blogspot.com/
Basic Computer
Organization and Design
https://jntuhbtechadda.blogspot.com/
Topics to be Covered
ØInstruction Codes
ØComputer Registers
ØComputer Instructions
ØInstruction Cycle
https://jntuhbtechadda.blogspot.com/
Indirect Address
The second part of an instruction code (0-11 bits) can specify:
- Actual Operand, called as Immediate Operand.
- Address of the operand, called as Direct Address.
- Address of a memory word, consisting the address of the
operand, called as Indirect Address.
Note: One bit in the instruction code can be used to
distinguish between a direct and an indirect address.
The instruction format consists:
- 3-bit Operation Code
- 12-bit Address
- An indirect address mode bit
designated by I
https://jntuhbtechadda.blogspot.com/
I = 0 for a Direct Address; I = 1 for an Indirect Address
The Direct and Indirect Addressing is demonstrated using:
https://jntuhbtechadda.blogspot.com/
Computer Registers
For the execution of the program stored in the memory, the
registers for the following purposes are required:
- A counter to calculate the address of the next instruction.
- Storing the instruction code after it is read from memory.
- Manipulation of data.
- Holding a memory address.
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
In the above figure:
The memory unit has a capacity of 4096 words and each word
contains 16 bits. So,
- All the registers holding the address should be of 12 bits size
ØAddress Register (AR)
ØProgram Counter (PC)
- All the registers holding the data/instruction should be of 16
bits.
Ø Data Register (DR)
Ø Accumulator (AC)
Ø Instruction Register (IR)
Ø Temporary Register (TR)
Hence, the processor (CPU) also should be of 16 bits.
The input register (INPR) receives an 8-bit character from an input
device.
The output register (OUTR) holds an 8-bit character for an output
https://jntuhbtechadda.blogspot.com/
device.
In the instruction code:
- 12 bits for the address of an operand.
- 3 bits to specify operation, i.e., Opcode
- 1 bit to specify a direct or indirect address
The PC:
- goes through a counting sequence by incrementing, so that
the computer reads the instructions sequentially.
- sequential execution continues unless a branch instruction,
that calls for a transfer to a nonconsecutive instruction in the
program, is encountered.
- the address part of a branch instruction is transferred to
become the address of the next instruction.
https://jntuhbtechadda.blogspot.com/
Common Bus System
Our basic computer has:
- Eight Registers,
- A Memory Unit, &
- A Control Unit.
https://jntuhbtechadda.blogspot.com/
Computer Instructions
The basic computer supports three instruction code formats of
16 bits each. In this:
- The operation code (opcode) part of the instruction contains
three bits.
- The meaning of the remaining 13 bits depends on the
opcode.
1> A memory-reference instruction, out of 13 bits uses:
- 12 bits to specify an address.
- One bit to specify the addressing mode I. I is equal to 0 for
direct address and to 1 for indirect address.
https://jntuhbtechadda.blogspot.com/
2> A register-reference instruction specifies:
- The operation code 111 with a 0 in the leftmost bit (bit 15)
of the instruction.
- An operation on or a test of the AC register by using the
other 12 bits.
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Instruction Set Completeness
The instruction set designed for a computer is said to be
complete, if it includes sufficient number of instructions in
each of the following categories:
1. Arithmetic, logical, and shift instructions.
2. Instructions for moving information to and from memory
and processor registers.
3. Program control instructions together with instructions that
check status conditions.
4. Input and output instructions.
https://jntuhbtechadda.blogspot.com/
Register-Reference Instructions
https://jntuhbtechadda.blogspot.com/
Memory-Reference Instructions
The function of the memory-reference instructions can be
defined precisely by means of register transfer notation as
shown below:
https://jntuhbtechadda.blogspot.com/
The effective address is placed in the address register AR:
- During timing signal T2 when I = 0.
- During timing signal T3 when I = 1 as AR ← M[AR].
The execution of the memory-reference instructions starts with
timing signal T4.
AND to AC (AND):
D0T4: DR ← M [AR]
D0T5: AC ← AC /\ DR, SC ← 0
ADD to AC (ADD):
D1T4: DR ← M [AR]
D1T5: AC ← AC + DR, E ← Cout, SC ← 0
https://jntuhbtechadda.blogspot.com/
Load to AC (LDA):
D2T4: DR ← M [AR]
D2T5: AC ← DR, SC ← 0
Store AC (STA):
D3T4: M [AR] ← AC, SC ← 0
https://jntuhbtechadda.blogspot.com/
Input-Output and Interrupt
For the full functionality of a computer, the communication
with the external environment is essential, because:
- Instructions and data stored in memory must be entered
through some input device.
- Computational results must be available to the user through
some output device.
https://jntuhbtechadda.blogspot.com/
The input-output configuration is:
https://jntuhbtechadda.blogspot.com/
Input-Output Instructions
Input and output instructions are needed for:
- Transferring information to and from AC register
- Checking the flag bits
- Controlling the interrupt facility
These instructions have an operation code 1111 (12-15), with
D7 = 1 and I = 1 and the remaining bits of the instruction
specify the particular operation.
https://jntuhbtechadda.blogspot.com/
Program Interrupt
The process of communication can be achieved in two ways:
- Programmed Control Transfer
- Interrupt Controlled Transfer
Programmed Control Transfer:
In this, the computer
- Keeps checking the flag bit
- On set, initiates an information transfer.
The information flow rate of the computer is much higher than
that of the input-output device, makes this transfer inefficient.
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Interrupt Cycle
The R flip-flop is set to 1 if
- IEN = 1 and either FGI or FGO are equal to 1.
- The timing signals T0, T1 or T2 are not active.
The condition for setting flip-flop R to 1 can be expressed as:
T0ʹT1ʹ T2ʹ(IEN)(FGI + FGO): R ← 1
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
UNIT-II
Microprogrammed
Control
https://jntuhbtechadda.blogspot.com/
Topics to be Covered
ØControl Memory
ØAddress Sequencing
ØMicroprogram Example
https://jntuhbtechadda.blogspot.com/
Control Memory
• The control unit in a digital computer initiates sequences of
microoperations.
• There are finite number of different types of
microoperations in a given system.
• The number of sequences of microoperations that are
performed determines the complexity of the digital system.
Control Memory:
- Holds a fixed microprogram that cannot be altered by the
occasional user.
https://jntuhbtechadda.blogspot.com/
The microprogram consists of microinstructions that specify
various internal control signals for execution of register
microoperations.
A reference to this bit by the status bit select lines from control
memory causes the branch address to be loaded into the CAR
unconditionally.
https://jntuhbtechadda.blogspot.com/
Mapping of Instruction
A special type of branching is used to transfer control to the
first word of a microprogram routine for an instruction.
The status bits for this type of branch are the bits in the
operation code part of the instruction.
https://jntuhbtechadda.blogspot.com/
Here,
- An operation code of four bits is used to specify up to 16
distinct instructions.
- The control memory has 128 words, requiring an address of
seven bits, from 0000000 to 1111111.
- For each operation code there exists a microprogram routine
in control memory that executes the instruction.
- This provides for each computer instruction a microprogram
routine with a capacity of four microinstructions.
- If the routine needs more than four microinstructions, it can
use addresses 1000000 through 1111111.
- If it uses fewer than four microinstructions, the unused
memory locations would be available for other routines.
https://jntuhbtechadda.blogspot.com/
Implementation of Mapping in ROM:
Mapping concept can be extended to a more general mapping
rule by using a ROM to specify the mapping function.
In this configuration:
- The bits of the instruction specify the address of a mapping
ROM.
- The contents of the mapping ROM give the bits for the
CAR.
Advantages:
- The microprogram routine executing the instruction can be
placed in any desired location in control memory.
- The mapping concept provides flexibility for adding
instructions for control memory as the need arises.
https://jntuhbtechadda.blogspot.com/
Implementation of Mapping using PLD:
The mapping function is sometimes implemented by means of
an integrated circuit called programmable logic device(PLD).
A PLD is similar to ROM, except that it uses AND and OR
gates with internal electronic fuses.
The interconnection between inputs, AND gates, OR gates,
and outputs can be programmed as in ROM.
A mapping function, expressed in terms of Boolean
expressions can be implemented conveniently with a PLD.
https://jntuhbtechadda.blogspot.com/
Subroutines
Subroutines are programs used by other routines to accomplish
a particular task and can ne called from any point.
The identical sections of Microprograms can be saved as
Subroutines.
https://jntuhbtechadda.blogspot.com/
Microprograms using subroutines must have a provision for:
- Storing the return address during a subroutine call.
- Restoring the address during a subroutine return.
https://jntuhbtechadda.blogspot.com/
Microprogram Example
After the configuration of a computer and establishment of its
microprogrammed control unit, the designer's task is to
generate the microcode for the control memory.
The computer used here is similar but not identical to the basic
computer. https://jntuhbtechadda.blogspot.com/
Computer Configuration
https://jntuhbtechadda.blogspot.com/
From the above block diagram it is observed that:
It consists of two memory units:
- Main memory for storing instructions and data.
- Control memory for storing the microprogram.
https://jntuhbtechadda.blogspot.com/
The computer instruction format is:
https://jntuhbtechadda.blogspot.com/
- ADD instruction adds the content of the operand found in
the effective address to the content of AC.
- BRANCH instruction causes a branch to the effective
address if the operand in AC is negative.
- STORE instruction transfers the content of AC into the
memory word specified by the effective address.
- EXCHANGE instruction swaps the data between AC and
the memory word specified by the effective address.
https://jntuhbtechadda.blogspot.com/
Symbolic Microinstructions
The microinstructions can be specified by using the symbols.
A symbolic microprogram can be translated into its binary
equivalent using an assembler, similar to conventional.
Each line of the assembly language microprogram defines a
symbolic microinstruction.
https://jntuhbtechadda.blogspot.com/
The Fetch Routine
The control memory has 128 words, and each word contains
20 bits.
So, to microprogram the control memory, it is necessary to
determine the bit values of each of the 128 words.
The first 64 words (addresses 0 to 63) are occupied by the
routines for the 16 instructions, the remaining for any other
purpose.
A convenient starting location for the fetch routine is address
64.
ORG 64
FETCH: PCTAR U JMP NEXT
READ, INCPC U JMP NEXT
DRTAR U MAP
https://jntuhbtechadda.blogspot.com/
Symbolic Microprogram
The execution of the third (MAP) microinstruction in the fetch
routine results in a branch to address 0xxxx00, where xxxx are
the four bits of the operation code.
Example:
For ADD instruction with opcode is 0000, the MAP
microinstruction will transfer the address 0000000 to CAR,
the start address for the ADD routine in control memory.
Similarly, the first address for the BRANCH and STORE
routines are 0 0001 00 (decimal 4) and 0 0010 00 (decimal 8),
respectively.
The first address for the other 13 routines are at address values
12, 16, 20, …, 60, i.e., four words in control memory for each
routine. https://jntuhbtechadda.blogspot.com/
In each routine, the microinstructions are required for:
- Evaluating the effective address.
- Executing the instruction.
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Design of Control Unit
The bits of the microinstruction are divided into fields, with
each field defining a distinct, separate function.
The various fields encountered in instruction formats provide:
- Control bits to initiate microoperations in the system.
- Special bits to specify the way that the next address is to be
evaluated.
- An address field for branching.
The nine bits of the microoperation field are divided into three
subfields of three bits each.
The control memory output of each subfield must be decoded
to provide the distinct microoperations.
The outputs of the decoders are connected to the appropriate
inputs in the processor unit.
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Microprogram Sequencer
The basic components of a microprogrammed control unit are:
- Control Memory.
- Circuits selecting the next address.
The address selection part is called a microprogram
sequencer.
Introduction:
Intel produced first 4-bit microprocessor 4004 in 1971 and 8-bit microprocessor 8008 in 1972, which
are not general purpose. In 1974, Intel launched the first general purpose 8-bit microprocessor 8080
and later 8085 with few more added features in the architecture to be a functionally complete
microprocessor.
The limitations of 8-bit microprocessors are:
- Low speed
- Low memory addressing capability
- Limited number of general purpose registers
- Less powerful instruction set
Intermediate data is stored in the register set during the execution of the instructions. The
microoperations required for executing the instructions are performed by the arithmetic
logic unit whereas the control unit takes care of transfer of information among the registers
and guides the ALU. The control unit services the transfer of information among the
registers and instructs the ALU about which operation is to be performed. The computer
instruction set is meant for providing the specifications for the design of the CPU. The
design of the CPU largely, involves choosing the hardware for implementing the machine
instructions.
·
· The need for memory locations arises for storing pointers, counters, return address,
temporary results and partial products. Memory access consumes the most of the time off an
operation in a computer. It is more convenient and more efficient to store these intermediate
values in processor registers.
·
·
·
·
·
·
https://jntuhbtechadda.blogspot.com/
· A common bus system is employed to contact registers that are included in the CPU
in a large number. Communications between registers is not only for direct data transfer but
also for performing various micro-operations. A bus organization for such CPU register
shown in Figure 3.2, is connected to two multiplexers (MUX) to form two buses A and B.
The selected lines in each multiplexers select one register of the input data for the particular
bus.
Control Word
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Stack organization:
A stack is a storage device that stores information in such a manner that the item
stored last is the first item retrieved.
The stack in digital computers is essentially a memory unit with an address register that
can count only. The register that holds the address for the stack is called a stack pointer
(SP) because its value always points at the top item in the stack.
The physical registers of a stack are always available for reading or writing. It is
the content of the word that is inserted or deleted.
Register stack:
https://jntuhbtechadda.blogspot.com/
To remove the top item, the stack is popped by reading the memory word at address 3 and
decrementing the content of SP. Item B is now on top of the stack since SP holds address
2.
To insert a new item, the stack is pushed by incrementing SP and writing a word in the
next-higher location in the stack.
In a 64-word stack, the stack pointer contains 6 bits because 26 = 64.
Since SP has only six bits, it cannot exceed a number greater than 63 (111111 in binary).
When 63 are incremented by 1, the result is 0 since 111111 + 1 = 1000000 in binary, but
SP can accommodate only the six least significant bits.
Similarly, when 000000 is decremented by 1, the result is 111111. The one-bit register
FULL is set to 1 when the stack is full, and the one-bit register EMTY is set to 1 when the
stack is empty of items.
DR is the data register that holds the binary data to be written into or read out of the stack.
PUSH:
If the stack is not full (FULL =0), a new item is inserted with a push operation. The push
operation consists of the following sequences of microoperations:
The stack pointer is incremented so that it points to the address of next-higher word. A
memory write operation inserts the word from DR into the top of the stack.
SP holds the address of the top of the stack and that M[SP] denotes the memory
word specified by the address presently available in SP.
The first item stored in the stack is at address 1. The last item is stored at address 0. If SP
reaches 0, the stack is full of items, so FULL is set to 1. This condition is reached if the
top item prior to the last push was in location 63 and, after incrementing SP, the last item
is stored in location 0.
Once an item is stored in location 0, there are no more empty registers in the stack. If an item
is written in the stack, obviously the stack cannot be empty, so EMTY is cleared to 0.
POP:
A new item is deleted from the stack if the stack is not empty (if EMTY = 0). The pop
operation consists of the following sequences of microoperations:
https://jntuhbtechadda.blogspot.com/
The top item is read from the stack into DR. The stack pointer is then decremented. If
its value reaches zero, the stack is empty, so EMTY is set to 1.
Memory Stack.
The address registers AR points at an array of data which is used during the execute
phase to read an operand.
The stack pointer SP points at the top of the stack which is used to push or pop items into
or from the stack.
The three registers are connected to a common address bus, and either one can provide an
address for memory.
As shown in Figure 5.2, the initial value of SP is 4001 and the stack grows with
decreasing addresses. Thus the first item stored in the stack is at address 4000, the second
item is stored at address 3999, and the last address that can be used for the stack is 3000.
We assume that the items in the stack communicate with a data register DR.
https://jntuhbtechadda.blogspot.com/
PUSH
A new item is inserted with the push operation as follows:
SP←SP-
1 M[SP] ←
DR
The stack pointer is decremented so that it points at the address of the next word. A
memory write operation inserts the word from DR into the top of the stack.
POP
A new item is deleted with a pop operation as follows:
DR ←
M[SP] SP
← SP+1
The top item is read from the stack into DR.
The stack pointer is then incremented to point at the next item in the stack.
The two microoperations needed for either the push or pop are (1) an access to memory
through SP, and (2) updating SP.Which of the two microoperations is done first and
whether SP is updated by incrementing or decrementing depends on the organization of
the stack.
In figure. the stack grows by decreasing the memory address. The stack may be
constructed to grow by increasing the memory also.The advantage of a memory stack
is that the CPU can refer to it without having to specify an address, since the address
is always available and automatically updated in the stack pointer.
Instruction formats
Insruction fields:
OP-code field - specifies the operation to be performed
Address field - designates memory address(s) or a processor register(s)
Mode field - specifies the way the operand or the effective address is determined.
The number of address fields in the instruction format depends on the internal organization of CPU.
The three most common CPU organizations:
Three and two address instruction formats can be used in General register
organization. https://jntuhbtechadda.blogspot.com/
ADDRESSING MODES
Specifies a rule for interpreting or modifying the address field of the instruction (before the
operand is actually referenced)
399 450
XR = 100
400 700
AC
500 800
600 900
Addressing Effective Content
Mode Address of AC
Direct address 500 /* AC (500) */ 800 702 325
Immediate operand - /* AC 500 */ 500
Indirect address 800 /* AC ((500)) */ 300
Relative address 702 /* AC (PC+500) */ 325 800 300
Indexed address 600 /* AC (XR+500) */ 900
Register - /*AC R1 */ 400
Register indirect 400 /* AC (R1) */ 700
Autoincrement 400 /* AC (R1)+ */ 700
Autodecrement 399 /* AC -(R) */ 450
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Arithmetic Instructions
Name Mnemonic
Increment INC
Decrement DEC
Add ADD
Subtract SUB
Multiply MUL
Divide DIV
Add with Carry ADDC
Subtract with Borrow SUBB
Negate(2’s Complement) NEG
It is sometimes convenient to supplement the ALU circuit in the CPU with a status
register where status bit conditions be stored for further analysis. Status bits are also
called condition-code bits or flag bits.
Figure 5.3 shows the block diagram of an 8-bit ALU with a 4-bit status register. The four
status bits are symbolized by C, S, Z, and V. The bits are set or cleared as a result of an
operation performed in the ALU.
3. Bit Z (zero) is set to 1 if the output of the ALU contains all 0’s. it is cleared to 0
otherwise. In other words, Z = 1 if the output is zero and Z = 0 if the output is not
zero.
4. Bit V (overflow) is set to 1 if the exclusives-OR of the last two carries is equal to
1, and cleared to 0 otherwise. This is the condition for an overflow when negative
numbers are in 2’s complement. For the 8-bit ALU, V = 1 if the output is greater
than + 127 or less than -128.
The status bits can be checked after an ALU operation to determine certain
relationships that exist between the vales of A and B.
If bit V is set after the addition of two signed numbers, it indicates an overflow condition.
If Z is set after an exclusive-OR operation, it indicates that A = B.
A single bit in A can be checked to determine if it is 0 or 1 by masking all bits except
the bit in question and then checking the Z status bit.
CMP and TST instructions do not retain their results of operations(- and AND, respectively).
They only set or clear certain Flags.
* Save the Return Address to get the address of the location in the Calling Program upon exit from
the Subroutine
- Locations for storing Return Address:
• Fixed Location in the subroutine(Memory)
• Fixed Location in memory
• In a processor Register
• In a memory stack
- most efficient way
PROGRAM INTERRUPT:
The concept of program interrupt is used to handle a variety of problems that arise out
of normal program sequence.
Program interrupt refers to the transfer of program control from a currently running program
to another service program as a result of an external or internal generated request. Control
returns to the original program after the service program is executed.
After a program has been interrupted and the service routine been executed, the CPU
must return to exactly the same state that it was when the interrupt occurred.
Only if this happens will the interrupted program be able to resume exactly as if nothing had
happened.
The state of the CPU at the end of the execute cycle (when the interrupt is recognized)
is determined from:
The interrupt facility allows the running program to proceed until the input or output device
sets its ready flag. Whenever a flag is set to 1, the computer completes the execution of the
instruction in progress and then acknowledges the interrupt.
The result of this action is that the retune address is stared in location 0. The instruction in
location 1 is then performed; this initiates a service routine for the input or output transfer.
The service routine can be stored in location 1.
The service routine must have instructions to perform the following tasks:
Types of interrupts.:
There are three major types of interrupts that cause a break in the normal execution of a program.
They can be classified as:
1. External interrupts
2. Internal interrupts
3. Software interrupts
1) External interrupts:
External interrupts come from input-output (I/0) devices, from a timing device, from a
circuit monitoring the power supply, or from any other external source.
Examples that cause external interrupts are I/0 device requesting transfer of data, I/o device
finished transfer of data, elapsed time of an event, or power failure. Timeout interrupt may
result from a program that is in an endless loop and thus exceeded its time allocation.
Power failure interrupt may have as its service routine a program that transfers the complete
state of the CPU into a nondestructive memory in the few milliseconds before power ceases.
External interrupts are asynchronous. External interrupts depend on external conditions
that are independent of the program being executed at the time.
2) Internal interrupts:
Internal interrupts arise from illegal or erroneous use of an instruction or data.
Internal interrupts are also called traps.
Examples of interrupts caused by internal error conditions are register overflow, attempt to
divide by zero, an invalid operation code, stack overflow, and protection violation. These
error conditions usually occur as a result of a premature termination of the instruction
execution. The service program that processes the internal interrupt determines the corrective
measure to be taken.
Internal interrupts are synchronous with the program. . If the program is rerun, the
internal interrupts will occur in the same place each time.
3) Software interrupts:
A software interrupt is a special call instruction that behaves like an interrupt rather than a
subroutine call. It can be used by the programmer to initiate an interrupt procedure at any
desired point in the program.
The most common use of software interrupt is associated with a supervisor call instruction.
This instruction provides means for switching from a CPU user mode to the supervisor mode.
Certain operations in the computer may be assigned to the supervisor mode only, as for
example, a complex input or output transfer procedure. A program written by a user must run
in the user mode.
When an input or output transfer is required, the supervisor mode is requested by means of a
supervisor call instruction. This instruction causes a software interrupt that stores the old CPU
state and brings in a new PSW that belongs to the supervisor mode.
The calling program must pass information to the operating system in order to specify
the particular task requested.
https://jntuhbtechadda.blogspot.com/
Computer Arithmetic
Ms. Nagamani G
Ms. Kavitha P
Mr. Murali Nath R S
https://jntuhbtechadda.blogspot.com/
Addition & Subtraction
The Addition and Subtraction operations can be
performed on the data represented as:
- Signed-Magnitude
- Signed-2’s Complement
- Floating-Point
- BCD
- Decimal
https://jntuhbtechadda.blogspot.com/
Signed-Magnitude
https://jntuhbtechadda.blogspot.com/
Hardware Implementation
https://jntuhbtechadda.blogspot.com/
Flowchart for Add and Subtract operations
https://jntuhbtechadda.blogspot.com/
Signed-2’s Complement
https://jntuhbtechadda.blogspot.com/
Floating-Point
https://jntuhbtechadda.blogspot.com/
Flowchart
https://jntuhbtechadda.blogspot.com/
BCD
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Multiplication algorithms
https://jntuhbtechadda.blogspot.com/
Manual Process
This process involves shift and add operations.
https://jntuhbtechadda.blogspot.com/
Flowchart
https://jntuhbtechadda.blogspot.com/
Example
https://jntuhbtechadda.blogspot.com/
Booth Algorithm
Signed 2’s Complement Numbers
A String of 0s in the multiplier require no addition, but shifting
A String of 1s in the multiplier from bit weight 2k to weight 2m can be
treated as 2k+1 – 2m.
https://jntuhbtechadda.blogspot.com/
Flowchart
https://jntuhbtechadda.blogspot.com/
Example
https://jntuhbtechadda.blogspot.com/
Array Multiplier
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Floating-Point Numbers
The multiplication of two floating-point numbers
require:
- Multiplication of mantissas &
- Addition of the exponents
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
division algorithms
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Input-Output Organization 1
• Peripheral Devices
• Input-Output Interface
• Modes of Transfer
• Priority Interrupt
https://jntuhbtechadda.blogspot.com/
PERIPHERAL DEVICES
https://jntuhbtechadda.blogspot.com/
INPUT/OUTPUT INTERFACES
* Provides a method for transferring information between internal storage
(such as memory and CPU registers) and external I/O devices
- Unit of Information
Peripherals - Byte
CPU or Memory - Word
- Operating Modes
Peripherals - Autonomous, Asynchronous
CPU or Memory - Synchronous
https://jntuhbtechadda.blogspot.com/
Keyboard
and Printer Magnetic Magnetic
display disk tape
terminal
Interface
- Decodes the device address (device code)
- Decodes the commands (operation)
- Provides signals for the peripheral controller
- Synchronizes the data flow and supervises
the transfer rate between peripheral and CPU or Memory
Typical I/O instruction
Device address
Op. code Function code
(Command)
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Physical Organizations
* Many computers use a common single bus system
for both memory and I/O interface units
- Use one common bus but separate control lines for each function
- Use one common bus with common control lines for both functions
* Some computer systems use two separate buses,
one to communicate with memory and the other with I/O interfaces
I/O Bus
- Communication between CPU and all interface units is via a common
I/O Bus
- An interface connected to a peripheral device may have a number of
data registers , a control register, and a status register
- A command is passed to the peripheral by sending
to the appropriate interface register
- Function code and sense lines are not needed (Transfer of data, control,
and status information is always via the common I/O Bus)
https://jntuhbtechadda.blogspot.com/
Memory-mapped I/O
- A single set of read/write control lines
(no distinction between memory and I/O transfer)
- Memory and I/O addresses share the common address space
-> reduces memory address range available
- No specific input or output instruction
-> The same memory reference instructions can
be used for I/O transfers
- Considerable flexibility in handling I/O operations
https://jntuhbtechadda.blogspot.com/
Handshaking
- A control signal is accompanied with each data
being transmitted to indicate the presence of data
- The receiving unit responds with another control
signal to acknowledge receipt of the data
https://jntuhbtechadda.blogspot.com/
STROBE CONTROL
* Employs a single control line to time each transfer
* The strobe may be activated by either the source or the destination unit
Strobe Strobe
https://jntuhbtechadda.blogspot.com/
HANDSHAKING
Strobe Methods
Source-Initiated
Destination-Initiated
https://jntuhbtechadda.blogspot.com/
Valid data
Data bus
Timing Diagram
Data valid
Data accepted
Data valid
Valid data
Data bus
1 1 0 0 0 1 0 1
Start Character bits Stop
bit bits
(1 bit) (at least 1 bit)
Modes of Transfer
https://jntuhbtechadda.blogspot.com/
Programmed I/O
• It is due to the result of the I/O instructions that are
written in the computer program.
• Each data item transfer is initiated by an instruction in the
program.
• Usually the transfer is from a CPU register and memory.
In this case it requires constant monitoring by the CPU of
the peripheral devices.
• Ex: I/O device does not have direct access to the
memory unit.
• A transfer from I/O device to memory requires the
execution of several instructions by the CPU, including
an input instruction to transfer the data from device to the
CPU and store instruction to transfer the data from CPU
to memory.
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
DMA Transfer
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Interrupts
Priority Interrupt
• A priority interrupt is a system which decides the priority
at which various devices, which generates the interrupt
signal at the same time, will be serviced by the CPU.
• The system has authority to decide which conditions are
allowed to interrupt the CPU, while some other interrupt
is being serviced.
• Generally, devices with high speed transfer such
as magnetic disks are given high priority and slow
devices such as keyboards are given low priority.
• When two or more devices interrupt the computer
simultaneously, the computer services the device with
the higher priority first.
https://jntuhbtechadda.blogspot.com/
• Types of Interrupts:
• Hardware Interrupts
• When the signal for the processor is from an external
device or hardware then this interrupts is known
as hardware interrupt.
Example: when we press any key on our keyboard to do
some action, then this pressing of the key will generate
an interrupt signal for the processor to perform certain
action. Such an interrupt can be of two types:
• Maskable Interrupt:The hardware interrupts which can
be delayed when a much high priority interrupt has
occurred at the same time.
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
• Main Memory
• Auxiliary Memory
• Associate Memory
• Cache Memory
https://jntuhbtechadda.blogspot.com/
Memory
Memory Hierarchy:
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
We can infer the following characteristics of
Memory Hierarchy Design from above figure:
• Capacity:
It is the global volume of information the
memory can store. As we move from top to
bottom in the Hierarchy, the capacity increases.
• Access Time:
It is the time interval between the read/write
request and the availability of the data. As we
move from top to bottom in the Hierarchy, the
access time increases.
https://jntuhbtechadda.blogspot.com/
• Performance:
Earlier when the computer system was designed without
Memory Hierarchy design, the speed gap increases
between the CPU registers and Main Memory due to large
difference in access time. This results in lower performance
of the system and thus, enhancement was required. This
enhancement was made in the form of Memory Hierarchy
Design because of which the performance of the system
increases. One of the most significant ways to increase
system performance is minimizing how far down the
memory hierarchy one has to go to manipulate data.
• Cost per bit:
As we move from bottom to top in the Hierarchy, the cost
per bit increases i.e. Internal Memory is costlier than
External Memory.
https://jntuhbtechadda.blogspot.com/
Main Memory
• The main memory acts as the central storage unit
in a computer system. It is a relatively large and
fast memory which is used to store programs and
data during the run time operations.
• The primary technology used for the main
memory is based on semiconductor integrated
circuits. The integrated circuits for the main
memory are classified into two major units.
• RAM (Random Access Memory) integrated circuit
chips
• ROM (Read Only Memory) integrated circuit chips
https://jntuhbtechadda.blogspot.com/
RAM Chips
• The RAM integrated circuit chips are further classified into
two possible operating modes, static and dynamic
• The primary compositions of a static RAM are flip-flops that
store the binary information. The nature of the stored
information is volatile, i.e. it remains valid as long as power
is applied to the system. The static RAM is easy to use and
takes less time performing read and write operations as
compared to dynamic RAM.
• The dynamic RAM exhibits the binary information in the
form of electric charges that are applied to capacitors. The
capacitors are integrated inside the chip by MOS
transistors. The dynamic RAM consumes less power and
provides large storage capacity in a single memory chip.
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
• A 128 * 8 RAM chip has a memory capacity of 128
words of eight bits (one byte) per word. This requires a
7-bit address and an 8-bit bidirectional data bus.
• The 8-bit bidirectional data bus allows the transfer of
data either from memory to CPU during
a read operation or from CPU to memory during
a write operation.
• The read and write inputs specify the memory
operation, and the two chip select (CS) control inputs
are for enabling the chip only when the microprocessor
selects it.
• The bidirectional data bus is constructed using three-
state buffers.
https://jntuhbtechadda.blogspot.com/
• The output generated by three-state buffers can be
placed in one of the three possible states which include
a signal equivalent to logic 1, a signal equal to logic 0,
or a high-impedance state.
• Note: The logic 1 and 0 are standard digital signals
whereas the high-impedance state behaves like an
open circuit, which means that the output does not
carry a signal and has no logic significance.
• The following function table specifies the operations of
a 128 * 8 RAM chip.
• From the functional table, we can conclude that the
unit is in operation only when CS1 = 1 and CS2 = 0. The
bar on top of the second select variable indicates that
this input is enabled when it is equal to 0.
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
ROM Chips
• The primary component of the main memory is RAM
integrated circuit chips, but a portion of memory may
be constructed with ROM chips.
• A ROM memory is used for keeping programs and data
that are permanently resident in the computer.
• Apart from the permanent storage of data, the ROM
portion of main memory is needed for storing an initial
program called a bootstrap loader. The primary
function of the bootstrap loader program is to start the
computer software operating when power is turned
on.
• ROM chips are also available in a variety of sizes and
are also used as per the system requirement.
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
• A ROM chip has a similar organization as a RAM
chip. However, a ROM can only perform read
operation; the data bus can only operate in an
output mode.
• The 9-bit address lines in the ROM chip specify
any one of the 512 bytes stored in it.
• The value for chip select 1 and chip select 2 must
be 1 and 0 for the unit to operate. Otherwise, the
data bus is said to be in a high-impedance state.
https://jntuhbtechadda.blogspot.com/
CONNECTION OF MEMORY TO CPU
https://jntuhbtechadda.blogspot.com/
Auxiliary Memory
• In addition to main memory we have large
storage called as Auxiliary or secondary storage
• Bigger in size, lesser in cost, more access time
• Physical properties of a device are complex but
logical properties can be characterized and
compared by few parameters
• Namely – Access mode, access time, transfer
rate, cost and capacity.
• Mode represents direct or indirect
• Access time – average time required to reach
location and access the data
https://jntuhbtechadda.blogspot.com/
• In magnetic disks and tapes
• Acess time = seek time + transfer time
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Associative Memory
• An associative memory can be considered as a memory
unit whose stored data can be identified for access by
the content of the data itself rather than by an address
or memory location.
• Associative memory is often referred to as Content
Addressable Memory (CAM).
• Associative memory is much slower than RAM, and is
rarely encountered in mainstream computer designs.
• When a write operation is performed on associative
memory, no address or memory location is given to the
word.
• The memory itself is capable of finding an empty
unused location to store the word.
https://jntuhbtechadda.blogspot.com/
• On the other hand, when the word is to be read from
an associative memory, the content of the word, or
part of the word, is specified.
• The words which match the specified content are
located by the memory and are marked for reading.
• Associative memory is expensive to implement as
integrated circuitry.
• Associative memory can be used in certain very-high-
speed searching applications where search time is very
critical and must be very short.
https://jntuhbtechadda.blogspot.com/
H/W organization of Associative Memory
https://jntuhbtechadda.blogspot.com/
• From the block diagram, we can say that an
associative memory consists of a memory array
and logic for 'm' words with 'n' bits per word.
• The functional registers like the argument
register A and key register K each have n bits, one
for each bit of a word. The match
register M consists of m bits, one for each
memory word.
• The words which are kept in the memory are
compared in parallel with the content of the
argument register.
https://jntuhbtechadda.blogspot.com/
• The key register (K) provides a mask for choosing
a particular field or key in the argument word.
• If the key register contains a binary value of all
1's, then the entire argument is compared with
each memory word.
• Otherwise, only those bits in the argument that
have 1's in their corresponding position of the
key register are compared.
• Thus, the key provides a mask for identifying a
piece of information which specifies how the
reference to memory is made.
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
• The diagram represent the relation between the
memory array and the external registers in an
associative memory.
• The cells present inside the memory array are
marked by the letter C with two subscripts.
• The first subscript gives the word number and the
second specifies the bit position in the word. For
instance, the cell Cij is the cell for bit j in word i.
• A bit Aj in the argument register is compared with
all the bits in column j of the array provided that
Kj = 1. This process is done for all columns j = 1, 2,
3......, n.
https://jntuhbtechadda.blogspot.com/
• If a match occurs between all the unmasked
bits of the argument and the bits in word i,
the corresponding bit Mi in the match register
is set to 1. If one or more unmasked bits of the
argument and the word do not match, Mi is
cleared to 0.
https://jntuhbtechadda.blogspot.com/
Cache Memory
• The data or contents of the main memory that
are used frequently by CPU are stored in the
cache memory so that the processor can easily
access that data in a shorter time.
• Whenever the CPU needs to access memory, it
first checks the cache memory. If the data is not
found in cache memory, then the CPU moves into
the main memory.
• Cache memory is placed between the CPU and
the main memory. The block diagram for a cache
memory can be represented as:
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
• The cache is the fastest component in the memory hierarchy
and approaches the speed of CPU components.
• The basic operation of a cache memory is as follows:
– When the CPU needs to access memory, the cache is
examined. If the word is found in the cache, it is read from
the fast memory.
– If the word addressed by the CPU is not found in the
cache, the main memory is accessed to read the word.
– A block of words one just accessed is then transferred
from main memory to cache memory. The block size may
vary from one word (the one just accessed) to about 16
words adjacent to the one just accessed.
– The performance of the cache memory is frequently
measured in terms of a quantity called hit ratio.
https://jntuhbtechadda.blogspot.com/
– When the CPU refers to memory and finds the word in
cache, it is said to produce a hit.
– If the word is not found in the cache, it is in main
memory and it counts as a miss.
– The ratio of the number of hits divided by the total
CPU references to memory (hits plus misses) is the hit
ratio.
The basic characteristics of cache memory is its fast
access time. Therefore, little time or no time must be
wasted when searching data in cache memory.
Transformation of data from main memory to cache is
called Mapping process.
https://jntuhbtechadda.blogspot.com/
We have three types of mapping procedures
• Associative mapping
• Direct mapping
• Set-associative mapping
To help understand the mapping procedure, we have the following
example:
https://jntuhbtechadda.blogspot.com/
Associative mapping:
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
A CPU address of 15 bits is places in the argument
register and the associative memory us searched for a
matching address
If the address is found, the corresponding 12-bits data is
read and sent to the CPU
If not, the main memory is accessed for the word
https://jntuhbtechadda.blogspot.com/
Direct Mapping
• Associative memory is expensive compared to RAM
• In general case, there are 2^k words in cache memory and 2^n words in
main memory (in our case, k=9, n=15)
• The n bit memory address is divided into two fields: k-bits for the index
and n-k bits for the tag field
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Set-Associative Mapping
• The disadvantage of direct mapping is that two words
with the same index in their address but with different
tag values cannot reside in cache memory at the same
time
• Set-Associative Mapping is an improvement over the
direct-mapping in that each word of cache can store
two or more word of memory under the same index
address
• In the next slide, each index address refers to two data
words and their associated tags
• Each tag requires six bits and each data word has 12
bits, so the word length is 2*(6+12) = 36 bits
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
Unit V
https://jntuhbtechadda.blogspot.com/
CISC & RISC Characteristics
• Important in computer acrhi. Is instruction set
provided to computer.
• Instruction set selected for a computer
determines the way machine language
programs are constructed.
https://jntuhbtechadda.blogspot.com/
• Trend in computer hardware complexity was
influenced by
– Upgrading existing models
– Adding instructions to facilitate Translation of high
level to machine level programs
– Moving the functions from software
implementation to hardware implementation
– A computer with large no of instructions is called
Complex Instruction Set Computer(CISC)
https://jntuhbtechadda.blogspot.com/
• In early 80s, there were computer use few
instructions with simple constructs so that
they can be executed much faster within CPU
without having to use memory often which is
called Reduced Instruction Set
Computer(RISC)
https://jntuhbtechadda.blogspot.com/
CISC Characteristics
• Large no of instructions
• More no of addressing modes
• Some instructions that perform specialized tasks which
are used infrequently
• Variable length instruction formats
• Instructions that manipulate operands in memory
https://jntuhbtechadda.blogspot.com/
RISC Characteristics
• It involves an attempt to reduce the execution
time by simplifying instruction set
• Characteristics
– Few instructions
– Few addressing modes
– Memory access limited to load and store
– All operations are done within registers of CPU
– Fixed length instruction format easy for decoding
– Single cycle instruction execution(Pipelining)
https://jntuhbtechadda.blogspot.com/
– Uses large no of registers
– Efficient instruction Pipeline
– Overlapped register windows
• Here much time is consumed to save and restore values
from memory/stack while implementing Subroutine call
& return to avoid this we use overlapping of register
windows
https://jntuhbtechadda.blogspot.com/
• Register Windows:
• Procedure CALL and RETURN occurs in HL
programming languages. When translated into
machine language,
• a procedure CALL produces a sequence of
instructions that:
• - Save register values. –
• Pass parameters needed for the procedure.
• - Call a subroutine to execute the body of the
procedure.
https://jntuhbtechadda.blogspot.com/
• After a procedure RETURN, the program will:
– Restore the old register values,
• – Pass results to the calling program,
• – Return from the subroutine.
https://jntuhbtechadda.blogspot.com/
• Saving and restoring registers and passing parameters
and results involve time-consuming operations.
• To overcome this, one of the following techniques must
be used:
• 1. Using multiple-register banks: each procedure is
allocated its own bank of registers. This will eliminate
the need for saving and storing register values.
• 2. Using the memory stack to store the parameters that
are needed by the procedure, but this requires a
memory access every time the stack is accessed.
https://jntuhbtechadda.blogspot.com/
• 3. Using overlapped register windows (RISC
processors) to provide the passing of
parameters and avoid the need for saving and
restoring register values.
https://jntuhbtechadda.blogspot.com/
• For overlapped register windows:
• Each procedure CALL results in the allocation
of a new window consisting of a set from the
register file for use by the new procedure.
• Each procedure CALL activates a new register
window by incrementing a pointer, while the
RETURN statement decrements the pointer
and causes the activation of the previous
window
https://jntuhbtechadda.blogspot.com/
• Windows for adjacent procedures have
overlapping registers that are shared to
provide the passing of parameters and results.
https://jntuhbtechadda.blogspot.com/
https://jntuhbtechadda.blogspot.com/
• Example:
• The system has 74 registers:
• R0-R9: Ten hold parameters shared by all procedures.
• R10-R73: Sixty four registers are divided into FOUR
windows to accommodate procedures A, B, C & D.
• Each register window consists of 10 local registers and
TWO sets of 6 registers common to adjacent windows.
• Local Registers: used for local variables.
• Common Registers: used for exchange of parameters
and results between adjacent procedures.
https://jntuhbtechadda.blogspot.com/
• Each procedure has 32 registers when it is
active, these are; global regs + local regs +
overlapping regs 10+10+6+6= 32 registers
• To calculate Register Window Size:
• G: No. of global registers.
• L: No. of local registers in each window.
• C: No. of registers common to two windows
• W: No. of windows.
https://jntuhbtechadda.blogspot.com/
• Window size (S): No. of registers available for
each window.
– Window Size= S= L+2C+G
To Memory
1) SISD (Single Instruction - Single Data stream) Incrementer
Processor
» for practical purpose: only one processor is useful registers
Floatint-point
add-subtract
» Example systems : Amdahl 470V/6, IBM 360/91
Floatint-point
IS multiply
Floatint-point
divide
CU PU MM
IS DS
https://jntuhbtechadda.blogspot.com/
9-2
2) SIMD
Shared
(Single Instruction - Multiple Data stream) DS 1
memmory
PU 1 MM 1
» vector or array operations
one vector operation includes many DS 2
PU 2 MM 2
operations on a data stream
IS
CU
» Example systems : CRAY -1, ILLIAC-IV
DS n
PU n MM n
IS
3) MISD
(Multiple Instruction - Single Data stream)
» Data Stream Bottle neck DS
IS 1 IS 1
CU 1 PU 1
Shared memory
IS 2 IS 2
CU 2 PU 2 MM n MM 2 MM 1
IS n IS n
CU n PU n
DS
https://jntuhbtechadda.blogspot.com/
9-3
4) MIMD
(Multiple Instruction - Multiple Data stream)
» Multiprocessor Shared
memory
IS 1 IS 1 DS
System CU 1 PU 1 MM 1
IS 2 IS 2
CU 2 PU 2 MM 2
v v
IS n IS n
CU n PU n MM n
Pipelining
Pipelining
Decomposing a sequential process into suboperations
Each subprocess is executed in a special dedicated segment concurrently
Pipelining Example
Multiply and add operation : Ai * Bi Ci ( for i = 1, 2, …, 7 )
3 Suboperation Segment
»1) R1 Ai, R2 Bi : Input Ai and Bi
»2) R3 R1* R2, R4 Ci : Multiply and input Ci
»3) R5 R3 R4 : Add Ci
https://jntuhbtechadda.blogspot.com/
9-5
Pipelining
General considerations
4 segment pipeline :
» S : Combinational circuit
for Suboperation
» R : Register(intermediate
results between the
segments)
Space-time diagram :
» Show segment utilization
as a function of time
Task : T1, T2, T3,…, T6 Clock cycles 1 2 3 4 5 6 7 8 9
» Total operation 1 T 1 T 2 T 3 T 4 T 5 T 6
Segment
1 2 3 4 5
all the segment
3 T 1 T 2 T 3 T 4 T5 T 6
4 T 1 T 2 T 3 T4 T 5 T 6
https://jntuhbtechadda.blogspot.com/
9-6
Segment
2 T 1 T 2 T 3 T 4 T 5 T6
3 T 1 T 2 T 3 T 4 T5 T 6
4 T 1 T 2 T 3 T4 T 5 T 6
https://jntuhbtechadda.blogspot.com/
9-7
Instruction Pipeline
Instruction Cycle
1) Fetch the instruction from memory
2) Decode the instruction
3) Calculate the effective address
4) Fetch the operands from memory
5) Execute the instruction
6) Store the result in the proper place
https://jntuhbtechadda.blogspot.com/
9-8
Instruction Pipeline
Example : Four-segment Instruction Pipeline
Four-segment CPU pipeline :
» 1) FI : Instruction Fetch
» 2) DA : Decode Instruction & calculate EA
» 3) FO : Operand Fetch
» 4) EX : Execution
Timing of Instruction Pipeline :
» Instruction 3 Branch
Step : 1 2 3 4 5 6 7 8 9 10 11 12 13
Instruction : 1 FI DA FO EX
2 FI DA FO EX
(Branch) 3 FI DA FO EX
4 FI FI DA FO EX
5 FI DA FO EX
6 FI DA FO EX
7 FI DA FO EX
No Branch Branch
https://jntuhbtechadda.blogspot.com/
Computer System Architecture Chap. 9 Pipeline and Vector Processing Dept. of Info. Of Computer
9-9
https://jntuhbtechadda.blogspot.com/
9-10
Clock cycles : 1 2 3 4 5 6 7 8 9 10
Delayed Branch 1. Load I A E
4. Subtract I A E
5. Branch to X I A E
6. No-operation I A E
7. No-operation I A E
8. Instruction in X I A E
Clock cycles : 1 2 3 4 5 6 7 8
1. Load I A E
2. Increment I A E
3. Branch to X I A E
4. Add I A E
5. Subtract I A E
6. Instruction in X I A E
https://jntuhbtechadda.blogspot.com/
9-11
https://jntuhbtechadda.blogspot.com/
9-12
https://jntuhbtechadda.blogspot.com/
9-13
Vector processor
» Single vector instruction
C(1:100) = A(1:100) + B(1:100)
https://jntuhbtechadda.blogspot.com/
9-14
https://jntuhbtechadda.blogspot.com/
9-15
A8B8 A7B7 A6B 6 A5B5 A4B4 A3B3 A2B2 A1B1 A8B8 A7B7 A6B 6 A5B5 A4B4 A3B3 A2B2 A1B1
DR DR DR DR
https://jntuhbtechadda.blogspot.com/
9-17
https://jntuhbtechadda.blogspot.com/
9-18
https://jntuhbtechadda.blogspot.com/
9-19
https://jntuhbtechadda.blogspot.com/
9-20
It contains a set of identical processing elements (PE's), each of which is having a local
memory M.
Each processor element includes an ALU and registers.
The master control unit controls all the operations of the processor elements. It also decodes
the instructions and determines how the instruction is to be executed.
The main memory is used for storing the program.
The control unit is responsible for fetching the instructions.
Vector instructions are send to all PE's simultaneously and results are returned to the
memory.
The best known SIMD array processor is the ILLIAC IV computer developed by
the Burroughs corps.
SIMD processors are highly specialized computers. They are only suitable for numerical
problems that can be expressed in vector or matrix form and they are not suitable for other
types of computations.
https://jntuhbtechadda.blogspot.com/
9-21
https://jntuhbtechadda.blogspot.com/
BVRIT HYDERABAD College of Engineering for Women PAGE NO:1
Multi Processors
Syllabus
Characteristics of Multiprocessors, Interconnection Structures, Interprocessor arbitration, Interprocessor
communication and synchronization, Cache Coherence.
Characteristics of Multiprocessors:
The interconnection of two or more CPUs with memory and input-output equipment is called as multiprocessor
system.
But, a system with a single CPU and one or more lOPs is usually not a multiprocessor system, unless lOP has
computational facilities comparable to a CPU.
So, a multiprocessor system implies the existence of multiple CPUs, with one or more lOPs.
A multicomputer system consists of several autonomous computers interconnected with each other by means of
communication lines.
The reliability of the system is improved with multiprocessing because a failure or error in one processor has a
limited effect on the rest of the system, as the second processor can be assigned to perform the functions of the
disabled processor.
Hence, the system will continue to function correctly with probably some loss in efficiency.
High performance can be achieved using a multiprocessor organization as the computations can proceed in
parallel in one of the following two ways.
1. Multiple independent jobs can be operated in parallel.
2. A single job can be partitioned into multiple parallel tasks.
b. In a computer system
One processor performs high-speed floating-point mathematical computations.
Other processor(s) take(s) care of routine data-processing tasks.
Based on the organization of memory, the multiprocessors are classified into two types:
1. Shared-memory or tightly coupled microprocessor
2. Distributed- memory or loosely coupled microprocessor
Tightly coupled systems can tolerate a higher degree of interaction between tasks.
Loosely coupled systems are most efficient for minimal interaction between tasks.
Interconnection Structures:
A multiprocessor system consists of following components:
- CPUs,
- IOPs connected to input-output devices,
- A memory unit that may be partitioned into a number of separate modules
Depending on the number of transfer paths available between the processors and memory in a sharedmemory
system or distributed memory system, the interconnection among the components can have different physical
configurations. They are:
1. Time-shared common bus
2. Multiport memory
3. Crossbar switch
NAME OF THE FACULTY:MuraliNath R S, Nagamani G, Kavitha P SUBJECT:COA
https://jntuhbtechadda.blogspot.com/
BVRIT HYDERABAD College of Engineering for Women PAGE NO:3
In this configuration:
- At any given time, only one processor having the control over the bus is allowed to communicate with the
memory or another processor.
- Any processor wishing to initiate a transfer:
First determines the availability of the bus
After the availability of the bus, address the destination is used to initiate the transfer and a command is
issued to indicate the operation to be performed.
The destination recognizes its address in the bus and responds to the sender’s control signals
accordingly.
The system will resolve the conflicts by incorporating a bus controller that establishes priorities among the
requesting units.
Limitations:
- A single common-bus system is restricted to one transfer at a time. So, all other processors are either busy
with internal operations or must be idle waiting for the bus.
- Hence, the total overall transfer rate within the system is limited by the speed of the single bus.
To overcome these, the processors in the system can be kept busy through the implementation of two or more
independent buses to permit multiple simultaneous bus transfers. But, this increases the overall system cost and
complexity.
2. Multiport Memory:
A multiport memory system employs separate buses between each memory module and each CPU as shown
below.
- The conflicts are resolved by assigning fixed priorities to each memory port, and the processors are
connected accordingly.
Advantage:
High transfer rate can be achieved using multiple paths between processors and memory.
Disadvantages:
- Expensive memory control logic
- Large number of cables and connectors.
In lieu of the above, this interconnection structure is appropriate for a small number of processors.
3. Crossbar Switch:
The crossbar switch organization shown below consists of a number of crosspoints that are placed at
intersections between processor buses and memory module paths.
The functional design of a crossbar switch connected to one memory module is illustrated below.
A crossbar switch organization uses a separate path associated with each memory module to support
simultaneous transfers. However, the hardware required to implement the switch can become quite large and
complex.
The 2 x 2 switch can be used as a building block, to construct multistage network to control the communication
between a number of sources and destinations.
- The two processors P1 and P2 are connected through switches to eight memory modules marked in binary
from 000 through 111.
- The path from source to a destination is determined from the binary bits of the destination number.
The first bit determines the switch output in the first level.
The second bit specifies the output of the switch in the second level
The third bit specifies the output of the switch in the third level.
The above example illustrates that either P1 or P2 can be connected to any one of the eight memories.
But, certain request patterns cannot be satisfied simultaneously.
For example, if P1 is connected to one of the destinations 000 through 011, P 2 can be connected to only one of
the destinations 100 through 111.
Many different topologies have been proposed for multistage switching networks to control
- Processor-memory communication in a tightly coupled multiprocessor system
- Communication between the processing elements in a loosely coupled system.
The omega switching network shown below is one of the popular topology.
In a tightly coupled multiprocessor system, the source is a processor and the destination is a memory module.
- The first pass through the network sets up the path.
- Succeeding passes are used to transfer the address into the memory and then transfer the data in either
direction based on Read or Write signals
In a loosely coupled multiprocessor system, both the source and destination are processing elements.
- The first pass through the network sets up the path.
- Then the source transfers the message to the destination.
5. Hypercube Interconnection:
The hypercube or binary n-cube multiprocessor structure is a loosely coupled system composed of N = 2nnodes
interconnected in an n-dimensional binary cube.
- Each node can be a CPU, local memory or an I/O interface.
- Each node has direct communication paths called as edges to n other neighbor nodes.
- There are 2n distinct n-bit binary addresses that can be assigned to the nodes.
- Each node address differs from that of each of its n neighbors by exactly one bit position.
For example, the three neighbors of the node with address 100 in a three-cube structure are 000, 110, and
101. Each of these binary numbers differs from address 100 by one bit value.
In general, an n-cube structure has 2nnodes with a processor residing in each node.
In an n-cube structure, one to n links are required to route messages from a source node to a destination node.
For example, in a three-cube structure, node 000:
- Can communicate directly with node 001.
- Must cross at least two links to communicate with 011 (from 000 to 001 to 011 or from 000 to 010 to 011).
- Go through at least three links to communicate from node 000 to node 111.
A routing procedure is based on the result of exclusive-OR of the source node address with the destination node
address. The resulting binary value will have 1 bits corresponding to the axes on which the two nodes differ.
The message is then sent along any one of the axes.
For example, in a three-cube structure, a message from 010 to 001produces an exclusive-OR result equal to 011.
The message can be sent along the second axis to 000 and then through the third axis to 001.
Interprocessor Arbitration:
A number of buses at various levels are used to transfer the information between the components in computer
systems.
- Internal buses for transfer of information between processor registers and ALU.
- A Memory bus for transferring data, address and read/write information
- An I/O bus is used to transfer information to and from I/O devices
A system bus connects major components in a multiprocessor system, such as CPUs, IOPs and memory.
System Bus: The lines in a system bus are divided into three functional groups:
- Data
- Address
- Control
In addition, there will be power distribution lines also to the components.
Data Lines:
- Provides a path for the transfer of data between processors and common memory.
- Usually in the multiples of 8.
- Terminated with three-state buffers
- Bidirectional
Address Lines:
- Used to identify a memory address or any other source or destination (I/O ports)
- Determines the maximum possible memory capacity
- Terminated with three-state buffers
- Unidirectional, from processor to memory or I/O ports
Control Lines:
- Provide signals for controlling the information transfer between components.
- These include:
Timing signals indicate the validity of data and address information.
Command signals specify operations to be performed.
Transfer Signals such as
Memory read and write.
Acknowledge of a transfer.
Interrupt requests.
Bus control signals – Bus request, Bus grant & Arbitration.
In a synchronous bus,
- Each data item is transferred from the source to destination units during a pre-known time slice.
- Both source and destination are driven by a common clock.
- In the case of separate clocks, synchronization signals are transmitted periodically to keep all clocks in step
with each other.
In an asynchronous bus, the transfer of each data item is accomplished by using handshaking control signals.
The following figures list the 86 lines available in the IEEE standard 796 multi bus.
Arbitration Procedures:
All the requests from the processor(s) are serviced by arbitration procedures based on the priorities. There are
three kinds of arbitration procedures:
- Serial
- Parallel
- Dynamic
Serial Arbitration Procedure:
- This is a hardware bus priority resolving technique.
- The units requesting the control of the system bus are connected in series as a daisy-chain connection.
- The processors/units are assigned priority based on their position along the priority control line.
The daisy-chain connection of four arbiters is shown below:
In case of a higher-priority processor requests the bus; the lower-priority processor must complete its bus
operation before relinquishing the control.
Polling Algorithm:
- The poll lines connected to all units are used by the bus controller to define an address for each device.
- The bus controller sequences through the addresses in a prescribed manner.
- A processor that requires access, recognizes its address, activates the bus busy line and accesses the bus.
- The polling process continues by choosing a different processor, after a number of bus cycles.
- This process is programmable and the selection priority can be altered under program control.
LRU Algorithm:
- The highest priority is given to the unit that has not used the bus for the longest interval.
- The priorities are adjusted after a number of bus cycles based on the usage.
- No unit is favored as the priorities are dynamically changed and every unit will get an opportunity.
FCFS Algorithm:
- All the requests are served in the order received.
- A queue is established by the bus controller to maintain the arrived bus requests.
- Each unit must wait for its turn to use the bus on a First-In, First-Out (FIFO) basis.
To prevent conflicting use of shared resources by several processors there must be a provision in the operating
system for assigning resources to processors.
There are three organizations that have been used in the design of operating system for multiprocessors:
- Master-slave configuration
- Separate operating system
- Distributed operating system.
Master-Slave Configuration:
- One processor, designated the master, always executes the operating system functions.
- The remaining processors, denoted as slaves, do not perform operating system functions, but requests the
service by interrupting the master.
- A message is then sent with a header and various data objects used to communicate between nodes.
- In the case of availability of multiple paths, the operating system in each node contains routing information.
- The communication efficiency of the inter processor network depends on:
Communication routing protocol.
Processor speed.
Data link speed.
Topology of the network.
Interprocessor Synchronization
The multiprocessor’s instruction set contains basic instructions to implement communication and
synchronization between cooperating processes.
- Communication refers to the exchange of data between different processes.
- Synchronization refers to the control information, needed to enforce the correct sequence of processes and
to ensure mutually exclusive access to shared writable data.
Multiprocessor systems include various mechanisms to deal with the synchronization of resources.
- Low-level primitives to enforce mutual exclusion are implemented directly by the hardware.
- A number of hardware mechanisms for mutual exclusion have been developed. One of the most popular
methods is through the use of a binary semaphore.
A critical section is a program sequence that, once begun, must complete execution before another processor
accesses the same shared resource.
A binary variable called a semaphore is often used to indicate whether or
not a processor is executing a critical section. A semaphore is a software controlled
flag that is stored in a memory location that all processors can access.
When the semaphore is equal to 1, it means that a processor is executing a
critical program, so that the shared memory is not available to other processors.
When the semaphore is equal to 0, the shared memory is available to any
requesting processor. Processors that share the same memory segment agree
by convention not to use the memory segment unless the semaphore is equal
to 0, indicating that memory is available . They also agree to set the semaphore
to 1 when they are executing a critical section and to clear it to 0 when they
are finished.
Testing and setting the semaphore is itself a critical operation and must
be performed as a single indivisible operation. If it is not, two or more processors
may test the semaphore simultaneously and then each set it, allowing
them to enter a critical section at the same time. This action would allow
simultaneous execution of critical section, which can result in erroneous initialization
of control parameters and a loss of essential information.
A semaphore can be initialized by means of a test and set instruction in
Cache Coherence
• Advantage of cache memory
• Problem in multiprocessor systems as each processor maintains its own cache and updated data might
not be reflected in all local cache memory
• Cache inherence occurs in multiprocessor systems with share writable data.
• Share readable data doesnt result in any issue
• Another configuration that may cause cache coherence is DMA in conjunction with IOP connected to
system bus.
• Do not allow private cache for each processor instead maintain shared cache but increases avg access
time
• Attach private cache but the data stored in cache should be non sharable – such data is called cachable.
– Shared writable data are non cachable
– But the cost increases because of extra software overhead
– Centralized global table in compiler– which allows writable data to be in at least one cache.
– Status of memory blocks is maintained in CGT.
– Blocks are identified as RO(read only) RAW(read & write)
– All cache will have blocks named as ROs and 1 cache will have RAW block. If any updations in
cache that has RAW block will not affect other caches as they don’t have that block.
• Centralized global table in compiler– which allows writable data to be in atleast one cache.
– Status of memory blocks is maintained in CGT.
– Blocks are identified as RO(read only) RAW(read & write)
– All cache will have blocks named as ROs and 1 cache will have RAW block. If any updations in
cache that has RAW block will not affect other caches as they don’t have that block.
Hardware solution:
– A hardware is used to monitor bus watching mechanism over all caches attached to the bus
called Snoopy cache controller
– Several schemes are proposed using snoopy controller.
– One such easy approach is using write through policy.
Steps:
– All snoopy controllers watch bus for memory write operations
– When a word is updated in cache, main memory is updated.
– Local snoopy controllers of all the caches check their memory if they have copy of the data that is
overwritten
– If the copy exists in remote cache, that location is marked invalid if not updated.
– If it is invalid in future reference it may lead cache miss