Unit 3 Part

Computer Organization and
Architecture
KCS302
LECTURE 38
Unit-3: Control Unit
•Instruction types
•Instruction formats
•Instruction cycles and sub cycles (Fetch and Execute etc)
•Micro operations
•Execution of a complete instruction
•Program Control
•Reduced Instruction Set Computer

•Pipelining
•Hardwire and micro programmed control:

•Micro programme sequencing
•Concept of horizontal and vertical microprogramming.
Program Control
1. DATA TRANSFER INSTRUCTIONS
2. DATA MANIPULATION INSTRUCTIONS
3. PROGRAM CONTROL INSTRUCTIONS
– changes the address value in the program counter and cause the flow of control to be
altered
• Processor Status Word (PSW):
• In Basic Computer, the processor had several (status) flags – 1 bit value that
indicated various information about the processor’s state – E, FGI, FGO, I, IEN,
R
• In some processors, flags like these are often combined into a register – the
processor status register (PSR); sometimes called a processor status word
(PSW)
• Common flags in PSW are
– C (Carry): Set to 1 if the carry out of the ALU is 1
– S (Sign): The MSB bit of the ALU’s output
– Z (Zero): Set to 1 if the ALU’s output is all 0’s
– V (Overflow): Set to 1 if there is an overflow
• Conditional Branch Instructions:
• Subroutine Call and Retun:
• The instruction known by different names:
• call subroutine
• jump to subroutine
• branch to subroutine
• branch and save address
✔ Operation code + Address that specifies the beginning of the subroutine
• Executed by performing two operations:

CALL
1. The return address is stored in a temporary location SP ← SP - 1
– In a fixed Location in memory M[SP] ← PC
– Or in a processor Register PC ← EA
– Or in memory stack (most efficient way)
RTN
2. Control is transferred to the beginning of the subroutine PC ← M[SP]
SP ← SP + 1
❑ Program Interrupt
• Types of Interrupt:
• External interrupts
✔ Initiated from the outside of CPU and Memory
– I/O Device
– Timing Device → Timeout
– Power Failure
– Operator
• Internal interrupts (traps)

✔ Caused by the currently running program
– Register, Stack Overflow
– Divide by zero
– OP-code Violation
– Protection Violation
• Software Interrupts
✔ Initiated by the executing an instruction.
– Supervisor Call
• Switching from a user mode to the supervisor mode
• Allows to execute a certain class of operations which are not allowed in the user mode
• Interrupt Procedure:
• Quite similar to a subroutine call except for three variations:

1. Usually initiated by an internal or an external signal rather than from the
execution of an instruction (except for the software interrupt)
2. The address of the interrupt service program is determined by the
hardware rather than from the address field of an instruction
3. An interrupt procedure usually stores all the information necessary to
define the state of CPU rather than storing only the PC.
• The state of the CPU is determined from:

– Content of the PC
– Content of all processor registers
– Content of status bits
• Many ways of saving the CPU state depending on the CPU architectures
RISC and CISC
• RISC (emerged in early 1980s):
– Relatively few instructions
– Relatively few addressing modes
– Memory access limited to load and store instructions
– All operations done within the registers of the CPU
– Fixed-length, easily decoded instruction format
– Single-cycle instruction execution
– Hardwired rather than microprogrammed control
– Efficient instruction pipeline
– LOAD/STORE Instructions
– Examples: ARC, Alpha, ARC, ARM, AVR, PA-RISC, and SPARC
• CISC (the original microprocessor ISA):

– A large number of instructions-typically from 100 to 250 instructions
– Some instructions that perform specialized tasks and are used infrequently
– A large variety of addressing modes-typically from 5 to 20 different modes
– Variable-length instruction formats
– Instructions that manipulate operands in memory
– Examples: System/360, VAX, AMD, and Intel x86 CPUs
Pipelining
• To improve the performance of a CPU we have two options:
1. Improve the hardware by introducing faster circuits.
2. Arrange the hardware such that more than one operation can be performed at the same
time.
• Pipelining:
– Pipelining is a process of arrangement of hardware elements of the CPU such that its
overall performance is increased.
– Simultaneous execution of more than one instruction takes place in a pipelined
processor.
• RISC processor has 5 stage instruction
pipeline:
1.Stage 1 (Instruction Fetch)
2.Stage 2 (Instruction Decode)
3.Stage 3 (Instruction Execute)
4.Stage 4 (Memory Access)
5.Stage 5 (Write Back)
• Performance:
• Non-pipelined execution time
= Total number of instructions x Time taken to execute one
instruction
= n x k clock cycles
• Pipelined execution time
= (k + n – 1) clock cycles
■ Speed Up
= Non-pipelined execution time / Pipelined execution time
= k / { 1 + (k – 1)/n }
• For very large number of instructions, n→∞.
Thus, speed up = k.
• Practically, total number of instructions never tend to
infinity. Therefore, speed up is always less than k.
• Dependencies and Data Hazard:
1. Structural Dependency
– due to the resource conflict in the pipeline
2. Data Dependency
– whenever there are two instructions one of which depends on the data obtained from the
other
i. RAW (Read after Writing) [Flow dependency]
ii. WAR (Write after reading) [Anti-Data dependency]
iii. WAW (Write after Writing) [Output dependency]
4. Control Dependency
– when one of the instructions is a branching instruction
– "branch penalty" (for the invalid fetches that are needed to be removed)
• These dependencies may introduce stalls in the pipeline.

• Stall: A stall is a cycle in the pipeline without new input.
• Other possible solutions:

✔ operand forwarding (for data dependency)
1. adding more hardware - Code Memory and Data Memory (for structural dependency)
• Ideal CPI of the pipelined processor is ‘1’. But due to stalls, it becomes greater
than ‘1’.
• Speed Up (S) = CPInon-pipeline / (1 + Number of stalls per instruction)
• Types of pipeline:
• Uniform delay pipeline
– all the stages will take same time
– Cycle Time (Tp) = Stage Delay
– If buffers are included between the stages then,
Cycle Time (Tp) = Stage Delay + Buffer Delay
• Non-Uniform delay pipeline
– Different stages take different time to complete an operation
– Cycle Time (Tp) = Maximum(Stage Delay)
• Example: If there are 4 stages with delays, (2 ns, 8 ns, 3 ns, 10 ns), find the
time taken to execute 100 tasks in the above pipeline.
• Solution: As the above pipeline is a non-linear pipeline,
Tp = max(2, 8, 3, 10) = 10 ns
We know that ETpipeline = (k + n – 1) Tp = (4 + 100 – 1) 10 ns = 1030 ns

Unit 3 Part

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit 3 Part

Uploaded by

Copyright:

Available Formats

Computer Organization and

•Reduced Instruction Set Computer

•Hardwire and micro programmed control:

✔ Operation code + Address that specifies the beginning of the subroutine

• Executed by performing two operations:

• Internal interrupts (traps)

• Quite similar to a subroutine call except for three variations:

• The state of the CPU is determined from:

• CISC (the original microprocessor ISA):

• These dependencies may introduce stalls in the pipeline.

• Other possible solutions:

• Speed Up (S) = CPInon-pipeline / (1 + Number of stalls per instruction)

You might also like