Download as pdf or txt
Download as pdf or txt
You are on page 1of 61

von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

CSC 213: Computer Architecture


Lecture 2: Processor Structure and Function

October 22, 2021

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Agenda

1 von Neumann Architecture

2 Processor Organization

3 Instruction Cycle

4 Instruction Pipelining

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

von Neumann Architecture

Modern computers are based on the von Neumann architecture.


Key concepts
The contents of this memory are addressable by location,
without regard to the type of data contained there.
Execution occurs in a sequential fashion (unless explicitly
modified) from one instruction to the next.
The basic function performed by a computer is execution of a
program, which consists of a set of instructions stored in memory.

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

von Neumann Architecture Internals

CPU

AC I/O AR I/O devices

ALU

MBR I/O BR

Internal
bus

MAR Main
memory
PC IR CU

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Instruction format
0 3 4 15

Operation code Address

0 1 15

Size

Sign
instruction argument

For example Move 104 - 0101000001101000

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Operation

Interrupt cycle

Interrupt
execution

Fetch cycle Execution cycle YES


NO

Instruction Instruction Interrupts


START
fetching execution valid?

STOP

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Instruction fetching cycle

Program counter (PC) stores address of the next instruction


to acquire (at the beginning it is so called entry point)
Processor fetches instruction from the address pointed by PC
Value of the PC is increased by 1 (unless something else is
required - jump)
Instruction is loaded into the instruction register (IR)
Processor decodes instruction and executes operation pointed
by it

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

instruction fetching cycle (1)

AC MBR Entry Address Content


point
100 Move 104
ALU 101 Add 105
102 Store 106
103 Stop
104 3
MAR  PC
105 5
PC IR CU
106
100
MAR

ADDRESS BUS

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

instruction fetching cycle (2)


DATA BUS
MBR  M(MAR)
Move 104
AC MBR Address Content
100 Move 104
ALU 101 Add 105
102 Store 106
103 Stop
104 3
105 5
READ
PC IR CU
106
CONTROL BUS
MAR 100

ADDRESS BUS

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

instruction fetching cycle (3)


DATA BUS

Move 104
AC MBR Address Content
100 Move 104
ALU 101 Add 105
102 Store 106
103 Stop
104 3
PC  PC + 1 105 5
READ
PC IR CU
106
101 CONTROL BUS
MAR 100

ADDRESS BUS

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

instruction fetching cycle (4)


DATA BUS

Move 104
AC MBR Address Content
100 Move 104
ALU 101 Add 105
IR  MBR 102 Store 106
103 Stop
Move 104 104 3
105 5
READ
PC IR CU
106
101 CONTROL BUS
MAR 100
ADDRESS BUS

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

instruction fetching cycle (5)


DATA BUS

Move 104
AC MBR Address Content
100 Move 104
ALU 101 Add 105
102 Store 106
103 Stop
CU  IR
104 3
Move 104
105 5
READ
PC IR CU
106
101 CONTROL BUS
MAR 100

ADDRESS BUS

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Instruction execution cycle

Processor-memory
data transfer between CPU and memory
Processor – input/output
data transfer between CPU and input/output module
Data processing
Arithmetic or logical operations on the data
Change of the instruction execution order (for example, jump)
Control
Combination of the above

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

instruction execution cycle (1)


Data BUS
MBR  M(104)
3
AC MBR Address Content
AC  MBR 100 Move 104
ALU 101 Add 105
102 Store 106
103 Stop
MAR  IR(104)
104 3
Move 104
105 5
READ
PC IR CU
106
101 CONTROL BUS
MAR 104
104
ADDRESS BUS

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Next instruction fetching cycle


DATA BUS

Add 105 MBR  M(101)

AC MBR Address Content


100 Move 104
ALU 101 Add 105
102 Store 106
103 Stop
104 3
MAR  PC
105 5
READ
PC IR CU
106
101 CONTROL BUS
MAR 101

ADDRESS BUS

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Next instruction fetching cycle (2)


DATA BUS

Add 105
AC MBR Address Content
100 Move 104
ALU 101 Add 105
102 Store 106
IR  MBR
103 Stop
CU  IR 104 3
Add 105
105 5
READ
PC IR CU
106
101 CONTROL BUS
MAR 101

ADDRESS BUS

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Interrupts

Mechanism allowing to disturb the original execution order by


the other system components
Types of interrupts
Programmed
for example, overflow, divide by zero
Clock-generated
Generated by the internal processor clock
Used for process scheduling
Input/Output
From the I/O controller
Hardware failure
for example, Memory parity error

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Examples of interrupts
User program I/O program User program I/O Program

1 1
4 4
WRITE WRITE
I/O instruction
I/O instruction
2a
2 @ Interrupt
5 2b execution
program
WRITE WRITE
Stop 3a 5
3 @
3b
Stop

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Interrupt cycle

Processor checks periodically, if the interrupt occurred


It is shown by the interrupt signal
If no interrupt occurred, the next instruction is fetched
If the interrupt occurred:
The executed program is suspended
Its context is saved
Program counter is set to the address of the first instruction of
the interrupt execution program
Interrupt is processed
After that, the previous context is loaded to the CPU and the
user program is executed from the point it was suspended

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Multiple interrupts

Two ways of dealing with multiple interrupts execution exist


Blocked interrupts
Processor ignores other interrupts while the current interrupt
is processed
Interrupts are queued and after the current interrupt is
processed, the next one (if exists) is processed
Interrupts are executed in the sequence they occurred
Priorities
Execution of the low priority interrupt can be suspended by
the higher priority interrupt
After execution of the higher priority interrupt the execution
of the low priority interrupt is continued

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Multiple interrupts execution


Sequential execution Priority execution
User program Interrupt nr 1 User program Interrupt nr 1

Interrupt nr 2
Interrupt nr 2

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Processor Organization
Processor Requirements:
Fetch instruction
The processor reads an instruction from memory (register,
cache, main memory)
Interpret instruction
The instruction is decoded to determine what action is required
Fetch data
The execution of an instruction may require reading data from
memory or an I/O module
Process data
The execution of an instruction may require performing some
arithmetic or logical operation on data
Write data
The results of an execution may require writing data to
memory or an I/O module

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Basic Elements of Processor (CPU)

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

CPU internal Organization

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Registers

CPU must have some working space (temporary storage) -


registers
Number and function vary between processor designs
One of the major design decisions
Top level of memory hierarchy
Registers in CPU perform two roles:
User-visible registers: used by the machine or assembly
language programmer to minimize main memory references
Control and status registers: used by the control unit to control
the operation of the processor and by privileged operating
system programs to control the execution of programs

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

1. User-visible Registers

General Purpose
Data
Address
Condition Codes

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

General Purpose Registers (1)

May be true general purpose


May be used for data or addressing: e.g. register indirect,
displacement
Data: e.g. accumulator
Addressing: e.g. segment pointers, index registers

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

General Purpose Registers (2)

Make them general purpose


Increase flexibility and programmer options
Increase instruction size and complexity
Make them specialized
Smaller (faster) instructions
Less flexibility

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

General Purpose Registers (3)

How Many? How Big?


Between 8 - 32 Large enough to hold full
Fewer = more memory address
references Large enough to hold full
More does not reduce word
memory references and takes Often possible to combine
up processor real estate two data registers

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Condition Code Registers

Sets of individual bits


e.g. result of last operation was zero
Can be read (implicitly) by programs
e.g. Jump if zero
Can not (usually) be set by programs

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

2. Control and Status Registers

Four registers are essential to instruction execution:


Program counter (PC)
Contains the address of an instruction to be fetched
Instruction register (IR)
Contains the instruction most recently fetched
Memory address register (MAR)
Contains the address of a location in memory
Memory buffer register (MBR)
Contains a word of data to be written to memory or the word
most recently read

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Program Status Word (PSW)

Includes Condition Codes


Sign of last result
Zero
Carry
Equal
Overflow
Interrupt enable/disable
Supervisor

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Example Register Organizations

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Instruction Cycle

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Instruction Cycle State Diagram

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

The Fetch Cycle

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

The Indirect Cycle

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

The Interrupt Cycle

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Pipelining

As with manufacturing, the concept of pipelining regarding


the operation of a CPU is to:
break the process into smaller steps, each step handled by a
sub process
as soon as one sub process finishes its task, it passes its result
to the next sub process, then attempts to begin the next task
multiple tasks being operated on simultaneously improves
performance

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Breaking an Instruction into Cycles

A simple approach is to divide instruction into two stages:


Fetch instruction
Execute instruction
There are times when execution of instruction doesn’t use
main memory
In these cases, use idle bus to fetch next instruction in parallel
with execution.
This is called instruction prefetch

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Instruction Prefetch

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Improved Performance of Prefetch

Without prefetch:
Instruction 1 Instruction 2 Instruction 3 Instruction 4

fetch exec fetch exec fetch exec fetch exec

With prefetch:
Instruction 1 fetch exec
Instruction 2 fetch exec
Instruction 3 fetch exec

Instruction 4 fetch exec

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Improved Performance of Prefetch (2)

Examining operation of prefetch appears to take half as many


cycles as the number of instructions increases
Performance, however, is not doubled:
Fetch usually shorter than execution
Prefetch more than one instruction?
Any jump or branch means that prefetched instructions are not
the required instructions
Add more stages to improve performance

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Three Cycle Instruction

The number of cycles it takes to execute a single instruction is


further reduced (to approximately a third) if we break an
instruction into three cycles (fetch/decode/execute).

Instruction 1 Instruction 2 Instruction 3 Instruction 4

F D E F D E F D E F D E

Instruction 1 F D E

Instruction 2 F D E

Instruction 3 F D E

Instruction 4 F D E

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Pipelining Strategy

If instruction execution could be broken into more pieces, we


could realize even better performance
Fetch instruction (FI) – Read next instruction into buffer
Decode instruction (DI) – Determine the opcode
Calculate operands (CO) – Find effective address of source
operands
Fetch operands (FO) – Get source operands from memory
Execute instructions (EI) – Perform indicated operation
Write operands (WO) – Store the result
This decomposition produces nearly equal durations

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Sample Timing Diagram for Pipeline

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Problems Associated with Previous Timing Diagram

Assumes that each instruction goes through all six stages of


pipeline
It is possible to have FI, FO, and WO happening at the same
time
Even with the more detailed decomposition, some stages will
still take more time
Conditional branches cause even greater disruption to pipeline
than with prefetch
Interrupts, like conditional branches, will disrupt pipeline
CO and FO stages may depend on results of previous
instruction at a point before the WO stage writes the results

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Effect of a Conditional Branch on Instruction Pipeline

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Flow of a Six Stage


Pipeline

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Alternative Pipeline
Depiction

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

More Roadblocks to Realizing Full Speedup

There are two additional factors that frustrate improving


performance using pipelining
Overhead required between stages such as buffer-to-buffer
transfers
The amount of control logic required to handle memory and
register dependencies and to control the pipeline itself
With each added stage, the hardware needed to support
pipelining requires careful consideration and design

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Pipeline Hazards

Resource Hazards
A resource hazard occurs when two or more instructions that
are already in the pipeline need the same resource
The result is that the instructions must be executed in serial
rather than parallel for a portion of the pipeline
A resource hazard is sometimes referred to as a structural
hazard

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Resource Hazard

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Pipeline Hazards (3)

Data Hazard
A data hazard
occurs when
there is a
conflict in the
access of an
operand
location

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Pipeline Hazards (4)

Control Hazard
Also known as a branch hazard
Occurs when the pipeline makes the wrong decision on a
branch prediction
Brings instructions into the pipeline that must subsequently
be discarded

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Dealing with Branches

A variety of approaches have been used to reduce the


consequences of branches encountered in a pipelined system:
Multiple Streams
Prefetch Branch Target
Loop buffer
Branch prediction
Delayed branching

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Multiple Streams

Branch penalty is a result of having two possible paths of


execution
Solution: Have two pipelines
Prefetch each branch into a separate pipeline
Once outcome of conditional branch is determined, use
appropriate pipeline
Competing for resources – this method leads to bus and
register contention
More streams than pipes – multiple branches lead to further
pipelines being needed

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Prefetch Branch Target

When a conditional branch is recognized, the target of the


branch is prefetched, in addition to the instruction following
the branch
Target is then saved until the branch instruction is executed
If the branch is taken, the target has already been prefetched

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Loop Buffer

Add a small, very fast memory


Maintained by fetch stage of pipeline
Use it to contain the n most recently fetched instructions in
sequence.
Before taking a branch, see if branch target is in buffer
Similar in concept to a cache dedicated to instructions while
maintaining an order of execution

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Loop Buffer (2)

CSC 213: Computer Architecture


von Neumann Architecture Processor Organization Instruction Cycle Instruction Pipelining

Branch Prediction

There are a number of methods that processors employ to make an


educated guess as to the direction a branch may take.
Static
Predict never taken
Predict always taken
Predict by opcode
Dynamic – depend on execution history
Taken/not taken switch
Branch history table

CSC 213: Computer Architecture

You might also like