Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

P3 - Chapter 4 -Processors And

Computer Architecture
Below attached are related past paper questions

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/60a22082-9fc
1-40d5-a4e7-fbcb494bb5a1/Processors_and_flipflops_a2-compressed.pdf

There are 2 types of Processors RISC and CISC these may further have their own
types e.g. SISD, SIMD, MISD, MIMD etc. based on how they conduct parallel
processing.

Processor Types

Type RISC CISC

→ Reduced Instruction Set Computer Complex Instruction Set Computer

→ Has fewer addressing modes Has more addressing modes

→ Instructions are simpler Instructions are more complex

→ Has better pipelineablity Has poorer pipelineablity

→ Requires less complex circuits Requires more complex circuits

→ Has fewer instructions Has more instructions


→ Uses fixed length instructions Uses variable length instructions

→ Has many registers Has fewer registers

→ Has fewer instruction formats Has many instructions formats

→ Makes more use of RAM Makes more use of Cache

Uses load and store instructions to Has many types of instructions to



address memory address memory

→ Has hardwired control unit Has programmable control unit

P3 - Chapter 4 -Processors And Computer Architecture 1


Pipelining
Pipelining is instruction level parallelism where execution of an instruction is split
into a number of stages. When the first stage of an instruction is completed the first
stage of the next instruction can start executing. I.e. Another instruction can start
executing before the previous one is finished. Processing of a number of instructions
can be simultaneous.

Considering a simple processor using von neuman architecture it has the (FDEC)
Fetch Decode Execute Cycle. A instruction is fetched then decoded then executed
and then the next instruction will then be fetched. Fetch , decode and execute each
take 1 cycle so for example 5 instructions would take 15 cycles.
In modern computers instead of 3 stages ( Fetch , Decode , Execute ) we have 5
stages :

IF- Instruction Fetch


ID- Instruction Decode

OF- Operand Fetch


IE - Instruction Execute

WB- Write Back

Here is a representation of how Pipelining using these 5 stages work, we are


considering 4 instructions A B C D over here.

P3 - Chapter 4 -Processors And Computer Architecture 2


Clock cycles

Stage 1 2 3 4 5 6 7 8
Instruction Fetch A B C D x x x x

Instruction Decode A B C D x x x
Operand Fetch A B C D x x

Instruction Execute A B C D x
Write Back A B C D

It can be seen using pipelining only 8 cycles are required.

However if we were not to use it 4*5 = 20 cycles would be required.


Hence we are able to save 12 cycles using pipelining.

When instructions A B C D are being processed other instructions might also be


executed, however since we are only focusing on A B C D the other can be
considered as "garbage". ( These are marked with x in the chart). As soon as D
passes through another garbage instruction might also reach execution ( write back
in this case). Thus as soon as C is processed the whole pipeline will be cleared for
the next of instructions to be executed.

The system flushes the pipeline when :


1) There is garbage in the system

2) There is a jump instruction.

3) There is an interrupt.

P3 - Chapter 4 -Processors And Computer Architecture 3


💡 Data dependency issues :
Consider the following instruction ADD <dest> <op1> <op2> , it adds the
integers in registers op1 and op2 and places the result in dest.
A program counter contains the following three instructions:
> ADD r3, r2, r1
> ADD r5, r4, r3
> ADD r10, r9, r8
>
Explain why pipelining fails for the first two instructions ?
The result of the first addition is not stored in r3 before the next instruction
needs to load the value from r3. Thus there will be a data dependency
issue. R3 is being fetched and stored on the same clock pulse.
Now,
How can the code above be optimized to overcome the problem?
The third instruction is not dependent on the the first two instructions. So
switching instructions 2 and 3 will solve the problem.

P3 - Chapter 4 -Processors And Computer Architecture 4


💡 Branch Instruction issues:
Consider the following sequence of instructions
>ADD R3, R2, R1 // add R1 to R2 and store in R3
>ADD R6, R5, R4 // add R4 to R5 and store in R6
>JPE R3, R6, LOOP // compare R3 and R6 - if equal jump to address
LOOP >
The issue is the same here the instruction has to know the values in the
registers R3 and R6 ,these are not known as neither instruction 1 or 2
have written the value to the register. This can cause the pipeline to stall.
One strategy that pipelining can use to deal with this is branch prediction.
The processor make a guess at the outcome of the condition. Research
has shown that if the branch instruction is at the bottom of a loop, the
execution will go back to the start of the loop in around 90% of the cases.
Conditions at the start of the loop are true in 50% of the cases. Therefore
the strategy is to assume the condition is true in the first case and not true
in the second case. If the guess proves to be wrong then the processor
must reinstate the register contents and start the pipeline again with the
correct instruction.

Interrupt management in CISC and RISC (pipelined) processors


When an interrupt occurs in a pipelined processor there will be be 5 instructions in
the pipeline one option is to clear the pipeline for the latest 4 instructions to have
entered. Then the normal interrupt service routine (ISR) can be applied to the
remaining instructions.
The other option is to construct the individual units in the processor with individual
program counter registers. This allows current data to be stored for all the
instructions in the pipeline while the interrupt is handled.

P3 - Chapter 4 -Processors And Computer Architecture 5


Parallelism
SISD : Single Instruction Single Data stream
There is only one processor executing one set of instructions on a single data
set

Only one which doesn't have the ability for parallel processing ( because it has
only 1 processor)

Found in the early computers

Eg. used to control washing machines

SIMD : Single Instruction Multiple Data Stream


Contains multiple processors, which have their own memory.

The processor has several ALUs. Each ALU executes the same instruction but
on different data.

Each processor executes the same instruction input using data available in the
dedicated memory.

Widely used to process 3D graphics in video games. ( Because Array


processors are used which make use of SIMD)

MISD : Multiple Instruction Single Data Stream


Used to sort large quantities of data.

Contains multiple processors which process the same data

MIMD : Multiple Instruction Multiple Data Stream


There are several processors. Each processor executes different instructions
drawn from a common pool . Each processor operates on different data drawn
from a common pool.

Each processor typically has its own partition within a shared memory.

Most parallel computer systems use this architecture

P3 - Chapter 4 -Processors And Computer Architecture 6


Used in modern computers

Massively Parallel Computers


Massive : Has many number of processors

Parallel : to perform a set of coordinated computation in parallel.

💡 NOTE : A massively parallel computer will have multiple processors.


Processing units are a part of a single processor for example a processor
with 4 processing units is called as a quad core computer system where
the processing units share the same bus. There is only one processor
here not multiple separate processors.

Hardware issue: Communication between the different processors is the issue as


each processor needs a link to every other processor . Many processors require
many of these links so it will be a challenging topology.

Software issue: Suitable programming language is needed that allows data


to be processed by multiple processors simultaneously.

Changes are required to a normal program code when being transferred to a


massively parallel computer these are :

1. It will have to be split into blocks of code that can be processed


simultaneously instead of sequentially.

2. Each block is processed by a different processor which allows each of


many processors to simultaneously process the different blocks of data
independently.

3. It requires both parallelism and co-ordination.

P3 - Chapter 4 -Processors And Computer Architecture 7


P3 - Chapter 4 -Processors And Computer Architecture 8

You might also like