RISC Instruction Set:: I) Data Manipulation Instructions

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

Development of RISC (Reduced Instruction Set Architecture) architecture started as a "fresh

look at existing ideas" and as a result of examination of how the instructions are actually used in
the real programs. RISC architecture starts with a small set of most frequently used instructions
which determines the pipeline structure of the machine enabling fast execution of those
instructions in one cycle. One cycle per instruction is achieved by exploitation of parallelism
through the use of pipelining. Basically we can characterize RISC as a performance oriented
architecture based on exploitation of parallelism through pipelining.

It includes:
 SPARC (used by Sun Microsystems workstations, an outgrow of Berkeley RISC),
 MIPS (an outgrow of Stanford MIPS project, used by Silicon Graphics), and a super-
scalar implementation of RISC architecture,
 IBM RS/6000 (also known as PowerPC architecture)

Each instruction of the RISC machine is simple and straight forward. Thus, the time required to
execute each instruction can be shortened and the number of cycles reduced. Typically the
instruction execution time is divided in five stages, machine cycles, and as soon as processing of
one stage is finished, the machine proceeds with executing the second stage. However, when the
stage becomes free it is used to execute the same operation that belongs to the next instruction.
The operation of the instructions is performed in a pipeline fashion, similar to the assembly line
in the factory process. Typically those five pipeline stages are:

IF – Instruction Fetch
ID – Instruction Decode
EX – Execute
MA – Memory Access
WB – Write Back

RISC Instruction Set:


The instructions set have been grouped into five categories and the discussion of a particular
category is applicable to all instructions of that category.

i) Data Manipulation Instructions:


The only instruction that comes into this category is “mov”. In this instruction second source is
redundant and ignored. No flag is changed because in moving content of ne register to some
other register, no arithmetic, logic or shift operation is involved.

ii) Data Manipulation Instructions:


This category includes instructions for addition, subtraction, logical operations and shift
operations. All of these support Register-addressing and Short-immediate-addressing. All the
instructions except “inv” are two operand instructions. Instruction “inv” takes only one operand.
For shift instructions shift count is provided by the second operand. As the shifts more than 31 bit are
meaningless, only 5 bit(<4:0>) arc used as shift count.
iii) LOAD/STORE Instructions
This set of instructions is used to access memory for read and write operations. Data can be
accessed as a byte, a half word, and a full word. The width of the data being accessed is provided
by two data-width-lines: Wo and W1.
In case of load instructions CPU always internally reads full word. Separate load instructions are
provided for loading unsigned byte, half word, and full word. Similarly for loading signed byte,
half word, and word separate instructions are provided. If read data is byte or half word then it is
aligned and converted to either signed 32 bit or unsigned 32 bit, depending on the signed or
unsigned access of data, before writing into register. However, in case of full word read
operation data is written into register specified without any modification.
For write access of memory, separate instructions are provided for storing a byte, a half word,
and a full word. The number of bytes to be written are determined by the instruction in
execution. For example, in case of storb instruction only one byte will be written. The bank(s) in
which byte(s) is(are) to be written is(are) selected by Wo and W1: Ao and A1 lines of address bus.
The width code lines indicate the width of item to be written. The store type instructions support
Register-Relative Addressing mode where the effective address is calculated as:
eff-adress=[RS1]+0.

iv) Control Transfer Instructions


Conditional jumps, Unconditional jumps, and Procedure call constitute this group. This set of
instructions is used to translate decision boxes of flow chart of a program. There are two types of
control transfer instructions provided. The first type of instructions support Register-Relative
Addressing mode whereas second type of instructions support PC-Relative Addressing. A jump
delayed by one cycle takes place if the condition evaluates to TRUE. The instruction slot next to
the control transfer instruction is called delay slot of that control transfer instruction and the
instruction in delay slot is always executed due to delayed control transfer.

The call instruction differs from other instructions of this group in one respect. It pushes the
address of the instruction that has been executed in delay slot into specified register. It will be
used as a return address. Thus to get the correct return address the address pushed by “call”
instruction should be incremented by 04h. This can be done in the control transfer instruction
used to return from procedure or function call.

v) Miscellaneous Instructions

In this group two types of instructions are included. Instructions “reti”, “getlpc” and “putpsw”
are privileged instructions andcan be executed only if P flag is 0. The non-priviliged group
comprises “getpsw” and “nop” instructions.

The “reti’ instruction is used for return from interrupt handling routine. It restores the previous
system operation mode(P flag) and loads the PC with the content of specified register. The
control transfer is delayed by one cycle.

The getlpc must be the first instruction of any interrupt handling routine. It moves the content of
LSTPC (the address of interrupted instruction) to register specified, which will be used to restart
the interrupted instruction.
The putpsw instruction is used to change the content of flags. Changing the flags C, S, Z, and V
is meaningless because these flags change dynamically and no one can predict the content
without actually analyzing the program. The result, of instruction will be effected only after the
end of execution cycle.
The getpsw pushes the content of flags into the register specified. The nop instruction does
nothing and generally used to fill the delay slot, if compiler is unable to fill it with some
meaningful instruction. Sometimes it used to introduce calculated amount of delay in program.
No instruction of this group except putpsw changes the flags, putpsw changes flags.

VLSI Implementation of RISC Instructions:

RISC processor is designed with load/store architecture, meaning that all operations are performed on
operands held in the processor registers and the main memory can only be accessed through the load and
store instructions.
RISC instructions are executed by using the technique called pipelining. There are four stages in pipeline
technique:

1) Instructions fetch cycle (IF):Send the program counter (PC) to memory and fetch the
current instruction from memory. Update the PC to the next sequential PC by adding one to the PC.

2) Instructions decode/register fetch cycle (ID):Decode the instruction and read the registers
corresponding to register source specifiers from the register file.
3) Execution (EX): The ALU operates on the operands prepared in the prior cycle, performing one
of three functions depending on the instruction type.
4) Store result (ST):Write the result into the register file, whether it comes from the memory system
(for a load) or from the ALU (for an ALU instruction).

Main modules
Let us consider modules of RISC processor. The main parts of processor is shown in Figure and are
explained bellow :

These are following basic main parts of the processor:


Control Unit:
It manages the sequence and timing of events carried out within the processor. The control unit of the
RISC processor examines the instruction opcode bits and decodes the instruction to generate eight control
signals.

Registers:
Holds values of internal operation, such as the address of the instruction being executed and the data
being processed i.e. Program Counter Register, Status Register.

Separate program and data memory:


The program memory also called as ROM which contains instructions of the processor. Each instruction
is 9 bit and there are 8 instructions are there. The data memory which also called as RAM which is
temporary memory and used to store the data values need for processor and data values coming from
processor after processing.

Load and store:


Processor which communicates with memory only by using load and store instruction. Load instruction
which load the data value from memory to register and store instruction which store the value from
register to RAM memory.

Arithmetic Logic Unit (ALU):


The arithmetic/logic unit (ALU) executes all arithmetic and logical operations. Arithmetic operations
either take two registers as operands. The result is stored in a third register.
The arithmetic/logic unit can perform arithmetic operations or mathematical calculations like addition,
and subtraction and also performs logical operations include Boolean comparisons, such as AND, OR,
XOR, NAND, NOR and NOT operations.

Barrel Shifter:
A barrel shifter is a digital circuit that can shift a data word by a specified number of bits in one clock
cycle. It can be implemented as a sequence of multiplexers and in such an implementation the output of
one Mux is connected to the input of the next Mux in a way that depends on the shift distance. For
example, take a four-bit barrel shifter, with inputs A, B, C and D. The shifter can cycle the order of the
bits ABCD as DABC, CDAB, or BCDA; in this case, no bits are lost. That is, it can shift all of the outputs
up to three positions to the right. The barrel shifter has a variety of applications, including being a useful
component in microprocessors (alongside the ALU. A barrel shifter is a combinational logic circuit with
n data inputs, n data outputs, and a set of control inputs that specify how to shift the data between input
and output. A barrel shifter that is part of a microprocessor CPU can typically specify the direction of
shift, the type of shift and the amount of shift.

Booth’s Multiplier:
Booth's multiplication algorithm is a multiplication algorithm that multiplies two signed binary numbers
in two's complement notation. Booth used desk calculators that were faster at shifting than adding and
created the algorithm to increase their speed. The area and speed of the multiplier is an important issue,
increment in speed results in large area consumption and vice versa. Multipliers play vital role in most of
the high performance systems. Performance of a system depends to a great extent on the performance of
multiplier thus multipliers should be fast and consume less area and hardware. For this one multiplier is
used with Booth’s Algorithm. The two main advantages of this algorithm are speed and the ability to do
signed multiplication (using two’s complement) without any extra conversions.
Pipelining

The basic concept of pipelining is to break up instruction execution activities into stages that can
operate independently. Every instruction passes through the same stages much like an assembly
line.

For example, we could set up the following stages for a MIPS pipeline.

IF - instruction fetch and PC increment


ID - source register fetch and instruction decode
EX - ALU source selection, ALU operation, and branch target calculation
MEM - data memory access
WB - write back to destination register

With these pipeline stages, a sequence of instructions can be executed as shown below. Time
progresses from left to right. Each horizontal division represents one clock period.

Advantages of Pipelining

1. The cycle time of the processor is reduced.


2. It increases the throughput of the system
3. It makes the system reliable.

Disadvantages of Pipelining

1. The design of pipelined processor is complex and costly to manufacture.


2. The instruction latency is more.
Pipeline Hazards:

 Structural hazards:

 structural hazards are those that occur because of resource conflicts.

 Example 1:
o For cost-saving reasons, a CPU may be designed with a single interface to
memory.

o This interface is always used during IF.

o It is also used during MEM for Load or Store operations.

o When a Load or Store gets to the MEM stage, the instruction in the IF stage must
be stalled.

Data Hazards

 Pipelining changes the relative timing of instructions by overlapping them in time.


 This introduces possible hazards by reordering accesses
 To the register file (data hazards.)
 To the program counter (control hazards.)
o Consider the code:

o All of the instructions after ADD use the result of the ADD instruction.

o Since the standard DLX pipeline waits until WB to write the value back, the SUB,
AND and OR instructions read the wrong value.

o Also, the error may not be deterministic if an interrupt occurs between the ADD
and the AND, which would allow the ADD to write its result.
 For example, consider:

 Stalling is necessary in this case for proper execution.


o This is done with a pipeline interlock , which stalls the pipeline until the hazard is
cleared.

This inserts a bubble into the pipeline just as the structural hazard did.

o Just as with structural hazards, no instructions are started during the cycle in
which the bubble is inserted.
o This increases the number of cycles required and thus the CPI.

You might also like