Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

Chapter 12

Instructions sets: Characteristics and functions.

Machine instruction characteristics:


• The operation of the CPU is determined by the instructions it executes, referred to as
machine instructions or computer instructions.
• The collection of different instructions that the CPU can execute is referred to as the
CPU’s instruction set.
• Each interaction must contain the information required by the CPU for execution.

Elements of a machine instruction:

• Operation code: Specifies the operation to be performed. The operation is specified by a


binary code, known as the operation code or opcode.

• Source operand reference: The operation may involve one or more source operands,
that is, operands that are inputs for the operation.

• Result operand reference: The operation may produce a result.

• Next instruction reference: This tells the processor where to fetch the next instruction
after the execution of this instruction is complete.

Source and result operands can be in one of four areas:

• Main or virtual memory: As with the next instruction references, the main or virtual
memory address must be supplied.

• I/O device: The instruction must specify the I/O module and device for the operation. If
memory mapped I/O is used, this is just another main or virtual memory address.

• Processor register: A processor contains one or more registers that may be referenced
by a machine instruction. If more than one register is assigned a unique name or number and the
instruction must contain the number of the desired register.

• Immediate: the value of the operand is contained in a field in the instruction being
executed.

Instruction representation:

• Within the computer each instruction is represented by a sequence of bits.


• The instructions in divided into fields, corresponding to the constituent elements of the
instruction.
• Opcodes are represented by abbreviations called mnemonics.
• Operands are also represented symbolically.
• Each symbolic opcode has a fixed binary representation: the programmer specifies the
location of each symbolic operand

1. Number of Addresses per Instruction: This refers to how many pieces of information are
needed for the CPU to understand what it needs to do next.

2. Fewer Addresses, Simpler Instructions: If instructions need fewer addresses, they're simpler
and the CPU doesn't need to be as complex. Plus, shorter instructions take up less space in
memory.

3. More Addresses, More Complex Programs: However, if instructions need more addresses,
it might mean the program needs more instructions overall. This can make programs longer and
more complex, potentially taking more time to run.

4. The Accumulator vs. Multiple Registers: With fewer addresses (like one-address
instructions), programmers typically only have one main place to store data called the
accumulator. But with more addresses (like two- or three-address instructions), there are usually
multiple places to store data called registers. Using registers is faster than using memory, so
programs run quicker when they use registers more often.

5. Modern Machines Use a Mix: Most modern computers use a mix of instructions that need
two or three addresses. This gives programmers more flexibility to use multiple registers, which
speeds up execution.

6. Other Factors Add Complexity: There are other things to consider, like whether an address
points to memory or a register. Since there are fewer registers, it takes fewer bits to say which
register we're talking about. Also, different types of instructions might need different ways of
pointing to data, which adds complexity to the design.

So, the decision about how many addresses an instruction needs is a balancing act between
simplicity, speed, and flexibility.

Example:

Explanation:
1. Maximum Addresses in an Instruction: When designing a processor, one thing to
consider is how many pieces of information an instruction needs to work correctly. For
arithmetic and logic operations, we typically need information about the source operands (the
numbers we're working with) and the destination operand (where we'll put the result). After
completing the instruction, we also need to know where to find the next instruction to run. So,
theoretically, we could need up to four addresses in an instruction.
2. Address Counts in Real Instructions: In most processors, instructions usually have
fewer addresses - one, two, or three. Usually, the address of the next instruction is automatically
understood by the processor without explicitly stating it in the instruction. Some special
instructions, like those in ARM architecture, might need more addresses, like up to 17!
3. Comparison of Different Address Counts: In a visual comparison, you can see how
instructions with different address counts handle a simple calculation like Y = (A – B)/[C + (D *
E)]. With more addresses, you can specify separate source and destination locations for each
operation, which simplifies things but requires longer instructions. With fewer addresses,
instructions need to share addresses, which can lead to more instructions to accomplish the same
task.
4. Zero-Address Instructions: Some instructions don't need any addresses at all. These are
useful for special memory setups called stacks, where data is organized in a last-in-first-out
manner. Instead of explicitly stating addresses, these instructions typically work with the top
elements of the stack.

Types of operands:

• Addresses.
• Numbers.
• Characters.
• Logical data.

Numbers:
All machine languages include numeric data types.

Numbers stored in a computer are limited:


• Limit to the magnitude of numbers representable on a machine.
• In the case of floating-point numbers, a limit to their precision.

Three types of numerical data are common in computers:


• Binary integer or binary fixed point.
• Binary floating point
• Decimal: Although all internal computer operations are binary in nature, the human users
of the system deal with decimal numbers. Thus, there is a necessity to convert from decimal to
binary on input and from binary to decimal on output. For applications in which there is a great
deal of I/O and comparatively little, comparatively simple computation, it is preferable to store
and operate on the numbers in decimal form. The most common representation for this purpose
is packed decimal.
Packed decimal:
• Each decimal digit is represented by a 4-bit code with two digits stored per byte. In the
obvious way, with two digits stored per byte. Thus, 0 = 000, 1 = 0001, c, 8 = 1000, and 9 = 1001.
Note that this is a rather inefficient code because only 10 of 16 possible 4-bit values are used.

• To form numbers 4-bit codes are strung together, usually in multiples of 8 bits.To form
numbers, 4-bit codes are strung together, usually in multiples of 8 bits. Thus, the code for 246 is
0000 0010 0100 0110. This code is clearly less compact than a straight binary representation, but
it avoids the conversion overhead. Negative numbers can be represented by including a 4-bit sign
digit at either the left or right end of a string of packed decimal digits. Standard sign values are
1100 for positive (+) and 1101 for negative (-).

Characters:

• A common form of data is text or character strings.

• Textual data in character form cannot be easily stored or transmitted by data processing
and communications systems because they are designed for binary data.

• Most commonly used character code is the IRA international reference alphabet. Referred
to as the ASCII (American standard code for information interchange).

• Another code used to encode characters is the Extended Binary Coded Decimal
Interchange Code (EBCDIC) and it is used on IBM mainframes.

1. Character Representation in Computers: Computers primarily deal with binary data,


but we often need to work with text, which humans understand better. To bridge this gap,
character codes are used to represent characters as sequences of bits.
2. ASCII - International Reference Alphabet (IRA): ASCII is one of the most common
character codes. Each character is represented by a unique 7-bit pattern, allowing for 128
different characters. However, ASCII characters are usually stored and transmitted using 8 bits
per character. The extra bit can be used for error detection (parity bit).
3. Control Characters in ASCII: Some of the patterns in ASCII represent control
characters, which are used for things like controlling printing or communication procedures.
These characters don't have a visible representation but are crucial for managing data.
4. Packed Decimal Representation: In ASCII, the digits 0 through 9 are represented by
their binary equivalents (0000 through 1001) in the rightmost 4 bits of the pattern. This same
pattern is used in packed decimal representation, simplifying conversion between the two.
5. EBCDIC - Extended Binary Coded Decimal Interchange Code: EBCDIC is another
character code used on IBM mainframes. It's an 8-bit code, and like ASCII, it's compatible with
packed decimal. In EBCDIC, the codes 11110000 through 11111001 represent the digits 0
through 9.

Logical data:
An n-bit unit consisting of n 1-bit items, each item having the value 0 or 1.
Two advantages to bit-oriented view:

• Memory can be used most efficiently for storing an array of Boolean or binary data items
in which each item can take on only the values 1(true) or 0(false).

• To manipulate the bits of a data item: If floating-point operations are implemented in


software, we need to be able to shift significant bits in some operations. To convert from IRA to
packed decimal, we need to extract the rightmost 4 bits of each byte.

Note that, in the preceding examples, the same data are treated sometimes as logical and other
times as numerical or text. The “type” of a unit of data is deter- mined by the operation being
performed on it. While this is not normally the case in high-level languages, it is almost always
the case with machine language.
General-Purpose Registers:
EAX (Extended Accumulator): Used for arithmetic operations, function return values, and
storing results.
EBX (Extended Base Register): Often used as a pointer to data in the data section of memory.
ECX (Extended Counter Register): Used as a loop counter in string and loop operations.
EDX (Extended Data Register): Used in arithmetic operations, I/O operations, and storing
results.
ESI (Extended Source Index): Used as a source index for string operations.
EDI (Extended Destination Index): Used as a destination index for string operations.
ESP (Extended Stack Pointer): Points to the top of the stack. Used for stack operations, function
calls, and managing local variables.
EBP (Extended Base Pointer): Often used as a base pointer for accessing parameters and local
variables on the stack.
Segment Registers:
CS (Code Segment): Points to the segment of memory containing the currently executing code.
DS (Data Segment): Points to the segment of memory containing data accessed by the program.
ES (Extra Segment): Additional segment register for data operations.
SS (Stack Segment): Points to the segment of memory containing the stack.
FS, GS: Additional segment registers introduced in later x86 architectures for additional memory
addressing capabilities.
Index Registers:
ESI, EDI: These registers are often used as index registers for array operations, string operations,
and memory copying.
Pointer Registers:
EBP, ESP: These registers are used for managing the stack and accessing parameters and local
variables within procedures.
Instruction Pointer Register:
EIP (Extended Instruction Pointer): Points to the memory address of the next instruction to be
executed.
Flags Register:
EFLAGS: Contains status flags that reflect the outcome of arithmetic and logical operations,
control the execution flow, and manage system behavior.
Control Registers (in Protected Mode):
CR0, CR2, CR3: Control registers used for managing paging, protection, and other system-level
features in protected mode. Debug Registers (in Debugging Mode):DR0, DR1, DR2, DR3 Debug

Figure 12.4 illustrates the x86 numerical data types. The signed integers are in twos complement
representation and may be 16, 32, or 64 bits long. The floating- point type actually refers to a set
of types that are used by the floating-point unit and operated on by floating-point instructions.
The three floating-point representations conform to the IEEE 754 standard.
Data Transfer Instructions:
These instructions are used to move data between registers, memory locations, or between
registers and memory.
Examples include mov, push, pop, lea, etc.
Arithmetic Instructions:
Arithmetic instructions perform basic arithmetic operations such as addition, subtraction,
multiplication, and division.
Examples include add, sub, mul, div, inc, dec, etc.
Logical Instructions:
Logical instructions perform bitwise logical operations such as AND, OR, XOR, and NOT.
Examples include and, or, xor, not, etc.
Control Transfer Instructions:
Control transfer instructions are used to change the flow of program execution. They include
unconditional and conditional jumps, calls, and returns.
Examples include jmp, call, ret, je, jne, jg, ja, etc.
Comparison Instructions:
These instructions are used to perform comparisons between data values.
Examples include cmp, test, cmov, etc.
String Instructions:
String instructions are used for manipulating strings of characters or bytes in memory.
Examples include movs, cmps, scas, lods, stos, etc.
Input/Output Instructions:
Input/output instructions facilitate communication between the processor and external devices
such as keyboards, displays, disks, etc.
Examples include in, out, int, etc.
Floating-Point Instructions:
These instructions perform floating-point arithmetic operations and manipulations.
Examples include fadd, fsub, fmul, fdiv, fcomp, fsqrt, etc.
System Instructions:
System instructions interact with the operating system or manage system resources.
Examples include halt, iret, cli, sti, etc.

Data transfer:

The most fundamental type of machine instruction where we must specify:


• Location of the sources and destination operands
• The length of data to be transferred must be indicated.
• The mode of addressing for each operand must be specified.
Each location could be memory, a register, or the top of the stack.

Arithmetic:

• Most machines provide the basic arithmetic operations of add, subtract, multiply and
divide.
• These are provided for signed integer (fixed point) numbers.
• Often, they are also provided for floating-point and packed decimal numbers.
• Other possible operations include a variety of single-operand instructions: Absolute,
Negate, Increment, Decrement.

The execution of an arithmetic instruction may involve data transfer operations to position
operands for input to the ALU, and to deliver the output of the ALU. Figure 3.5 illustrates the
movements involved in both data transfer and arithmetic operations. In addition, of course, the
ALU portion of the processor performs the desired operation.
1. Logical Shifts: Machines have functions for shifting and rotating bits within a word. In a
logical shift, the bits of a word are moved left or right. When a bit is shifted out, it's lost, and a 0
is shifted in from the other end. Logical shifts are often used to isolate specific parts (fields) of a
word.
2. Example: Sending Characters to an I/O Device: Imagine we want to send characters
from memory to an I/O device, one character at a time. If each memory word is 16 bits and holds
two characters, we need to unpack them for transmission.
• Sending the Left Character:
1. Load the word into a register.
2. Shift the bits to the right eight times. This moves the left character to the right half of the
register.
3. Send the data through I/O. The I/O module reads the lower 8 bits from the register,
containing the left character.
• Sending the Right Character:
1. Load the word into the register again.
2. Perform a bitwise AND operation with 0000000011111111. This masks out (keeps) the
right character, ignoring the left.
3. Send the data through I/O.

1. Arithmetic Shift: This operation treats data as signed integers and preserves the sign bit
while shifting other bits. When you perform a right arithmetic shift, the sign bit is copied into the
bit position to its right. In a left arithmetic shift, all bits except the sign bit are shifted left, and
the sign bit remains unchanged. These operations are useful for speeding up certain arithmetic
operations.
2. Effects of Arithmetic Shifts: In two's complement notation, a right arithmetic shift is
like dividing by 2, rounding down for odd numbers. Both types of left shifts (arithmetic and
logical) are like multiplying by 2 when there's no overflow. However, if overflow happens,
arithmetic and logical left shifts produce different results. The arithmetic left shift keeps the sign
of the number.
3. Processor Implementations: Some processors include arithmetic shift instructions,
while others don't due to the potential for overflow. For example, PowerPC and Itanium don't
include this instruction, while IBM EAS/390 does. Interestingly, x86 includes an arithmetic left
shift, but it's defined to be identical to a logical left shift.
4. Rotate Operations: Rotate, or cyclic shift, operations keep all bits intact. One common
use is to bring each bit successively into the leftmost bit, where it can be identified by testing the
sign of the data treated as a number.

Input/Output:

Variety of approaches taken:


• Isolated programmed I/O
• Memory-mapped programmed I/O
• DMA
• Use of an I/O processor

Many implementations provide only a few I/O instructions with the specific actions specified by
parameters, codes, or command words.

System control:

Instructions that can be executed only while the processor is in a certain privileged state or is
executing a program in a special privileged area of memory.

Typically, these instructions are reserved for the use of the operating system.

Examples of system control operations:


• A system control instruction may read or alter a control register.
• An instruction to read or modify a storage prediction key.
• Access to process control blocks in a multiprogramming system.

Transfer of control:

Reasons why transfer of control operations are required:


• It is essential to be able to execute each instruction more than once.
• Virtually all programs involve some decision making.
• It helps if there are mechanisms for breaking the task up into smaller pieces that can be
worked on one at a time.

Most common transfer of control operations found in instruction sets:


Branch, skip, procedure call.
1. Branch Instruction Overview: A branch instruction, also known as a jump instruction, is a
type of instruction in a computer program. It's used to change the flow of execution by
transferring control to a different part of the program. Most branch instructions are conditional,
meaning they only branch if a certain condition is met. An unconditional branch always jumps to
the specified location.

2. Conditional Branches: Conditional branch instructions only change the program counter
(which keeps track of the instruction being executed) if a specific condition is satisfied.
Otherwise, the program continues executing the next instruction in sequence. For example, a
conditional branch might only jump if a result is positive or if an overflow occurs.

3. Generating Conditions: There are two common ways to determine the condition for a
conditional branch instruction:

- Condition Codes: Many machines have a special condition code register that stores
information about the result of certain operations, like addition or subtraction. These codes
indicate things like whether the result was positive, negative, zero, or if there was an overflow.
Conditional branch instructions can then check these condition codes to decide whether to jump.
- Comparison and Branch: Another approach is to directly compare two values and branch
based on the result. For example, a branch instruction might compare the contents of two
registers and jump if they are equal.

4. Examples: Figure 12.7 provides examples of both types of conditional branch operations. It
shows how unconditional and conditional branches can be used to create loops in a program. In
the example, a loop is created that repeats until the result of subtracting Y from X is zero.

In summary, branch instructions are used to change the flow of execution in a program.
Conditional branches depend on certain conditions being met, while unconditional branches
always jump to a specified location. These instructions are essential for controlling the flow of
programs and implementing loops and conditional statements.

Skip instructions:

Includes an implied address, typically implies that one instruction be skipped, thus the implied
address equals the address of the next instruction plus one instruction length.
Because the skip instruction does not require a destination address field it is free to do other
things.

Procedure Call Instructions:

Self-contained computer program that is incorporated into a larger program.


• At any point in the program the procedure may be invoked or called.
• Processor is instructed to go and execute the entire procedure and then return to the point
from which the call took place.

Two principal reasons for use of procedures:


• Economy: a procedure allows the same piece f code to be used many times.
• Modularity: Procedures also allow large programming tasks to be subdivided into
smaller units. This use of modularity greatly eases the programming task.

The procedure mechanism involves two basic instructions: a call instruction that branches from
the present location to the procedure, and a return instruction that returns from the procedure to
the place from which it was called. Both are forms of branching instructions.
1. Procedures in Programs: Figure 12.8a demonstrates how procedures are used to
organize a program. In this example, there's a main program starting at location 4000. It calls a
procedure named PROC1, located at 4500. When the call instruction is reached, the main
program pauses, and PROC1 begins execution at 4500. Within PROC1, there are calls to another
procedure, PROC2, located at 4800. Each call suspends PROC1 and executes PROC2. The
RETURN statement brings control back to the calling program after the procedure finishes.
2. Key Points about Procedures:
• A procedure can be called from multiple locations in the program.
• Procedures can call other procedures, allowing for nesting to any depth.
• Each procedure call must be matched by a return statement in the called program.
3. Storing Return Addresses: Since procedures can be called from different points in a
program, the processor needs to store the return address to know where to continue after the
procedure finishes. Common places to store the return address are:
• Registers
• Start of the called procedure.
• Top of the stack
4. Reentrant Procedures: A reentrant procedure allows multiple calls to be open
simultaneously. Recursive procedures, which call themselves, are an example. If parameters are
passed to a reentrant procedure via registers or memory, the parameters need to be saved to free
up space for other procedure calls.

1. Using a Stack for Procedure Calls: A stack is a data structure that follows the Last-In-
First-Out (LIFO) principle, meaning the last item placed on the stack is the first one to be
removed. When a processor executes a procedure call (like a CALL instruction), it places the
return address on the stack. Later, when a RETURN instruction is executed, the processor
retrieves the return address from the stack to know where to continue execution.
2. Parameter Passing: Besides return addresses, procedures often need parameters to work
with. These parameters can be passed in several ways:
• Using registers.
• Storing parameters in memory just after the CALL instruction.
3. Drawbacks of Different Approaches:
• Using registers for parameter passing requires careful programming to ensure proper
usage of the registers by both the calling and called programs.
• Storing parameters in memory makes it challenging to handle a variable number of
parameters and prevents the use of reentrant procedures (which allow multiple simultaneous
calls).
In summary, using a stack for procedure calls is a powerful and flexible approach. It simplifies
managing return addresses and allows for parameter passing. However, different methods for
passing parameters have their drawbacks, such as complexity in register usage or limitations in
handling variable numbers of parameters and reentrant procedures.
1. Flexible Parameter Passing with the Stack: When a processor executes a procedure
call, it not only saves the return address on the stack but also stores the parameters to be passed
to the called procedure. The called procedure can access these parameters from the stack.
Similarly, return parameters can also be placed on the stack when the procedure returns. The
entire set of parameters, along with the return address, stored for a procedure invocation is called
a stack frame.
2. Example in Figure 12.10: The figure illustrates an example with two procedures, P and
Q. Procedure P declares local variables x1 and x2, and it can call procedure Q. Procedure Q
declares local variables y1 and y2. In the figure, the return point for each procedure is the first
item stored in the corresponding stack frame. Following the return point is a pointer to the
beginning of the previous frame. This pointer is necessary if the number or length of parameters
to be stacked is variable.

x86 Operation types:

• The x86 provides a complex array of operation types including a number of specialized
instructions.
• The intent was to provide tools for the compiler writer to produce optimized machine
language translation of high-level language programs.
• Provides four instructions to support procedure call/return: CALL, ENTER,
LEAVE,RETURN

When a new procedure is called the following must be performed upon entry to the new
procedure:
• Push the return point on the stack.
• Push the current frame pointer on the stack.
• Copy the stack pointer as the new value of the frame pointer.
• Adjust the stack pointer to allocate a frame.

1. x86 Instruction Types: The x86 architecture offers a wide range of instructions,
including specialized ones tailored to optimize high-level language programs. Table 12.8 lists
these types with examples.
2. Procedure Call/Return Support: x86 provides four instructions for procedure
call/return: CALL, ENTER, LEAVE, and RETURN. These instructions support the mechanism
of stack frames. When entering a new procedure, several steps are typically performed:
• Push the return point on the stack.
• Push the current frame pointer on the stack.
• Set the frame pointer to the current stack pointer.
• Adjust the stack pointer to allocate a frame.
• The CALL instruction pushes the return address onto the stack and jumps to the
procedure's entry point.
3. Use of ENTER Instruction: The ENTER instruction was added to the x86 instruction
set to simplify procedure entry, especially for nested procedures in languages like Pascal and
Ada. However, it's not widely used due to its slower execution time compared to the traditional
sequence of PUSH, MOV, SUB instructions.
4. Segmentation Instructions: x86 includes privileged instructions for managing memory
segmentation. These instructions, typically used by the operating system, allow loading and
reading of local and global segment tables, as well as checking and altering the privilege level of
segments.
5. Cache Management Instructions: Chapter 4 discusses specialized instructions for
managing the on-chip cache.
Status flags are bits in special registers that may be set by certain operations and used in
conditional branch instructions.
The term condition code refers to the settings of one or more status flags. In the x86 and many
other architectures, status flags are set by arithmetic and compare operations. The compare
operation in most languages subtracts two operands, as does a subtract operation. The difference
is that a compare operation only sets status flags, whereas a subtract operation also stores the
result of the subtraction in the destination operand. Some architectures also set status flags for
data transfer instructions.
1. Condition Code Table: Table 12.9 displays the condition codes (combinations of status
flag values) for which conditional jump opcodes are defined. These opcodes allow branching in a
program based on the state of the CPU's status flags.
2. Comparison of Numbers: When comparing two operands, whether they're considered
bigger or smaller depends on whether they're interpreted as signed or unsigned integers. For
example:
• 11111111 is greater than 00000000 when interpreted as an unsigned integer (255 > 0).
• But it's less than 00000000 when interpreted as a signed two's complement number (-1 <
0).
• To handle this difference, assembly languages often use different terms:
• "Less than" and "greater than" for signed integers.
• "Below" and "above" for unsigned integers.
3. Complexity of Comparing Signed Integers: Comparing signed integers involves
checking for conditions like:
• If the sign bit is zero and there's no overflow (S = 0 and O = 0), or
• If the sign bit is one and there's an overflow.
• These conditions ensure proper comparison of signed integers and prevent errors.
In summary, understanding the interpretation of numbers as signed or unsigned is crucial for
comparing them accurately. Assembly languages use different terms and conditions for
comparing signed and unsigned integers. Additionally, comparing signed integers involves
checking for specific conditions to ensure correctness.

You might also like