Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 37

Assembly Language Basics

WHAT IS ASSEMBLY LANGUAGE?


Assembly language is a low-level programming
language that serves as an intermediary
What is Assembly between human-readable source code and
machine code. It is specific to a computer
Language? architecture and consists of a set of symbolic
names and mnemonics representing the
computer's basic operations.
Human-Readable Representation: Assembly
language provides a human-readable
representation of machine code instructions,
making it easier for programmers to write and
understand low-level code.
Direct Hardware Interaction: It allows direct
interaction with a computer's hardware
Purpose: components, such as registers, memory, and I/O
ports, providing fine-grained control over a
system's resources.
Efficiency: Assembly language programs are
often more efficient in terms of execution speed
and memory usage compared to high-level
languages, making it ideal for tasks where
performance is critical.
High-Level Languages: In contrast to high-
level programming languages like Python,
C++, or Java, assembly language is much
closer to the computer's architecture,
providing more control but requiring more
Comparisons: detailed coding.
Abstraction Level: While high-level
languages offer a higher level of abstraction,
assembly language operates at a lower level,
requiring programmers to have a deeper
understanding of the underlying hardware.
Machine Code Definition: Machine code,
also known as machine language, is the
lowest-level programming language that a
computer can understand. It consists of
binary instructions that the CPU directly
interprets and executes.
Machine Language
Fundamentals Binary Representation: Machine code
instructions are represented in binary
format, which means they are composed
of 0s and 1s. Each binary sequence
corresponds to a specific operation or
instruction that the CPU can perform.
How Instructions
are Encoded in
Machine Language:

Machine language instructions are encoded in binary, which is a


base-2 numeral system using 0s and 1s. This binary encoding is
used to represent various aspects of an instruction, including the
operation to be performed (the opcode) and any data or memory
addresses associated with that operation (operands).
Opcode (Operation Code): - The
opcode is the most critical part of a
machine language instruction. It
specifies the type of operation or action
How Instructions that the CPU should perform. Common
opcodes represent operations like
are Encoded in addition, subtraction, loading data from
memory, storing data to memory, and
Machine more.

Language:
Example: In a hypothetical machine
language, an opcode of "0001" might
represent addition, while "0010"
represents subtraction.
In many machine language instructions, there
are one or more operands. Operands provide
the data or addresses needed to carry out the
specified operation. The way operands are
encoded can vary, but there are two primary
methods:
How Instructions
are Encoded in a. Immediate Operand: The value of the
operand is directly included within the
Machine instruction itself. For example, if you want to
add the value 42 to a register, the immediate

Language: operand might be "00101010" (binary for 42).

b. Memory Address Operand: The operand


specifies a memory address where the data to
be used in the operation is located. The CPU
will fetch the data from this memory address
during execution.
In addition to specifying the type of operand
(immediate or memory address), machine language
instructions often include information about the
addressing mode. The addressing mode defines how
the CPU should interpret the operand's value or
address. Common addressing modes include:

Operand Direct Addressing: The operand directly specifies a


Addressing memory address where data is located.
Register Addressing: The operand is a reference to a
Modes: CPU register where data is stored.
Indirect Addressing: The operand points to a memory
address that contains the actual memory address of
the data to be used (a level of indirection).
Indexed Addressing: The operand includes an offset or
index value that is added to a base address to calculate
the final memory address.
Let's say we have a hypothetical machine language
instruction for adding two values from memory
addresses A and B and storing the result in memory
address C. Here's a simplified binary representation:

- Opcode for addition: "0001"


- Operand 1 (Memory Address A): "11001100"
- Operand 2 (Memory Address B): "10101010"
Example - Operand 3 (Memory Address C): "11110000"

The machine code instruction might look like this:


"0001110011001010101011110000"

This binary representation tells the CPU to execute an


addition operation, fetching values from memory
addresses A and B, and storing the result in memory
address C.
Syntax and Assembly Language as a Low-Level Language:
Assembly language is a low-level programming
Structure of language that serves as an interface between
human-readable code and machine code. It

Assembly provides a symbolic representation of machine


instructions.

Language Unlike high-level languages such as Python or


C++, assembly language is specific to a
Programs: particular computer architecture and closely
reflects the hardware's capabilities.
In assembly language programming,
mnemonics are short, symbolic
abbreviations used to represent
machine code operations or
instructions. They serve as a bridge
between human-readable code and
Mnemonics and the actual binary instructions that a
computer's CPU can execute.
Instructions:
Each mnemonic corresponds to a
specific operation that the CPU can
perform, such as data movement,
arithmetic operations, branching, and
more.
MOV: Stands for "move" and is used to transfer
data from one location to another. For example,
MOV AX, BX copies the contents of register BX into
register AX.
Common ADD and SUB: Represent addition and subtraction
Mnemonics: operations. For instance, ADD AX, 10 adds the
value 10 to the contents of register AX.
JMP: Represents a jump instruction, used for
altering the flow of program execution. JMP label
transfers control to a labeled section of code.
CMP: Used for comparison operations. It
compares two values and sets the CPU's flags
based on the result, often used in conditional
branching.
Common CALL and RET: Used for subroutine calls and
Mnemonics: returns. CALL transfers control to a subroutine,
and RET returns control to the calling program.
NOP: Stands for "no operation" and represents an
instruction that does nothing. It is often used for
padding or alignment in code.
Assembly language instructions typically have one or
more operands. Operands provide data or specify
memory addresses that the instruction operates on.
Operands: Operands can be registers, memory addresses, or
immediate values (constants).
Register Operand: Refers to a CPU register. For
example, in MOV AX, BX, AX and BX are registers.
Memory Operand: Refers to a memory location. For
example, in MOV AL, [1234], [1234] represents a
Operands: memory location.
Immediate Operand: Represents a constant value
embedded in the instruction itself. For instance, MOV
CX, 5 sets the CX register to the value 5.
Encoding and
Machine Code:
While mnemonics make assembly code human-readable, they are translated into
binary machine code for execution by the CPU.

The assembler, a program that converts assembly code into machine code, maps each
mnemonic to a specific binary opcode (operation code) and encodes the operands
accordingly.

For example, MOV AX, BX might be translated into a sequence of 0s and 1s that
represent the appropriate opcode and operands for the given CPU architecture.
What Are
Registers
Registers are small, high-speed storage locations within
a CPU (Central Processing Unit) that are used to hold
data temporarily during program execution. Think of
registers as the CPU's internal scratchpad or working
memory.
Key Characteristics
of Registers:
Speed: Registers are the fastest storage locations in a computer. Accessing data from registers is
significantly quicker than accessing data from memory.

Limited in Number: A CPU typically has a limited number of registers, which can vary depending on
the architecture. Common register sets include general-purpose registers, special-purpose registers,
and floating-point registers.

Usage: Registers are used for a variety of purposes, including data storage, arithmetic operations,
addressing memory, and control flow operations.
3 Types of
Registers:
General-Purpose Registers: These registers can be used for a wide range of tasks, such as
holding data, performing arithmetic operations, and serving as temporary storage for
intermediate results. Common examples include AX, BX, CX, and DX.
General-Purpose
Registers (8086):
The 8086 architecture includes several general-purpose registers, each serving a specific purpose:

AX (Accumulator): Often used for arithmetic operations and data manipulation. For example, you might
perform addition or subtraction with data in AX.
BX (Base Register): Typically used for addressing data in memory, especially in conjunction with the DI or SI
registers for effective address calculations.
CX (Count Register): Often used for loop control and string operations. It can also serve as a general-
purpose register.
DX (Data Register): Used in arithmetic operations and for I/O operations, such as transferring data
between the CPU and peripheral devices.
3 Types of
Registers:
Special-Purpose Registers: These registers serve specific
functions within the CPU.
Special-Purpose
Registers (8086):
IP (Instruction Pointer): Keeps track of the memory address of the next instruction to be executed. It is
crucial for program control flow.
SP (Stack Pointer): Points to the top of the stack, which is essential for managing the program's call
stack during function calls and subroutine execution.
BP (Base Pointer): Often used as a reference point for accessing data on the stack within function calls.
It helps maintain a stable frame for the current function.
SI (Source Index) and DI (Destination Index): These registers are commonly used for string
manipulation instructions like copying, comparing, or searching strings in memory.
Segment
Registers (8086):
In addition to general-purpose and special-purpose registers, the 8086
architecture features segment registers, which are used for memory segmentation:
CS (Code Segment): Points to the segment in memory containing executable code.
DS (Data Segment): Points to the segment in memory containing data.
ES (Extra Segment): Often used as an additional data segment in some operations,
like string manipulations.
SS (Stack Segment): Points to the segment in memory used for the stack.
FS and GS (Additional Segments): In later x86 architectures (not strictly 8086),
these registers are introduced for additional data segments.
Memory and Addressing in Assembly
Language

Memory and addressing are fundamental concepts in


assembly language programming, as they enable the
manipulation of data and instructions. In this slide,
we'll explore how assembly language interacts with
memory and various addressing modes.
Memory Access in Assembly:

Storage and Retrieval of Data:


Memory access in assembly language involves the storage
and retrieval of data from memory locations.
Memory is essential for holding program instructions,
variables, and other data structures that a program uses
during execution.
Memory Access in Assembly Language:

Memory Addresses:
Memory locations are identified by numerical addresses. Each address corresponds to a specific
location in memory where data or instructions are stored.

Addresses are typically represented within square brackets, such as [1234], indicating the data or
instruction stored at memory address 1234.
Memory Access in Assembly Language:

Data Movement Instructions:


Assembly language provides instructions for moving data between CPU registers and memory
locations. One of the most common instructions for this purpose is MOV (move).-+
The MOV instruction allows you to copy data from a memory location to a register or from a
register to a memory location.
Example 1: MOV AX, [1234] copies the data stored at memory address 1234 into the AX register.
Example 2: MOV [5678], BX stores the value in the BX register into memory address 5678.
Direct Addressing:

Direct addressing is a simple memory access mode where the memory address is explicitly
provided within the instruction.

This mode is suitable for accessing data at specific, known memory addresses.

Example: MOV AL, [4567] loads the byte at memory address 4567 into the AL register.
Register Addressing:

Register addressing mode involves using a register as an indirect pointer to a memory address.
The content of the register serves as the address.

It's often used when the memory address is stored in a register and needs to be accessed
indirectly.

Example: MOV [BX], AX stores the content of the AX register at the memory address specified by
the BX register.
Indexed Addressing:

Indexed addressing mode is especially useful when working with data structures like arrays. It
allows you to add an offset or index to a base memory address to access elements efficiently.

This mode simplifies data structure traversal.

Example: MOV CX, [SI+10] loads the value at the memory address SI+10 into the CX register,
where SI serves as the base address.
Memory Operations:

Apart from data movement, assembly instructions like ADD, SUB, and CMP can directly operate
on data stored in memory locations.

These instructions allow for arithmetic operations and comparisons with memory-resident data.

Example: ADD [BX], 5 adds 5 to the value stored at the memory address specified by the BX
register.
Importance in Assembly Programming:

Memory Models:
In some CPU architectures like x86, memory is organized into
segments, and different memory models define how memory is
accessed. These models dictate the addressing schemes used in
programs.
Addressing Range:
Different CPU architectures have varying addressing ranges,
determined by the number of address bits available. For example, the
8086 CPU has a 20-bit address bus, allowing access to a maximum of 1
MB of memory.
What Are Instruction Sets?

Instruction sets are collections of machine-level instructions


that a CPU can execute.
Each instruction corresponds to a specific operation, such as
data manipulation, arithmetic, branching, or input/output.
Types of Instruction Sets:
CISC (Complex Instruction Set Computer):
CISC architectures feature a rich set of instructions, some of which can be complex and take
multiple clock cycles to execute.
They are designed to handle a wide range of tasks with a single instruction.
Examples include the x86 architecture used in many personal computers.

RISC (Reduced Instruction Set Computer):


RISC architectures have a smaller, simpler set of instructions, typically designed to execute in a
single clock cycle.
They aim for higher instruction throughput and reduced complexity.
Examples include ARM and MIPS architectures, commonly used in mobile devices and embedded
systems.
Types of Instruction Sets:
VLIW (Very Long Instruction Word):
VLIW architectures provide parallelism by allowing multiple instructions to execute
simultaneously.
They rely on the compiler to schedule instructions for parallel execution.
Example: Itanium processors from Intel.

EPIC (Explicitly Parallel Instruction Computing):


EPIC architectures, like IA-64 (Intel Architecture 64), combine elements of CISC and RISC,
emphasizing parallel execution and efficient handling of complex instructions.
Instruction Categories:
Assembly instructions can be categorized into several groups:
Data Movement: Instructions for copying data between registers and memory.
Arithmetic/Logical: Operations like addition, subtraction, AND, OR, and XOR.
Control Flow: Instructions for branching and changing the flow of program execution.
Stack Operations: Pushing and popping data from the stack.
String Operations: Manipulating strings of data.
Input/Output: Instructions for interacting with input and output devices.

You might also like