Professional Documents
Culture Documents
Chapter 4
Chapter 4
Chapter Four
Basics of Assembly Language Programs
& Assembly Instructions
Data Movement Instructions
PUSH/POP
Arithmetic and Logic instructions
Various Data Transfer Instructions
String Data Transfers
String Comparisons
2
3
4
Middle-level language is a computer language in which the instructions are
created using symbols such as letters, digits and special characters.
Assembly language is an example of middle-level language. In assembly
language, we use predefined words called mnemonics. Binary code instructions in
low-level language are replaced with mnemonics and operands in middle-level
language.
But the computer cannot understand mnemonics, so we use a translator
called Assembler to translate mnemonics into machine language.
Assembler is a translator which takes assembly code as input and produces
machine code as output.
That means, the computer cannot understand middle-level language, so it needs to
be translated into a low-level language to make it understandable by the computer.
Assembler is used to translate middle-level language into low-level language.
5
6
Assembly Language Syntax
An assembly language program consists of statements.
The syntax of an assembly language program statement obeys
the following rules:
Only one statement is written per line.
Each instruction has an opcode and possibly one or more
operands.
An opcode is known as a mnemonic.
Each mnemonic represents a single machine instruction.
Operands provide the data to work with.
7
8
Comments
• Comments are an important way for the writer of a program to
communicate information about the program’s design to a person
reading the source code.
11
The following rules have to be strictly followed in order to
write correct code.
1 - Both operands have to be of the same size:
12
Arithmetic and Logic instructions
13
14
The DEC Instruction
The DEC instruction is used for decrementing an operand by one. It works on a
single operand that can be either in a register or in memory.
15
Addition (ADD) and Subtraction (SUB)
ADD adds the data of the destination and source operand and stores
the result in destination.
Both operands should be of the same type (words or bytes),
otherwise, the assembler will generate an error.
The subtraction instruction subtracts the source from destination
and stores the result in destination.
16
17
18
19
20
21
22
23
24
25
26
27
The MUL/IMUL Instruction
• There are two instructions for multiplying binary data. The MUL (Multiply) instruction
handles unsigned data and the IMUL (Integer Multiply) handles signed data. Both
instructions affect the Carry and Overflow flag.
Syntax
• The syntax for the MUL/IMUL instructions is as follows −
MUL/IMUL multiplier
Multiplicand in both cases will be in an accumulator, depending upon the size of the
multiplicand and the multiplier and the generated product is also stored in two registers
depending upon the size of the operands.
Following section explains MUL instructions with three different cases
28
29
30
31
Comparison of assembly and high-level languages
Assembly language is low level and High-level language is high level.
AL can use registers and main memory and HLL uses main memory.
AL is machine oriented and HLL is human oriented.
Assembly languages are close to a one to one correspondence between
symbolic instructions and executable machine codes.
32
Typical applications of Assembly language
• A stand-alone binary executable of compact size is required, i.e. one that must
execute without recourse to the run-time components or libraries associated
with a high-level language; this is perhaps the most common situation.
34
What is wrong with Assembly Language?
Here are the reasons people give for not using assembly:
• Improved compiler technology has eliminated the need for assembly language.
• Today, machines are so fast and have more memory that we no longer need to
use assembly.
• If you need more speed, you should use a better algorithm rather than switch to
assembly language.
35
What is right with Assembly language?
Assembly language has several benefits:
• Speed. Assembly language programs are generally the fastest programs around.
• Knowledge. Your knowledge of assembly language will help you write better
programs, even when using HLLs.
36
Tools of Assembly language programming
• Assembly language programming tools are available as free software packages under the
GNU licensing agreement and as commercial software available for purchase.
• To program in assembly, you will need some software, namely an assembler and an
editor.
Assemblers
1. MASM –Originally by Microsoft, it's now included in the MASM32v8 package, which
includes other tools as well.
2. TASM – Another popular assembler. Made by Borland but is still a commercial product, so
you cannot get it for free.
3. NASM –A free, open source assembler, which is also available for other platforms.
Editor:-
4. Emulator software[EMU8086] Notepad/ notepad ++
37
Textpad, VS code, Jedit
Basic Elements of Assembly Language
Integer Constants
• An integer constant (or integer literal) is made up of an optional leading
sign, one or more digits, and an optional suffix character (called a radix)
indicating the number’s base: [{+ | −}] digits [ radix]
• Radix may be one of the following (uppercase or lowercase):
If no radix is given, the integer constant is assumed to be decimal. Here are some
examples using different radixes:
38
Continued …
• Example:-
• A hexadecimal constant beginning with a letter must have a leading zero to prevent
the assembler from interpreting it as an identifier.
• Integer Expressions:-is a mathematical expression involving integer values and
arithmetic operators.
• It can store 32 bits (00000000h through FFFFFFFFh).
• The arithmetic operators are listed in the following table according to their precedence
order, from highest (1) to lowest (4).
39
Continued …
40
Continued …
• Real number constants:- are represented as Character Constants:
decimal reals or encoded (hexadecimal) reals. Single character w/c is enclosed in single or double
quotes.
• It contains an optional sign followed by an Example: ‘A’ ,”d”
integer, a decimal point, an optional integer String Constants:
that expresses a fraction, and an optional
exponent: Is a sequence of characters (including spaces) enclosed in
• [sign] integer.[ integer][ exponent] single or double quotes.
Examples:
sign {+,-}
'ABC'
exponent E[{+,-}] integer
'X'
• +3.0 "Good night, Gracie"
• -44.2E+05 '4096’
• 26.E5 Embedded quotes:
"This isn't a test"
=>At least one digit and a decimal point
‘Say "Good night," Helen’
are required.
41
Reserved Words
have special meaning in Assembler and can only be used in their correct context.
There are different types of reserved words:
Instruction mnemonics, such as MOV, ADD, MUL ,INC, JMP, etc
Register names, such as AX,BX,CX,SI,DI,
Directives, which tell assembler how to assemble programs such
as .code, .data, .stack
Attributes, which provide size and usage information for variables and operands.
Examples are BYTE and WORD
Operators, used in constant expressions such as +/-*
42
Identifiers
• It is a programmer-chosen name. It might identify a variable, a constant, a procedure, or a
code label.
• Keep the following in mind when creating identifiers:
They may contain between 1 and 247 characters.
They are not case sensitive.
The first character must be a letter (A..Z, a..z), underscore (_), @ , ?, or $.
Subsequent characters may also be digits.
An identifier cannot be the same as an assembler reserved word.
• The @ symbol is used extensively by the assembler as a prefix for predefined symbols, so
avoid it in your own identifiers. Make identifier names descriptive and easy to understand.
Here are some valid identifiers:
43
Assembly Language Directives
Is a command embedded in the source code that is recognized and acted upon by the assembler.
Do not execute at runtime.
Can define variables, macros, and procedures.
They can assign names to memory segments and perform many other housekeeping tasks
related to the assembler.
Directives are case insensitive. For example, it recognizes .data, .DATA, and .Data as
equivalent.
The following example helps to show the difference between directives and instructions.
The DWORD directive tells the assembler to reserve space in the program for a doubleword
variable.
The MOV instruction, on the other hand, executes at runtime, copying the contents of myVar
to the EAX register:
myVar DWORD 26 ; DWORD directive
MOV EAX, myVar ; MOV instruction
44
Continued …
• Defining Segments:
• One important function of assembler directives is to define program
sections, or segments.
• The .DATA directive identifies the area of a program containing
variables: .data
• The .CODE directive identifies the area of a program containing executable
instructions: .code
• The .STACK directive identifies the area of a program holding the runtime
stack, setting its size: .stack 100h
• Instructions :-is a statement that becomes executable when a program is
assembled. It is translated by the assembler into machine language bytes,
which are loaded and executed by the CPU at runtime. An instruction
contains four basic parts:
1 optional 2 Required 3 required 4 optional
• [ label:] mnemonic [ operands] [; comment] 45
Continued …
• Label:-is an identifier that acts as a place marker for instructions and data.
• A label placed just before an instruction implies the instruction’s address. Similarly, a label placed just before a
variable implies the variable’s address.
• Data Labels: A data label identifies the location of a variable, providing a convenient way to reference the variable in code.
Example;
Count DWORD 100
array DWORD 1024, 2048, 4096, 8192 ; It is possible to define multiple data items following a label.
Code Labels: A label in the code area of a program (where instructions are located) must end with a colon (:).
Are used as targets of jumping and looping instructions.
• For example, the following JMP (jump) instruction transfers control to the location marked by the label named target, creating a
loop:
− target:
− mov ax,bx
− ...
− jmp target
A code label can share the same line with an instruction, or it can be on a line by itself:
− L1: mov ax,bx
− L2:
46
Continued …
• Operands: Assembly language instructions can have between zero and three
operands, each of which can be a register, memory operand, constant
expression Comments: Descriptions
1. Single-line comments, beginning with a
semicolon character (;).
2. Block comments, beginning with the
COMMENT directive and a user-specified
symbol.
COMMENT ! This is comment line ! Or
COMMENT & comment line &
• The imul instruction has Three operands: imul eax,ebx,5 ;In this case, EBX is
47
multiplied by 5, and the product is stored in the EAX register.
Assembling, Linking, and Running Programs
• A source program written in assembly language cannot be executed directly
on its target computer. It must be translated, or assembled into executable
code.
• The assembler produces a file containing machine language called an object
file. This file isn’t quite ready to execute. It must be passed to another
program called a linker, which in turn produces an executable file.
48
Defining Data Types
• It describes a set of values that can be assigned to variables and expressions
of the given type.
• The essential characteristic of each type is its size in bits: 8, 16, 32, 48, 64
• A variable declared as DWORD, for example, logically holds an unsigned
32-bit integer. In fact, it could hold a signed 32-bit integer, a 32-bit single
precision real, or a 32-bit pointer. The assembler is not case sensitive, so a
directive such as DWORD can be written as dword, Dword, dWord, and so on.
Instruction Mnemonic
• Is a short word that identifies an instruction. Assembly language instruction
mnemonics such as mov, add, and sub provide hints about the type of
operation they perform.
49
Data Definition Statement
• Syntax: name directive initializer [, initializer]...
count DWORD 12345
Directive: The directive in a data definition statement can be BYTE,
WORD, DWORD, SBYTE, SWORD, or DB,DW,DD,DQ.
Initializer: At least one initializer is required in a data definition,
even if it is zero. Additional initializers, if any, are separated by
commas. For integer data types, initializer is an integer constant or
expression matching the size of the variable’s type, such as BYTE or
WORD. If you prefer to leave the variable uninitialized (assigned a
random value), the ? symbol can be used as the initializer. All
initializers, regardless of their format, are converted to binary data
by the assembler.
50
Defining BYTE and SBYTE Data
• The BYTE (define byte) and SBYTE (define signed byte) directives allocate
storage for one or more unsigned or signed values. Each initializer must fit
into 8 bits of storage. For example:- The optional name is a label marking the
value1 BYTE 'A' ; character constant variable’s offset from the beginning of its
value2 BYTE 0 ; smallest unsigned byte enclosing segment. For example, if value1 is
value3 BYTE 255 ; largest unsigned byte located at offset 0000 in the data segment
value4 SBYTE −128 ; smallest signed byte and consumes 1 byte of storage, value2 is
value5 SBYTE +127 ; largest signed byte automatically located at offset 0001:
• value1 BYTE 10h
• A question mark (?) initializer leaves the
• value2 BYTE 20h
variable uninitialized, implying it will beThe DB directive can also define an 8-bit
assigned a value at runtime: variable, signed or unsigned:
• val1 DB 255 ; unsigned byte
value6 BYTE ?
• val2 DB -128 ; signed byte
51
Multiple Initializers
• If multiple initializers are used in the same data definition, its label refers
only to the offset of the first initializer. In the following example, assume list
is located at offset 0000. If so, the value 10 is at offset 0000, 20 is at offset
0001, 30 is at offset 0002, and 40 is at offset 0003:
list BYTE 10,20,30,40
• list BYTE 10,20,30,40 BYTE 50,60,70,80
BYTE 81,82,83,84;multiple declaration
Within a single data definition, its initializers can use
different radixes. Character and string constants can be
freely mixed.
In the following example, list1 and list2 have the same
contents:
list1 BYTE 10, 32, 41h, 00100010b
• greeting1 BYTE "Good afternoon",0
• greeting2 BYTE 'Good night',0 list2 BYTE 0Ah, 20h, 'A', 22h
• Each character uses a byte of storage. Strings are an exception to the rule that byte52
DUP Operator
• The DUP operator allocates storage for multiple data items, using a constant
expression as a counter. It is particularly useful when allocating space for a string
or array, and can be used with initialized or uninitialized data:
• BYTE 20 DUP(0) ; 20 bytes, all equal to zero
• BYTE 20 DUP(?) ; 20 bytes, uninitialized
• BYTE 4 DUP("STACK") ; 4 bytes: " STACKSTACKSTACKSTACK "
– .data Data with instruction
– smallArray DWORD 10 DUP(0) ; 40 bytes
– .code
– mov eax,ebx
– .data?
– .data
– bigArray DWORD 5000 DUP(?) ; 20,000 bytes, not initialized
– temp DWORD ?
– .data
– .code
– smallArray DWORD 10 DUP(0) ; 40 bytes
– mov temp,eax
– bigArray DWORD 5000 DUP(?) ; 20,000 bytes 53
D ?
E N
54