Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 54

1

Chapter Four
Basics of Assembly Language Programs
& Assembly Instructions
Data Movement Instructions
PUSH/POP
Arithmetic and Logic instructions
Various Data Transfer Instructions
String Data Transfers
String Comparisons

2
3
4
Middle-level language is a computer language in which the instructions are
created using symbols such as letters, digits and special characters.
Assembly language is an example of middle-level language. In assembly
language, we use predefined words called mnemonics. Binary code instructions in
low-level language are replaced with mnemonics and operands in middle-level
language.
But the computer cannot understand mnemonics, so we use a translator
called Assembler to translate mnemonics into machine language.
Assembler is a translator which takes assembly code as input and produces
machine code as output.
That means, the computer cannot understand middle-level language, so it needs to
be translated into a low-level language to make it understandable by the computer.
Assembler is used to translate middle-level language into low-level language.

5
6
Assembly Language Syntax
An assembly language program consists of statements.
The syntax of an assembly language program statement obeys
the following rules:
 Only one statement is written per line.
 Each instruction has an opcode and possibly one or more
operands.
 An opcode is known as a mnemonic.
 Each mnemonic represents a single machine instruction.
 Operands provide the data to work with.

7
8
Comments
• Comments are an important way for the writer of a program to
communicate information about the program’s design to a person
reading the source code.

• The following information is typically included at the top of a


program listing:
• Description of the program’s purpose
• Names of persons who created and/or revised the program
• Program creation and revision dates
9
Comments (cont..)
• Comments can be specified in two ways:
1. Single-line comments
 Beginning with a semicolon character (;).
All characters following the semicolon on the same line are ignored
by the assembler.
Mov eax, 5; I am a comment.
2. Block comments
beginning with the COMMENT directive and a user-specified symbol.
All subsequent lines of text are ignored by the assembler until the
same user-specified symbol appears.
For example, COMMENT ! This line is a comment. This line is also a
comment.
10
Assembly language instructions
A total of 116 instructions are available for the Intel 8086
microprocessor.
Copy data (MOV): This instruction copies a byte (8-bit) or a word (16-
bit) from source to destination. Both operands should be of the same
type (byte or word).
The syntax of this instruction is:

The destination operand can be any register or a memory location,


whereas the source operand can be a register, memory address, or a
constant/immediate value.

11
The following rules have to be strictly followed in order to
write correct code.
1 - Both operands have to be of the same size:

12
Arithmetic and Logic instructions

The INC Instruction


 The INC instruction is used for incrementing an operand by one. It
works on a single operand that can be either in a register or in
memory.

13
14
The DEC Instruction
 The DEC instruction is used for decrementing an operand by one. It works on a
single operand that can be either in a register or in memory.

15
Addition (ADD) and Subtraction (SUB)
ADD adds the data of the destination and source operand and stores
the result in destination.
Both operands should be of the same type (words or bytes),
otherwise, the assembler will generate an error.
 The subtraction instruction subtracts the source from destination
and stores the result in destination.

16
17
18
19
20
21
22
23
24
25
26
27
The MUL/IMUL Instruction
• There are two instructions for multiplying binary data. The MUL (Multiply) instruction
handles unsigned data and the IMUL (Integer Multiply) handles signed data. Both
instructions affect the Carry and Overflow flag.
Syntax
• The syntax for the MUL/IMUL instructions is as follows −
MUL/IMUL multiplier

Multiplicand in both cases will be in an accumulator, depending upon the size of the
multiplicand and the multiplier and the generated product is also stored in two registers
depending upon the size of the operands.
Following section explains MUL instructions with three different cases

28
29
30
31
Comparison of assembly and high-level languages
Assembly language is low level and High-level language is high level.
AL can use registers and main memory and HLL uses main memory.
AL is machine oriented and HLL is human oriented.
Assembly languages are close to a one to one correspondence between
symbolic instructions and executable machine codes.

Assembly languages also include directives to the assembler, directives to


the linker, directives for organizing data space, and macros.

32
Typical applications of Assembly language

• Hard-coded assembly language is typically used in a system's boot


ROM (BIOS on IBM-compatible PC systems).

• A stand-alone binary executable of compact size is required, i.e. one that must
execute without recourse to the run-time components or libraries associated
with a high-level language; this is perhaps the most common situation.

• Assembly language is also valuable in reverse engineering, since many


programs are distributed only in machine code form, and machine code is
usually easy to translate into assembly language and carefully examine in this
form, but very difficult to translate into a higher-level language. 33
Cont…
• In time critical application a specified time period (real time application)
a) Aircraft navigation systems
b) Process control systems
c) Robot control software
d) Communication software
e) Target acquis ion (Missile tracking) software
• System software often requires direct control over the system hardware, So,
it needs assembly program to communicate with the hardware. Such
system software
a) Operating system (low level part of operating
system)
b) Assembler and compiler
c) Linkers and loaders
d) Device driver and network interface

34
What is wrong with Assembly Language?

Here are the reasons people give for not using assembly:

• Assembly is hard to learn.

• Assembly language programming is time consuming and not portable.

• Improved compiler technology has eliminated the need for assembly language.

• Today, machines are so fast and have more memory that we no longer need to
use assembly.

• If you need more speed, you should use a better algorithm rather than switch to
assembly language.
35
What is right with Assembly language?
Assembly language has several benefits:

• Speed. Assembly language programs are generally the fastest programs around.

• Space. Assembly language programs are often the smallest.

• Capability. You can do things in assembly which are difficult or impossible


in HLLs.

• Knowledge. Your knowledge of assembly language will help you write better
programs, even when using HLLs.

36
Tools of Assembly language programming
• Assembly language programming tools are available as free software packages under the
GNU licensing agreement and as commercial software available for purchase.
• To program in assembly, you will need some software, namely an assembler and an
editor.

Assemblers
1. MASM –Originally by Microsoft, it's now included in the MASM32v8 package, which
includes other tools as well.
2. TASM – Another popular assembler. Made by Borland but is still a commercial product, so
you cannot get it for free.
3. NASM –A free, open source assembler, which is also available for other platforms.
Editor:-
4. Emulator software[EMU8086]  Notepad/ notepad ++
37
 Textpad, VS code, Jedit
Basic Elements of Assembly Language
Integer Constants
• An integer constant (or integer literal) is made up of an optional leading
sign, one or more digits, and an optional suffix character (called a radix)
indicating the number’s base: [{+ | −}] digits [ radix]
• Radix may be one of the following (uppercase or lowercase):

If no radix is given, the integer constant is assumed to be decimal. Here are some
examples using different radixes:
38
Continued …
• Example:-

• A hexadecimal constant beginning with a letter must have a leading zero to prevent
the assembler from interpreting it as an identifier.
• Integer Expressions:-is a mathematical expression involving integer values and
arithmetic operators.
• It can store 32 bits (00000000h through FFFFFFFFh).
• The arithmetic operators are listed in the following table according to their precedence
order, from highest (1) to lowest (4).
39
Continued …

The following are valid expressions and values

• Precedence refers to the implied order of


operations when an expression contains two
or more operators.

40
Continued …
• Real number constants:- are represented as Character Constants:
decimal reals or encoded (hexadecimal) reals.  Single character w/c is enclosed in single or double
quotes.
• It contains an optional sign followed by an Example: ‘A’ ,”d”
integer, a decimal point, an optional integer String Constants:
that expresses a fraction, and an optional
exponent:  Is a sequence of characters (including spaces) enclosed in
• [sign] integer.[ integer][ exponent] single or double quotes.
 Examples:
 sign {+,-}
 'ABC'
 exponent E[{+,-}] integer
 'X'
• +3.0  "Good night, Gracie"
• -44.2E+05  '4096’
• 26.E5 Embedded quotes:
 "This isn't a test"
=>At least one digit and a decimal point
 ‘Say "Good night," Helen’
are required.
41
Reserved Words
have special meaning in Assembler and can only be used in their correct context.
There are different types of reserved words:
 Instruction mnemonics, such as MOV, ADD, MUL ,INC, JMP, etc
 Register names, such as AX,BX,CX,SI,DI,
 Directives, which tell assembler how to assemble programs such
as .code, .data, .stack
 Attributes, which provide size and usage information for variables and operands.
Examples are BYTE and WORD
 Operators, used in constant expressions such as +/-*

42
Identifiers
• It is a programmer-chosen name. It might identify a variable, a constant, a procedure, or a
code label.
• Keep the following in mind when creating identifiers:
They may contain between 1 and 247 characters.
They are not case sensitive.
The first character must be a letter (A..Z, a..z), underscore (_), @ , ?, or $.
Subsequent characters may also be digits.
An identifier cannot be the same as an assembler reserved word.
• The @ symbol is used extensively by the assembler as a prefix for predefined symbols, so
avoid it in your own identifiers. Make identifier names descriptive and easy to understand.
Here are some valid identifiers:

43
Assembly Language Directives
Is a command embedded in the source code that is recognized and acted upon by the assembler.
Do not execute at runtime.
Can define variables, macros, and procedures.
They can assign names to memory segments and perform many other housekeeping tasks
related to the assembler.
Directives are case insensitive. For example, it recognizes .data, .DATA, and .Data as
equivalent.
The following example helps to show the difference between directives and instructions.
The DWORD directive tells the assembler to reserve space in the program for a doubleword
variable.
The MOV instruction, on the other hand, executes at runtime, copying the contents of myVar
to the EAX register:
myVar DWORD 26 ; DWORD directive
MOV EAX, myVar ; MOV instruction
44
Continued …
• Defining Segments:
• One important function of assembler directives is to define program
sections, or segments.
• The .DATA directive identifies the area of a program containing
variables: .data
• The .CODE directive identifies the area of a program containing executable
instructions: .code
• The .STACK directive identifies the area of a program holding the runtime
stack, setting its size: .stack 100h
• Instructions :-is a statement that becomes executable when a program is
assembled. It is translated by the assembler into machine language bytes,
which are loaded and executed by the CPU at runtime. An instruction
contains four basic parts:
1 optional 2 Required 3 required 4 optional
• [ label:] mnemonic [ operands] [; comment] 45
Continued …

• Label:-is an identifier that acts as a place marker for instructions and data.
• A label placed just before an instruction implies the instruction’s address. Similarly, a label placed just before a
variable implies the variable’s address.
• Data Labels: A data label identifies the location of a variable, providing a convenient way to reference the variable in code.
Example;
 Count DWORD 100
 array DWORD 1024, 2048, 4096, 8192 ; It is possible to define multiple data items following a label.
Code Labels: A label in the code area of a program (where instructions are located) must end with a colon (:).
Are used as targets of jumping and looping instructions.
• For example, the following JMP (jump) instruction transfers control to the location marked by the label named target, creating a
loop:
− target:
− mov ax,bx
− ...
− jmp target
A code label can share the same line with an instruction, or it can be on a line by itself:
− L1: mov ax,bx
− L2:
46
Continued …

• Operands: Assembly language instructions can have between zero and three
operands, each of which can be a register, memory operand, constant
expression Comments: Descriptions
1. Single-line comments, beginning with a
semicolon character (;).
2. Block comments, beginning with the
COMMENT directive and a user-specified
symbol.
COMMENT ! This is comment line ! Or
COMMENT & comment line &

stc ; set Carry flag


• The STC instruction, for example, has no operands:
• The INC instruction has one operand: inc eax ; add 1 to EAX
• The MOV instruction has two operands: mov count,ebx ; move EBX to count

• The imul instruction has Three operands: imul eax,ebx,5 ;In this case, EBX is
47
multiplied by 5, and the product is stored in the EAX register.
Assembling, Linking, and Running Programs
• A source program written in assembly language cannot be executed directly
on its target computer. It must be translated, or assembled into executable
code.
• The assembler produces a file containing machine language called an object
file. This file isn’t quite ready to execute. It must be passed to another
program called a linker, which in turn produces an executable file.

48
Defining Data Types
• It describes a set of values that can be assigned to variables and expressions
of the given type.
• The essential characteristic of each type is its size in bits: 8, 16, 32, 48, 64
• A variable declared as DWORD, for example, logically holds an unsigned
32-bit integer. In fact, it could hold a signed 32-bit integer, a 32-bit single
precision real, or a 32-bit pointer. The assembler is not case sensitive, so a
directive such as DWORD can be written as dword, Dword, dWord, and so on.
Instruction Mnemonic
• Is a short word that identifies an instruction. Assembly language instruction
mnemonics such as mov, add, and sub provide hints about the type of
operation they perform.

49
Data Definition Statement
• Syntax: name directive initializer [, initializer]...
count DWORD 12345
Directive: The directive in a data definition statement can be BYTE,
WORD, DWORD, SBYTE, SWORD, or DB,DW,DD,DQ.
Initializer: At least one initializer is required in a data definition,
even if it is zero. Additional initializers, if any, are separated by
commas. For integer data types, initializer is an integer constant or
expression matching the size of the variable’s type, such as BYTE or
WORD. If you prefer to leave the variable uninitialized (assigned a
random value), the ? symbol can be used as the initializer. All
initializers, regardless of their format, are converted to binary data
by the assembler.
50
Defining BYTE and SBYTE Data
• The BYTE (define byte) and SBYTE (define signed byte) directives allocate
storage for one or more unsigned or signed values. Each initializer must fit
into 8 bits of storage. For example:- The optional name is a label marking the
value1 BYTE 'A' ; character constant variable’s offset from the beginning of its
value2 BYTE 0 ; smallest unsigned byte enclosing segment. For example, if value1 is
value3 BYTE 255 ; largest unsigned byte located at offset 0000 in the data segment
value4 SBYTE −128 ; smallest signed byte and consumes 1 byte of storage, value2 is
value5 SBYTE +127 ; largest signed byte automatically located at offset 0001:
• value1 BYTE 10h
• A question mark (?) initializer leaves the
• value2 BYTE 20h
variable uninitialized, implying it will beThe DB directive can also define an 8-bit
assigned a value at runtime: variable, signed or unsigned:
• val1 DB 255 ; unsigned byte
value6 BYTE ?
• val2 DB -128 ; signed byte
51
Multiple Initializers
• If multiple initializers are used in the same data definition, its label refers
only to the offset of the first initializer. In the following example, assume list
is located at offset 0000. If so, the value 10 is at offset 0000, 20 is at offset
0001, 30 is at offset 0002, and 40 is at offset 0003:
list BYTE 10,20,30,40
• list BYTE 10,20,30,40 BYTE 50,60,70,80
BYTE 81,82,83,84;multiple declaration
Within a single data definition, its initializers can use
different radixes. Character and string constants can be
freely mixed.
In the following example, list1 and list2 have the same
contents:
 list1 BYTE 10, 32, 41h, 00100010b
• greeting1 BYTE "Good afternoon",0
• greeting2 BYTE 'Good night',0  list2 BYTE 0Ah, 20h, 'A', 22h
• Each character uses a byte of storage. Strings are an exception to the rule that byte52
DUP Operator
• The DUP operator allocates storage for multiple data items, using a constant
expression as a counter. It is particularly useful when allocating space for a string
or array, and can be used with initialized or uninitialized data:
• BYTE 20 DUP(0) ; 20 bytes, all equal to zero
• BYTE 20 DUP(?) ; 20 bytes, uninitialized
• BYTE 4 DUP("STACK") ; 4 bytes: " STACKSTACKSTACKSTACK "
– .data Data with instruction
– smallArray DWORD 10 DUP(0) ; 40 bytes
– .code
– mov eax,ebx
– .data?
– .data
– bigArray DWORD 5000 DUP(?) ; 20,000 bytes, not initialized
– temp DWORD ?
– .data
– .code
– smallArray DWORD 10 DUP(0) ; 40 bytes
– mov temp,eax
– bigArray DWORD 5000 DUP(?) ; 20,000 bytes 53
D ?
E N

54

You might also like