Professional Documents
Culture Documents
Pcasm Book K2opt
Pcasm Book K2opt
Paul A.Carter
July 23, 2006
This
may
be reproduced
made
for
notice)
the
and distributed
au-thorship,
provided
document
authors
consent.
excerpts
like
This
reviews
in its
copyright and
that
no
charge
itself,
without
includes
fair
and
is
the
use
advertising,
and
or
case.
Contents
Preface
1 Introduction
1.1
1.2
.........
..........
...........
.......
.....
..........
.........
...
..
........
.
..
.........
Number Systems
1.1.1
1.1.2
Decimal
Binary
1.1.3
Hexadecimal
Computer Organization
1.2.1
Memory
1.2.2
The CPU
1.2.3
1.2.4
1.2.5
1.2.6
Real Mode
1.2.7
1.2.8
1.2.9
Interrupts
1.3
1.4
.......
....
....
...
....
.........
....
.........
.......
.......
..
...
...
Assembly Language
1.3.1
Machine language
1.3.2
Assembly language
1.3.3
Instruction operands
1.3.4
Basic instructions
1.3.5
Directives
1.3.6
1.3.7
Debugging
1.4.1
a Program
First program
1.4.2
Compiler dependencies
1.4.3
1.4.4
1.4.5
Creating
...............
1
1
...............
...............
...............
...............
...............
...............
...............
...............
...............
...............
...............
...............
...............
...............
...............
...............
...............
...............
1
1
3
4
4
5
6
7
8
8
9
10
10
11
11
11
12
12
...............
...............
...............
...............
...............
...............
...............
...............
............... .
.......
1.4.6
13
16
16
18
18
22
22
23
23
Understanding
listing file
23
ii
1.5
an assembly
Skeleton File
..............
.........
.....
.........
2.1
2.1.1
Integer representation
2.1.2
Sign extension
2.2
.......
...........
..........
......
.....
2.1.3
2.1.4
Example
2.1.5
program
Control Structures
2.2.1
Comparisons
2.2.2
Branch instructions
2.2.3
.............
.............
.............
.............
CONTENTS
25
27
27
27
30
.............
.............
.............
.............
.............
.............
.............
.
.......... ..........
33
35
36
37
37
38
41
2.3
2.4
3
If statements
2.3.2
While loops
2.3.3
Do while loops
Bit Operations
3.1
..........
.........
..
............
..........
...........
........
..........
2.3.1
Shift Operations
3.1.1
Logical shifts
3.1.2
Use of shifts
3.1.3
Arithmetic shifts
3.1.4
Rotate shifts
3.1.5
3.2
.......
......
......
.......
......
......
......
.....
....
.........
...
Simple application
3.2.1
3.2.2
The OR operation
3.2.3
3.2.4
3.2.5
3.2.6
3.3
3.4
3.4.1
3.4.2
.............
.............
.............
.............
3.5
42
43
43
43
.............
.............
.............
.............
.............
.............
.............
.............
.............
.............
.............
.............
.............
.............
47
47
47
48
48
49
49
50
50
50
51
51
51
52
53
.............
.............
.............
.............
......
....
.......................
56
56
56
57
3.5.1
When
to Care About
Little
and Big
.
.......................
....................
...
.................
...........
.....
................
...
..........
59 3.6
Endian
Counting Bits
60 3.6.1
Method
one
60 3.6.2
Method two
61 3.6.3
Method three
123
CONTENTS
Subprograms
4.1
Indirect Addressing
4.2
4.3
The Stack
4.4
4.5
Calling Conventions
4.5.1
Passing parameters
on the stack
4.5.2
Local variables
..
........
......
.........
.......
.......
on the stack
4.6
Multi-Module Programs
4.7
4.7.1
Saving registers
4.7.2
Labels of functions
4.7.3
Passing parameters
............
............
............
............
............
............
iii
65
65
66
68
69
70
70
............
............
............
............
............
............
.
.......
....
................
75
77
80
81
82
82
4.7.4
...........................
............
.............
.......... ..............
...
.......
...
82 4.7.5
Returning values
83 4.7.6
83 4.7.7
Examples
85 4.7.8
Calling C
88
4.8
89 4.8.1
Recursive subprograms
89 4.8.2
91
Arrays
............ ...............
............ ..........
............
..
.......
...
............
...
......
..
............
....
95
5.1
Introduction
95
5.1.1
Defining
arrays
95
5.1.2
Accessing elements of
arrays
96
1 3
98
5.2
5.1.4
Example
5.1.5
Multidimensional Arrays
Array/String Instructions
memory
5.2.1
5.2.2
5.2.3
5.2.4
5.2.5
Floating Point
6.1
6.1.1
.............
.............
.............
.............
.............
.............
.............
.............
.............
.............
..........
...................
99
103
106
106
108
109
109
111
117
117
117
6.1.2
119 6.2
Arithmetic
122
Floating Point
..............
.........................
....
........
.............
............
.............
........
.....
..........
.................
...........
..........
.............
...........
6.2.1
Addition
122
6.2.2
Subtraction
iv
6.3
6.2.3
6.2.4
6.3.1
Hardware
6.3.2
6.3.3
Instructions
Examples
6.3.4
Quadratic formula
6.3.5
Reading
6.3.6
Finding primes
array
from file
7.1
7.2
Structures
7.1.1
Introduction
7.1.2
7.1.3
Memory alignment
Bit Fields
7.1.4
7.2.1
.............
..........
..............
........
.....
.......
7.2.2
References
7.2.3
Inline functions
7.2.4
Classes
7.2.5
7.2.6
A 80x86 Instructions
Index
............
............
............
............
CONTENTS
123
124
124
124
............
............
............
............
............
............
............
............
............
............
............
............
125
130
130
133
135
143
143
143
145
146
150
150
151
............
............
............
............
............
............
............
153
154
156
166
171
173
173
179
181
Preface
Purpose
purpose
The
better understanding
at
work
languages
lower
like
Pascal.
than
By
often be much
gaining
more
productive
assembly
program
used
language
supported
real
books
The
mode.
address
the computer.
multitasking
discusses how to
processors
teach
to
how
8086
processor
In this
mode,
only
any
device in
instead
C and
any memory or
secure,
later
as
in assembly language
still
processor
1981!
program may
program
way to achieve
the 8086
in
deeper
developing
is an excellent
really
in programming
understanding
can
how computers
of
level
for
program
Windows
and
supports
the
systems
Linux
runs
in).
features
that
modern
expect,
such
memory protection.
use protected mode:
as
There
virtual
are
This
mode
operating
memory and
reasons to
several
use.
run in
protected mode.
runs in
this mode.
for protected
assembly
is the
programming
main
mode PC
reason
that
assembler
Both
Source software:
of these
the Internet.
use
NASM
are
use
of
available
C/C++
compiler.
to download
from
assembly
code
under
the Linux
operating
sys-tem
found
on
my
drpaulcarter
the example
many
with
C/C++ compilers
Microsofts
Examples
and
web
Borlands
and
under Win-dows
platforms
can
be
site: http://www.
run
v
vi
PREFACE
Acknowledgements
The author would like to thank the many
programmers
were
for developing
on which DJGPP
is based
gcc
on; Donald
were used to
Jeremiah Lawrence
Ed Beroset
Jerry Gembarowski
Ziqiang Peng
Eno Compton
Josh I
Cates
Mik Miffl in
Luke Wallis
Gaku Ueda
Brian Heward
vii
Chad Gorshing
F. Gotti
Bob Wilkinson
Markus Koegel
Louis Taber
Dave Kiddell
Eduardo Horowitz
S ebastien Le Ray
Nehal Mistry
Jianyue Wang
Jeremias Kleer
Marc Janicki
Resources
Authors
page
NASM SourceForge
DJGPP
on the Internet
http://www.drpaulcarter.com/
page
http://sourceforge.net/projects/nasm/
Linux Assembly
The Art of Assembly
http://www.delorie.com/djgpp
http://www.linuxassembly.org/
http://webster.cs.ucr.edu/
USENET
Intel documentation
comp.lang.asm.x86
http://developer.intel.com/design/Pentium4/documentation.htm
Feedback
The author welcomes
work.
E-mail:
pacman128@gmail.com
WWW:
http://www.
drpaulcarter.com/pcasm
viii
PREFACE
Chapter 1
Introduction
1.1
Number Systems
memory
decimal
simplifies
(base
the
10).
hardware,
Because
it
computers
greatly
store
all
1.1.1
Decimal
234
=2
10
+3
10
+4
10
1.1.2
Binary
110012
+1 2 + 0
16 + 8 + 1
1 2
=
=
+0
+1
25
may
be converted to
are represented
inbinary.
are added.
Heres
1
2
CHAPTER 1.INTRODUCTION
an example:
Decimal
Binary
Decimal
Binary
0000
1000
0001
1001
0010
10
1010
0011
11
1011
0100
12
1100
0101
13
1101
0110
14
1110
0111
15
1111
+0
+1
carry
1
+0
1
Previous
+1
+0
0
+1
carry
1
+0
+1
110112
+100012
1011002
If one considers the following decimal division:
1234
can see
he
that
10
= 123 r 4
this division
strips
off the
by two performs
11012
This
fact
can
102
= 1102
1.2
shows.
rightmost digit
significant
r1
be used to convert
Figure
Consider the
This
decimal
representation
method
finds
as
the
memory
a byte.
Decimal
25
= 12 r 1
2 =6 r0
2 =3 r0
2 =1
r1
2 =0 r1
12
6
3
Binary
= 1100 r 1
= 110 r 0
10 = 11r 0
10 = 1
r1
10 = 0 r 1
11001
10
1100
10
110
11
=110012
Thus 2510
589
16
36
16
16
=
=
=
Thus 589
36 r13
2 r4
0 r2
=24D16
Figure 1.3:
1.1.3
Hexadecimal
Hexadecimal
Hexadecimal
numbers
use
can
base
16.
be used
as a
shorthand
for
binary
numbers.
Hex
a problem
has
16
since there
has
a power
Example:
=
=
=
2BD16
To convert
idea that
Each digit of
of 16 associated
+ 11
512 + 176 + 13
2
16
16
+ 13
16
use
the
701
was
a hex
with it.
conversion
same
except
is that there is
4
CHAPTER 1.INTRODUCTION
quickly.
Hex provides
much
is converted
to 0010 0100
are
ensures
that the
process uses
segments 2.Example:
110
0000
0101
1010
0111
11102
E16
A 4-bit number
is called
nibble
Thus
make
represented
value
by
ranges
a
so a
to
corresponds
a
a
byte and
nibble. Two
can
byte
be
1.2
1.2.1
Computer Organization
Memory
Memory
is
memory
megabytes
can hold
measured
is
computer
units of kilobytes ( 2
10
32
with
memory
of
20
and
labeled
in
byte.
by
address
as
Address
bytes),
megabytes
=1,048,576
gigabytes
unique
2 30
number
as
known
is
shows.
1,073,741,824
1.4
Figure
memory
bytes)
its
bytes).
Memory
2A
45
B8
20
8F
CD
12
2E
memory
Often
single
have
bytes. On the PC
been
memory as
given
larger
sections
stored by
memory is numeric.
using a char-acter code
to characters.
numbers
common
character
(American
is supplanting
One
codes
Standard
Interchange)
difference
of
All data in
are
names
architecture,
to these
Code
is
for
that
maps
the
most
as
is known
new, more
ASCII
of
Characters
ASCII
Informa- tion
Unicode.
One
key
uses
2
If
it is not
clear
why
the
starting
point
makes
word
2 bytes
double word
4 bytes
quad word
8 bytes
paragraph
16 bytes
one
a character, but
per character.
byte to encode
maps
ASCII
character
004116
the
capital
byte
4116
important
for representing
The CPU
the
the word
limited to
Unicode extends
and allows
to be represented.
1.2.2
(6510 ) to
maps
uses a byte, it is
3
uses
For example,
A; Unicode
Since ASCII
more
Unicode
characters
many
This
is
on to
can access
data in registers
care to keep
take
only currently
The instructions
make
language. Machine
have
a much more
an instructions
purpose very quickly to run effi ciently.
to decode
Machine
is
(known
as the clock
speed). When
you or one
4.
The clock
does not keep track of minutes and seconds. It simply beats at a constant
second.
per second.
3
uses
so only
use.
are used
in many different
components of
a computer.
The
use different
6
CHAPTER 1.INTRODUCTION
a metronome
beats of
help
one
play music at
are
requires
usually
depends
model. The
called
on
number
cycles)
an
1.2.3
of cycles depends
as
on
and
the
well.
IBM-type
PCs contain
instruction
as
family
including
more recent
all have
the features.
are
identical.
They
were
the CPUs
16-bit registers: AX, BX, CX, DX, SI, DI, BP, SP,
CS, DS, SS, ES, IP, FLAGS. They only support
up to one
megabyte of
any memory
programs!
very
address,
This makes
divided
debugging
and security
program memory
has to be
can not
be
some new
was
instructions
new
many
of
new
new
16-bit registers
32-bit protected
mode.
In this mode, it
can access up to
again
divided
segment
into
gigabytes.
segments,
Programs
80486/Pentium/Pentium
now
but
are
each
insize!
Pro:
These
very
few
new
up the execution
of the
instructions.
Pentium MMX: This
processor
(MultiMedia eXtensions)
instructions
instructions
to
the
Pentium.
graphics operations.
These
AX
AH
AL
a faster
Pentium II.)
1.2.4
purpose
Each of these
into
could be decomposed
as
register contains
Figure
the
upper
into
the AH and
AH and AL
are
used
as
independent
one
byte
they
are not
independent
to realize that
purpose
registers
are
versa.
The
DI. They
are
ma-chine
are
used to
are
memory is
a program. CS
segment registers.
used
stands
for different
parts
of
1.2.7.
to
keep
track
of the
by the
as an
CPU. Normally,
executed, IP is advanced
instruction
is
instruction inmemory.
The
information
instruction
FLAGS
about
stores
register
the
These
results
results
of
are
important
previous
stored
as
was zero or
0 if not
zero.
instruction
to
see
8
CHAPTER 1.INTRODUCTION
1.2.5
80386
The
and
processors
later
have
the 16-bit AX
compatible,
32-bit
as
no way to access
the
upper
16-bits
registers
of EAX
are
EBX,
well.
BP
becomes
EBP;
SP
are
extended
becomes
as
ESP;
However
registers,
below)
only
the
registers
are used.
protected
extended
are
mode
versions
purpose
(discussed
of
these
80386. There
FS and
are
also two
GS. Their
anything
They
segment
One of definitions
sees
was
80386
was
was
developed, it
released.
first
was
When
the
of word unchanged,
definition
confusing.
now a little
even
though the
1.2.6
Real Mode
where
So
memory
did
is limited
bytes). Valid
range
mous
from?
required
The BIOS
some of the 1M
one
to only
DOS
megabyte
640K
limit
(2
20
address
to FFFFF. These
a 20-
a 20-bit
16-bit registers. Intel solved this problem, by using two 16-bit values to
ware devices
screen.
In real mode,
infa-
addresses require
come
the
determine
an address.
values must be stored in segment registers. The second 16-bit value is called
the offset. The physical address referenced by
a 32-bit
selector:offset
pair is
16 selector
+ offset
a 0 to
+0048
04808
In effect, the selector value is a paragraph
number (see Table 1.2).
Real segmented addresses have disadvantages:
large amounts
This
of data
Each byte in
memory
unique
can
be referenced
by
047C:0048, 047D:0038,
047E:0028
or
047B:0058.
the comparison of
This
can
seg-
mented addresses.
1.2.7
complicate
are interpreted
a selector
In
value
memory. In
a selector
value is an index into a descriptor table.
In
both modes, programs are
protected mode,
segments
inphysical
value
are not
memory.
In fact,
memory at all!
Protected mode uses a technique called virtual
memory The basic idea
of a virtual memory system is to only keep the
data and code in memory that
programs are currently using. Other data and
code are stored temporarily
on disk until they are needed again. In16-bit
are
program
memory to work.
an entry
ina descriptor
are still
One
well-known
segment sizes
the 286
arrays
dead.
columnist called
problematic!
use of large
CPU brain
10
CHAPTER 1.INTRODUCTION
1.2.8
1.Offsets
allows
up to 4 gigabytes.
2. Segments
units called
it
is. This
is not
practical
with
the
larger
NT/2000/XP,
1.2.9
Interrupts
the ordinary
Sometimes
flow
of
a program
response
The hardware of a computer
a mechanism called interrupts to handle
these events. For example, when a mouse is
moved
the mouse hardware
interrupts
the
current program to handle the mouse movement
(to move the mouse cursor, etc. )Interrupts
cause control to be passed to an interrupt
handler. Interrupt handlers are routines that
process the interrupt. Each type of interrupt is
assigned an integer number.
At the beginning
of physical memory, a table of inter-rupt vectors
prompt
provides
an index
External
addresses of
interrupts
mouse
are
Many
keyboard,
cards)
devices
I/O
raise
interrupts
(e.g.,
Internal interrupts
generated
are
instruction
uses
called
(Application
modern
UNIX)
Interface)
interface.
Many interrupt
handlers
program
restore
values they
More
as Windows
and
to the
DOS
Programming
use a C based
interrupt
interrupts.
to the interrupted
They
the
from
software
finish.
same
occurred.
the program.
5
However,
kernel level.
they
may use a
at the
11
1.3
Assembly Language
1.3.1
Machine language
type
Every
machine
language
of
language.
are
understands
CPU
Instructions
its
in
own
machine
as bytes in memory.
own unique numeric
its operation code or opcode
for
80x86 processors instructions vary
numbers stored
called
short. The
Many instructions
or
addresses)
also include
used by the
instruction.
Machine
program
language
is
very
in directly. Deciphering
of the numerical-coded
for humans.
says to add
diffi cult
instructions
For example,
to
the meanings
is tedious
the instruction
that
together
C3
an assembler can do
a program
the programmer.
1.3.2
Assembly language
An assembly
text (just
as a
language
program
is stored
program).
ma-chine
as
one
as:
inassembly language
add
eax, ebx
of the instruction
is much
for
the
addition
instruction.
The
a text
file with assembly instructions and converts the assembly into machine
code. Compilers
are programs
a compiler.
years
Every
It took several
a single
machine
computer
may
and
ure
that since
every
a compiler!
is
Porting
12
CHAPTER 1.INTRODUCTION
computer
different
architectures
books
com-mon
More
examples
or NASM
Assembler
Assembler
is much
more
language.
uses
the
Netwide
are Microsofts
or Borlands Assembler
are some differences in the
assemblers
(MASM)
There
(TASM).
1.3.3
Instruction operands
Machine
number
and
code
type
instructions
of
oper-ands
operands;
have
varying
however,
a fixed
can have
(0 to 3). Operands
ters.
in
memory:
may
a constant hardcoded
or may be computed using
values
registers.
of
memory.
The
be
into the instruction
are
Address
always
segment.
immediate:
These
are
values
fixed
that
are
are
These
operands
are not
explicitly
or memory.
The
one to a register
one is
implied.
1.3.4
The
Basic instructions
most
instruction.
It
basic
moves
instruction
data from
is
one
the
MOV
location to
another
(like
the
assignment
operator
in
mov
dest,
src
by
src
is copied to dest
memory
operands.
This
quirk
assembly.
There
of
points
are
may not
out
be
another
often somewhat
are
is
an
example
(semicolons
same
into BL.
start
13
mov
eax, 3
;
store
3 into EAX
mov
bx, ax
;
store
the value of AX
eax, 4
add
al,ah
= eax + 4
;
al= al+ ah
;
eax
-10
ebx, edi ;
ebx = ebx -edi
bx, 10
;
bx = bx
or
decrement values by
Since the
one is an implicit
one.
operand, the
ecx
;
ecx++
dec
dl
;
dl--
1.3.5
Directives
are gen-erally
used to either
or inform
are not
are:
define constants
define
equ directive
equ directive can be used to define a
symbol. Symbols are named constants that can be
used inthe assembly program. The format is:
The
The
symbol
equ value
Symbol values
can not
be redefined later.
14
CHAPTER 1.INTRODUCTION
Unit
Letter
byte
word
double word
quad word
ten bytes
macros
just
as in C.
mov
eax, SIZE
use
a macro
named SIZE
symbols
and
can
in two
be
more
ways
than
Data directives
an
initial
value, too)
DX directives. The X
letters
those
in the
RESX
directives.
It is very
db
;
byte
;
word
L3
dw
labeled L1
1000
db
110101b
;
byte
initialized
;
byte
12h
db
indecimal) L5
;
byte
17o
db
dd
;
double
1A92h
to hex 1A92 L7
word initialized
;
1
resb
uninitialized byte
L8
db
;
byte
"A"
initialized
are
treated the
in
Sequences
1.3. ASSEMBLY
inC.
LANGUAGE
15
L9
db
d, 0
db
0,1,2,3
db
;
defines a C string
= "word"
L11
;
same as
word, 0
L10
The DD directive
can
integer
and
single
precision
point
floating
can
only be used to
For large
sequences,
NASMs
TIMES directive
a specified
L12
times 100 db 0
L13
resw
;
reserves room
100
for 100 words
that labels
as
square
one
and
should think of
convention.)
mov
as
a
follow
examples:
al,[L1]
;
copy
byte
at L1into AL
mov
;
EAX
eax, L1
mov
address of byte at L1 3
;
copy
ah
mov
[L1],
AH into byte at L1
;
copy
eax, [L6]
eax, [L6]
;
EAX
at L6
add
add
= EAX +double
[L6], eax
word
mov
al,[L6]
;
copy
first
shows an important
assem-bler does not keep
track of the type of data that a label refers to. It
is up to the programmer to make sure that he (or
she) uses a label correctly. Later
it will be
common to store addresses of data in registers
and use the register like a pointer variable in C.
Again, no checking is made that a pointer is
used correctly. In this way, assembly is much
more error prone than even C.
Line
7 of the
examples
mov
[L6], 1
;
store a 1
at
L6
This
not
statement
specified
produces
error.
an
operation
Why?
Because
size
the
a byte,
word
or
size specifier:
mov
dword [L6], 1
;
store a 1
at L6
6
variable in C.
16
CHAPTER 1.INTRODUCTION
an 1
at the
specifiers
1.3.6
and TWORD
7.
It
involves
in-terfacing
dependent
with
the
a simple,
uniform programming
no standard libraries
access hardware
operation in pro-tected mode)
must
They
either
(which is a privileged
or use
whatever
directly
low
level
routines
that
the
very common
be interfaced
the
code
I/O routines
rules
for assembly
with C. One
for
can use
However,
passing
routines to
advantage
of this is
the standard C
one must
information
know
between
complicated
uses.
These rules are too
to cover here. (They are covered
later!)
simplify
routines
that
To
developed
C
rules
the
I/O,
author
has
and
1 .4
Table
interface.
provide
more
much
describes
the
rou-tines preserve
simple
routines
the value of
them.
that
one must
the
a
preprocessor
To include
%include
a file with
to use
NASM, use the
include
assembler
file in
needs
directive.
The
following
To
use one
one
uses a
value
and
loads
CALL
to
function
call in
high level
section
program
below
shows
several
examples
routines.
1.3.7
Debugging
authors
library also contains some
for debugging programs
These
The
useful routines
defines
TWORD
floating point
8
that
code
The
..
asm io
asm io
a ten
byte
coprocessor uses
downloads
(and
inc
requires)
inc
on
the
area
of
memory.
The
the
are
asm io
in
web
object
the
page
file
example
for
this
tutorial, http://www.drpaulcarter.com/pcasm
17
print int
inEAX
print char
print string
screen
the
screen a new
an integer
reads
reads
a single
then make
regs
This
coprocessor,
macro
respectively.
is displayed
that is printed
distinguish
the
If it is 0,it is not
a single integer
out as well. This can
It takes
output
of
argument
be used to
dump
different
regs
commands.
dump
of
as
mem
a region
macro
of memory (in
This
.
.
ASCII characters.
arguments
delimited
that
dump
is used
regs
It takes
to label
argument)
the
paragraphs
memory
output
integer
(just
is
can
be
the number
as
will
start
on
is
label.)
of 16-byte
displayed
comma
an
three
The first is
the
The
first
macro
in Chapter
4.) The
as double
way.
9
It
takes
three
comma
18
CHAPTER 1.INTRODUCTION
arguments
delimited
The first is
an
integer
that
macro
coprocessor.
It
takes
single
integer
1.4
the
output
regs
does.
Creating
just
as
the
of
a Program
program
argument
written completely
stand alone
in assembly language
Assembly
rou-tines.
assembly
makes
other platforms.
at all.
So, why should
anyone
can be
access to direct
2.Assembly allows
hardware
one
one
last
two
learning assembly
programs
points
demonstrate
that
programs
in assembly, but he
uses
the ideas he
1.4.1
First
The early
program
programs
simply
asm main.
calls
program
another
This is really
in Figure 1.6
function
named
protected
mode.
corresponding
worry
about
19
in
their
by C. The assembly
any
and
code. The
int main()
ret status
=asm main();
use Cs
I/O
shows
a simple
assembly
program.
first.asm
;
file: first.asm
;
First assembly program. This program
;
input and prints out their sum.
;
To create executable using djgpp:
;
nasm -f coff first.asm
;
gcc -o first first.o driver.c asm_io.
89
%include "asm_io.inc"
10
12
;
initialized
;
13
segment .data
11
14
as
segment
15
;
These
for output
16
17
prompt1 db
;
dont
"Enter
a number:
18
",0
prompt2
db
19
outmsg1 db
20
outmsg2 db
"and ",0
21
outmsg3 db
",the
sum of these
is
",0
22
23
24
;
;
uninitialized
.bss segment
25
26
segment .bss
27
28
;
;
These
29
20
30
input1
resd 1
31
input2
resd 1
32
35
;
;
code
;
36
segment .text
33
34
37
38
global
_asm_main
_asm_main:
39
enter
40
pusha
0,0
41
42
mov
eax, prompt1
43
call
print_string
45
call
read_int
46
mov
[input1],
48
mov
eax, prompt2
49
call
print_string
44
eax
;
;
47
50
51
call
read_int
52
mov
[input2],
eax
;
;
54
mov
55
add
56
mov
eax, [input1]
eax, [input2]
ebx, eax
;
;
;
53
57
dump_regs 1
58
dump_mem
59
60
61
62
2,outmsg1, 1
;
;
next
;
message
63
mov
eax, outmsg1
64
call
print_string
65
mov
eax, [input1]
66
call
print_int
67
mov
eax, outmsg2
68
call
print_string
69
mov
eax, [input2]
70
call
print_int
71
mov
eax, outmsg3
CHAPTER 1.INTRODUCTION
setup routine
print out prompt
read integer
eax
= dword at input1
= eax
;
print
;
print
as series
;
print
;
print
;
print
;
print
memory
of steps
out first
message
out input1
out second
out input2
message
72
call
print_string
73
mov
eax, ebx
74
call
print_int
75
call
print_nl
76
77
popa
78
mov
79
leave
eax, 0
21
;
print
out third
;
print
out
;
print
new-line
;
return
80
message
sum (ebx)
back to C
ret
first.asm
initialized
Only
They
will be printed
so must
library and
in this
be terminated
are
strings
with
the
with
null
there is
Uninitialized
segment
(named
.bss
code
historically.
segment
is
.text
named
are
It is where instructions
placed.
an
underscore
of the C calling
very
convention.
important
uses
now, one
will be presented;
(i.e., functions
and
conven-tion
This
global
Later the
en-
however,
for
have
underscore
compiler.
to them by the C
prefix appended
(This
DOS/Windows,
rule
is
specifically
for
does not
assembler
global
directive
to make the
first.asm module.
22
CHAPTER 1.INTRODUCTION
1.4.2
Compiler dependencies
can
be freely downloaded
ternet. It requires
runs
386-based PC
compiler
uses
95/98
or better and
or NT. This
format
use
with
nasm
(as
remove
uses
and
an
object
The compiler specific
another
Microsoft
popular
the authors
for object
Borland C/C++ is
compiler.
files. Use
uses
It
the
OMF format
the -f obj
switch
for
Borland compilers.
already been modified to
work with the appropriate
compiler.
The extension of the object file willbe obj. The OMF format
uses differ-
ent segment directives than the other object formats. The data segment
(line 13) must be changed to:
segment
use32
use32
use32
Inaddition
line 36:
a new
group
The
DGROUP
Microsoft
BSS
DATA
compiler
C/C++
or the
given a
can use
object
OMF
files.
converts
(If
the
information
to
format,
Win32
internally.)
defined just
as
format
to be
1.4.3
nasm
10
it
-f object-format first.asm
(http://www.fsf.org)
11
http://www.delorie.com/djgpp
23
where object-format
win32 depending
on
used. (Remember
that the
what
C compiler
source
1.4.4
or
obj
will be
file must be
as well.)
For DJGPP,
a C compiler.
use:
gcc -c driver.c
1.4.5
Linking
is
the
process
of
combining
the
an executable
file. As will
correct
pa-rameters,
run.
C library and
It is much easier to
first
program
using DJGPP,
gcc -o first
This creates
with the
use:
driver.o first.o
an executable
asm io.o
Borland
asm io.obj
gcc -o first
Now
driver.c first.o
asm io.o
1.4.6
Understanding
an assembly
listing file
The -llisting-file
tell
how the
can be used to
file of a given name.
code was assembled.
switch
appear
are
source
same as
24
CHAPTER 1.INTRODUCTION
48 00000000 456E7465722061206Eprompt1 db
"Enter
a number:
",0 49
00000009 756D6265723A2000
50 00000011 456E74657220616E6Fprompt2 db
raw
case
the
on the line
column are very
source
file is displayed
may
program.
Each module
section
these
1.4.5),
all
definitions
are
segment.
The
combined
new
final
data
to
step (see
segment
form
offsets
one
are
label
data
then
source
file:
94
0000002C
eax,
mov
A1 [00000000]
95 00000031 0305[04000000]
[input1]
add
eax,
mov
ebx, eax
[input2] 96
00000037
89C3
The
third
generated
code for
column
shows
the
by the assembly.
machine
Often
code
the complete
be computed yet.
can compute
mov
instruction
but
it writes
the
In this
in
offset
can not
case, a temporary
square
brackets
be computed yet.
offset
of 0 is used
does
beginning
program.
not
of
mean
the
that
final
it will be
bss
segment
at the
of
the
any
can compute
the
offset that
appears
but 04000000.
multibyte
in
memory
is not 00000004,
Why? Different
processors store
Endian is pronounced
like
methods of
storing integers
Big endian
:
big
is the
25
that
seems
most significant)
next
biggest,
00000004
etc. For
would be stored
processors
method. However,
example,
as
the
dword
processors
use this big endian
Intel-based processors use the
most RISC
all
byte
or through a network).
as a multibyte
integer
as individual
bytes
or vice
versa.
Endianness
array
always
strings
elements.
at the lowest
(which
are
address.
just
an array
This applies
character
1.5
arrays)
is
to
elements
arrays.
Skeleton File
as a starting
programs.
used
a skeleton
file that
can be
26
skel.
1
%include "asm_io.inc"
segment .data
CHAPTER 1.INTRODUCTION
;
initialized data
;
segment here
segment .bss
8
9
;
;
uninitialized
segment
10
11
12
13
14
segment .text
global
_asm_main
_asm_main:
15
enter
16
pusha
0,0
17
18
19
20
21
;
;
code is put inthe text
;
or after this comment.
;
22
23
popa
segment.
24
mov
25
leave
;
setup
eax, 0
routine
;
return
26
back to C
ret
skel.asm
Chapter 2
Basic Assembly
Language
2.1
2.1.1
Integer representation
Integers
signed
come
Unsigned
non-negative)
are
straightforward
binary
integers
(which
represented
manner.
in
are
very
this be represented
memory.
in
There
are
three
general
computer
memory
techniques
that
signed integers in
use
as a sign
Signed magnitude
rep-resents
the integer
as two
parts. The first part is the sign bit and the second
is the magnitude
be represented
as
bit is underlined)
The
largest
+127
the byte
or
(the sign
11111111
00111000
byte
127. To negate
bit is reversed.
value
a value,
or
be
the sign
would
Since
same.
the
This
complicates
the
should act
logic
of
27
28
CHAPTER 2.BASIC
ASSEMBLY
LANGUAGE
as 10
Ones complement
The
second
method
is
known
as
ones
a number
number.
new
(Another
way to
bit value is 1
oldbitvalue.) For example, the
bit
value.
)For
ones
complement
In ones
ones
com-plement
complement
is
notation,
equivalent
computing
to
the
nega-tion.
for 56.
was
automatically
and that
as one
are two
(+0) and
representations
ex-
zero:
00000000
with ones
numbers is complicated.
There is
complement
would
twice yields
complement
changed
a
of
a number
in hexadecimal without
is
is
agrees
Twos complement
Modern
twos complement
of
were used on
use a third
computers
representation.
number is found by
2.Add
Heres
an example
of step 1
one is added:
11000111
1
11001000
In two complements
twos
complement
number.
Thus,
complement
negations
should reproduce
WITH INTEGERS
the
to negating
is equiv-alent
11001000
is
rep-resentation
Surprising
twos
The C language
29
notation, computing
of
the
twos
56.
Two
complement
does
meet
2.1. WORKING
this
Number
Hex Representation
00
01
127
7F
-128
80
-127
81
-2
FE
-1
FF
Representation
one to the
ones complement.
00110111
1
00111000
When performing
complement
leftmost bit
not
used.
computer
number
may
the
produce
Remember
is of
of
operation,
some
bits).
addition
a carry.
that
fixed
Adding
This
all data
of
two
bytes
the
carry is
on the
of
always
produces
byte
words produces
important
twos
for
example,
complement
complement
notation.
zero as a one
consider
number (00000000)
byte
For
twos
Computing
its
c
where
c represents
00000000
a carry.
carry,
the
result.)
Thus,
arithmetic
in
(Later
it will be
twos
complement
makes twos
methods.
twos
Using
byte
can
complement
notation,
be used to represent
are
to +32, 767
signed
some
selected values
can
be represented.
+32, 767 is
represented
be
re prese ntby
ed.7FFF, 32, 768 by 8000, -128
FF80 and -1 as
numbers
range
approximately.
as
billion
to +2 billion
no
or double
idea what
particular byte
Whether
on what
instruction is used
data
FF is considered
a signed 1 or a unsigned
on the programmer. The C language
represent
on the
to
+255 depends
30
CHAPTER 2.BASIC
ASSEMBLY
LANGUAGE
This allows
a C compiler to
determine the
2.1.2
Sign extension
a specified size. It
uncommon to need to change the size of
to use it with other data. Decreasing size is
the easiest.
the
more
remove
a
trivial example:
mov
;
ax =52 (stored
ax, 0034h
in16 bits)
mov
cl,al
lower 8-bits of
Of
course,
if
;
cl=
ax
the
number
can not
be
represented
correctly
in
the
smaller
size,
were
above
code
method
would
still set
CL to 34h. This
numbers.
Consider
FFFFh (1
(1
0134h (or
as a
as a
byte).
signed numbers,
was
if AX
or
removed
must
have
the first
the
same
new
as
as the original
the
sign bit of
same
sign bit!
to
more
complicated
a unsigned
If FF is
on how
FF is interpreted.
an
In general, to extend
one
makes
new
all the
unsigned
bits
number,
of the expanded
00FF. However,
to extend
sign
is 1, the
new
ones, to
decimal)
was extended,
31
are
There
provides
that
for extension
the computer
number
is
signed
of
does
or
numbers.
not know
unsigned.
numbers,
It is
Remember
whether
up to
the
instruction.
one can
simply
put
zeros
in the
upper
bits using
MOV instruction.
an
mov
;
zero
ah, 0
it is not
However,
out
upper
possible
an
unsigned
double
no way to
EAX in a MOV. The
providing
a new
There is
instruction
word
use a
to
8-bits
MOV
word in AX to
in EAX. Why
specify the
upper
not?
16 bits of
MOVZX
by
This
The destination
a 16 or 32 bit register.
source (second operand) may be an 8 or 16 bit
register or a byte or word of memory. The other
(first operand) must be
The
source.
(Most
must be larger
instructions
require
the
size.) Here
movzx
eax, ax
;
extends ax into eax
movzx
eax, al
;
extends
alinto
eax
movzx
ax, al
;
extends
alinto
ax
movzx
ebx, ax
;
extends ax into
ebx
no easy way to
any case. The 8086
use
provided
several
numbers.
The
for
instructions
CBW
to extend
are
Word to Double
signed
to Word)
Byte
(Convert
into AX.
word) instruction
sign extends
means to
as one 32 bit
upper
any
new
not have
several
instructions.
Word
to Double
sign extends
Double
word
word
The
(Convert
instruction
CWDE
Extended)
sign
Application to C programming
Extending
also
occurs
of unsigned
and signed
in C. Variables in
integers
define
may
be declared
as
either signed
Consider
a is extended
for unsigned values (using MOVZX), but inline 4,the signed rules
are used
signed
or not, it is up to
or
whether the
That is why
32
CHAPTER 2.BASIC
ASSEMBLY
LANGUAGE
=0xFF;
=0xFF;
signed char
schar
a = (int) uchar;
int
int b
= (int) schar;
a = 255
/ b
= 1
(0x000000FF) /
(0xFFFFFFFF) /
Figure 2.1:
char ch;
while( (ch
= fgetc(fp))
!= EOF ){
/ do something with ch /
}
Figure 2.2:
back
an int since
reason
an char
is that it normally
(ex-tended to
However, there is
basic
problem
the program in
re-turns an int, but
with
higher
000000FF
truncated
can not
and
FFFFFFFF
both
will be
case,
or unsigned.
on whether
char is signed
an int value
ch will be
extended to an int so that two values being
2.
compared are of the same size
As Figure 2.1
showed, where the variable is signed or unsigned
is very important.
EOF.
Since
EOF is
If char is unsigned,
000000FF.
to be
FF is extended
This is compared
to EOF (FFFFFFFF)
never
ends!
1
It is
a common
an EOF
The
reason
33
char
If
is
signed,
FF
is
compare as
extended
to
ch variable
is
done,
2.
Inside
the value
2.1.3
As
was seen
performs
performs
addition
earlier,
and
subtraction.
the
the
Two
add
instruction
sub
instruction
of the bits
overflow and
carry
in the
are the
for signed
arithmetic.
shortly.
subtraction
integers.
signed
or
of the great
One
complement
advantages
are
same as
may
exactly the
FFFF
on
43
a carry generated,
answer.
There are two different
There is
multiply
be used
(1)
(1)
002B
used
for unsigned
44
002C
or
2s
and
unsigned integers.
the
of
for addition
instruction.
and divide
use either
the MUL
is
used to multiply
different
signed integers.
instructions
multiplication
complement
are
needed?
different
Why
The
are two
rules
for
the multiplication
yielding
word
result.
Using
unsigned
or 65025
or 1
(or 0001 in hex).
There
or 1
(or
are
0001
several
in hex).
forms of the
(or
this is
1 times 1
1
multiplication
The
source
source
reference.
Exactly
depends
is either
It
what
on the
can not
register
be
an
multiplication
size of the
or a memory
immediate
source
is
value.
performed
operand. If the
16 bits of AX. If
multiplied
the
source
is 16-bit, it is
34
BASIC ASSEMBLY
CHAPTER 2
LANGUAGE
dest
source1
source2
Action
reg/mem32
= AL*source1
DX:AX =AX*source1
EDX:EAX =EAX*source1
reg16
reg/mem16
dest *= source1
reg32
reg/mem32
dest *= source1
reg16
immed8
dest *= immed8
reg32
immed8
dest *= immed8
reg16
immed16
reg32
immed32
reg16
reg/mem16
immed8
dest
reg32
reg/mem32
immed8
dest
reg16
reg/mem16
immed16
reg32
reg/mem32
immed32
reg/mem8
AX
reg/mem16
dest *= immed16
dest *= immed32
=source1*source2
=source1*source2
dest =source1*source2
dest =source1*source2
source
is 32-bit, it is
into EDX:EAX.
The IMUL instruction has the same formats
some
as
imul
dest, source1
imul
formats:
div
source
If the
source
DX:AX
in AH. If the
source
source
is 16-bit, then
the operand
is divided by
same way.
like
There
the special
are no
IMUL
special IDIV
ones.
If the
or the
zero, the program is interrupted and
terminates. A very common error is to forget to
initialize DX or EDX before division.
quotient
divisor is
The
NEG
instruction
negates
its
may
be
any
8-bit, 16-bit,
or
single
Its
32-bit
register
or memory
location.
2.1.4
Example programmath.asm
%include "asm_io.inc"
segment .data
prompt
db
"Enter
square_msg
db
"Square of input
;
Output
strings
a number:
35
",0
is ",0
cube_msg
is ",0
db
"Cube of input
cube25_msg
db
db
8
"Cube of
quot_msg
rem_msg
db
"Remainder of
neg_msg
db
"The negation
10
11
segment .bss
12
input
resd 1
13
14
15
16
segment .text
global
_asm_main
_asm_main:
17
enter
18
pusha
0,0
19
20
mov
eax, prompt
21
call
print_string
23
call
read_int
24
mov
[input],
imul
eax
ebx, eax
28
mov
mov
29
call
print_string
30
mov
eax, ebx
31
call
print_int
32
call
print_nl
22
eax
25
26
27
33
eax, square_msg
;
;
34
mov
ebx, eax
35
imul
ebx, [input]
36
mov
eax, cube_msg
37
call
print_string
38
mov
eax, ebx
39
call
print_int
40
call
print_nl
cube/100 is ",0
of the remainder is ",0
setup routine
edx:eax
=eax *eax
save answer
inebx
ebx *= [input]
36
ASSEMBLY
CHAPTER 2.BASIC
LANGUAGE
41
42
imul
ecx, ebx, 25
43
mov
eax, cube25_msg
44
call
print_string
45
mov
eax, ecx
46
call
print_int
47
call
print_nl
49
mov
eax, ebx
50
cdq
51
mov
ecx, 100
52
idiv
54
mov
mov
ecx
ecx, eax
eax, quot_msg
55
call
print_string
56
mov
eax, ecx
57
call
print_int
58
call
print_nl
59
mov
eax, rem_msg
48
53
60
call
print_string
61
mov
eax, edx
62
call
print_int
63
call
print_nl
neg
mov
edx
66
67
call
print_string
68
mov
eax, edx
69
call
print_int
70
call
print_nl
64
65
eax, neg_msg
71
72
popa
73
mov
74
leave
;
ecx
eax, 0
=ebx*25
;
initialize
;
cant
;
edx:eax / ecx
;
save quotient
into
ecx
;
negate
the remainder
;
return
back to C
ret
75
2.1.5
Assembly
.
that
math.asm
allow
one
to
perform
instructions
addition
and
These in-structions
use
the
carry
flag. As stated
the
carry
flag if
respectively.
generated,
37
or subtract
carry
flag
can be
large numbers by
up the operation
use this
= operand1 + carry
+ operand2
= operand1 -carry
flag
operand2
How
add
eax, ecx
;
add lower
adc
edx, ebx
;
add
32-bits
2
upper
32-bits and
carry
from previous
sum
lower 32-bits
sbb
;
subtract upper
For really
to
use
ADC
can be done
instruction
right
initialize the
carry
there is
a loop could be
a sum loop, it would
instruction for every
large numbers,
be convenient
This
edx, ebx
iteration
;
subtract
eax, ecx
sub
no
instructions.
before
loop
difference
The
starts
carry
to
flag is 0,
same
idea
can
be used
for
subtraction, too.
2.2
Control Structures
High
control
level
languages
structures
(e.g.,
provide
high
the
and
if
level
while
goto
and
used
It instead
uses the
can
inappropriately
infamous
result
in
assembly
basic procedure
language
programs
program
is to design the
high level
The
logic
control structures
language
(much like
compiler
would
do).
2.2.1
Comparisons
38
CHAPTER 2.BASIC
ASSEMBLY
used
LANGUAGE
later.
The
80x86
are
of
on the
provides
the
CMP
The FLAGS
The operands
are set
based
on
result
that
comparison like:
cmp
vleft, vright
The difference
of vleft
- vright
is computed
> vright,
<
borrow). If vleft
is set (borrow).
= OF if
Why does SF
flag vleft
> vright?
an operation
If there
flag
The sign
difference
have
the
is
set
is
set
(just
If vleft
as for unsigned
integers). If vleft
= vright, the ZF
correct
value
be non-negative.
SF
and
must
Thus,
SF
=OF =0.However,
and
= OF. If vleft < vright, ZF is unset and SF 6= OF.
Do not forget that other instructions
correct
value
instructions
(and
will be negative).
SF
in fact
transfer
Branch
Thus,
= OF = 1
can
2.2.2
execution
Branch instructions
points of a
to arbitrary
program. In other
are two types
conditional. An unconditional
a goto,
it
conditional
always
branch depending
JMP
(short
unconditional
usually
and
makes
the
branch.
conditional
There
branch
register. If
a goto.
unconditional
for
jump)
branches
Its
instruction.
instruction
single
makes
argument
is
to.The assembler
or linker
one
another
easier.
that the
assembler
It
statement
is
important
immediately
to realize
that
life
the
are
several
variations
of
the
jump
instruction:
SHORT This jump is
very
range. It
in memory.
limited in
128 bytes
39
JZ
JNZ
JO
JNO
JS
JNS
JC
JNC
JP
JNP
memory
move
ahead
or
behind.
keyword
It
uses a
displacement
is how
(The
immediately
many
single
of the
bytes to
displacement
is
use
the
short jump,
unconditional
it
can
near
This allows
The
four
protected
byte
mode.
type
The
is the
two
default
byte
type
in 386
can
be
before the
in the
code
are
segment
the
same
rules
as
defined by placing
in front
of
the
are many
instructions.
whether to branch
of these
which
indicates
number
or not.
instructions.
the
flag
or evenness
the
odd
of
of the
result.)
The following pseudo-code:
40
CHAPTER 2.BASIC
ASSEMBLY
LANGUAGE
== 0 )
= 1;
if (EAX
EBX
else
EBX
= 2;
cmp
as:
eax, 0
jz
thenblock
mov
ebx, 2
jmp
next
thenblock:
mov
ebx, 1
next:
Other comparisons
;
set
;
if ZF is set
of IF
;
jump over
THEN part of IF
;
THEN part
of IF
so easy
-0 =0)
branch to thenblock
;
ELSE part
not
EBX
>= 5 )
= 1;
are
else
EBX
= 2;
ZF
cmp
eax, 5
js
signon if SF
=1
jo
;
goto
elseblock if OF
jmp
thenblock
if SF
jo
elseblock
=1
and SF =0
;
goto
= 0 and OF =0
;
goto
signon
signon:
;
goto
thenblock
thenblock if SF
thenblock
=1
and OF =1
elseblock:
8
mov
ebx, 2
jmp
next
10
thenblock:
mov
11
12
ebx, 1
next:
are signed
Table
are
the
same
for
41
Unsigned
Signed
= vright
JE
branches if vleft
JNE
JL, JNGE
branches if vleft
JLE, JNG
JG, JNLE
branches if vleft
JGE, JNL
< vright
> vright
=vright
JE
branches if vleft
JNE
JB, JNAE
branches if vleft
JBE, JNA
JA, JNBE
branches if vleft
JAE, JNB
<vright
>vright
instructions
two
have
synonyms.
For
same instruction
because:
x < y =
The unsigned branches
not(x y)
use
pseudo-code above
much easier.
cmp
eax, 5
jge
thenblock
ebx, 2
jmp
next
thenblock:
mov
mov
ebx, 1
next:2.2.3
operand.
LOOP Decrements ECX, if ECX 6= 0, branches
to label
LOOPE, LOOPZ Decrements ECX (FLAGS
= 1, branches
0 and ZF
= 0,branches
are useful
for
pseudo-code:
42
CHAPTER 2
sum =0;
for( i=10; i
>0; i )
sum += i;
could be translated into assembly
as:
mov
eax, 0
;
eax
is
mov
ecx, 10
;
ecx
is
loop_start:
add
eax, ecx
loop
loop_start
BASIC ASSEMBLY
LANGUAGE
sum
i
2.3
Translating Standard
Control Structures
can be
2.3.1
If statements
else
else block
could be implemented
so that
;
code
to set FLAGS
jxx
else_block
;
select xx
3
jmp
endif
else_block:
;
code
;
code
as:
endif:
branch
can be replaced
by a branch to endif.
;
code
to set FLAGS
jxx
endif
xx so that
;
code
4
endif:
;
select
43
2.3.2
While loops
}This
could be translated into:
1
while:
;
code
jxx
endwhile
on
condition
3
xx so that
branches if false
;
select
4
body of loop
jmp
while
endwhile:
2.3.3
Do while loops
}while( condition );
do:
;
body
of loop
;
code
jxx
do
on
condition
4
;
select xx
so that
branches if true
2.4
Example:
Finding
Prime
Numbers
This section looks at
a program
that finds
prime numbers
are
this
C.
shows the basic algorithm written in
2 is the only
44
even prime
number.
CHAPTER
BASIC ASSEMBLY
guess;
LANGUAGE
/ current
guess
unsigned
unsigned factor;
/ possible factor of
unsigned limit;
/ find primes
for prime
guess
up to this
value /
45
up to:);
&limit);
printf (2\n);
/ treat
printf (3\n);
/ special
case
/ initial
guess
guess
=5;
while (guess
10
/ look for
11
factor
12
<= limit
){
a factor
of
guess
14
factor
15
guess
as
/
/
= 3;
while (factorfactor
13
< guess
&&
% factor != 0 )
+= 2;
if (guess % factor != 0)
16
printf (%d\n,
17
guess += 2;
18
guess);
19
Figure 2.3:
prime.asm
1
%include "asm_io.inc"
segment .data
Message
db
"Find
resd
45
segment .bss
6
Limit
resd
Guess
89
segment .text
global
10
11
_asm_main
_asm_main:
12
enter
13
pusha
0,0
14
15
mov
eax, Message
16
call
print_string
17
call
read_int
18
mov
[Limit],
eax
19
primes
up to:",0
;
find primes up to this
;
the current guess
;
setup routine
;
scanf("%u",
limit
for prime
& limit );
45
20
mov
eax, 2
21
call
print_int
22
call
print_nl
23
mov
eax, 3
24
call
print_int
25
call
print_nl
mov
dword [Guess], 5
26
27
28
while_limit:
29
mov
eax,[Guess]
30
cmp
eax, [Limit]
31
jnbe
end_while_limit
mov
ebx, 3
32
33
34
while_factor:
35
mov
eax,ebx
36
mul
eax
37
jo
end_while_factor
38
cmp
eax, [Guess]
39
jnb
end_while_factor
40
mov
eax,[Guess]
41
mov
edx,0
42
div
ebx
43
cmp
edx, 0
44
je
end_while_factor
46
add
ebx,2
47
jmp
while_factor
45
48
end_while_factor:
49
je
end_if
50
mov
eax,[Guess]
51
call
print_int
call
print_nl
54
add
dword [Guess], 2
55
jmp
while_limit
52
53
56
end_if:
end_while_limit:
57
58
popa
59
mov
60
leave
eax, 0
;
printf("2\n");
;
printf("3\n");
;
Guess
=5;
;
while
(Guess
<= Limit
;
use
;
ebx
is factor
=3;
are
unsigned
;
edx:eax
= eax*eax
;
if answer
;
if !(factor*factor < guess)
;
edx
=edx:eax
;
if !(guess
% ebx
% factor != 0)
;
factor += 2;
;
if !(guess
% factor != 0)
;
printf("%u\n")
;
guess += 2
return back to C
61
ret
46
prime.asm
CHAPTER 2.BASIC
ASSEMBLY
LANGUAGE
Chapter 3
Bit Operations
3.1
Shift Operations
common
the
individual
bits
programmer to
of
data.
One
shift. A shift
3.1.1
Logical shifts
shows
an example
of
a shifted
manner.
Figure 3.1
Original
Left shifted
Right shifted
new,
incoming bits
are
allow
one to shift
any
by
These
number of
can
positions to shift
a constant or can
be stored in the CL
register.
stored
zero.
always
in the
carry
flag. Here
are some
code
examples:
mov
to
bit
right,
ax,
ax
ax
ax
0C123H
8246H,
2091H,
CF
;
shift
4123H,
CF
;
shift
;
shift
ax
left,
ax,
shr
ax, 0C123H
shl
CF
15
=0
13
1
bit to
shr
1
bit to right,
mov
ax
47
48
,
,
ax
shl
to
bits
mov
cl,3
ax
shr
bits to right,
3.1.2
048CH,
CF
17
;
shift
cl
ax =0091H,
CF
=1
Use of shifts
Fast
most
;
shift
ax
left,
multiplication
common uses
of
and
a shift
division
are
operations. Recall
the
and
For
example,
to double
the
binary
once to
the left to get 101102 (or 22). The quotient of a
division by a power of two is the result of a right
shift. To divide by just 2, use a single right shift;
number 10112
3),
instructions
are very
basic and
are
much faster
logical
can
shifts
be
used
to
val-ues. They do
logically
value FFFF
(signed
once,
right shifted
If value
it is
1).
can
be
3.1.3
Arithmetic shifts
These
shifts
are
designed
to allow
signed
powers
treated correctly.
SAL Shift Arithmetic
just
a synonym
is assembled
machine code
as
Left
- This instruction
is
for SHL. It
into
the exactly
the
same
as SHL. As long
instruction
Right
- This
is
a new
as
are copies
The other
new
bits
are
bits that
new
bits
are
also 1).
are shifted.
carry
flag.
8246H,
mov
ax,0C123H
ax,
sal
CF
13
1
sal
;
ax = 048CH, CF = 1
;
ax = 0123H, CF = 0
sar
;
ax =
ax, 1
ax, 2
49
3.1.4
Rotate shifts
shifts except that bits lost off one end of the data
are
is treated
simplest
Just
as
mov
8247H,
CF
ax, 0C123H
ax
rol
13
are two
;
ax =
ax, 1
ax, 1
ax, 2
ax, 1
;
ax = 048FH, CF = 1
;
ax = 091EH, CF = 0
;
ax = 8247H, CF = 1
;
ax = C123H, CF = 1
There
rol
rol
ror
ror
carry
flag
if the AX
register
is rotated
17-bits made
up
of AX and the
carry
flag
the
are
rotated.
mov
clc
the
carry
ax, 0C123H
flag (CF
= 0)
;
ax =8246H,
15
rcl
091BH, CF
=0
=1
rcr
CF
;
ax = C123H,
CF
CF
;
ax =
ax, 1
6
;
ax = 8246H,
3.1.5
CF
;
ax = 048DH,
ax, 1
rcl
;
clear
rcl
ax,
ax, 2
=1
=0
rcr
ax, 1
Simple application
are on
register.
1
mov
;
blwill
bl,0
;
ecx
ecx, 32
mov
count_loop:
4
bit into
shl
carry
;
shift
eax, 1
flag
jnc
;
if CF == 0,goto
skip_inc
skip_inc
7
inc
bl
skip_inc:
loop
count_loop
50
CHAPTER 3.BIT OPERATIONS
X AND Y
AND
a byte
be
3.2
are
There
four
common
boolean
operators:
of each operation
3.2.1
support
these
operations
bits of data
in parallel.
contents of AL and BL
are
as
on
all the
For example,
if the
as Figure
mov
ax,0C123H
and
ax, 82F6H
8022H
;
ax =
3.2.2
The OR operation
mov
ax,0C123H
or
ax, 0E831H
= E933H
;
ax
51
X OR Y
X XOR Y
3.2.3
example:
mov
ax, 0C123H
xor
ax, 0E831H
;
ax
= 2912H
3.2.4
The NOT
operation
a unary
is
operation
(i.e. it acts
operations such
mov
ax, 0C123H
not
ax
;
ax =
3EDCH
Note
that
complement.
the
NOT
finds
the
ones
operations,
any
of the
an
A ND
3.2.5
The
TEST
instruction
performs
performs
a subtraction
on what
set.
the result
zero, ZF would
be
52
Turn
on bit i
Turn offbit i
Complement
bit i
NOT X
called
a mask
3.2.6
Bit operations
for
common uses
mov
xor
ax,
C12BH
410BH
0FFF0H
4F00H
or
invert
ax,
;
turn
nibbles
0FFFFH
ax
off nibble,
ax
= BF0FH
;
1s
bit
ax,
= 4F0BH
xor
ax = C10BH
bit 5,
;
invert
ax
ax
and
8000H
;
turn on nibble,
ax,
;
turn on
;
turn off
0FFDFH
ax
ax
ax
3,
bit
ax,0C123H
or
0F00H
and
ax
complement,
0F00FH
8
31,
xor
ax
40F0H
of
can
number with
a power of two
a division by 2 i,AND
division
by
It
from
is just
bit 0 these bits that contain the remainder.
The result
zero out
mov
64H
ebx, 0000000FH
= 16 -1
= 15 or F
;
ebx = remainder =4
Using the CL register
it is possible
sets (turns
on)
an
arbitrary
an
;
mask
ebx, eax
and
arbitrary
;
100 =
eax, 100
mov
to modify
example that
53
mov
;
blwill
bl,0
;
ecx
ecx, 32
mov
count_loop:
bit into
carry
;
add just
loop
;
shift
eax, 1
shl
the
adc
carry
flag to bl
flag
bl,0
6
count_loop
mov
;
first
cl,bh
mov
ebx, 1
3
shl
left cltimes
;
shift
ebx, cl
or
eax, ebx
;
turn on bit
Turning
1
a bit offis
mov
just
a little
harder.
;
first
cl,bh
mov
ebx, 1
shl
left cltimes
;
invert
not
ebx
bits
;
turn off
eax, ebx
and
;
shift
ebx, cl
4
bit
Code to complement
an arbitrary
bit is left
as an
xor
;
eax
eax, eax
=0
3.3
Avoiding Conditional
Branches
Modern
processors
common
technique
execu-tion
This
processing
multiple
capabilities
processor,
the
parallel
of the CPU to
execute
once.
at
instructions
branches present
uses
technique
Conditional
or not.
If it is taken,
try to predict
whether
taken.
the
branch
will
be
If
has wasted
the
its
54
CHAPTER 3.BIT OPERATIONS
One
using
The
way to
conditional
sample
provides
code
a simple
branches
in
when
possible.
3.1.5
example of where
one
could do
EAX register
are counted.
can
be
a branch to skip
removed
It uses
carry
instructions
by
how the
using
the
ADC
flag directly.
provide
a way to
remove
branches
to
location
These
a byte register or
zero or one based on the
cases.
certain
instructions
memory
in
are
conditional
same
the
branches.
If
The characters
characters
the
used for
corresponding
al
set, else 0
techniques
that
one can
cal-cuate
develop
values
some
without
branches.
maximum
of
two
values.
The
standard
use a
use a conditional branch to act on which
was larger. The example program below
how the maximum can be found without
value
shows
any branches.
1
;
file: max.asm
%include "asm_io.inc"
segment .data
message1 db "Enter
a number:
",0
0
message3 db "The larger number is:",
0
89
segment .bss
10
11
input1
resd
;
first
12
13
14
15
segment .text
global
_asm_main
_asm_main:
16
enter
17
pusha
0,0
18
19
mov
eax, message1
20
call
print_string
21
call
read_int
number entered
;
setup
;
print
;
input
routine
out first
message
first number
55
mov
[input1],
24
mov
eax, message2
25
call
print_string
26
call
read_int
28
xor
ebx, ebx
29
cmp
eax, [input1]
30
setg
bl
;
;
;
neg
mov
ebx
32
33
and
ecx, ebx
ecx, eax
34
not
ebx
35
and
ebx, [input1]
22
eax
23
27
31
;
;
;
;
;
or
ecx, ebx
38
mov
eax, message3
39
call
print_string
40
mov
eax, ecx
41
call
print_int
42
call
print_nl
36
37
43
44
popa
45
mov
46
leave
47
ret
eax, 0
message
ebx
=0
compare
= (input2
ebx = (input2
ecx = (input2
ecx = (input2
ebx = (input2
ebx = (input2
ecx = (input2
ebx
>input1)
>input1)
>input1)
>input1)
>input1)
>input1)
>input1)
1
:
0
?0xFFFFFFFF
:
0
?0xFFFFFFFF
:
0
input2
:
0
?
?
?
?
0:
0xFFFFFFFF
0:
input1
input2
:
input1
return back to C
a bit
mask that
can be
or
0 otherwise.
the required
instruction
that EBX
if EBX is 1, the
or
0xFFFFFFFF.
This
is just
the
bit
mask
mask
An alternative
trick is to use the DEC
statement. Inthe above code, if the NEG is replaced
with
0xFFFFFFFF.
are
or
reversed
56
CHAPTER 3.BIT OPERATIONS
3.4
3.4.1
Unlike
some
~operator.
are performed
by Cs
s;
short int
int is 16bit /
3
s = 1;
complement) /
assume
short unsigned
that short
u;
s = 0xFFFF
u= 100;
(2s
/
u=
0x0064 /
5
u = u|
0x0100;
u = 0x0164
s =s & 0xFFF0;
s = 0xFFF0
s = s u;
s = 0xFE94
u= u<< 3;
u = 0x0B20
(logical
shift )/
(arithmetic
3.4.2
s = s >> 2;
s = 0xFFA5
shift )/
are used
same purposes as they are used
language.
They
allow
one to
individual bits of data and can be
The bitwise operators
multiplication
in C for the
in assembly
manipulate
a smart
compiler will
use a shift
for
a multiplication
like,
x *= 2,automatically.
Many
POSIX
operating
system
API
2 s (such
operands
example,
systems
POSIX
as
(a better
others.
name
Each
would be owner),
user can
type of
bits. For
maintain
programmer
to
be granted
a file requires
manipulate
file
users: user
group and
as
use
individual
macros to help
a file.
the C
bits
(see Table
3.6). The
1
This
unary
2
operator
is different
from
the
binary
&&
and
& operators!
Computer
for Portable
Environments.
IEEE based
on UNIX.
Operating
System
stan-dard
Interface
developed
by
for
the
57
REPRESENTATIONS
Macro
Meaning
S IRUSR
S IWUSR
S IXUSR
S IRGRP
S IWGRP
S IXGRP
S IROTH
S IWOTH
S IXOTH
a string with
on and an integer 4
two parameters,
file to act
the
set
name
the
of the
to allow the
owner
group to read
the file
no access.
S IRUSR |
S IWUSR |
S IRGRP );
can
be used to
permissions
access to
some
without
changing
others.
removes
access
struct stat
are not
file stats
altered.
/ struct
used by
stat () /
2
stat(foo,
//
read
read
filefil info
e
file sta
ts
..
chmod(foo,
3.5
Representations
Chapter 1introduced
little endian
representations
of multibyte
data.
many
confuses
covers
the
refers
a multibyte
method.
so on. In other
words
are
the
in
the
bytes
Actually
typedef to
opposite
a parameter
of
type
mode t
which
an integral type.
58
CHAPTER 3.BIT OPERATIONS
if (p[0]
= 0x1234;
p = (unsigned
assumes
sizeof(short)
char ) &word;
== 0x12 )
else
printf ( Little Endian Machine\n);
== 2 /
is
representing
In
big
56 78. In little
would be stored
endian represenation,
representations
would
seem
the bytes
for
asking himself
representation?
sadists
as 12 34
as 78 56 34 12.
endian
word
endian
use
right
little
inflicting
on multitudes
this
of
at
confusing
program-mers?
It
(and to
that
are not
of
many
electronic
in any
Consider
can
be
are
AL. There
are not
in any
not before
instruction
or
are
mov
to memory
is not any
a CPU can be
determined. The
array.
as a two
element
byte of word in
memory
which
depends
on
the
59
REPRESENTATIONS
unsigned invert;
unsigned char ip
x)
67
= xp [3];
= xp[2];
ip[2] = xp[1];
ip[3] = xp[0];
ip[0]
reverse
ip[1]
10
11
return invert
12
13
3.5.1
Big Endian
common
some
or a
an issue
for
byte character
it.
sets,
like
important
even
for
con-
text
a double
a mechanism
for specifying
system, the two functions just return their input unchanged. This allows
anness
see W.
a C function
that inverts
processor
introduced
a new
reverses
the bytes of
any
example,
bswap
;
swap
edx
bytes of edx
reverses
little
to big is the
functions
do the
same
con-verting
same
thing.
on
an integer
16-bit
simply
operation.
So both
of
or
these
60
CHAPTER 3.BIT OPERATIONS
int cnt
=0;
45
while( data != 0 ){
=data
data
cnt++;
return cnt;
10
instruction
3.6
ah,al
;
swap
bytes of ax
Counting Bits
Earlier
a straightforward
was given
are on in a
technique
of bits that
double
word. This
section
looks
at other less
as an exercise
using
3.6.1
Method
The first
one
method
very
is
How
does
this
work? In every
one bit is turned off in
bits are off (i.e. when data
method
zero is equal to
the number
Line 6 is where
a bit of data
binary representation
zero.
every
bit
of data
- 1? The bits to
same as for
be the
complement
xxxxx10000
of
the
original
bits
data
-1 =
xxxxx01111
61
1
/ lookup table /
23
void
4
67
for( i
cnt
while( data != 0 ){
10
data
11
/ method
one
cnt++;
12
13
14
=cnt;
15
16
= data
17
18
19
20
= (unsigned
21
22
23
24
[byte[1]]
[byte [3]];
where the xs
are
the
same
will
zero
the rightmost 1
in data and leave all the
Method two
3.6.2
A lookup table
bits
of
an
straightforward
precompute
can
arbitrary
approach
the number
two related
would
be
The
to
are
word.
problems
array.
However, there
with this
approach.
62
CHAPTER 3.BIT OPERATIONS
can be
construction
like:
(data
to find
similar
most
the
ones
significant
byte
value
and
constructions
will
be
slower
an array
than
reference.
Computing
the
sum as
the explicit
In fact,
the index.
sum
a smart
of four
compiler
sum.
This
loop
iterations
process
technique known
3.6.3
or
compiler
eliminating
optimization
as loop unrolling.
Method three
There
counting
of reducing
is
is
yet
another
clever
are on
in
method
of
data. This
sum must
byte stored in
counting
variable
named
operation:
data
0x55 is
are
bit positions
bits
position
same
>>
operand ((data
the
at the
contains
the
even bits
of data. When these two operands are added
together, the even and odd bits of data are
odd bits and the second operand the
added
together.
101100112
For
then:
63
example,
if
data
is
x)
= {0x55555555,
0x33333333,
0x0F0F0F0F,
0x00FF00FF,
0x0000FFFF };
int i;
int shift
10
11
12
return
13
14
x;
or
00
01
00
01
01
01
00
01
01
10
00
10
would be:
data
>> 2) &
0x33);
data
now
is 011000102 ):
data & 001100112
or
fields to that
0010
0010
0001
0000
0011
0010
are
independently added.
The next step is to add these two bit
sums
001100102 ):
data & 000011112
00000010
or
00000011
00000101
64
CHAPTER 3.BIT OPERATIONS
Now
data
Figure
is 5 which
3.8 shows
an
is the correct
implemen- tation
result.
of
this
uses a for
sum.
It would be
Chapter 4
Subprograms
This chapter
make modular
programs
are
to
will
to
interface
programs
assembly
that
can be
subprograms
used
with
pass
4.1
Indirect Addressing
as a pointer,
be used indirectly
square
is to
direct
mov
memory
mov
ebx, Data
a register
it is enclosed in
mov
ax, [Data]
addressing of
;
ebx
;
normal
a word
2
= & Data
ax, [ebx]
;
ax =
*ebx
Because AX holds
word
a single
was
byte would be
determined
65
66
CHAPTER 4.SUBPROGRAMS
general
purpose
(EAX, EBX,
can be
4.2
Simple Subprogram
Example
A subprogram
an
independent
unit of
a
a subprogram is like a
function in C. A jump can be used to invoke the
subprogram, but returning presents a problem. If
code that
program.
can
is
In other words,
can not
be hard coded to
a label.
uses
the value of
This
a register
a
program
the first
from chapter 1
rewritten to
is
use
a subprogram.
sub1.asm
;
file: sub1.asm
;
Subprogram example program
%include "asm_io.inc"
45
segment .data
a number:
prompt1 db
"Enter
prompt2 db
",0
outmsg1 db
outmsg2 db
"and ",0
10
outmsg3 db
",the
sum of these
is ",0
11
12
segment .bss
13
input1
resd 1
14
input2
resd 1
15
16
17
18
segment .text
global
_asm_main
_asm_main:
19
enter
20
pusha
0,0
;
setup
eax, prompt1
;
print
21
22
mov
23
call
print_string
mov
ebx, input1
24
25
;
dont
;
store
routine
out prompt
address of input1 into ebx
67
26
mov
ecx, ret1
27
jmp
short get_int
;
;
29
mov
eax, prompt2
30
call
print_string
32
mov
ebx, input2
33
mov
ecx, $ + 7
34
jmp
short get_int
mov
eax, [input1]
28
ret1:
31
35
36
;
;
37
add
38
mov
eax, [input2]
ebx, eax
40
mov
eax, outmsg1
41
call
print_string
42
mov
eax, [input1]
43
call
print_int
44
mov
eax, outmsg2
45
call
print_string
46
mov
eax, [input2]
47
call
print_int
48
mov
eax, outmsg3
49
call
print_string
50
mov
eax, ebx
51
call
print_int
52
call
print_nl
;
;
eax, 0
39
53
54
popa
55
mov
56
leave
57
ret
58
;
subprogram
59
;
Parameters:
60
ebx
get_int
-address
of dword to store
= this address + 7
eax = dword at input1
ecx
= eax
message
message
print out
message
sum (ebx)
print new-line
return back to C
ecx
integer into
61
ecx
-address
of instruction to
return to
63
;
Notes:
; value
64
get_int:
62
of eax is destroyed
65
call
66
mov
read_int
[ebx],
eax
67
;
jump
memory
ecx
sub1.
back to caller
68
CHAPTER 4.SUBPROGRAMS
get int
The
register-based
EBX
register
uses a simple,
conven-tion. It expects the
subprogram
calling
to
hold
the
address
of
the
ECX register
instruction
of the
to compute
this return
on. The
on line 36.
are
awkward.
to be defined
second
instead of
simpler
a label, but
If a near jump was
would
a label
call. The
not
a short
does
used
be 7! Fortunately,
way to invoke
there is
much
uses
the stack.
4.3
The Stack
Many
stack.
CPUs
have built-in
A stack is
support
a Last-In First-Out
an area of memory
for
(LIFO)
that is
removes
is always the
a last-in
The
SS
segment
register
specifies
the
same segment
ESP register
contains
This
the
instruction
double
word
at [ESP]
The
POP
work and
assumes
that
push
= 0FFCh
;
2 stored
at 0FF8h, ESP
push
dword 3
;
1
stored
dword 1
0FFCh, ESP
= 0FF8h
;
3 stored
= 0FF4h
pop
EAX =3,ESP = 0FF8h
;
EBX = 2,ESP = 0FFCh
;
ECX = 1,ESP = 1000h
ESP
Actually words
words
on the stack.
at 0FF4h,
eax
pop
ebx
pop
at
dword 2
push
in32-bit
ecx
69
The stack
can
be used
as a convenient
subprogram
place
calls, passing
parameters
that pushes
the values
of
a PUSHA
instruction
can
be used to
pop
4.4
Instructions
The 80x86 provides two instructions
the stack to make calling subprograms
easy.
The
CALL
uncondi-tional jump to
instruction
subprogram
pops
of f
use
makes
an
and pushes
on the stack.
an address and
that
quick and
jumps
to
that
address.
When
using
so
these
one man-age
new
instructions
by changing lines 25 to 34
to be:
mov
ebx, input1
call
get_int
mov
ebx, input2
call
get_int
call
read_int
mov
[ebx],
eax
ret
There
are several
It is simpler!
al lows
ogram
sca
It allows subprogramsIt calls
to subpr
be nested
easily.
Notice
that
get int
on the
a RET
pops
off the
pops off
asm main.
70
CHAPTER 4.SUBPROGRAMS
on the
get_int:
call
read_int
mov
[ebx],
push
eax
ret
eax
;
pops
off
4.5
Calling Conventions
When
subprogram
on
how to
languages
as
the calling
agree
have standard
calling
interface
pass
is invoked,
conventions.
with assembly
ways to pass
For
data known
high-level
language,
code
to
the assembly
can
use
language.
the
same
on
are on or
optimizations
convention
a CALL
All
or may vary
is compiled
not). One
(e.g. if
universal
compilers
support
as
depending
RET.
one
calling
allow one
are reentrant. A
may be called at any point
to create
subprograms
reentrant subprogram
of
conventions
a program
that
itself).
4.5.1
Passing parameters
Parameters
on the
stack.
the
to
They
parameter
subprogram,
on the stack
is
to
be
changed
by
less than
a double
the
must be
size is
parameters
on
the
stack
are not
are
instead they
Since
they
have
to be pushed
on
the stack
on again).
can not
be kept in a
chapter
CONVENTIONS
71
4.5. CALLING
ESP + 4
Parameter
Return address
ESP
Figure 4.1:
ESP + 8
Parameter
ESP + 4
Return address
ESP
subprogram data
Figure 4.2:
single parameter
proces-
pa-
dressing, the
rameter
can be accessed
sor accesses
2).
If the stack is also used inside the subprogram to store data, the number
needed to be added to ESP will change. For example, Figure 4.2 shows what
the stack looks like if a DWORD is pushed the stack. Now the parameter is
registers
indirect
sion.
EAX,
is to reference data
ments depending
on the
EDX
EBX,
while
ECX
and
purpose
segment.
stack. The C calling convention mandates that
a subprogram
first
save the
value of EBP on the stack and then set EBP to be equal to ESP. This allows
ESP to change
as data
is pushed
or popped
EBP. At the end of the subprogram, the original value of EBP must be
restored (this is why it is saved at the start of the subprogram.) Figure 4.3
However,
this is usually
proprograms, be-
cause
are the
same.
up the general
prologue of a subprogram.
Lines 5 and 6 make
up the epilogue.
Figure 4.4
parameter
can be access
any place
[EBP + 8] at
with
inthe subprogram
are
must
2
remove
the parame-
It is legal to add
a constant to a register
complicated expressions
when
More
are
72
CHAPTER 4.SUBPROGRAMS
subprogram_label:
push
ebp
;
save original
mov
ebp, esp
;
new
;
subprogram
pop
ret
EBP
EBP value
on stack
=ESP
code
ebp
;
restore
ESP + 8
EBP + 8
Parameter
ESP + 4
EBP + 4
Return address
ESP
EBP
Figure 4.4:
saved EBP
ters.
(There
is
another
form
support
compilers
pascal keyword
is
of
easy to
the
RET
do.) Some C
too.
The
and
this
convention
use
this
convention.
In
fact,
use
the
stdcall
API C functions
types of
the
operation
remove the
will vary from one call
to
allows
the instructions
Pascal
convention
and
stdcall
makes
operation
very
convention
diffi cult.
Thus,
the
does not
this convention
since
none
this
Pascal
can use
removes
convention
subprogram
would
using the
be called.
Line
The compiler
because
the
that
use
program
ECXs
with two
4.5.
may
be
CALLING
CONVENTIONS
73
1
push
dword 1
call
fun
add
esp, 4
;
pass 1
as parameter
;
remove
source
into single
up
in the linking
process.
a subprogram uses to
%include "asm_io.inc"
23
segment .data
4
sum
dd
56
segment .bss
7
input
resd 1
89
;
10
11
12
13
14
15
;
pseudo-code
;
i
=1;
;
sum = 0;
algorithm
;
while( get_int(i,
; sum += input;
; i++;
&input),
16
;
}
17
;
print_sum(num);
18
segment .text
global
19
20
_asm_main
_asm_main:
21
enter
22
pusha
0,0
23
mov
24
25
edx, 1
while_loop:
26
push
edx
27
push
dword input
28
call
get_int
29
add
esp, 8
sub3.asm
input != 0 ){
;
setup routine
;
edx is i inpseudo-code
;
save
i
on stack
;
push address on input on stack
;
remove i
and &input from stack
74
30
32
mov
cmp
eax, [input]
eax, 0
33
je
end_while
add
[sum],
37
inc
edx
38
jmp
short while_loop
31
34
35
eax
36
39
40
end_while:
41
push
dword [sum]
42
call
print_sum
43
pop
ecx
44
45
popa
46
leave
ret
47
48
49
50
;
subprogram
;
Parameters
get_int
(inorder pushed
on
CHAPTER 4.SUBPROGRAMS
;
sum += input
;
push value
;
remove
of
sum onto
stack
stack)
+12])
51
52
+ 8])
53
;
Notes:
are
destroyed
55
segment .data
56
prompt
db
")Enter
segment .text
59
get_int:
an integer
57
60
push
ebp
61
mov
ebp, esp
mov
62
63
call
print_int
66
mov
eax, prompt
67
call
print_string
69
call
read_int
70
mov
ebx, [ebp
71
mov
[ebx],
64
65
68
memory
eax
+ 8]
72
73
pop
74
ret
ebp
75
81
;
subprogram print_sum
;
prints out the sum
;
Parameter:
; sum to print out (at [ebp+8])
;
Note: destroys value of eax
;
82
segment .data
83
result
76
77
78
79
80
db
"The
sum is ",0
84
85
segment .text
86
print_sum:
87
push
ebp
88
mov
ebp,
90
mov
eax, result
91
call
print_string
esp
89
92
93
mov
eax, [ebp+8]
94
call
print_int
95
call
print_nl
97
pop
ebp
98
ret
96
;
jump
4.5.2
sub3.
75
back to caller
Local variables
The stack
on the stack
location
subprograms
to
be
reentrant
one
any
place,
reentrant subprograms
Using the stack
wishes
reentrant
program
global
is active.
Local
variables
are
stored
right
by subtracting
the number
after
the
are
allocated
of bytes
required
76
CHAPTER 4.SUBPROGRAMS
subprogram_label:
push
ebp
;
save original
mov
ebp, esp
;
new
sub
esp, LOCAL_BYTES
;
=# bytes
;
subprogram
EBP
EBP value
onstack
=ESP
needed by locals
code
mov
esp, ebp
;
deallocate
pop
ebp
;
restore
ret
locals
int i
n,int
sump
sum =0;
45
for( i=1; i
<= n;i++ )
sum += i;
sump
= sum;
new
is
subprogram
to
used
sum
Figure 4.6
access
local
variables.
pro-gram
return information
Despite
the
fact
function creates
and
LEAVE
that
ENTER
a new
simplify
a stack
the
of a
on the stack.
invocation
stack frame
storage is
frame. Every
often.
Why
performs
performs
The
Because
the
they
prologue
can be simplified
instruction
For the
are
ENTER
code
takes
simplier
by using
and
instruction
the
ENTER
operands
are designed
LEAVE
epilogue
two immediate
instructions!
This
struction
one.
77
1
cal_sum:
push
ebp
mov
ebp, esp
sub
esp, 4
;
make room for local sum
56
-4],0
;
sum =0
mov
dword [ebp
mov
ebx, 1
;
ebx (i)= 1
for_loop:
cmp
ebx, [ebp+8]
;
is i
<= n?
jnle
end_for
12
add
[ebp-4], ebx
13
inc
ebx
14
jmp
short for_loop
10
11
;
sum += i
15
16
end_for:
17
mov
ebx, [ebp+12]
;
ebx =sump
18
mov
eax, [ebp-4]
;
eax =sum
19
mov
[ebx],
21
mov
esp, ebp
22
pop
ebp
23
ret
eax
;
*sump
=sum;
20
4.6
Multi-Module Programs
program
A multi-module
more
than
presented
programs.
one
object
here
is
have
one
composed of
programs
All the
file.
been
multi-module
object
library
Recall
that
object
files).
program.
the
linker
a single executable
up references
in another
the
use a
module.
In
label defined in
extern
directive
comes
a comma
as external to the
are labels that can be
but are defined in another.
routines
In assembly,
defines the
labels
can not
78
CHAPTER 4.SUBPROGRAMS
be
accessed
ESP
+ 16
+ 12
ESP + 8
ESP + 4
EBP + 12
ESP
EBP + 8
EBP + 4
Return address
ESP
EBP
EBP
sump
saved EBP
-4
sum
Figure 4.9:
1
subprogram_label:
enter
leave
ret
;
=# bytes
LOCAL_BYTES, 0
;
subprogram
needed by locals
code
13
Figure
defined
of
The
global
does
this
program listing in
asm main label being
the skeleton
1.7 shows
as
directive
global.
there would be
the
Without
this
declaration,
Because the
Next
rewritten
separate
use two
to
subprograms
modules.
two
The
source
example,
are
in
main4.asm
1
%include "asm_io.inc"
segment .data
4
sum
dd
segment .bss
7
input
resd 1
segment .text
10
global
_asm_main
11
extern
get_int, print_sum
12
13
_asm_main:
enter
0,0
setup routine
14
pusha
4.6. MULTI-MODULE PROGRAMS
79
15
16
17
mov
edx, 1
while_loop:
18
push
edx
19
push
dword input
20
call
get_int
21
add
esp, 8
23
mov
eax, [input]
24
cmp
eax, 0
25
je
end_while
add
[sum],
29
inc
edx
30
jmp
short while_loop
22
26
27
eax
28
31
32
end_while:
33
push
dword [sum]
34
call
print_sum
35
pop
ecx
36
37
popa
leave
38
;
edx is i inpseudo-code
;
save i
on stack
;
push address on input on stack
;
remove
i
and &input from stack
;
sum += input
;
push value
;
remove
of
sum onto
39
stack
main4.asm
sub4.asm
1
%include "asm_io.inc"
segment .data
4
prompt
db
")Enter
an integer
segment .text
7
global
get_int, print_sum
get_int:
enter
0,0
10
11
mov
12
call
print_int
14
mov
eax, prompt
15
call
print_string
13
16
80
17
call
read_int
18
mov
ebx, [ebp
19
mov
[ebx],
+ 8]
eax
20
21
leave
22
ret
23
24
segment .data
25
result
db
"The
sum is ",0
26
27
segment .text
28
print_sum:
enter
0,0
31
mov
eax, result
32
call
print_string
34
mov
eax, [ebp+8]
35
call
print_int
36
call
print_nl
29
30
33
37
38
leave
39
ret
sub4.
CHAPTER 4.SUBPROGRAMS
;
store input into memory
;
jump back to caller
same way.
the
4.7
Today,
very
few
programs
are written
are very good at
code. Since
high
language,
level
it
is
more popular. In
more portable
than assembly!
When assembly
can be done
in
two
ways:
uses.
supports
NASMs
require different
No
compiler
format.
formats
at
the
Different
moment
compilers
uses
gcc
format. The
that all GNU compilers
use. It
WITH C
81
1
segment .data
dd
format
db
"x
45
= %d\n", 0
...
segment .text
6
push
dword [x]
;
push xs
push
dword format
;
push address
call
_printf
;
note underscore!
add
esp, 8
;
remove
10
value
of format string
an assembly
more standardized on
technique of calling
much
subroutine is
the PC.
Assembly routines
the following
Direct
are usually
reasons:
access
are
or impossible to access
diffi cult
from C.
as fast as possible
and the
can.
reason
The last
is not
Compiler technology
and
code
turned
on)
routines
are: reduced
Most
has improved
The
of
the
disadvantages
of
assembly
calling
conventions
there
have
are a
few
4.7.1
Saving registers
First, C
assumes
that
subroutine
maintains
can
mean
that
be used in
C variable
the subroutine
it does change their values, it must restore their original values before the
subroutine returns.
because C uses these registers for register variables. Usually the stack is
used to save the original values of these registers.
ern
compilers
relatively
similar
syntaxes
tion. These
register variables.
different
from
Mod-
the
any suggestions.
82
CHAPTER 4.SUBPROGRAMS
EBP + 12
value of
EBP + 8
EBP + 4
Return address
EBP
saved EBP
Figure 4.12: Stack inside printf
4.7.2
Labels of functions
Most
compilers
prepend
single
names
of functions
For example,
and global/static
a function
variables.
an
assembly
gcc
compiler
does
not
use
any character.
one simply would
prepend
an underscore. Note
that inthe assembly skeleton program (Figure 1
.7), the label for the main routine is asm main.
DJGPP s
gcc
does prepend
Passing parameters
4.7.3
of arguments.
It is not
on the stack
will
always be at
bitrary
number
of
argu-
macros
that can be used to process
them portably.
See any
printf code
+12]. However,
not be xs value!
4.7.4
this will
variables
or
does
this.
a label defined
However,
inthe
Basically, the
calculating
the
WITH C
83
x is located at EBP
one cannot just use:
mov
eax, ebp
8 on
-8
that MOV
as-sembler
a constant).
(that is, it
However, there
would calculate
the address
of
and
eax, [ebp
lea
-8]
Do
on the
not
be
instruction
however,
is
x and
could be
confused,
reading
it
the
looks
data
like
at
this
[EBP8];
instruction
not
never
true reads
The memory!
LEA in stru
It only
ction computes
the
address
that
instruction
would
be
read
by
another
in its first
( e.g.
size designation
Returning values
Non-void
conventions
are
types
integral
returned
return back
C functions
The C calling
(char,
in the
value
how this is
specify
enum,
int,
EAX
register
etc.)
If they
are
are
on if they
are
values
returned
in the EDX:EAX
4.7.6
The
rules
above
describe
the
standard
calling
support
Often compilers
conventions
as
well.
other
When interfacing
very
convention
your
important
to
the compiler
is
use
that
use
multiple
command
conventions
line switches
can
that
often
have
to
be used
change
4
The Watcom
does not
example
use
C compiler
the standard
source
is
an
example
conven-tion
of
one
that
84
CHAPTER 4.SUBPROGRAMS
the
calling
They
also
provide
conventions
However,
and
convention.
default
extensions
these
may vary
to
individual
extensions
are not
functions
standardized
The GCC
compiler
allows
of
different
calling
a function can be
explicitly
declared
by using the
following
a void
function
parameter,
use
the
void f(int )
attribute
supports
also
GCC
attribute
the
((cdecl));
standard
call
calling
could be declared
convention
convention
take
can
a fixed
the subroutine
to
remove
does)
number of arguments
(i.e.
ones not
an additional attribute
regparm that tells the compiler to use
registers to pass up to 3 integer arguments to a
function instead of using the stack.
This is a
common type of optimization
that
many
GCC also supports
called
compilers support.
to declare
calling
cdecl and
immediately
prototype.
would
conven-tions.
stdcall
as
act
keywords
function
the
before
They
add the
to C. These
keywords
modifiers
appear
in a
and
name
function
be defined
as
follows
for Borland
and
Microsoft:
void
cdecl f(int );
There
each
are
advantages
the
of
calling
and disadvantages
conven-tions.
The
to
main
advantages
simple and
very
flexible. It
can
be used for
any
can
limit
the
portability
of
the
can
use more
is that it
be slower than
memory
(since
requires
code
stack).
uses
less
memory
main
disadvantage
with functions
is that it
can not
be used
numbers
of
arguments.
The advantage
uses
registers to
of using
pass
convention
that
more
parameters
complex.
Some
on the stack.
may
be
is
in
WITH C
85
4.7.7
Examples
example
that shows how an
can be interfaced to a C program.
(Note that this program does not use the assembly
skeleton
program (Figure 1.7) or the driver.c
Next
is
an
assembly routine
module.)
main5.c
#include <stdio.h>
/ prototype
int
n,sum;
int )
((cdecl));
up to:);
&n);
10
scanf(%d,
11
printf (S
(Sum is %d\n, sum);
12
return 0;
13
14
main5.c
sub5.asm
;
subroutine _calc_sum
;
finds the sum of the integers
through n
1
;
Parameters:
; n
what
+ 8])
5
sump
-pointer
;
void
;
{
10
11
;
;
;
+ 12])
to int to store
;
pseudo
calc_sum( int
int i,
sum
n,int
= 0;
for( i=1; i
<= n;i++ )
sum += i;
sum
C code:
*sump )
; *sump
;
}
12
13
=sum;
14
segment .text
15
global
16
_calc_sum
17
86
CHAPTER 4.SUBPROGRAMS
Sum integers
up to:10
Stack Dump # 1
EBP
= BFFFFB70
ESP
= BFFFFB68
+16
BFFFFB80
080499EC
+12
BFFFFB7C
BFFFFB80
+8
BFFFFB78
0000000A
+4
BFFFFB74
08048501
+0
BFFFFB70
BFFFFB88
-4
BFFFFB6C
00000000
-8
BFFFFB68
4010648C
Sum is 55
Figure 4.13:
18
;
local variable:
19
sum at [ebp-4]
20
_calc_sum:
21
enter
4,0
22
push
ebx
23
24
mov
25
dump_stack 1,2,4
;
;
26
mov
ecx, 1
;
;
27
dword [ebp-4],0
for_loop:
28
cmp
ecx, [ebp+8]
29
jnle
end_for
31
add
[ebp-4],
32
inc
ecx
33
jmp
short for_loop
36
mov
ebx, [ebp+12]
37
mov
eax, [ebp-4]
38
mov
[ebx],
40
pop
ebx
41
leave
30
ecx
34
35
end_for:
eax
39
Sample
;
make room for sum on stack
;
IMPORTANT!
sum =0
print out stack from ebp-8 to ebp+16
ecx
is i
inpseudocode
cmp
i
and
if not i
<= n,quit
sum += i
ebx
eax
=sump
=sum
restore ebx
42
ret
sub5.asm
WITH C
87
so
important?
Because
the C calling
con-vention
program
Line
macro
just a
requires
the
by the function
very
25 demonstrates
how
first parameter
is
EBP
determine how
+ 8); the
EBP value is
variable is 0 at (EBP
- 4); and
of the local
-8).
changed to:
sum = calc
sum(n);
sum would
need be
sub6.asm
;
subroutine _calc_sum
;
finds the sum of the integers
through n
1
2
;
Parameters:
;
Return value:
; value of sum
;
pseudo C code:
;
int calc_sum( int n)
;
{
=0;
10
int i,
sum
11
for( i=1; i
<= n;i++ )
13
;
;
return
14
;
}
15
segment .text
12
16
sum += i;
sum;
global
_calc_sum
17
18
;
local variable:
19
20
_calc_sum:
21
[ebp
sum at [ebp-4]
enter
4,0
+ 8])
;
make room for sum on stack
88
CHAPTER 4.SUBPROGRAMS
segment .data
format
34
db "%d", 0
...
segment .text
5
lea
eax, [ebp-16]
push
eax
push
dword format
call
_scanf
add
esp, 8
10
11
...
22
23
mov
dword [ebp-4],0
24
mov
ecx, 1
25
for_loop:
26
cmp
ecx, [ebp+8]
27
jnle
end_for
29
add
[ebp-4],
30
inc
ecx
28
ecx
31
jmp
short for_loop
mov
eax, [ebp-4]
32
33
end_for:
34
35
leave
36
;
sum = 0
;
ecx is i
inpseudocode
;
cmp
i
and
;
if not
i
<= n,quit
;
sum += i
;
eax
37
= sum
ret
sub6.asm
4.7.8
assembly
of interfacing
as-sembly
code to
C and
access
functions
important
preserves
EDI registers;
registers
definitely
very
may
means
be changed,
as
In fact, EAX
will
it will contain
the
89
SUBPROGRAMS
asm io.asm
which
Subprograms
any
code instructions.
In
mov
word [cs:$+7], 5
;
previous
;
copy
add
ax, 2
statement changes 2 to 5!
work
protected
mode
operating
segment is marked
as
systems
the
code
maintain
below).
as data
in
tasegments).
in
All variables
stack.
There
are several
code.
A reentrant subprogram
can be called
recursively.
program can
processes. On many
reentrant
multiple
multi-tasking
operating
multiple instances of
program
be
shared
by
systems, if there
are
running, only
one copy
of the code
is inmemory. Shared
libraries and DLLs (Dynamic
Link Libraries)
Reentrant
subprograms
multi-threaded
grams
pro-
Linux,
support
etc.)
multi-threaded
programs.
4.8.1
Recursive subprograms
can be
occurs
Direct recursion
foo,
calls
recursion
itself
occurs
either
direct
when
inside
when
call themselves.
foos
or
indirect.
subprogram,
body.
a subprogram
say
Indirect
is not called
it
A multi-threaded
program
program
has multiple
threads
itself is multi-tasked.
90
CHAPTER 4.SUBPROGRAMS
of
;
finds
segment .text
global _fact
n!
_fact:
enter
0,0
67
mov
eax, [ebp+8]
cmp
eax, 1
;
eax =n
;
if n<= 1,terminate
jbe
term_cond
10
dec
eax
11
push
eax
12
call
_fact
;
eax =fact(n-1)
13
pop
ecx
;
answer
14
mul
dword [ebp+8]
;
edx:eax
15
jmp
short end_fact
16
=eax *[ebp+8]
term_cond:
mov
17
18
ineax
eax, 1
end_fact:
19
leave
20
ret
Recursive
termination
true,
subprograms
condition.
no more
recursive
condition
recursion
routine
must
When this
recursive
calls
have
condition
is
are made. If a
a termination
loop).
Figure 4.15 shows
a function
that calculates
factorials recursively.
with:
x =fact
/ find 3!/
(3);
4.17
and 4.18
show
another
on the
recursive
more
in C and assembly,
variable i. Defining
i
as
a new
own
independent
n=3 frame
n=2 frame
n=1frame
91
n(3)
Return address
Saved EBP
n(2)
Return address
Saved EBP
n(1)
Return address
Saved EBP
Figure 4.16: Stack frames for
factorial function
1
void f(int
x)
int i;
for( i=0; i
< x;i++ ){
printf (%d\n,
i);
f(i);
4.8.2
types
are
any
function and
locations
are
defined outside of
stored
or
memory
at fixed
bss segments)
program
and
until the
as
declared
same
can access
module
in the
are
are
local variables
the keyword
static
function
purposes!)
These variables
accessed
of
92
CHAPTER 4.SUBPROGRAMS
%define i
ebp-4
%define
segment .data
format
segment .text
;
useful macros
;
10 =\n
db "%d", 10,0
global _f
extern _printf
x ebp+8
_f:
enter
4,0
;
allocate room on stack
mov
dword [i],0
;
i
=0
13
mov
eax, [i]
;
is i
<x?
14
cmp
eax, [x]
15
jnl
quit
17
push
eax
18
push
format
19
call
_printf
20
add
esp, 8
22
push
dword [i]
23
call
_f
24
pop
eax
26
inc
dword [i]
27
jmp
short lp
for i
10
11
12
lp:
16
;
call printf
21
;
call f
25
28
;
i++
quit:
29
leave
30
ret
93
automatic
type for
a funcvariables are
allocated
on
the
are
are
deallocated
Thus,
they
not
do
have
fixed
memory
locations.
register This keyword asks the compiler to
use
just
a request.
of the variable
anywhere in the
program
(since registers
only
values.
simple
integral
Structured types
not fit in
automatically
types
can
be
Also,
register
register! C compilers
make normal
is used
automatic
without
any
would
will often
variables
programmer.
volatile This keyword tells the compiler that the
value of the variable
may
any moment.
can not make any
change
compiler
assumptions
modified. Often
about
This
means
that the
is
a compiler
in
variable
use
these
types
of
optimizations
with
one could
of
volatile
variable
be altered by two
threads
of
program
multi-threaded
If
would be
x =10;
y =20;
z = x;
could be altered by another
thread, it is
x between
so that z would not be 10.
if the x was not declared volatile, the
might assume that x is unchanged and
lines 1
and 3
However,
compiler
set z to 10.
the compiler
94
CHAPTER 4.SUBPROGRAMS
Chapter 5
Arrays
5.1
Introduction
array.
It is possible
5.1.1
arrays
Defining
arrays
Defining
segments
To define
segment,
NASM
use
initialized
also provides
TIMES that
many
an
can
times
statements
array
in the data
useful
directive
be used to repeat
without
having
by hand. Figure
named
a statement
to duplicate
the
examples of these.
segment,
use
an
uninitialized
the
resb,
types of definitions.
95
96
CHAPTER 5.ARRAYS
segment .data
;
define array
a1
;
define array
a2
;
same as before
a3
;
define array
a4
1,2,3,4,5,6,7,8,9,10
dd
of 10 words initialized to 0
0,0,0,0,0,0,0,0,0,0
dw
using TIMES
times 10 dw 0
of bytes with 200 0s and then a 100 1s
times 200 db 0
times 100 db 1
10
11
12
segment .bss
13
;
define anarray
14
a5
15
;
define anarray
16
a6
resd
resw
10
of 100 uninitialized words
100
Defining
arrays
stack
including
arrays,
directly
and subtracts
or
using the
this
ENTER
a function
needed
a 50 element
4
+ 50
word
array, one
= 109 bytes.
would need 1+ 2
a multiple of four
(112 in this case) to keep ESP on a double word
boundary.
One could arrange
the variables
inside this 109 bytes in several ways. Figure 5.2
shows two possible ways
The unused part of
subtracted
the first
on double
memory accesses.
words
5.1.2
ordering
word boundaries
Accessing elements of
There
language
array,
is
as
its address
the following
array1
array
no
[ ] operator
in C. To
two
access an
arrays
in assembly
element of
an
definitions:
5, 4, 3, 2, 1
db
;
array
Here
up
array
of bytes array2
2,1
to speed
are some
dw
of words
5.1. INTRODUCTION
5,4,3
arrays:
97
EBP
-1
-8
EBP -12
EBP
char
unused
dword 1
dword 2
word
array
word
-100
-104
EBP -108
EBP -109
array
EBP
EBP
EBP
-112
dword 1
dword 2
char
unused
mov
al,[array1]
mov
al,[array1 + 1]
mov
[array1 + 3],al
mov
ax, [array2]
mov
ax, [array2 + 2]
mov
[array2
mov
ax, [array2 + 1]
+ 6],ax
Inline 5,element 1
of the word
array
;
al=array1[0]
;
al=array1[1]
;
array1[3]
=al
;
ax =array2[0]
;
ax =array2[1]
;
array2[3]
(NOT array2[2]!)
=ax
;
ax =??
is referenced, not element 2.Why?
elements
example code.
of array1
in the
previous
instruction
must be the
inline 3.
to calculate the
sum.
ways
Setting AH to
an
unsigned
zero
number.
is implicitly
If
it is
a CBW
assuming that AL is
signed,
the
appropriate
98
CHAPTER 5.ARRAYS
mov
ebx, array1
;
ebx =address
mov
dx,0
;
dx will hold sum
mov
ah,0
;
?
mov
ecx, 5
mov
al,[ebx]
;
al=*ebx
add
dx,ax
;
dx += ax (not
inc
ebx
;
bx++
loop
lp
of array1
lp:
al!)
mov
ebx, array1
;
ebx =address
mov
dx,0
;
dx will hold sum
mov
ecx, 5
add
dl,[ebx]
;
dl+= *ebx
jnc
next
;
if no carry
inc
dh
;
inc dh
inc
ebx
;
bx++
loop
lp
10
of array1
lp:
goto next
next:
addressing
memory
reference is:
[base
EAX,
EDI.
EBX, ECX, EDX, EBP, ESP, ESI or
factor is either 1,2,4 or 8.(If 1,
factor is omitted.)
index
EAX,
5.1. INTRODUCTION
99
1
mov
ebx, array1
;
ebx =address
mov
dx,0
;
dx will hold sum
mov
ecx, 5
add
dl,[ebx]
;
dl+= *ebx
adc
dh,0
;
dh+= carry
inc
ebx
;
bx++
loop
lp
of array1
lp:
flag
+0
The constant
5.1.4
Example
%define NEW_LINE 10
segment .data
5
FirstMsg
db
"First 10
elements of array", 0
6
Prompt
db
"Enter index of
SecondMsg
db
"Element %d is
db
"Elements 20
%d", NEW_LINE, 0
8
ThirdMsg
through 29 of array", 0
9
InputFormat
db
"%d", 0
10
11
segment .bss
12
array
resd ARRAY_SIZE
13
14
segment .text
15
extern
global
_asm_main
_dump_line
16
17
_asm_main:
enter
18
21
ebx
push
4,0
esi
-4
19
100
22
;
initialize array
to 100, 99,98
23
24
25
26
mov
ecx, ARRAY_SIZE
mov
ebx, array
init_loop:
27
mov
[ebx],
28
add
ebx, 4
ecx
29
loop
init_loop
31
push
dword FirstMsg
32
call
_puts
33
pop
ecx
35
push
dword 10
36
push
dword
37
call
_print_array
38
add
esp, 8
30
34
array
39
40
;
prompt user
41
Prompt_loop:
42
push
dword Prompt
43
call
_printf
44
pop
ecx
45
97,
...
CHAPTER 5.ARRAYS
;
print
out FirstMsg
;
print
first 10 elements of
lea
46
eax
= address
47
eax
push
48
push
49
call
_scanf
50
add
esp, 8
cmp
eax, 1
51
eax
eax, [ebp-4]
of local dword
array
dword InputFormat
;
je
52
InputOK
53
call
54
_dump_line
InputOK:
over
;
dump
jmp
55
;
if input
invalid
56
58
mov
esi, [ebp-4]
59
push
dword [array
60
push
esi
61
push
dword SecondMsg
+ 4*esi]
;
call
62
_printf
63
add
esp, 12
5.1. INTRODUCTION
64
65
push
dword ThirdMsg
66
call
_puts
67
pop
ecx
69
push
dword 10
70
push
dword
71
call
_print_array
72
add
esp, 8
74
pop
esi
75
pop
ebx
76
mov
eax, 0
77
leave
68
array + 20*4
73
ret
78
79
80
81
;
;
routine
_print_array
101
;
print
;
address
of array[20]
return back to C
82
;
C-callable
array as
83
signed integers.
84
85
;
C prototype:
;
void print_array(
n);
;
Parameters:
; a
pointer
(at ebp+8 on stack)
86
87
-number
stack)
89
90
segment .data
on
of
91
OutputFormat
db
"%-5d %5d"
92
93
segment .text
global
94
95
_print_array
_print_array:
96
enter
0,0
97
push
esi
98
push
ebx
100
xor
esi, esi
101
mov
ecx, [ebp+12]
102
mov
ebx, [ebp+8]
99
103
print_loop:
push
104
ecx
105
NEW_LINE, 0
;
esi = 0
;
ecx = n
;
ebx = address of array
;
printf might change ecx!
102
+ 4*esi]
106
push
dword [ebx
107
push
esi
108
push
dword OutputFormat
109
call
_printf
110
add
esp, 12
112
inc
esi
113
pop
ecx
114
loop
print_loop
pop
pop
ebx
117
118
leave
119
ret
111
115
116
esi
array1.
CHAPTER 5.ARRAYS
;
push array[esi]
;
remove
array1c.c
#include <stdio.h>
int
4
asm main(
void );
int main()
7
{
{
=asm main();
ret status
10
11
12
13
14
15
input buffer
16
17
18
int ch;
19
20
while( (ch
21
\n)
23
22
= getchar())
/ null body/
!= EOF && ch !=
array1c.c
5.1. INTRODUCTION
103
ebx, [4*eax
This effectively
+eax]
for other
A fairly
Consider
faster
than using MUL. However,
do this
one must
realize
can not
be used to multiple
quickly.
5.1.5
Multidimensional Arrays
Multidimensional
different
by 6
mem-ory as
array.
just that,
Not
are represented in
a plain one dimensional
surprisingly,
the
simplest
with the
identified
row
of the element
and the
rows
and two
an array
as:
with three
columns defined
int
a [3][2];
reserve
The C
room
compiler
for awould
6 (= 2r
map the elements as follows:
array
Index
Element
What
and
0
a[0][0]
1
a[0][1]
element
referenced
a[1][0]
a[1][1]
a[2][0]
a[2][1]
is
(index
1) and
[j]
appears
in the rowwise
where a[i]
representation?
hard to
see
mov
2
eax, [ebp
sal
-44]
eax, 1
;
ebp
multiple i
by 2
3
-48]
add
eax, [ebp
mov
add j
4
-40]
mov
= a[i ][j]
on the
rows.
As an example, let
following code (using
x = a[i ][ j];
Figure 5.6 shows the assembly this is translated
into. Thus, the compiler
code to:
x = (&a[0][0] + 2i + j);
and infact, the programmer could write this way
with the
same result.
of the
array. A
as
well:
0
Index
Element
a[0][0]
1
a[1][0]
a[2][0]
a[0][1]
a[1][1]
a[2][1]
at position i
+ 3j.Other languages
for example)
(FORTRAN,
three dimensional
array:
int b[4][3][2];
This
array
dimensional
consecutively
arrays each of
in memory. The
105
was
size
four two
[3]
[2]
Index
Element
b[0][0][1]
b[0][1][0]
b[0][1][1]
b[0][2][0]
b[0][2][1]
10
b[1][0][0]
b[1][0][1]
b[1][1][0]
b[1][1][1]
b[1][2][0]
Index
Element
b[0][0][0]
11
b[1][2][1]
anar-ray dimensioned
+ k.Notice
does not
i+ N
appear
inthe formula.
same process
anndimen-sional array of
generalized. For
dimensions D1 to Dn
is
D3
+ Dn
Dn
in1
i1
+D3
D4
Dn
i2
+ in
Dk
j=1
ij
k=j+1
does not
appear
This is where
in the
you can
tell
For
the
columnwise
representation,
the author
the
was
a physics
major. (Or
+ D1
i2 + + D1
i1
D2
D2 Dn1 in
Dn2
in1 + D1
ence to FORTRAN a give- or
away?)
j1
Dk
j=1
ij
k=1
not
appear
in the formula.
that does
as
Parameters inC
The
rowwise
representation
of
apparent
when considering
This becomes
a
array as a
the prototype
a multidimensional
of
f(
int
information /
[][]);
no dimension
106
CHAPTER 5.ARRAYS
a [][2] );
array
can
a function
with this
prototype:
void f( int
This defines
a []);
a single
array of integer
can be used to create an
acts much like a two
dimensional
array
of arrays that
dimensional
array).
arrays,
array parameter
void f(int
5.2
a [][4][3][2]
);
Array/String Instructions
for
The
80x86
family
of
These
instructions.
provide
arrays
processors
instructions
They
use
(ESI
increment
are incremented or
are two instructions
that
index registers
There
decremented
modify
the
direction flag:
CLD clears the direction flag. Inthis state, the
index registers
are incre-
mented.
STD sets the direction flag. Inthis state, the
index registers
are decre-
mented.
very common
is
to code that
5.2.1
memory
A size
can be specified
compiler.
107
=[DS:ESI]
= ESI 1
AX = [DS:ESI]
ESI = ESI 2
EAX = [DS:ESI]
ESI = ESI 4
AL
LODSB
STOSB
LODSW
LODSD
= AL
= EDI 1
[ES:EDI] = AX
EDI = EDI 2
[ES:EDI] = EAX
EDI = EDI 4
[ES:EDI]
EDI
ESI
STOSW
STOSD
segment .data
array1
dd
1,2,3,4,5,6,7,8,9,10
34
segment .bss
5
array2
resd 10
67
segment .text
;
dont
cld
mov
esi, array1
10
mov
edi, array2
11
mov
ecx, 10
12
forget this!
lp:
13
lodsd
14
stosd
15
loop
lp
shows
these
pseudo-code
are several
instructions
description
with
short
of what they
do. There
for reading
and
easy to
remember this if
one
or
as
DS
programming,
programmer
ES to detemine
there is only
one
should be automatically
(just
use
mode programming
since
is).
it
is
a problem,
data segment
initialized
However,
very
to initialize
3.
in real
important
ES
and ES
to reference it
to the
mode
for
the
correct
segment
selector
use
example
3
Another
value
of
Figure
these
complication
5.8 shows
instructions
is that
an
that
the
copied
MOV instruction.
to
general
Instead,
purpose
register
be
108
CHAPTER 5.ARRAYS
MOVSB
byte [ES:EDI]
ESI
EDI
MOVSW
= byte [DS:ESI]
= ESI 1
= EDI 1
word [ES:EDI]
= word [DS:ESI]
= ESI 2
EDI = EDI 2
ESI
MOVSD
dword [ES:EDI]
= dword [DS:ESI]
= ESI 4
EDI = EDI 4
ESI
segment .bss
array
move
string instructions
resd 10
34
segment .text
;
dont
cld
mov
edi, array
mov
ecx, 10
xor
eax, eax
rep stosd
forget this!
array
example
copies
an array
The
into another.
combination
of
and
LODSx
STOSx
very common.
by
performed
instructions
5.8
single
5.9 describes
Figure
could
difference
be
string instruction.
MOVSx
the operations
that
these
be
instruction
can
replaced
with
the
with
same
single
MOVSD
The
effect.
only
5.2.2
The
80x86
family
provides
.
a
special
instruction prefix
called REP that can be used
with the above string instructions
This prefix
tells the CPU to repeat the next string instruction
a specified number of times. The ECX
4
MOV instructions.
4
A instruction
prefix
is not
modifies
the instructions
an
a
instruction,
it is
are
accesses
also
109
CMPSB
compares
= ESI 1
EDI = EDI 1
ESI
CMPSW
compares
= ESI 2
EDI = EDI 2
ESI
CMPSD
compares
= ESI 4
EDI = EDI 4
ESI
SCASB
compares
EDI
SCASW
compares
EDI
SCASD
AX and [ES:EDI]
compares
EDI
AL and [ES:EDI]
1
EAX and [ES:EDI]
as
rep movsd
Figure 5.10 shows another example that
zeroes
5.2.3
Figure
5.11
shows
new
several
string
the
CMP instruction.
compare
corresponding
scan memory
SCASx
The
instructions
CMPSx
memory
locations
for
specific
value.
Figure 5.12 shows
searches
array.
12
a double word
on line 10 always
in
adds 4 to EDI,
found. Thus, if
the 12 found
5.2.4
There
prefixes that
string instructions.
bel.
110
CHAPTER 5.ARRAYS
segment .bss
array
resd 100
34
segment .text
5
cld
mov
edi, array
;
pointer
mov
ecx, 100
;
number
of elements
mov
eax, 12
;
number
to scan for
lp:
10
scasd
11
je
found
12
loop
lp
13
;
code to perform
14
15
16
17
18
to start of array
jmp
onward
sub
edi, 4
if not found
found:
;
code to perform
;
edi now
points to 12 inarray
if found
onward:
REPE, REPZ
REPNE, REPNZ
prefixes
and
are
(as
describes
are
and REPZ
just
REPNE
comparison
their
synonyms
and
operation.
REPE
same
prefix
for the
REPNZ).
If the
repeated
stops because
string instruction
are
registers
decremented;
still
however,
incremented
and
of
or
ECX
use
the
because of
flag
to
comparisons
just look
Thus, it is possible to
to determine
see
if ECX is
if the
zero
a comparison or ECX
repeated
after
stopped
becoming
zero.
an
example
code snippet
that determines
if two
blocks
of
memory are
equal. The JE
example checks to
see
the result
on
line 7 of the
of the previous
stopped
if the
ECX became
and
no
comparisons
zero, the
branch is
stopped
made;
because
111
1
segment .text
cld
mov
esi, block1
;
address
of first block
mov
edi, block2
;
address
of second block
mov
ecx, size
;
size of blocks
repe
cmpsb
;
repeat
je
equal
;
code to perform
jmp
inbytes
;
if Z set, blocks
if blocks
equal
onward
equal:
10
11
;
code to perform
if equal
onward:
12
5.2.5
memory
blocks
Example
an assembly source
array
file
functions
global
_asm_copy,
_asm_find,
memory.asm
_asm_strlen, _asm_strcpy
2
segment .text
;
function _asm_copy
;
copies blocks of memory
;
C prototype
;
void asm_copy( void *dest, const
void *src, unsigned sz);
;
parameters:
; dest
pointer to buffer to copy
4
to
10
src
-pointer
sz
-number
to buffer to copy
from
11
of bytes to copy
12
13
;
next, some
helpful symbols
defined
14
15
16
%define
src
[ebp+12]
17
%define
sz
[ebp+16]
18
_asm_copy:
19
enter
0,0
20
push
esi
112
are
21
push
edi
mov
mov
mov
esi, src
22
23
24
25
edi, dest
ecx, sz
26
27
cld
28
rep
movsb
pop
pop
edi
31
32
leave
33
ret
29
30
esi
34
35
36
37
38
39
;
function _asm_find
;
searches memory for a given
;
void *asm_find( const void
;
parameters:
CHAPTER 5.ARRAYS
;
esi =address
;
edi =address
of buffer to copy to
;
ecx
=number
;
clear
of bytes to copy
direction flag
;
execute
byte
40
src
-pointer
to buffer to
search
41
42
;
;
target
sz
;
return value:
; if target is found, pointer to
first occurrence of target inbuffer
43
44
45
is returned
46
else
;
NULL is returned
;
NOTE: target is a byte value,
pushed on stack as a dword value.
47
48
49
but is
8-bits.
50
51
%define
52
53
%define
src
sz
[ebp+8]
[ebp+16]
54
55
_asm_find:
56
enter
0,0
57
push
edi
mov
eax, target
58
59
;
al
mov
60
edi, src
61
mov
62
cld
ecx, sz
113
63
repne
64
until ECX
je
66
zero
mov
;
scan
scasb
== 0 or [ES:EDI] == AL
67
;
if not
eax, 0
short quit
;
if
found_it
65
68
found,
jmp
69
found_it:
70
mov
eax, edi
71
dec
eax
pop
74
leave
75
ret
-1)
72
;
if
quit:
edi
76
77
78
79
80
81
82
83
84
;
function _asm_strlen
;
returns the size of a string
;
unsigned asm_strlen( const char
;
parameter:
; src
pointer to string
*);
;
return value:
; number of chars
instring (not
86
%define
87
_asm_strlen:
88
enter
0,0
89
push
edi
91
mov
edi, src
92
mov
ecx, 0FFFFFFFFh ;
90
93
xor
94
cld
al,al
scasb
95
repnz
96
97
98
99
;
;
repnz
100
;
not
101
FFFFFFFF
-ECX
102
mov
eax,0FFFFFFFEh
103
sub
eax, ecx
104
edi
=pointer
use largest
to string
possible ECX
=0
al
-ECX,
-ecx
is FFFFFFFE
=0FFFFFFFEh
114
105
pop
106
leave
107
ret
edi
108
109
;
function
_asm_strcpy
114
;
copies a string
;
void asm_strcpy(
;
parameters:
; dest
pointer
; src
pointer
115
116
117
%define
118
_asm_strcpy:
110
111
112
113
src
char *dest,
to string to
to string to
+ 8]
[ebp + 12]
119
enter
0,0
120
push
esi
121
push
edi
mov
edi, dest
122
123
124
mov
125
cld
126
esi, src
cpy_loop:
127
lodsb
128
stosb
129
or
al,al
130
jnz
cpy_loop
pop
pop
edi
133
134
leave
131
132
esi
CHAPTER 5.ARRAYS
copy to
copy from
;
load
AL & inc si
;
store
;
set
AL & inc di
condition flags
;
if not
ret
135
memory.
memex.c
#include <stdio.h>
void
asm copy(
unsigned )
attribute
((cdecl));
void ,
char target
void
unsigned )
5.2. ARRAY/STRING
INSTRUCTIONS
115
unsigned
attribute
, const char )
10
char )
void
attribute
asm strcpy(
((cdecl));
11
char
12
int main()
13
14
15
16
char st;
17
char
= test
string;
ch;
18
19
asm copy(st2,
30 chars of string /
copy
printf (%s\n,
20
all
st2);
21
22
printf (Enter
in string /
24
25
26
27
28
st
23
a char: );
scanf(%c%[\n],
= asm find(st2,
&ch);
if (st )
printf
pr
else
printf (Not found\n);
29
30
31
st1[0]
= 0;
32
scanf(%s,
33
printf (len
st1);
= %u\n,
asm strlen(st1));
34
35
asm strcpy(
st2, st1);
copy
36
printf (%s\n,
st2 );
37
return 0;
38
39
memex.c
116
CHAPTER 5.ARRAYS
Chapter 6
Floating Point
6.1
Floating Point
Representation
6.1.1
it must
non-integral
numbers
decimal. In decimal,
were
discussed in the
were
values
to represent
be possible
in other bases
digits to the
0.123
=1
10
+2
10
as
well
+3
powers
10
0.1012
=1
+0
+1
as
right of the
ten:
discussed.
= 0.625
of
This idea
can be combined
methods of Chapter 1
to convert
a general
number:
very
a binary
..
new
...
a.bcdef
117
118
CHAPTER 6.FLOATING POINT
0.5625
0.125
0.25
0.5
2
2
2
2
=
=
=
=
1.125
0.25
first bit
second bit
0.5
third bit
1.0
fourth bit
=
=
=
=
1
0
0
1
0.85
0.7
0.4
0.8
0.6
0.2
0.4
0.8
2
2
2
2
2
2
2
2
=
=
=
=
=
=
=
=
1.7
1.4
0.8
1.6
1.2
0.4
0.8
1.6
..
now
...
are
needed
fractional
part of
zero
is
reached.
As
another
example,
consider
converting
part
(23
fractional
=
part
101112 ), but
(0.85)?
what
Figure
about
6.2 shows
the
the
119
means
to
opposed
There
that 0.85 is
repeating
a pattern to
is
decimal
the
in base 10)
numbers
that
0.85
10111.1101102
0.1101102
in the
23.85
Thus,
important
One
exactly in binary
(Just
13
as
a finite
can not
into
these
approximation of 23.85
To
simplify
the
are
can not
variables.
Only
an
hardware,
floating
point
format
23.85
be stored
can be stored.
numbers
using
stored in
..
stored
as:
1.011111011001100110...
100
1.ssssssssssssssss
eeeeeee
eeeeeeee
6.1.2
is the exponent.
representation
The
IEEE
Electronic
(Institute
Engineers)
organization
is
on most
made
Often
binary
specific
it
is
of the computer
This
com-puters
supported
by
IEEE
different
defines
the
coprocessors
two different
are
use it.
(which
The
and
inter-national
format is used
today.
Electrical
an
hardware
of
formats
with
double
precision
precision
uses a
automatically.
slightly
different
Extended
general format
so
not
It should
repeat
in
repeats
0.13
2
type
one
be
in decimal,
Some
compilers
this
use
double. (This
surprising
that
not another.
but in ternary
compilers
uses
so
base, but
(such
extended
as
number
Think
about
might
1
3
(base 3) it would
Borland)
precision.
long
However,
it
be
double
other
120
CHAPTER 6.FLOATING POINT
31
30
23
s
s
e
22
e
sign bit
-0 = positive, 1= negative
= true exponent
+ 7F (127 decimal).
The
fraction
uses 32 bits to
It is usually accurate to 7
are stored
in a much
more
complicated
a IEEE single
precision number.
There
Instead,
assumes a normalized
significand
representation.
How would 23.85 be
= 8316
Finally, the
CC CD can be interpreted
ways depending
on what aprogram does
with them!
As as single
different
precision
one is hidden).
Putting this all together (to help clarify the different sections of the floating
point format, the sign bit and the faction have been underlined and the bits
floating
point
4-bit nibbles):
number,
they
23.850000381,
represent
but
as
0100 00011
= 41BECCCC16
represent
1,103,023,309!
decimal,
23.849998474.
is
correct
the
This
as above. Since
was
up to 1. So 23.85 would be
as 41BE CC CD inhex using single
represented
121
e =0
and
=0
e =0
and
f 6= 0
denotes
a denormalized
number. These
are dis-
e =FF
and
=0
are
both positive
e =FF
and
f 6= 0
an undefined
a Number).
denotes
(Not
result, known
as NaN
63
62
52
0
s
51
How
would
-23.85
be
represented?
Just
Certain combinations
meanings
f have special
An infinity is produced
operation
a negative
overflow
undefined
Normalized
or
e and
zero.
An
result is produced by an invalid
such as trying to find the square root of
by
an
of
by division
single
precision
by
numbers
N
orm alizedcan
sin
range
in magnitude
1.0
2in magnitude
( 1.1755
gle
precision
numbersfrom
can range
35)
127
35).
10
to 1.11111...
2
( 3.4028
10
126
Denormalized numbers
Denormalized
numbers
can be used to
represent numbers with magni-tudes too small to
126).
normalize tud
(i.e.
es too
below
sm all
1.0 to 2n ormal ize
For (i.e.
example,
be lo
129
consider
1.02 the number 1.0012
2
( 1.6530
39).
Inthe given normalized form, the exponent
can
be represented in
127.
the unnormal- ized form: 0.010012
2
To
significand
with 2
the
of the
127
one to
number written
left
of
are
the
as a
stored
decimal
the one
point)
129
product
including
is then:
122
CHAPTER 6.FLOATING POINT
uses 64 bits to
biased
range
magnitudes
308.
to 10
can range
from approximately 10
308
= 403 inhex.
or 40 37 D9 99 99 99 99 9A inhex. If one
converts this back to decimal, one finds
23.8500000000000014 (there are 12 zeros!) which
is a much better approximation of 23.85.
The double precision has the same special
3.
values as single precision
Denormalized
numbers are also very similar. The only main
difference is that double denormalized numbers
use 2 1023 instead of 2 127.
6.2
Floating
different
point arithmetic
than in continuous
mathematics,
all numbers
on a computer
mathematics
can
is
In
be considered
on a
with
6.2.1
Addition
are not
6.34375
= 16.71875
or inbinary:
1.0100110
1.0100110 2 2
1.1001011
1.1001011
123
so shift
exponent
the significand
to make the
1.0100110
0.1100110
0.1100110
2
2
10.0001100
results
and
ift
in
drops off
after
ing of 1.100101
3.
0.11001102
nding
n0.11
2
The
result of the (or
ad
The result
result s iof
the 00110
addition,
10.00011002
4)
1.00001100
2
is equal to 10000.1102 or 16.75.
3
answer (16.71875)!
an approximation due to the round off
errors of the addition process.
This is not equal to the exact
It is only
It is important
always
mathematics
point
on a computer
an approximation
arithmetic
(or
calculator)
The
laws
is
of
numbers
on a computer.
Mathemat- ics
assumes
no computer can
+ b)
exactly
6.2.2
= a; however,
this
teaches that (a
may not
hold true
ona computer!
Subtraction
Shifting 1.1111111
1.0000000
As
an example,
= 0.8125:
1.0000110
1.1111111
1.1111111
3
22
0.0000110
24
1.0000110
24
1.0000000
1.0000000
2
2
0.0000110
24
exactly correct.
6.2.3
are
10.375
2.5
added. Consider
= 25.9375:
1.0100110
1.0100000
1.0100000
2
2
10100110
10100110
1.10011111000000
are
124
Of
course,
8-bits to give:
1.1010000
= 11010.0002 = 26
6.2.4
errors.
calculations
are not exact.
The
programmer needs to be aware of this.
A
common mistake that programmers make with
floating point numbers is to compare them
assuming that a calculation is exact. For example,
consider a function named f(x) that makes a
complex calculation and a program is trying to
point
4.
One might be tempted
if (f(x)
== 0.0 )
means
that
30?
x is a very good
This
very
approximation
if (fabs(f(x))
< EPS )
to compare
another
a floating
(y) use:
if (fabs(x y)/fabs(y)
6.3
< EPS
6.3.1
Hardware
processors
had
mean
that
they
could
not
no
hardware
This does
perform
float
operations.
It just
by
performed
non-floating
systems
procedures
composed
point instructions.
Intel did
math
provide
coprocessor.
an
many
of
chip
coprocessor
has machine
instructions
that perform many
floating point operations much faster than using a
software procedure (on early processors, at least
called
means
A math
10 times
4
x such that
f(x)
=0
125
faster! )
The
coprocessor
and
the
for
processor
80386,
80387.
80486 itself.
was
was a 80287
The
80486DX
coprocessor
into the
coprocessor;
however, it is still programmed as if it was a
separate unit. Even earlier systems without
a
coprocessor can install software that emulates a
math coprocessor. These emulator packages are
automatically
activated
when
a program
executes a coprocessor instruction and run a
software procedure that produces the same result
as the coprocessor would have (though much
80x86
processors
have
a builtin
math
slower, of course).
The numeric
point registers.
coprocessor
has eight
Each register
holds
.
as
80-bit
extended
precision
The
floating
are
point
..
always stored
numbers
are named
floating
80 bits of
in these
registers
are
used
differently
are organized as
a Last-In First-Out
a stack.
Recall that
a stack
is
new
numbers
the
pushed down
new
on the
are
added to
numbers
stack to make
room
are
for the
number.
register in the numeric
6.3.2
The
,,
a status
It has several
There is also
coprocessor.
will be covered: C0
use of these
C1
C2
is discussed later.
Instructions
coprocessor
source may
be a single, double
or extended
or a coprocessor register.
reads an integer from memory,
precision number
FILD
source
either
FLD1
stack.
stores
FLDZ
stack.
from
5
integrated
coprocessor.
There
was a separate
an
80487SX
126
CHAPTER 6.FLOATING POINT
the stack
as it stores
FST dest
it.
into
memory.
The destina-
may either be a
number or a
coprocessor register.
tion
double precision
just
or
FSTP dest
memory
single
as FST; however,
after the number
is stored, its
may
destination
either
or a coprocessor
register.
FIST dest
memory.
or a double
into
may
either
a word
word.
The
The
destination
stack
itself
is
to
an
is
converted
onsome bits in
the
coprocessors
control
word.
so that
it rounds
FSTCW
(Store
Control
can
be used
dest
Same
as
may
that
can
stack itself.
FXCH STn
up a register on the
as
unused or empty.
frees
stack
may
src
ST0
coprocessor register
or a single or double
be any
precision number in
memory.
dest
dest
may
be any
FADDP dest
coprocessor reg-ister.
or
dest
may
be any
coprocessor
+= ST0. The
src
ST0
register.
+= (float) src.
Adds
+b = b + a,
but
but
a b 6=there
b a!).
For
each instruction,
instruction,
is an
alternate
one that t
subtracts inthe reverse order. These
6.3.
COPROCESSOR
127
THE
reverse
NUMERIC
segment .bss
array
resq
SIZE
sum
resq
45
segment .text
6
mov
ecx, SIZE
mov
esi, array
fldz
;
ST0 =0
lp:
10
fadd
qword [esi]
;
ST0 += *(esi)
11
add
esi, 8
;
move
12
loop
lp
13
fstp
qword
to next double
;
store
sum
result into
sum
sum example
up the elements
one must specify the size
memory operand. Otherwise the assembler
that adds
of the
memory
operand
FSUB
src
FSUBR
src
was a float
FSUBP dest
or
FSUBRP dest
or
FISUB
src
FISUBR
(dword)
ST0
or a double
src
(qword).
register
dest
ister.
dest
cessor
dest
reg-
coprocessor
register.
dest
be any
coprocessor register.
ST0 -= (float) src.
Subtracts an integer from
ST0. The src must be a word or double word in memory.
ST0 = (float) src
ST0. Subtracts ST0 from an
integer. The src must be a word or double word in
memory.
128
CHAPTER 6.FLOATING POINT
are completely
may
src
coprocessor register
or a single or double
src
be any
precision number in
memory.
dest
may
be any
FMULP dest
coprocessor reg-ister.
or
may
be any
coprocessor
register.
src
Multiplies
are
Division by
FDIV
may
zero results
inan infinity.
ST0 /= src. The
src
coprocessor register
or a single or double
src
be any
precision number in
memory.
FDIVR
The
src
ST0
= src / ST0.
precision
number in
memory.
FDIV dest, ST0
dest
may
be any
coprocessor reg-ister.
dest
= ST0 / dest.
The dest
register.
FIDIV
src
ST0 /= (float)
src.
FIDIVR
ST0.
src must
src
Divides
ST0. The
be a word
or double word in
memory.
ST0 = (float) src /
an integer by
src must be a word or double
word in
memory.
Comparisons
The
coprocessor
129
1
if (
x >y )
fld
qword [x]
;
ST0 =x
fcomp
qword [y]
;
compare
fstsw
ax
;
move
sahf
else_part
;
if x not above y,goto
10
11
12
13
jna
STO and
else_part
then_part:
;
code for then part
jmp
end_if
else_part:
;
code for else part
end_if:
FCOM src
compares ST0 and src. The src
can be a coprocessor register
or a float or double in memory.
FCOMP src
compares ST0 and src, then
pops stack. The src can be a
coprocessor register or a float or
double in memory.
FCOMPP
compares ST0 and ST1, then
pops stack twice.
FICOM src
compares ST0 and (float)
src
inmemory.
compares
FTST
,,
ST0 and 0.
C1 C2 and
status register.
access
some new
FSTSW dest
SAHF
instructions:
FLAGS register.
LAHF
a short
example code
the C0
,,
C1 C2
a JNA
instruction.
130
CHAPTER 6.FLOATING POINT
processors
new comparison
src must
src
be a coprocessor
register.
an example
Miscellaneous instructions
This section
instructions
FCHS
ST0
ST0
= -ST0 Changes
the sign of
= |ST0|
Takes the absolute
STO
= |ST0|
Takes the absolut
STO
ST0 =
Takes the square
ST0
FABS
value of ST0
ST0
value
FSQRT
of S
root of ST0
= ST02
ST0 by
a power
of 2 quickly. ST1
ST0
= ST02
multiples
bST1c
ST0
FSCALE
coprocessor
an example
instruction.
6.3.3
Examples
6.3.4
Quadratic formula
can be encoded
ax
+bx +c = 0
and x2
x1,x2
b
2
2a
4ac
square root
(b
4ac)
determining
the
possibilities
which
1.There is only
4ac
of
following
three
solution. b
=0
2.There
3.There
4ac
solutions. b
uses
>0
4ac
<0
the assembly
131
quadt.c
#include <stdio.h>
2
int quadratic(
int main()
6
printf (Enter
scanf(%lf
10
if (quadratic(
11
a,b,c,&root1,
&root2) )
printf (roots:
pr
%.10g %.10g\n,
12
root2);
13
root1,
else
printf (
(No real roots\n);
14
return 0;
15
16
a,b,c:);
quadt.c
;
function
;
finds
quadratic
equation:
3
a*x^2
+ b*x +c = 0
;
C prototype:
double
6
a,double
b,
c,
double *root1,
double *root2 )
7
;
Parameters:
a,b,c
-coefficients
-pointer
in
10
root1
-pointer
of powers of
11
to double to
;
Return
value:
returns 1
if real roots found,
else 0
13
qword [ebp+8]
14
%define
15
%define b
qword [ebp+16]
16
%define
qword [ebp+24]
17
%define root1
dword [ebp+32]
18
%define root2
dword [ebp+36]
19
%define disc
qword [ebp-8]
132
20
%define one_over_2a
qword
21
22
segment .data
23
MinusFour
dw
-4
24
segment .text
25
global
26
_quadratic
_quadratic:
27
28
push
ebp
29
mov
ebp, esp
30
sub
esp, 16
31
push
ebx
32
[ebp-16]
;
allocate
;
must save
original ebx
33
fild
word [MinusFour];
34
fld
35
fld
36
fmulp
st1
37
fmulp
st1
38
fld
39
fld
40
fmulp
st1
41
faddp
st1
42
ftst
;
;
43
fstsw
44
sahf
45
jb
46
fsqrt
47
fstp
48
fld1
49
fld
50
fscale
51
fdivp
st1
52
fst
one_over_2a
;
;
53
fld
54
fld
disc
55
fsubrp
st1
56
fmulp
st1
57
mov
ebx, root1
58
fstp
qword [ebx]
ax
no_real_solutions
;
disc
;
a
59
fld
60
fld
disc
61
fchs
stack -4
stack:
a,-4
stack:
c,a,-4
stack: a*c, -4
stack: -4*a*c
stack: b,b,-4*a*c
stack: b*b, -4*a*c
stack: b*b
-4*a*c
test with 0
;
if disc < 0,no real solutions
stack: sqrt(b*b
store and
-4*a*c)
pop stack
stack: 1.0
stack:
a,1.0
stack:
a *2^(1.0)
=2*a, 1
stack: 1/(2*a)
stack: 1/(2*a)
stack: b,1/(2*a)
stack: disc, b,1/(2*a)
stack: disc
stack: (-b
-b,1/(2*a)
+ disc)/(2*a)
store in*root1
stack: b
stack: disc, b
stack: -disc, b
133
62
fsubrp
st1
63
fmul
one_over_2a
64
mov
ebx, root2
65
fstp
qword [ebx]
66
mov
eax, 1
67
jmp
short quit
68
69
no_real_solutions:
mov
eax, 0
ebx
74
pop
mov
75
pop
ebp
76
ret
70
71
72
quit:
73
esp, ebp
-b
-disc)/(2*a)
;
stack:
-disc
;
stack:
(-b
;
store
in*root2
;
return
value is 1
;
return
value is 0
quad.asm
6.3.5
Reading
array
from file
Inthis example,
doubles from
program:
readt.c
This
assembly
from stdin
program tests
procedure.
the
32bit
read doubles
This
program
tests th
/
5
#include <stdio.h>
int );
int main()
10
11
int i,n;
12
double a[MAX];
13
n= read
14
doubles(stdin
a,MAX);
15
for( i=0; i
< n;i++ )
16
17
return 0;
18
19
1 34
readt
r e
1
segment .data
format
34
db
ad
"%lf", 0
asm
segment .text
5
global
_read_doubles
extern
_fscanf
78
%define SIZEOF_DOUBLE
%define FP
dword
10
%define ARRAYP
dword
11
%define ARRAY_SIZE
dword
;
format
for fscanf()
+ 8]
[ebp + 12]
[ebp + 16]
[ebp
12
%define TEMP_DOUBLE
13
14
15
;
function
16
;
C prototype:
_read_doubles
[ebp
-8]
17
array
anarray, until
a text
;
EOF or
19
is full.
20
;
Parameters:
21
-FILE pointer
fp
-pointer
arrayp
read into
23
22
to double
array_size
to read
;
array to
-number of
elements inarray
24
;
Return
25
value:
26
27
_read_doubles:
28
push
ebp
29
mov
ebp,esp
30
sub
esp, SIZEOF_DOUBLE
32
push
esi
33
mov
esi, ARRAYP
34
xor
edx, edx
31
35
36
while_loop:
37
cmp
edx, ARRAY_SIZE
38
jnl
short quit
array
(in EAX)
;
define one double on stack
;
save
esi
;
esi =ARRAYP
;
edx =array
index (initially 0)
;
is edx < ARRAY_SIZE?
;
if not, quit
loop
135
39
40
;
call fscanf()
to read
a double
into
TEMP_DOUBLE
41
;
fscanf()
42
so save
43
push
edx
44
lea
eax, TEMP_DOUBLE
45
push
eax
46
push
dword format
47
push
FP
48
call
_fscanf
49
add
esp, 12
50
pop
edx
51
cmp
eax, 1
52
jne
short quit
it
53
54
56
;
copy
;
(The
57
55
are copied
-8]
58
mov
eax, [ebp
59
mov
[esi +8*edx],
eax
-4]
60
mov
eax, [ebp
61
mov
[esi +8*edx
63
inc
edx
64
jmp
while_loop
pop
esi
mov
eax, edx
71
mov
esp, ebp
72
pop
ebp
62
65
66
quit:
67
68
69
70
;
save
edx
;
push &TEMP_DOUBLE
;
push &format
;
push file pointer
;
restore
edx
;
did fscanf
return 1?
;
if not, quit
loop
+4],eax
;
first copy
;
next copy
lowest 4 bytes
highest 4 bytes
;
restore esi
;
store return
value into
ret
73
6.3.6
eax
read.asm
Finding primes
looks at finding
prime
is more
one. It stores the
in an array and only divides
every
new
primes.
136
CHAPTER 6.FLOATING POINT
instead of
is that it computes
square root
of the
guess
for
the
that when
it
truncates
instead
of
rounding.
This
is
integer
conversions.
careful to
save
the original
control
word and
program:
fprime.c
#include <stdio.h>
#include <stdlib.h>
Parameters:
a array to hold
10
how
many
primes
primes to find
a,unsigned n
);11
12
int main()
13
14
int status;
15
unsigned i;
16
unsigned
17
int
max;
a;
18
19
printf (How
find? );
20
many
scanf(%u,
&max);
21
22
a =calloc(
sizeof(int ),max);
23
24
if (a ){
25
26
find primes(a,max);
27
28
for(i= (max
29
? max 20
> 20 )
:0; i
< max;
i++ )
30
3132
free(a);
status
33
34
35
else {
=0;
fprintf (stderr
36
status
37
=1;
array
38
39
return status;
40
41
137
of %u ints\n, max);
fprime.c
prime2.asm
1
segment .text
global
_find_primes
;
;
function find_primes
;
finds the indicated
number of primes
;
Parameters:
; array
array to hold primes
; n_find
how many primes to find
;
C Prototype:
10
11
12
%define
13
%define n_find
14
%define
15
%define isqrt
16
%define orig_cntl_wd
17
%define new_cntl_wd
array
n
18
19
_find_primes:
ebp
ebp
+8
+12
-4
-8
ebp -10
ebp -12
ebp
ebp
enter
12,0
22
push
ebx
23
push
esi
25
fstcw
word [orig_cntl_wd]
26
mov
ax, [orig_cntl_wd]
20
21
24
unsigned n_find )
;
number of primes found so far
;
floor of sqrt of guess
;
original control word
;
new
control word
;
make room for local variables
;
save possible register variables
;
get current control word
138
27
or
ax, 0C00h
28
mov
[new_cntl_wd],
29
fldcw
word [new_cntl_wd]
30
ax
31
mov
esi,[array]
32
mov
dword [esi], 2
33
mov
34
mov
ebx, 5
35
mov
dword [n], 2
36
37
;
This
a new
prime
;
set
;
esi points
;
array[0]
;
array[1]
;
ebx
to array
=2
=3
= guess = 5
;
n= 2
each iteration, which it adds to the
38
;
end of the array. Unlike
prime finding
program,
the earlier
this function
39
;
divides
41
;
are stored
42
43
while_limit:
inthe array.)
44
mov
eax, [n]
45
cmp
eax, [n_find]
46
jnb
short quit_limit
48
mov
ecx, 1
49
push
ebx
50
fild
dword [esp]
51
pop
ebx
52
fsqrt
47
fistp
53
54
dword [isqrt]
;
while
(n< n_find )
;
ecx is used as array
;
store guess on stack
;
load guess
;
get guess
onto
index
coprocessor
off stack
stack
;
find sqrt(guess)
;
isqrt
55
;
This
=floor(sqrt(quess))
inner loop divides
guess
(ebx) by
58
;
until it finds a prime factor of guess
;
or until the prime number to divide is
;
59
while_factor:
56
57
[esi + 4*ecx]
60
mov
eax, dword
61
cmp
eax, [isqrt]
62
jnbe
short quit_factor_prime
63
mov
eax, ebx
64
xor
edx, edx
65
div
66
or
edx, edx
means guess
is not prime)
;
eax
= array[ecx]
;
while
(isqrt
;
&& guess
67
jz
<array[ecx]
% array[ecx] != 0 )
short
quit_factor_not_prime
inc
68
ecx
jmp
69
short while_factor
70
71
72
;
found anew
73
74
quit_factor_prime:
prime !
mov
mov
eax, [n]
76
77
inc
eax
78
mov
[n], eax
75
79
80
quit_factor_not_prime:
81
add
ebx, 2
82
jmp
short while_limit
83
84
quit_limit:
85
86
fldcw
word [orig_cntl_wd]
87
pop
esi
88
pop
ebx
89
90
leave
139
;
add guess
;
inc n
;
try next
;
restore
;
restore
to end of
odd number
control word
register variables
ret
91
array
140
CHAPTER 6.FLOATING POINT
prime2.
global _dmax
23
segment .text
4
;
function
;
returns
;
C prototype
;
double
;
Parameters:
_dmax
-first double
-second double
d1
10
d2
11
;
Return
12
13
%define d1
ebp+8
14
%define d2
ebp+16
15
_dmax:
value:
larger of d1
and d2 (inST0)
enter
0,0
18
fld
qword [d2]
19
fld
qword [d1]
;
ST0 =d1,ST1= d2
20
fcomip
st1
;
ST0 =d2
21
jna
short d2_bigger
22
fcomp
st0
;
pop d2 from stack
23
fld
qword [d1]
;
ST0 =d1
24
jmp
short exit
16
17
25
d2_bigger:
26
exit:
;
if d2 is max, nothing
27
leave
28
ret
141
THE
NUMERIC
to do
segment .data
dq
2.75
five
dw
;
converted
to double format
45
segment .text
6
fild
dword [five]
;
ST0 =5
fld
qword [x]
;
ST0 =2.75, ST1=5
fscale
;
ST0 =2.75 *32,ST1= 5
142
CHAPTER 6.FLOATING POINT
Chapter 7
Structures and
C++
7.1
7.1.1
Structures
Introduction
Structures
are intimately
related.
can be passed as a
as an array
be considered
of the code.
a structure can
size of the
elements
elements
index.
A structures
same
each element
specified
of
numerical index.
a structure will
way as an element of an
array. To access an element, one must know the
In assembly, the element of
can
be calculated
an array
where this
by the index
a structure
of the
is assigned
an
memory management
section of
143
144
CHAPTER 7
Offset
Element
y
6
Offset
Element
any
unused
y
8
struct S {
short int
x;
/ 2byte integer /
int
y;
/ 4byte integer /
double
z;
/ 8byte float
};
standard
that
variable
of type S
memory.
The ANSI
the
elements
a
same
of
structure
order
states
very
is at the
macro
inthe stddef
This
macro
(S
y)
7.1. STRUCTURES
145
struct S {
short int
x;
/ 2byte integer /
int
y;
/ 4byte integer /
double
z;
attribute
/ 8byte float
((packed));
7.1.2
gcc
Memory alignment
y (and
may
give different
a structure
that
ways
differently.
The
gcc compiler
has
a flexible
and
one to specify
a special syntax.
following line:
typedef short int unaligned int
attribute
((
aligned (1)));
defines
aligned
a new type
on byte
parenthesis
after
(Yes,
attribute
The 1
inthe aligned parameter
with
other
alignments.
powers
type,
gcc
the
are required!)
can be replaced
two to specify
other
word alignment,
structure
of
all
was
etc.)
changed
would put
If the
to be
y at
y element of
an unaligned
offset 2.
the
int
However,
are
also
put at offset 6.
as well for it to
146
CHAPTER 7
#pragma pack(push) /
save
alignment
state /
#pragma pack(1)
struct S {
short int
x;
/ 2byte integer /
int
y;
/ 4byte integer /
double
z;
/ 8byte float
};
#pragma pack(pop)
/ restore
original alignment
or
Borland
The
gcc
structure.
minimum
possible
space
one to pack a
to use the
compiler
for
the
structure.
way.
use
the
minimum
support the
using
and
same
a #pragma
Borlands
compilers
both
directive.
#pragma pack(1)
on
elements of structures
with
no extra
The
replaced
on word,
alignment
paragraph
stays
directive
these directives
in
effect
This
are
differently
lead
to
respectively.
until
The
overridden
by
problems since
can
be
or sixteen to specify
can cause
laid out
(i.e.,
one can
boundaries,
another directive
This
byte boundaries
padding).
these structures
may
be
modules of
Different
is
a way to
avoid
current
alignment
state
and
this
problem.
a way to save
restore
the
it later
7.1.3
Bit Fields
one to
specify members of
use a spec-ified
number of bits.
a multiple
an
or int
This defines
member
a 32-bit
with
147
struct S {
unsigned f1
:
3;
/ 3bit field
unsigned f2 :
10;
/ 10bit field /
unsigned f3 :
11;
/ 11bit field /
unsigned f4 :
8;
/ 8bit field
};
Byte \ Bit
Logical Unit #
msb of LBA
Transfer Length
Control
8 bits
11bits
10 bits
3 bits
f4
The
f3
first
bitfield
f2
is
to the
assigned
f1
least
so simple if one
are actually stored in
memory. The diffi culty occurs when bitfields
span byte boundaries. Because the bytes on a
little endian processor will be reversed in memory.
However, the format is not
looks
5 bits
3 bits
3 bits
5 bits
8 bits
8 bits
f2l
f1
f3l
f2m
f3m
f4
The f2l label refers to the last five bits (i.e., the
five least significant bits) of the f2 bit field. The
bits
of
boundaries.
If
one reverses
all
memory
layout
is not usually
out of the
common
program
(which
is actually
It is
quite
common
for
some
flexibility
However,
in exactly
common
the
compilers
bits
(gcc,
are
laid
Microsoft
148
STRUCTURES AND C++
CHAPTER 7
out
and
)\
||
defined( MSC VER))
34
#if MS OR BORLAND
5
pragma
pack(push)
pragma
pack(1)
#endif
89
:
8;
10
unsigned opcode
11
12
13
14
15
16
unsigned control
:
5;
:
3;
:
8;
18
20
attribute
21
:
8;
:
8;
17
19
/ middle bits /
((packed))
#endif
22
23
#if MS OR BORLAND
24
25
#endif
pragma
pack(pop)
3.
A direct read command
message to the
a six byte
this using
7.6,
one sees
spans
endian format.
Figure
From Figure
7.7 shows
in big
definition
a macro
or
The potentially
one
as a
are
Borland
parts
confusing
are
defined separately
reason
is
fields
3
Next,
the
lba msb
appear to be reversed;
and
logical unit
however,
an industry
7.1. STRUCTURES
149
8 bits
3 bits
control
8 bits
5 bits
transfer length
8 bits
8 bits
8 bits
lba lsb
lba mid
logical unit
lba msb
opcode
10
:
5;
:
3;
/ middle bits /
attribute
11
12
#endif
13
((packed))
Format Structure
case.
are again
as a
are
are arranged
To complicate
for
the
correctly
matters
for
Microsoft
more,
does
C.
the definition
not
If
quite
the
work
ex-pression
is evalutated,
the
uses
compiler
Microsoft
the
type of the
are
defined
as
unsigned
types,
the compiler
structure to make it
words. This
can
an integral
be remedied
number of double
compiler
any
pad bytes
The other
avoids
fields by using
unsigned char.
The reader should not be discouraged
if he
confusing.
It is
found
the previous
confusing!
The
discussion
author
often
it
finds
Mixing
to examine
different
types
and
of
bit
modify
fields
use bit
the
leads
less
to
bits
very
150
CHAPTER 7
manually.
7.1.4
As discussed
assembly
is
above, accessing
very
much like
a structure in
an array
how one would
would zero out
accessing
Assuming
the
zero
y( S
s p );
%define
_zero_y:
y_offset
enter
0,0
mov
eax, [ebp + 8]
s_p (struct
mov
6
;
get
+ y_offset],
dword [eax
leave
ret
C allows
a function;
by value to
a bad
structure
the
more
a pointer to a structure instead.
C also allows a structure type to be used as
the return value of a func-tion. Obviously
a
structure can not be returned in the EAX register.
then retrieved
compilers
Different
handle
common
this
situation
com-pilers
use is to internally rewrite the function as one
that takes a structure
pointer as a parameter.
differently.
solution that
a structure
Most
built-in support
assembly
(including
for defining
code. Consult
your
NASM)
structures
documentation
details.
7.2
in
have
your
for
The
programming
C++
language
is
an
are
However,
Also,
some
easier to understand
assembly
language
knowledge of C++.
some
rules need to be
with
This section
a knowledge of
assumes a basic
7.2. ASSEMBLY
AND C++
151
1
#include <stdio.h>
23
void f(int
4
printf (%d\n,
x)
{
x);
78
void f(double
9
printf (%g\n,
10
11
x)
{
x);
7.2.1
C++
allows
different
functions
(and
class
same name to be
defined
When more than one function share
the same name, the functions are said to be
overloaded. If two functions are defined with the
same name in C, the linker will produce an
error because it will find two definitions for the
same symbol in the object files it is linking. For
member
functions)
with the
equivalent
C++
avoids
In a
creating
the
label
mangling, too. It
of the C function
for
the
function.
name of both
functions in Figure 7.10 the same way and
produce
an error. C++ uses a more
sophisticated mangling process that produces two
However,
C will
func-tion
first
assigned
the
different
the
mangle
in Figure
by DJGPP
For example,
7.10
would
be
any
linker
errors.
Unfortunately,
to manage
mangle
Borland
@f$qd
names
names
there is
no
differently.
C++ would
use
example
are not
in Fig-ure 7.10
completely
arbitrary
name
The mangled
function is defined
a single
int
has an i
at the end of its mangled
name (for both DJGPP and Borland) and that
the one that takes a double argument has a d at
the end of its mangled name. If there was a
argument
152
CHAPTER 7
void f(int
x,int y,double
z);
name to
be f Fiid
functions
mangled
overloading
signature
name.
in
C++.
fact
explains
unique
Only
may
functions
rule
of
whose
one
can see, if two functions with the same name and
signature are defined in C++, they will produce
the same mangled name and will create a linker
error. By default, all C++ functions are name
signatures
are
and
This
be overloaded. As
even ones
mangled,
no way
are
that
When it is compiling
not
overloaded
a particular function
is overloaded or not, so it mangles all names. In
fact, it also mangles the names of global variables
by encoding the type of the variable in a similar
way as function signatures. Thus, if one defines a
global variable in one file as a certain type and
then tries to use it in another file as the wrong
type, a linker error will be produced. This
characteristic
of C++ is known as typesafe
linking. It also exposes another type of error,
inconsistent prototypes
This occurs when the
definition of a function in one module does not
agree with the prototype used by another module.
In C, this can be a very diffi cult problem to
debug. C does not catch this error. The program
of knowing whether
will compile
behavior
different
as
types
on
code
will be pushing
the function
a linker error.
a function
types
of
a matching
the
function by looking at
arguments
passed
to the
If it finds
function
name
mangling rules.
use
compilers
Since different
name
different
may not
compilers
This
fact is important
when considering
using
function
that
a
a
name
mangling
student
may
question
whether
as
expected.
name
function
compiler
printf.
be
This
is
mangled
valid
was
and
CALL to the
concern!
If
the
label
the
compiler
will
arguments.
scope
consider
The
rules
an exact
matches
made
for this
process
a C++
match, the
by
casting
the
are
beyond
the
7.2. ASSEMBLY
AND C++
153
for pointer, C for const,
ellipsis.)
This
would
for
the regular
not call
librarys
be
assembly
code using
the normal
C mangling
conventions.
uses
terminology,
or
global variable it
or
global variable
uses
...
);
For
linkage
function
convenience,
of
block
C++
of
also
functions
allows
and
the
global
prototypes
variables
and function
cplusplus
extern C {
#endif
containing
an extern
"C" block if
7.2.2
References
References
They allow
consider
the
code
reference parameters
in Figure
are pretty
For example,
7.11.
Actually,
154
CHAPTER 7
{x++; }
a reference parameter
34
int main()
5
y =5;
int
f(y);
no & here!
return 0;
10
printf (%d\n,
simple,
they
really
are
just
pointers.
The
programmer (j
implement
var
ust
as
Pascal
parameters
as
compilers
pointers)
When
the compiler
passes
the address
function f in assembly,
6:
prototype
was
of
are
References
just
convenience
are
that
another
define
feature
meanings
structure
use
or
is to
of C++
for
that
the
(+) operator
plus
the
operators
define
one to
on
a common
allows
common
a and
to
b were
strings
one
of the string
as operat or
pass
would like to
objects
in-stead
of
could be done
would require
7.2.3
Inline functions
C++
error-prone,
preprocessor-based
macros
that take
Of
course,
name
mangling
as discussed
in
Section 7.2.1
7
as an extension
155
1
inline int
inline f (int
x)
{return xx; }
34
int f(int
5
x)
{return xx; }
67
int main()
8
int
y =inline
11
f (x);
return 0;
12
13
y,x = 5;
y =f(x);
10
Because the
C
and
parenthesis
preprocessor
does
are
simple
sub-stitutions,
the
answer
in most
cases.
However,
answer
are
As
demonstrated,
version
of making
function.
even this
for SQR(x++).
the
on
chapter
performing
a very
the
simple
subprograms
may
be
more
way to
write
normal
function,
but
common
block
functions
are
function.
code
that
of code. Instead,
calls to inline
C++ allows
function
to be
made
declared
y is at ebp-4):
push
dword [ebp-8]
a normal
x is at
to
function call
3
4
call
_f
pop
mov
ecx
[ebp-4],
eax
156
CHAPTER 7
mov
eax, [ebp-8]
imul
eax, eax
mov
[ebp-4],
In this
case,
there
eax
are two
advantages
to
parameters
are
pushed
on
no stack
no branch is
func-tion call uses less
the stack,
of inlining
so
is that
the code of
an
function
only
name
of
is available
However,
using
knowledge
This
means
the
that
inline
of the function.
non-inline
if
function is changed,
use
does not
the function
header
requires
function
are usually
reasons,
placed in
is contrary
to the
are never
code statements
7.2.4
Classes
A C++ class
describes
a type
of object. An
normal C
struct with a
Actually,
C++
uses
the
functions
are not
stored in
memory
member
functions.
passed
structure
to
However,
functions
are different from other
pointer to the object acted
They are
a hidden parameter.
pointer to the
on from inside
This parameter is
the member
method
object
on.
For
consider
assigned
example,
of the Simple
class of Fig-
ure
7.13.
like
If it
was
function
that
was
passed a
on as the code
-S switch on the
gcc and Borland
explicitly
compilers
an
compiler
as well)
assembly
file
(and
the
containing
the
equivalent
gcc
the assembly
.s extension
8
For
file ends in an
in C++
and unfortu-
or more
generally
methods.
7.2. ASSEMBLY
AND C++
157
1
class Simple {
public:
Simple();
// default constructor
Simple();
// destructor
// member functions
private:
int data;
// member data
};
10
11
12
Simple::Simple()
{data
= 0;}
13
14
15
Simple::Simple()
{/ null body / }
16
17
int Simple::get
18
{return data; }
data() const
19
20
21
x)
= x;}
nately
uses
AT&T
assembly
language
syntax
a file
7.15
Figure
converted
shows
very
DJGPP
the
purpose
of the statements. On
is assigned
name
of
added to clarify
the
output
the
mangled
the
parameters.
because
The
other
name
classes
might
have
method
assigned
different
labels.
encoded
so
the
that
parameters
The
can
class
overload
different
are
the
just
as
as
before,
Next
prologue
9
The
called
the
gcc
compiler
On
are
differences
outputs
several
5,
system includes
compiler
There
free
on lines 2
appears.
the code
pages on
its
in the
the
web
named
a2i
format
that
program
own
assembler
discuss
gas.
the
There is also
(http://www.multimania.com/placr/a2i.html),
that
158
CHAPTER 7
int
x)
{
object>data
= x;
_set_data__6Simplei:
mangled
name
push
ebp
mov
ebp, esp
mov
eax, [ebp + 8]
parameter
;
data
+12]
is at offset 0
leave
10
;
edx
mov
ret
;
eax
mov
= integer
[eax], edx
10
the
x parameter
into
on,
which being
Example
This section
a C++
uses
an unsigned
can
be any size, it will be stored in an array
of unsigned integers (double words)
It can be
made any size by using dynamical allocation.
11
The double words are stored in reverse order
create
integer of arbitrary
class
array
that
is used to store
10
11
Why?
Because
start processing
addition
operations
at the beginning
of the
array
move
and
forward.
12
source
159
some
code
of the
Parameters:
size
as a normal
unsigned int
=0);
Parameters:
13
14
15
16
18
initial
initial value
12
17
of
10
11
as number
size
of
initial
as a string
holding
/
Big int (size t
const char
19
as number
size
size
initial value );
20
21
22
23
24
25
size () const;
26
27
28
= (const
+(const
29
30
31
32
== (const
35
36
37
33
34
< (const
<< (ostream
&
os,
38
private:
size t
39
41
size
unsigned number
40
// size of unsigned
;
//
array
pointer to unsigned
array
holding value
};
160
STRUCTURES AND C++
CHAPTER
extern C {
res,
res,
10
11
12
+(const
13
14
int
op1, op2);
16
if (res
17
== 2)
18
return result
19
20
res
if (res
15
21
22
23
inline Big int operator (const Big int & op1, const Big int & op2)
{
24
25
int
op1, op2);
27
if (res
28
== 2)
29
return result
30
31
res
if (res
26
AND C++
161
assigned offset
zero
and the
number member is
assigned offset 4.
To
these
simplify
same
example,
size
only
object
unsigned
the class
integer;
initializes
the instance
contains
the second
by using
hexadecimal
the first
instance by using
value.
(line 18)
string
The
that
third
subtraction
where
operators
the assembly
on
work
language
since
this
is
is used. Figure
set
up to
call
different compilers
rules
for
functions
the
use radically
operator
are
assembly
up
are
Since
different mangling
functions,
used to set
routines.
inline
operator
calls to C linkage
easy to
need to throw
an exception
as fast as
also eliminates
the
from assembly!
to perform
carry must
multiple
be
CPUs
only
carry
flag. Performing
be done
recalculate the
by
carry
programmer to access
having
C++
can
be accessed
independently
the
more
where the
add it
effi cient to
carry
flag
which automatically
adds the
carry
flag in makes
a lot of sense.
For brevity, only the add big ints assembly
routine will be discussed
big math.
1
segment .text
global
add_big_ints,
%define size_offset 0
%define number_offset 4
56
%define EXIT_OK 0
7
%define EXIT_OVERFLOW 1
%define EXIT_SIZE_MISMATCH 2
10
;
Parameters
11
%define
12
%define op1ebp+12
13
res ebp+8
14
sub_big_ints
sub routines
162
15
add_big_ints:
16
push
ebp
17
mov
ebp, esp
18
push
ebx
19
push
esi
20
push
edi
21
;
first
;
22
23
set
up esi to
edi to
point to op1
point to op2
24
25
26
27
28
;
mov
mov
mov
;
ebx, [res]
30
;
make sure
;
31
mov
32
cmp
33
jne
sizes_not_equal
34
cmp
35
jne
sizes_not_equal
29
have
36
37
38
39
40
mov
ecx, eax
;
;
now, set registers to point
;
esi = op1.number_
to their
= op2.number_
= res.number_
edi
ebx
43
;
;
44
mov
ebx, [ebx
45
mov
46
mov
41
42
+number_offset]
47
clc
48
xor
edx, edx
;
;
addition loop
49
50
51
52
add_loop:
53
mov
eax, [edi+4*edx]
54
adc
eax, [esi+4*edx]
55
mov
[ebx
56
inc
edx
the
same
+4*edx], eax
size
;
op1.size_
!= op2.size_
;
op1.size_
!= res.size_
;
ecx
=size of Big_ints
respective
arrays
;
clear carry
;
edx
=0
flag
;
does
not alter
carry
flag
57
loop
add_loop
jc
overflow
xor
eax, eax
jmp
done
58
59
60
ok_done:
61
62
63
overflow:
64
mov
eax, EXIT_OVERFLOW
65
jmp
done
66
sizes_not_equal:
mov
eax, EXIT_SIZE_MISMATCH
69
pop
edi
70
pop
esi
71
pop
ebx
72
leave
73
ret
67
68
done:
big math.asm
163
;
return
value
= EXIT_OK
most
Hopefully,
straightforward
of
this
code
to the reader by
to 27 store pointers
should
now.
be
Lines 25
really
to point to the
objects
instead
array
of
objects
themselves
arrays
together
dword
significant
in this
first, then
the
next
least
significant
sequence
on overflow
the
carry
array are
in the
moves
array
a short
necessary
constructor
conversion
unsigned
as on line
reasons. First,
explicitly
for two
int
to
that
will
a Big int.
same size can
problematic
16. This
is
there is
no
an
convert
Secondly,
since it would be
any
author
implementation
only
be added. This
of the
more
class would
size to be added to
did not
wish to
164
CHAPTER
#include <iostream>
using
namespace
std;
45
int main()
6
try {
10
cout
11
c =a + b;
12
c =c + a;
13
cout
14
15
16
cout
<< c1
18
cout
19
cout
<< c
20
cout
21
22
23
24
25
){
26
27
28
29
mismatch
<< endl;
30
return 0;
31
32
<< endl;
17
165
1
#include <cstddef>
#include <iostream>
using
namespace
std;
45
class A {
6
public:
cdecl m() {cout
void
int ad;
};
10
11
12
class B :
public A {
public:
cdecl m() {cout
13
void
14
int bd;
15
};
16
17
void f( A
18
= 5;
19
p>ad
20
p>m();
21
p)
22
23
int main()
24
25
A a;
26
Bb;
27
cout
28
29
cout
30
<< Offset
31
<< Offset
32
f(&a);
33
f(&b);
<< endl;
return 0;
34
35
166
_f__FP1A:
push
ebp
mov
ebp, esp
mov
eax, [ebp+8]
mov
dword [eax], 5
mov
eax, [ebp+8]
push
eax
call
_m__1A
add
esp, 4
10
leave
11
ret
;
mangled function name
;
eax points to object
;
using offset 0 for ad
;
passing address of object
to A::m()
;
mangled
method
name
for A::m()
7.2.5
Inheritance
data
and
consider
allows
methods
one
of
class
another.
to inherit
For
the
example,
Size of
a:4 Offset
of ad: 0
pointer to either
an A object
the (edited)
asm
code
(generated
by gcc).
a and
one can see that
m method was
assembly,
hard-coded
into
object-oriented
the
function.
programming,
should depend
on what type
the function.
This
For
true
the method
called
of object is passed to
is known
as
polymorphism.
uses
would be changed.
transition
can
be implemented
gccs
becoming significantly
initial
cover
the
more
implementation.
simplifying
many ways.
implementation
complicated
In
this discussion,
the
is
in
and is
than its
interest
of
Windows
based
Microsoft
and
Borland
167
class A {
public:
virtual void
int ad;
};
67
class B :
public A {
8
public:
virtual void
11
int bd;
10
};
years
program
changes:
Size of
a:8 Offset
of ad: 4
a B object.
This is
an A is
now
answer to these
are
questions
related
to how polymorphism
is
implemented.
any
virtual methods is
given
an extra
array
of method pointers
called
pointer
is stored
compilers
always
at offset
put
0. The
this
Windows
pointer
at
the
can see
of the
program, one
m is not to a
on the
For classes
always
without
virtual
methods
with
C++
compilers
normal C struct
with the
14
It
Of
same
data members.
course,
was put
very
effi cient
because
optimizations turned
it
was
generated
without
compiler
on.
168
?f@@YAXPAVA@@@Z:
push
ebp
mov
ebp, esp
mov
eax, [ebp+8]
mov
dword [eax+4], 5
mov
mov
ecx, [ebp + 8]
10
mov
eax, [ebp + 8]
11
push
eax
12
call
dword [edx]
13
add
esp, 4
45
78
edx, [ecx]
14
15
pop
16
ret
ebp
;
p->ad
=5;
=p
;
edx = pointer
;
eax = p
;
ecx
;
push "this"
to vtable
pointer
;
call first function
;
clean up stack
invtable
This
allows
code
to
call
the
case
call to
certain
why
declared
to
using the
use
uses a different
methods
passes
the C calling
convention
calling
conven-tion for
C++ class
C convention.
It
on by
the
The
stack
explicit parameters
modifier
is still used
of using the
tells it to
convention.
use
Borland
the standard
uses
C++
lets
complicated
look
example
at
(Figure
cdecl
C calling
the C calling
convention by default.
Next
by
slightly
7.23)
more
In it, the
m2. Remember
its
own
method.
m2 method,
Figure
7.24
it inherits
the
shows
the b object
how
A classs
vtable
for each
object.
The
two B objectss
addresses
look
at the addresses
looking at assembly
in the
output,
169
vtables.
one can
From
determine
class A {
public:
virtual void
virtual void
int ad;
};
78
class B :
public A {
9
virtual void
10
int bd;
11
12
// Binherits As m2()
public:
};
13
14
15
pa
psees pa as an array
16
//
17
unsigned
18
// vt
sees
of dwords
p = reinterpret
vtable
= reinterpret
19
void vt
20
cout
21
cout
22
cast<unsigned
as an array
<< dword
>(pa);
of pointers
cast<void
address
>(p[0]);
<< i
<< :
23
functions in EXTREMELY
24
// call
25
void (m1func
virtual
pointer)(A
26
m1func pointer
= reinterpret
27
m1func pointer(pa);
);
nonportable
way!
cast<void ()(A)>(vt[0]);
// call method m1via function pointer
28
29
void (m2func
pointer)(A
30
m2func pointer
= reinterpret
m2func pointer(pa);
31
32
);
cast<void ()(A)>(vt[1]);
// call method m2 via function pointer
33
34
int main()
35
36
A a;
Bb1;
37
cout
<< a:
Bb2;
<< endl;
38
cout
<< b1:
39
cout
<< b2:
40
return 0;
41
170
CHAPTER 7
048
vtablep
ad
bd
&B::m1()
&A::m2()
vtable
b1
a:
vtable address
=004120E8
dword 0:
00401320
dword 1:
00401350
A::m1()
A::m2()
b1:
vtable address
=004120F0
dword 0:
004013A0
dword 1:
00401350
B::m1()
A::m2()
b2:
vtable address
=004120F0
dword 0:
004013A0
dword 1:
00401350
B::m1()
A::m2()
7.2. ASSEMBLY
AND C++
171
pointers
because
class B
one
could call
stored into
C-type
function
pointer
with
an
7.25,
use
the vtable.
There
are some
to be
very
one
use a binary
and writing
read
would have
implement
use
class objects
16
interfaces
vtables
compilers
Only
virtual method
vtables
as
Object Model)
to implement
that
COM
implement
does
can
uses
one of
the
Microsoft
the
COM
classes.
The
exactly
code
like
for
the
virtual
non-virtual
one.
method
it
sure
can
ignore
the vtable
can be
will be
and call
7.2.6
looks
the
The workings
RunTime
Type Information,
exception
are
good
starting
Reference
point
is
The
handling
beyond the
go
(e.g.,
scope
a
further,
Anno-tated
C++
and
15
172
gcc.
use the
CHAPTER 7
Appendix A
80x86 Instructions
A.1
Non-floating Point
Instructions
This section lists and describes the actions
abbreviations:
general register
R8
8-bit register
R16
16-bit register
R32
32-bit register
SR
segment register
These
can
memory
M8
byte
M16
word
M32
double word
immediate value
instructions.
means
For
example,
the
format
R, R
operands.
allow the
same
a 8-bit
register
an operand,
or memory can
the abbreviation,
be
R/M8 is
used.
The table also shows how various bits of the
FLAGS register
are
affected
by each instruction
bit is
particular
value,
1or 0 is
depends
placed
on the
in the
modified in
shown in the
to
value that
some
Finally,
undefined
way a
if the
a C is
bit
is
? appears in
173
174
APPENDIX
A.
80X86 INSTRUCTIONS
Flags
Name
Description
ADC
O2
ADD
Add Integers
O2
AND
Bitwise AND
O2
BSWAP
Byte Swap
R32
CALL
Call Routine
RMI
CBW
CDQ
Convert
Dword
Formats
to
Qword
CLC
Clear Carry
CLD
CMC
Complement
CMP
Compare Integers
CMPSB
Carry
C
C
Compare Bytes
CMPSW
Compare Words
CMPSD
Compare Dwords
CWD
Convert
Word
O2
to
Convert
Word
to
Decrement Integer
RM
DIV
Unsigned Divide
RM
ENTER
I,0
IDIV
Signed Divide
RM
IMUL
Signed Multiply
R16,R/M16
R32,R/M32
R16,I
R32,I
R16,R/M16,I
R32,R/M32,I
INC
Increment Integer
RM
INT
Generate Interrupt
JA
Jump Above
JAE
Jump Above
JB
Jump Below
JBE
Jump Below
JC
Jump Carry
or Equal
I
I
or Equal
I
I
175
INSTRUCTIONS
Flags
Name
Description
Formats
=0
JCXZ
Jump if CX
JE
Jump Equal
JG
Jump Greater
JGE
Jump
I
I
I
Greater
or
Equal
JL
Jump Less
JLE
Jump Less
JMP
Unconditional Jump
JNA
JNAE
or Equal
I
RMI
I
or
Equal
JNB
JNBE
or
Equal
JNC
Jump No Carry
JNE
JNG
JNGE
or
Equal
JNL
JNLE
Jump
Not
Less
or
Equal
JNO
Jump No Overflow
JNS
Jump No Sign
JNZ
JO
Jump Overflow
JPE
JPO
JS
Jump Sign
JZ
Jump Zero
LAHF
LEA
LEAVE
LODSB
Load Byte
LODSW
Load Word
LODSD
Load Dword
LOOP
Loop
LOOPE/LOOPZ
Loop If Equal
LOOPNE/LOOPNZ
176
80X86 INSTRUCTIONS
R32,M
APPENDIX
A.
Flags
Name
MOV
Description
Formats
Move Data
O2
SR,R/M16
R/M16,SR
MOVSB
Move Byte
MOVSW
Move Word
MOVSD
Move Dword
MOVSX
Move Signed
R16,R/M8
R32,R/M8
R32,R/M16
MOVZX
Move Unsigned
R16,R/M8
R32,R/M8
R32,R/M16
MUL
Unsigned Multiply
RM
NEG
Negate
RM
NOP
No Operation
NOT
1s Complement
RM
OR
Bitwise OR
O2
POP
R/M16
POPA
Pop All
POPF
Pop FLAGS
PUSH
Push to Stack
PUSHA
Push All
PUSHF
Push FLAGS
RCL
R/M32
R/M16
R/M32 I
R/M,I
R/M,CL
RCR
Rotate
Right
with
Carry
REP
Repeat
REPE/REPZ
Repeat If Equal
REPNE/REPNZ
RET
Return
ROL
Rotate Left
R/M,I
R/M,CL
R/M,I
R/M,CL
ROR
Rotate Right
R/M,I
R/M,CL
AH
Copies
SAHF
into
FLAGS
177
INSTRUCTIONS
Flags
Name
SAL
Description
Formats
Shifts to Left
R/M,I
R/M, CL
SBB
SCASB
SCASW
SCASD
SETA
Set Above
SETAE
Set Above
SETB
Set Below
SETBE
Set Below
SETC
Set Carry
R/M8
SETE
Set Equal
R/M8
SETG
Set Greater
SETGE
Set Greater
SETL
Set Less
SETLE
Set Less
SETNA
SETNAE
Set
O2
R/M8
or Equal
R/M8
R/M8
or Equal
R/M8
R/M8
or Equal
R/M8
R/M8
or Equal
Not
R/M8
R/M8
Above
or
R/M8
Equal
SETNB
SETNBE
Set
Not
R/M8
Below
or
R/M8
Equal
SETNC
Set No Carry
R/M8
SETNE
R/M8
SETNG
SETNGE
Set
Not
Greater
R/M8
or
R/M8
Equal
SETNL
SETNLE
R/M8
or Equal
R/M8
SETNO
Set No Overflow
R/M8
SETNS
Set No Sign
R/M8
SETNZ
R/M8
SETO
Set Overflow
R/M8
SETPE
R/M8
SETPO
R/M8
SETS
Set Sign
R/M8
SETZ
Set Zero
SAR
Arithmetic
R/M8
Shift
to
Right
R/M,I
R/M, CL
178
APPENDIX
A.
80X86 INSTRUCTIONS
Flags
Name
SHR
Description
Formats
R/M,I
R/M, CL
SHL
R/M,I
R/M, CL
STC
Set Carry
STD
STOSB
Store Btye
STOSW
Store Word
STOSD
Store Dword
SUB
Subtract
O2
TEST
Logical Compare
R/M,R
R/M,I
XCHG
Exchange
XOR
Bitwise XOR
R/M,R
R,R/M
O2
179
A.2
In this section,
coprocessor
section
of
information
the
of the 80x86
are
instructions
description
operation
many
described.
about whether
the
The
describes
briefly
instruction.
math
the
save space,
instruction pops
To
format
operands
column
can be used
what
type
are used:
STn
register
I32
I64
Instructions requiring
terisk(
of
following abbreviations
coprocessor
shows
).
a Pentium
Pro
with anas-
Description
ST0
FABS
FADDP dest[,ST0]
+= src
+= STO
dest += ST0
FCHS
ST0
FADD
src
FCOM
src
FCOMP
ST0
STn FD
dest
STn
STn
= ST0
src
Format
= |ST0|
src
src
STn FD
STn FD
src
src
STn
STn
ST0 /= src
STn FD
dest /= STO
STn
FDIVP dest[,ST0]
dest /= ST0
STn
FCOMPP
FCOMI
FCOMIP
FDIV
src
src
FDIVRP dest[,ST0]
= src/ST0
dest = ST0/dest
dest = ST0/dest
FFREE dest
Marks
FDIVR
src
STn FD
ST0
STn
STn
as empty
STn
+= src
FIADD
src
ST0
FICOM
src
I16 I32
Compare
I16 I32
FICOMP
FIDIV
src
src
FIDIVR
src
STO /= src
STO
= src/ST0
180
80X86 INSTRUCTIONS
I16 I32
src
ST0 and src
I16 I32
I16 I32
APPENDIX
A.
Instruction
FILD
src
FIMUL
src
Description
ST0 *= src
I16 I32
FINIT
Initialize Coprocessor
FIST dest
Store ST0
FISTP dest
Store ST0
FISUB
src
FISUBR
src
-= src
ST0 =src -ST0
ST0
FLD src
FLD1
FLDCW
src
Format
I16 I32
I16 I32 I64
I16 I32
I16 I32
STn FDE
I16
Push on Stack
FLDPI
FLDZ
ST0 *= src
STn FD
dest *= STO
STn
FMULP dest[,ST0]
dest *= ST0
STn
FRNDINT
Round ST0
FMUL
src
= ST0
=
bST1c
FSCALE
ST0
FSQRT
ST0
FST dest
Store ST0
FSTP dest
Store ST0
FSTCW dest
I16
FSTSW dest
I16 AX
STO
FSUBP dest[,ST0]
-= src
dest -= STO
dest -= ST0
ST0 =src-ST0
dest = ST0-dest
dest = ST0-dest
FTST
FXCH dest
FSUB
src
src
ST0
STn FD
STn FDE
STn FD
STn
STn
STn FD
STn
STn
STn
Index
ADC, 37, 54
ADD, 13, 36
AND, 50
array1.asm, 99102
arrays,
95115
accessing, 96102
defining, 9596
local variable, 96
static, 95
multidimensional, 103106
parameters, 105106
binary, 12
addition, 2
bit operations
AND, 50
assembly, 5253
C,5657
NOT, 51
OR, 50
shifts, 4750
arithmetic shifts, 48
logical shifts, 4748
rotates, 49
XOR, 51
branch prediction, 53
bss segment, 21
BSWAP, 59
BYTE, 16
byte, 4
C driver, 19
181
C++, 150171
copy constructor,
161
extern C, 153
inheritance, 166171
inline functions, 154156
see meth-
ods
name
mangling, 151153
polymorphism, 166171
references, 153154
84
cdecl, 84
stdcall, 84
C, 21, 71, 8084
labels, 82
parameters, 82
registers, 81
return values, 83
Pascal, 71
register, 84
standard call, 84
stdcall, 72, 84, 171
CBW, 31
CDQ, 31
CLC, 37
182
CLD, 106
clock, 5
CMP, 3738
CMPSB, 109
CMPSD, 109
CMPSW, 109
code segment, 21
COM, 171
comment, 12
compiler, 5,11
Borland, 22, 23
DJGPP, 22, 23
gcc, 22
attribute
149
Microsoft, 22
pragma
Watcom, 83
method
one, 6061
80x86, 6
CWD, 31
CWDE, 31
data segment, 21
debugging, 1618
DEC, 13
decimal, 1
directive, 1315
%define, 14
DX, 14, 95
data, 1415
DD, 15
DQ, 15
equ, 13
extern, 77
global, 21, 78, 80
RESX, 14, 95
TIMES, 15, 95
INDEX
DIV, 34, 48
do while loop, 43
DWORD, 16
FADD, 126
FADDP, 126
FCHS, 130
FCOM, 129
FCOMI, 130
FCOMIP, 130, 140
FCOMP, 129
FCOMPP, 129
FDIV, 128
FDIVP, 128
FDIVR, 128
FDIVRP, 128
FFREE, 126
FIADD, 126
FICOM, 129
FICOMP, 129
FIDIV, 128
FIDIVR, 128
FILD, 125
FIST, 126
FISUB, 127
FISUBR, 127
FLD, 125
FLD1, 125
FLDCW, 126
FLDZ, 125
one, 120
IEEE, 119122
floating point
INDEX
coprocessor,
124139
127
comparisons, 128130
hardware, 124125
loading and storing data, 125
126
multiplication and division, 128
FMUL, 128
FMULP, 128
FSCALE, 130, 141
FSQRT, 130
FST, 126
FSTCW, 126
FSTP, 126
FSTSW, 129
FSUB, 127
FSUBP, 127
FSUBR, 127
FSUBRP, 127
FTST, 129
FXCH, 126
gas, 157
hexadecimal, 34
I/O, 1618
asm io library,
1618
dump math, 18
dump
dump
mem, 17
regs, 17
dump stack, 17
print char, 17
print int, 17
print nl, 17
print string, 17
read char, 17
read int, 17
IDIV, 34
if statment, 42
immediate, 12
IMUL, 3334
INC, 13
183
arrays,
98102
integer, 2738
comparisons, 3738
division, 34
representation, 2733
ones complement, 28
signed magnitude, 27
twos complement, 2830
interrupt, 10
JC, 39
JE, 41
JG, 41
JGE, 41
JL, 41
JLE, 41
JMP, 3839
JNC, 39
JNE, 41
JNG, 41
JNGE, 41
JNL, 41
JNLE, 41
JNO, 39
JNP, 39
JNS, 39
JNZ, 39
JO, 39
JP, 39
JS, 39
JZ, 39
label, 1416
LAHF, 129
LEA, 83, 103
184
linking, 23
LOOPE, 41
LOOPNE, 41
LOOPNZ, 41
LOOPZ, 41
memory, 45
pages, 10
segments, 9,10
virtual, 9,10
memory.asm,
111115
memory:segments, 9
methods, 156
mnemonic, 11
MOV, 12
MOVSB, 108
MOVSD, 108
MOVSW, 108
MOVSX, 31
MOVZX, 31
multi-module
NASM, 12
NEG, 34, 55
nibble, 4
NOT, 52
programs,
7780
opcode, 11
OR, 51
prime.asm, 4345
prime2.asm, 135139
protected mode
16-bit, 9
INDEX
32-bit, 10
quad.asm, 130133
QWORD, 16
RCL, 49
RCR, 49
read.asm, 133135
real mode, 89
recursion, 8990
register, 5,78
32-bit, 8
EDI, 107
EDX:EAX, 31, 34, 37, 83
EFLAGS, 8
EIP, 8
ESI, 107
FLAGS, 7,3738
CF, 38
DF, 106
OF, 38
PF, 39
SF, 38
ZF, 38
index, 7
IP, 7
segment, 7,8,107
stack pointer, 7,8
REP, 108109
REPE, 110
REPNE, 110
see REPNE
see REPE
REPNZ,
REPZ,
RET, 6970, 72
ROL, 49
ROR, 49
SAHF, 129
SAL, 48
SAR, 48
SBB, 37
SCASB, 109
INDEX
185
SCASD, 109
16 SCASW, 109
WORD,
word,
8 SCSI, 147149
SETxx, 54
60 SETG, 55
51 SHL, 47
SHR, 47
skeleton file, 25
speculative execution, 53
stack, 6876
local variables, 7576, 8283
parameters, 7073
startup code, 23
STD, 106
storage types
XCHG,
XOR,
automatic, 91
global, 91
register, 93
static, 91
volatile, 93
STOSB, 107
STOSD, 107
structures, 143150
alignment, 145146
bit fields, 146150
offsetof(), 144
SUB, 13, 36
subprogram, 6693
calling, 6976
reentrant, 89
subroutine,
see subprogram
TASM, 12
TCP/IP, 59
TEST, 51
text segment,
arithmetic, 3337
TWORD, 16
UNICODE, 59
while loop, 43