What Is Assembly Language?: Introduction To The GNU/Linux Assembler and Linker For Intel Pentium Processors

What is Assembly Language?
Introduction to the GNU/Linux

assembler and linker
for Intel Pentium processors
High-Level Language
Most programming nowdays is done using
so-called high-level languages (such as
FORTRAN, BASIC, COBOL, PASCAL, C,
C++, JAVA, SCHEME, Lisp, ADA, etc.)
These languages deliberately hide from
a programmer many details concerning
HOW his problem actually will be solved
by the underlying computing machinery
The BASIC language

Some languages allow programmers to
forget about the computer completely!
The language can express a computing
problem with a few words of English, plus
formulas familiar from high-school algebra
EXAMPLE PROBLEM: Compute 4 plus 5
The example in BASIC

1
2
3
4
5
LET X = 4
LET Y = 5
LET Z = X + Y
PRINT X, +, Y, =, Z
END
Output:
4+5=9
The C language
Other high-level languages do require a small
amount of awareness by the program-author of
how a computation is going to be processed
For example, that:
- the main program will get linked with a
library of other special-purpose subroutines
- instructions and data will get placed into
separate sections of the machines memory
- variables and constants get treated differently
- data items have specific space requirements
Same example: rewritten in C

#include <stdio.h>
// needed for printf()
int
int
// initialized variables
// unitialized variable
x = 4, y = 5;
z;
int main()
{
z = x + y;
printf( %d + %d = %d \n, x, y, z );
}
ends versus means

Key point: high-level languages let programmers
focus attention on the problem to be solved, and
not spend effort thinking about details of how a
particular piece of electrical machiney is going to
carry out the pieces of a desired computation
Key benefit: their problem gets solved sooner
(because their program can be written faster)
Programmers dont have to know very much
about how a digital computer actually works
computer scientist vs. programmer

But computer scientists DO want to know
how computers actually work:
-- so we can fix computers if they break
-- so we can use the optimum algorithm
-- so we can predict computer behavior
-- so we can devise faster computers
-- so we can build cheaper computers
-- so we can pick one suited to a problem
A machines own language

For understanding how computers work,
we need familiarity with the computers
own language (called machine language)
Its LOW-LEVEL language (very detailed)
It is specific to a machines architecture
It is a language spoken using voltages
Humans represent it with zeros and ones
Example of machine-language
Heres what a program-fragment looks like:
10100001 10111100 10010011 00000100
00001000 00000011 00000101 11000000
10010011 00000100 00001000 10100011
11000000 10010100 00000100 00001000
It means:
z = x + y;
Incomprehensible?
Though possible, it is extremely difficult,
tedious (and error-prone) for humans to
read and write raw machine-language
When unavoidable, a special notation can
help (called hexadecimal representation):
A1 BC 93 04 08
03 05 C0 93 04 08
A3 C0 94 04 08
But still this looks rather meaningless!
Hence: assembly language

There are two key ideas:
-- mnemonic opcodes: we employ abbreviations
of English language words to denote operations
-- symbolic addresses: we invent meaningful
names for memory storage locations we need
These make machine-language understandable
to humans if they know their machines design
Lets see our example-program, rewritten using
actual assembly language for Intels Pentium
Simplified Block Diagram

Central
Processing
Unit
Main
Memory
system bus
I/O
device
I/O
device
I/O
device
I/O
device
Pentiums visible registers

Four general-purpose registers:
eax, ebx, ecx, edx
Four memory-addressing registers:

esp, ebp, esi, edi
Six memory-segment registers:
cs, ds, es, fs, gs, ss
An instruction-pointer and a flags register:
eip, eflags
The Fetch-Execute Cycle

main memory
central processor
Temporary
Storage
(STACK)
ESP
Program
Variables
(DATA)
Program
Instructions
(TEXT)
EAX
EAX
EAX
EAX
EIP
the system bus
Define symbolic constants

.equ
.equ
.equ
device_id, 1
sys_write, 4
sys_exit, 0
our programs data section

.section
x: .int
y: .int
z: .int
fmt: .asciz
buf: .space
len: .int
.data
4
5
0
%d + %d = %d \n
80
0
our programs text section

.section
.text
_start:
# comment: assign z = x + y
movl
x, %eax
addl
y, %eax
movl
%eax, z
text section (continued)

# comment: prepare programs output
pushl
z
# arg 5
pushl
y
# arg 4
pushl
x
# arg 3
pushl
$fmt
# arg 2
pushl
$buf
# arg 1
call
sprintf
# function-call
addl
$20, %esp # discard args
movl
%eax, len # save return-value
text section (continued)

# comment: request kernel assistance
movl
$sys_write, %eax
movl
$device_id, %ebx
movl
$buf, %ecx
movl
len, %edx
int
$0x80
text section (concluded)

# comment: request kernel assistance
movl
$sys_exit, %eax
movl
$0, %ebx
int
$0x80
# comment: make label visible to linker
.global
_start
.end
program translation steps

demo.s
program
source
module
demo.o
assembly
program
object
module
object module library

object module library
other object modules
demo
linking
the
executable
program
The GNU Assembler and Linker

With Linux you get free software tools for
compiling your own computer programs
An assembler (named as): it translates
assembly language (called the source code)
into machine language (called the object code)
$ as demo.s -o demo.o
A linker (named ld): it combines object files
with function libraries (if you know which ones)
What must programmer know?
Needed to use CPU register-names (eax)

Needed to know space requirements (int)
Needed to know how stack works (pushl)
Needed to make symbol global (for linker)
Needed to understand how to quit (ret)
And of course how to use system tools:
(e.g., text-editor, assembler, and linker)
Summary
High-level programming (offers easy and
speedy real-world problem-solving)
Low-level programming (offers knowledge
and power in utilizing machine capabilities)
High-level language hides lots of details
Low-level language reveals the workings
High-level programs: readily portable
Low-level programs: tied to specific CPU
In-class exercise #1
Download the source-file for demo1, and
compile it using the GNU C compiler gcc:
$ gcc demo1.c -o demo1
Website: http://cs.usfca.edu/~cruse/cs210/
Execute this compiled applocation using:
$ ./demo1
Download the two source-files needed for our
demo2 application (i.e., demo2.s and
sprintf.s), and assemble them using:
$ as demo2.s -o demo2.o
$ as sprintf.s -o sprintf.o
Link them using:
$ ld demo2.o sprintf.o -o demo2
And execute this application using: $ ./demo2
Use your favorite text-editor (e.g., vi) to
modify the demo2.s source-file, by using
different initialization-values for x and y
Reassemble your modified demo2.s file,
and re-link it with the sprintf.o object-file
Run the modified demo2 application, and
see if it prints out a result that is correct

What Is Assembly Language?: Introduction To The GNU/Linux Assembler and Linker For Intel Pentium Processors

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

What Is Assembly Language?: Introduction To The GNU/Linux Assembler and Linker For Intel Pentium Processors

Uploaded by

Copyright:

Available Formats

What is Assembly Language?

Introduction to the GNU/Linux

The BASIC language

The example in BASIC

Same example: rewritten in C

// needed for printf()

ends versus means

computer scientist vs. programmer

A machines own language

But still this looks rather meaningless!

Hence: assembly language

Simplified Block Diagram

Pentiums visible registers

Four memory-addressing registers:

The Fetch-Execute Cycle

the system bus

Define symbolic constants

our programs data section

our programs text section

text section (continued)

text section (continued)

text section (concluded)

program translation steps

object module library

The GNU Assembler and Linker

What must programmer know?

Needed to use CPU register-names (eax)

You might also like