Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Kotebe Metropolitan University (KMU)

Microprocessor and Assembly Programming


Detail Demonstration of CPU Cycle with Assembly programming
Objectives
 To give introductory concept on CPU cycle and assembly language
 To show how to use ELASS assembler (Elass & Elink), debug debugger and editor.
 To demonstrate the phase (life cycle) of assembly language programs.
PART I: CPU CYCLE

Fig. state diagram showing CPU cycle.


Fetch cycle basically involves read the next instruction from the memory into the CPU and
along with that update the contents of the program counter. In the execution phase, it
interprets the opcode and performs the indicated operation. The instruction fetch and
execution phase (including decode) together known as instruction cycle.

In cases, where an instruction occupies more than one word, step 1 and step 2 can be
repeated as many times as necessary to fetch the complete instruction. In these cases, the
execution of an instruction may involve one or more operands in memory, each of which
requires a memory access. Further, if indirect addressing is used, then additional memory
access is required.

The fetched instruction is loaded into the instruction register. The instruction contains bits
that specify the action to be performed by the processor. The processor interprets (decodes)
the instruction and performs the required action. In general, the actions fall into four
categories:
 Processor-memory: Data may be transferred from processor to memory or from
memory to processor. Called Data Storage.
 Processor-I/O: Data may be transferred to or from a peripheral device by transferring
between the processor and an I/O module. Called Data Movement.
 Data processing: The processor may perform some arithmetic or logic operation on
data. Called Data Processing.

1
 Control: An instruction may specify that the sequence of execution be altered. Called
Control.
The main line of activity consists of alternating instruction fetch and instruction execution
activities. After an instruction is fetched, it is examined to determine if any indirect
addressing is involved. If so, the required operands are fetched using indirect addressing.

The execution cycle of a particular instruction may involve more than one reference to
memory. Also, instead of memory references, an instruction may specify an I/O operation.
With these additional considerations the basic instruction cycle can be expanded with more
details view in the above figure. The figure is in the form of a state diagram.

PART TWO: CPU CYCLE WITH ASSEMBLY PROGRAMMING


Introductory note
As programming languages become lower and lower, there exists one to one correspondence
between the language (assembly languages) and the machine. So we have to understand low
level abstraction of the computer in order to understand low level programming language
like assembly languages.
First of all: Registers
 For your experiment, you can use 8086/8088 computer (CPU) is used.
 Registers are high speed storage locations inside the microprocessors (made of flip
flops).
 In 8086/8088 machine, 14 registers exist – each having 16 bits.

16 bit register LSB = least Significant Bit; HSB = Highest Significant Bit

HSB LSB

AX BX CX DX General Purpose

ES CS DS SS Segment Registers

SI DI BP Index Registers

IP SP Special Purpose

F Flag Registers

Fig: Intel 8086 register set.


All the registers are 16 bit registers, numbered from left to right. More than 90% we store
data inside registers. We rarely use (declare) variables in RAM (why?). So it is important to

2
study the behaviors of registers, specifically their value and possible operations. Note that
registers are core to assembly programming.
Debuggers is a program that allows us to examine the contents of registers, memory and to
step through a program one statement at a time to see what is going on. It allows us
 See current contents of registers
 See type of data on memory
 Trace the program (like F8 in C++)
Examples of debuggers are: Code View debuggers, borland T debugger, debug debugger
which is part of O/S debugger. You can use debugger for your experiment. In debug
debugger, the following commands are commonly in use:
 debug – open the debug debugger. Used on MS – DOS command line interpreter (CLI)
 r – to display register and the next instruction to be executed.
 d – dump memory, see the content of memory registers.
 t/p – trace through the program statements. t and p are the same except p is used for
instruction with interrupt, for example at the last instruction.
 q – quit and return back
Use of registers
Each data register (general purpose register) can be divided in to two register. For example,
AX register is divided in to two registers called AH and AL. This don’t apply for other registers.
AX register divided into AH and AL
AH AL
AH – Higher A, AL – Lower A
Data registers are used for data movement and arithmetic operations. It is run timer error if
other registers are used for such operations. If 8 bit operation is required, AH and AL are
used, if 16 bit operation is required, AX is used if 32 bit operation is required, merge two data
register like EX:AX.
Example: write a program that adds two numbers and store the sum in a variable named
result.
First of all we need editor (a place where we write code). Any ASCII understanding editor can
be used. For example: MS window editor like notepad and word pad, MS – Dos editor, C++
turbo editor.
Structure of an assembly language program as compared to C++ program.
C++ program structure assembly language structure
Preprocessor directive directives
Global variables declaration code segment
Void main () instructions to be executed by the
{ local variable declaration microprocessor at run time
List of statements code ends
} data segment
global variable declaration
data ends
stack segment
stack size definition
stack ends

3
In the C++ program code and data are mixed. Actually when C++ program compiled, the code
and data segment are segregated and have similar structure with assembly. Even though
there is no explicit definition of stack segment in C++ program. Any program cannot run
without stack. C++ program stack is controlled by the operating system while those of
assembly language are controlled by the programmer. As programs become low level, code
and data are separated. Program = Code + Data. Now it is time to begin coding. One thing
that you should remember is that, unlike C++, assembly language is CASE INSENSITIVE. The
other point is, unlike to C++, semicolon is used as COMMENT.

hex $ ; directive which says treat numbers written after $ as hexdecimal


code segment
mov ax, data ; these two lines make DS to point to data segment.
mov ds, ax ; the two lines should always exist in any assembly program.

mov ah, 5 ; these three lines add 5 and 3 and store sum on result
add ah,3
mov result, ah

mov ax, $4c00 ; these two lines quit the program. If not written infinite loop
int $21 ; the two lines should always exist in any assembly program.
code ends
data segment
result db 0 ; declaration/initialization of variable in memory (pseudo-opcode)
data ends
stack segment ; these three lines are for stack segment. (of 16 bytes – how?)
stack $10 ; the three lines should always exist in any assembly program.
stack ends

1) Write the above assembly program in note pad (your editor) and save it as sample.asm.
But before you save it. Put your ELASS folder in C: drive and save sample.asm in that
ELASS folder. Write an equivalent C++ program in paper and compare your code with the
above assembly program. Also identify directives, instructions and pseudo – opcodes in
the programs. This phase is CODING PHASE – output is .asm file
2) The above assembly program is ASCII file but computers do not understand ASCII files.
Hence, it should be converted to machine code (object file). To do so, we need assembler
(like compilers in C++). Assembly language is a program that converts source code written
in assembly language in to object file (machine code). Some examples of assembler are
TASM (turbo assembler), MASM (Microsoft macro assembler), and ELASS. You can use
ELASS for your experiment. To assemble sample.asm, first open MS-DOS CLI, then change
working directory to ELASS by writing ‘CD C:\ELASS’ in the MS –DOS prompt. Then,
assemble sample.asm by writing ‘Elass sample.asm’. At this point you should find
sample.obj in the same location as sample.asm. Just like you face compile time error
when you compile C++ file, here you may see assemble time error. You (the programmer)
is expected to fix all the errors before you run the program This phase is ASSEMBLING
PHASE – result .obj file.

4
3) Processors can understand .obj files but can’t run them properly and gives you correct
result.
 This is because; other internal and external functions/files should be
merged/included to sample.asm to form a single executable file. This is done by
the linker.
 Linker always create program prefix pointer, which is 256 bytes – an external
segment (ES) at the top. It put information about the program (sample.asm).
Hence, an executable file has four segments. ES, CS, DS, SS.
 To link sample.obj write ‘elink sample.obj’. Then, you should find sample.exe at
the same location as sample.obj.
This phase is LINKING PHASE – result .exe file.
4) Now the exe file (sample.exe) is ready. It is time to run the program by writing
‘sample.exe’ on the MS – DOS prompt. When you run the program you will see additional
information about the sample.exe like the size of each segment and the full file size. Don’t
surprise if you don’t see any output when you run, this is because your program,
sample.asm, is not programmed to output a result.

Figure: The phase of assembly program. Compare it with the phases of C++ program phase

5) Running the program is not enough for an assembly programmer. He/she should dig into
the program and diagnose the effect of each line code in the program. Now let’s see how
sample.asm, sample.obj and sample.exe looks like in memory.

5
A
d
d
r
e
s
s
I
n
c
r
e
a
s
e
I
n
T
h
i
s
D
i
r
e
c
t
i
o
n

Figure: sample.asm, sample.obj and sample.exe in memory respectively.


In the above figure, notice the following points
 Address increase in the downward (for human) direction, but upward for computer.
 Sample.asm contains ASCII text. It is a text file where text is represented by its ASCII
value in binary. Note that only binary numbers can be stored in RAM.
 Sample.obj and sample.exe is binary file where the sample.asm’s text is encoded into
a binary format that can be interpreted or executed by machine. It is common to
represent the binary numbers in hexadecimal in object and executable files. (why?)
 Sample.obj contains three segments (CS, DS and SS) and function calls. Sample.exe
contains four segments (ES, CS, DS, and SS) and resolve/expand some internal function
calls and addresses. Still some addresses are not resolved / expanded. (identify them)
 As we go from sample.asm to sample.obj instruction length is decreased. For example,
mov ax,data in sample.asm containing is 11 bytes converted to B807E2 in sample.obj
file containing 3 Bytes.

Note that before running sample.exe the loader (which is part of the operating system) first
search for sample.exe in the working directory (C:\Elass), if it successfully find it, it loads
sample.exe on memory (RAM). If it does not find it, report an error. During loading, the first

6
thing the loader do is: it loads segment registers (ES, CS, DS, SS) in CPU with the beginning
address of each segment in memory (ES, CS, DS, SS) of sample.exe. However, it loads DS
register with beginning address of ES segment, not with begging address of DS. The first two
line of code in sample.asm are used to make DS register point to the beginning address of DS.
This phase is RUNNING PHASE – result is the effect of the program is actually shown. In one
exe file, there is always one ES and one SS. But depending on the complexity of the program,
there may be more than one CS and DS. At this point the CPU and RAM looks like the
following:

.
. Extra
. Segment

B8
07
E2
E8
D8
AX BX CX DX .
.
. Core
ES CS DS SS Segment

SI DI BP
A3
IP SP 5D
B7
F .

00
.
. Data
. Segment

.
. Stack
. Segment
.

Figure: RAM and CPU after the loader leaves and the FIRST TWO INSTRUCTIONS (why?)
executed.

7
Then the Dos loader leaves CPU takes the job. CPU begin executing instruction from the first
offset of the code segment in RAM whose address is found din the CS register already loaded
by the Dos loader. Why it begin from CS but not from other segment?
IP (instruction pointer) is a special purpose register which contain OFFSET of the next
instruction to be executed by the microprocessor. Anything in code segment is accessed by
offset amount in IP register value from CS register value, CS:IP. IP control code segment.
Likewise SP control stack segment as SS:SP. What control DS? Note that the IP is always
loaded with 0000 after the loader leaves. Why?
Remember CPU cycles: Fetch, Decode, Execute and Return Value (optional). During fetch, the
CPU does the following tasks:
 Locate currently executing instruction: CS:IP CS + IP
 Fetch the instruction at the located position.
 Adjust the IP with the fetched instruction size.
o IP = IP + x where x is the size of the fetched instruction in byte, depending on
the complexity of the instruction.
Now it is time to diagnose/trace how CPU executes the sample.exe.
1) In MS-DOS, Write ‘debug sample.exe’, this EXECUTES THE FIRST INSTRUCTION and
display a subcommand - the hyphen symbol.
2) Look in to the segment registers and IP at this point (by using the r command) and
write their value in paper. DS register points to ES in RAM. This is shown by ES=DS.
3) Also look in to data segment of the code by using d command by writing ‘d DS:0’. This
means dump memory location beginning from DS offset 0. The value of result variable
4) Then use t command to trace (and execute) the next instruction. You also see
additional information when you enter t command. The address of the next
instruction (in RAM) to be executed in CS:IP format, the instruction itself (in this case
mov ds,ax and the same instruction in binary (hexadecimal) format. You may practice
your understanding by guessing the value of IP in each execution. Don’t forget to use
the p command (but not the t command) for tracing interrupt (or int) instructions.
5) In case you need to stop execution (or tracing), use q command.
6) Repeat step 2 to 5 until the program terminates. You can see the following points as
the instruction is executed.
a. IP = 0000 when the program begins.
b. First DS points to Extra segment, but later it points to data segment.
c. You should be able to correctly guess the value of IP at each step.
d. You should always see how the value of result variable varies as instruction
goes on.
e. Practice with fetch, decode, execute and return value cycles.
7) Also try ‘-u sample.exe’ command and see the output.

FINAL COMMENT: Like Any programming language, Practice makes you perfect. But when we
come to Assembly programming, you need at least to DOUBLE YOUR PRACTICE than that you
make for C++. As usual the Final comment for students is BEGIN READING TODAY.

You might also like