Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Chapter 2 Assemblers

-- Machine-Independent Assembler Features

Tzong-Jye Liu
Department of Information Engineering and Computer Science
Feng Chia University
tjliu@fcu.edu.tw

Outline

n Literals
n Symbol Defining Statement
n Expressions
n Program Blocks
n Control Sections and Program Linking

System Programming 2
Literals
n Consider the following example
:
LDA FIVE
:
FIVE WORD 5
:
n It is convenient to write the value of a constant operand as a
part of instruction
:
LDA =X’05’
:

n If the same constant operands (strings, floating points, large


integers) used in more than one place, it is better to store them
in only one copy of data area

System Programming 3

Literals

n A literal is identified with the prefix =, followed


by a specification of the literal value
n Examples: (Figure 2.10, pp.68)
45 001A ENDFIL LDA =C’EOF’ 032010
nixbpe disp
000000 110010 010
93 LTORG
002D * =C’EOF’ 454F46
215 1062 WLOOP TD =X’05’ E32011
230 106B WD =X’05’ DF2008
1076 * =X’05’ 05

System Programming 4
Literals vs. Immediate Operands

n Literals
n The assembler generates the specified value as a
constant at some other memory location
45 001A ENDFIL LDA =C’EOF’ 032010
n Immediate Operands
n The operand value is assembled as part of the machine
instruction
55 0020 LDA #3 010003

n We can have literals in SIC, but immediate


operand is only valid in SIC/XE
System Programming 5

Literal Pools
n Normally literals are placed into a pool at the
end of the program
n see Fig. 2.10 (END statement)
n In some cases, it is desirable to place literals
into a pool at some other location in the object
program
n Assembler directive LTORG
n When the assembler encounters a LTORG
statement, it creates a literal pool
(containing all literal operands used since
previous LTORG)
n Reason: keep the literal operand close to the instruction
n Otherwise PC-relative addressing may not be
allowed
System Programming 6
Duplicate literals
n The same literal used more than once in the program
n Only one copy of the specified value needs to be stored
n For example, =X’05’ in Figure 2.10 (pp. 68)
n How to recognize the duplicate literals
n Compare the character strings defining them
n Easier to implement, but has potential problem
(see next)
n Ex: =X’05’
n Compare the generated data value
n Better, but will increase the complexity of the
assembler
n Ex: =C’EOF’ and =X’454F46’

System Programming 7

Problem of duplicate-literal recognition


n ‘*’ denotes a literal refer to the current value
of program counter
BUFEND EQU * ( P.68 Fig. 2.10 )
n There may be some literals that have the
same name, but different values
BASE *
LDB =* (cf. P.58 #LENGTH)
n The literal =* repeatedly used in the program has the
same name, but different values
n The literal “=*” represents an “address” in the
program, the assembler must generate the
appropriate “modification records”.
System Programming 8
Literal table

n LITTAB
n Content
n Literal name
n Operand value and length
n Address
n LITTAB is often organized as a hash table, using the
literal name or value as the key

System Programming 9

Implementation of Literal
n Pass 1
n Build LITTAB with literal name, operand value and
length, leaving the address unassigned
n When LTORG or END statement is encountered, assign
an address to each literal not yet assigned an address
n The location counter is updated to reflect
the number of bytes occupied by each
literal
n Pass 2
n Search LITTAB for each literal operand encountered
n Generate data values using BYTE or WORD statements
n Generate modification record for literals that represent
an address in the program
System Programming 10
Example: (pp. 67, Figure 2.9)
SYMTAB & LITTAB
SYMTAB LITTAB

System Programming 11

Symbol-Defining Statements
n Assembler directive EQU
n Allows the programmer to define symbols and specify their values
Ex: symbol EQU value
n To improve the program readability, avoid using the magic
numbers, make it easier to find and change constant values
n Replace
+LDT #4096
n with
MAXLEN EQU 4096
+LDT #MAXLEN
n Define mnemonic names for registers
n A EQU 0
n X EQU 1
n Expression
n MAXLEN EQU BUFEND-BUFFER
System Programming 12
Symbol-Defining Statements
n Assembler directive ORG
n Allow the assembler to reset the PC to values
Ex: ORG value
n When ORG is encountered, the assembler resets its
LOCCTR to the specified value
n ORG will affect the values of all labels defined until the
next ORG
n If the previous value of LOCCTR can be automatically
remembered, we can return to the normal use of
LOCCTR by simply write
ORG

System Programming 13

Example: using ORG (1/2)


n In the data structure
n SYMBOL: 6 bytes
n VALUE: 3 bytes (one word)
n FLAGS: 2 bytes

n We want to refer to every field of each entry


n If EQU statements are used
STAB RESB 1100
SYMBOL EQU STAB
Offset from STAB
VALUE EQU STAB+6
FLAG EQU STAB+9
System Programming 14
Example: using ORG (2/2)
n If ORG statements are used
STAB RESB 1100
ORG STAB Set LOCCTR to STAB
SYMBOL RESB 6
VALUE RESW 1 Size of each field
FLAGS RESB 2
ORG STAB+1100 Restore LOCCTR

n We can fetch the VALUE field by


LDA VALUE,X
n X = 0, 11, 22, … for each entry
System Programming 15

Forward-Reference Problem

n Forward reference is not allowed for both


EQU and ORG.
n All terms in the value field must have been defined
previously in the program.
n The reason is that all symbols must have been defined
during Pass 1 in a two-pass assembler.

• Allowed: ALPHA RESW 1


BETA EQU ALPHA
• Not allowed: BETA EQU ALPHA
ALPHA RESW 1

System Programming 16
Expression
n The assemblers allow “the use of expressions as
operand”
n The assembler calculates the expressions and products a single
operand address or value
n Expressions consist of
n Operator
n +,-,*,/ (division is usually defined to produce an
integer result)
n Individual terms
n Constants
n User-defined symbols
n Special terms, e.g., *, the current value of LOCCTR
n Examples
n MAXLEN EQU BUFEND-BUFFER
n STAB RESB (6+3+2)*MAXENTRIES
System Programming 17

Relocation Problem in Expressions


n Values of terms can be
n Absolute (independent of program location)
n constants
n Relative (to the beginning of the program)
n Address labels
n * (value of LOCCTR)
n Expressions can be
n Absolute
n Only absolute terms
n MAXLEN EQU 1000
n Relative terms in pairs with opposite signs for each
pair BUFEND +
n MAXLEN EQU BUFEND-BUFFER BUFFER -
n Relative
n All the relative terms except one can be paired as
described in “absolute”. The remaining unpaired
relative term must have a positive sign.
n STAB EQU OPTAB + (BUFEND – BUFFER)
System Programming 18
Restriction of Relative Expressions

n No relative terms may enter into a


multiplication or division operation
n 3 * BUFFER
n Expressions that do not meet the conditions
of either “absolute” or “relative” should be
flagged as errors.
n BUFEND + BUFFER
n 100 - BUFFER

System Programming 19

Handling Relative Symbols in SYMTAB

n To determine the type of an expression, we


must keep track of the types of all symbols
defined in the program.
n We need a “flag” in the SYMTAB for indication.

Symbol Type Value • Absolute value


RETARD R 0030 BUFEND - BUFFER
BUFFER R 0036 • Illegal
BUFEND R 1036 BUFEND + BUFFER
MAXLEN A 1000 100 - BUFFER
3 * BUFFER

System Programming 20
Program Blocks
n Allow the generated machine instructions and
data to appear in the object program in a
different order
n Separating blocks for storing code, data, stack, and
larger data block
n Program blocks v.s. Control sections
n Program blocks
n Segments of code that are rearranged
within a single object program unit
n Control sections
n Segments of code that are translated into
independent object program units

System Programming 21

Program Blocks

n Assembler directive: USE


n USE [blockname]
n At the beginning, statements are assumed to be part of
the unnamed (default) block
n If no USE statements are included, the entire program
belongs to this single block
n Each program block may actually contain several
separate segments of the source program
n Example: pp. 79, Figure 2.11

System Programming 22
Program Blocks
n Assembler rearrange these segments to gather
together the pieces of each block and assign address
n Separate the program into blocks in a particular order
n Large buffer area is moved to the end of the object program
n Program readability is better if data areas are placed in the source
program close to the statements that reference them.
n Example: pp, 81, Figure 2.12
n Three blocks are used
n default: executable instructions
CDATA n CDATA: all data areas that are less in length
n CBLKS: all data areas that consists of larger
CBLKS
blocks of memory

System Programming 23

Example: pp. 81, Figure 2.12


(default) block Block number

CDATA block

CBLKS block

System Programming 24
Example: pp. 81, Figure 2.12

(default) block

CDATA block

System Programming 25

Example: pp. 81, Figure 2.12

(default) block

CDATA block

System Programming 26
Rearrange Codes into Program Blocks
n Pass 1
n A separate location counter for each program block
n Save and restore LOCCTR when switch between blocks
n At the beginning of a block, LOCCTR is set to 0
n Assign each label an address relative to the start of the block
n Store the block name or number in the SYMTAB along with the
assigned relative address of the label
n Indicate the block length as the latest value of LOCCTR for each block
at the end of Pass1
n Assign to each block a starting address in the object program by
concatenating the program blocks in a particular order
Block name Block number Address Length
(default) 0 0000 0066
CDATA 1 0066 000B
CBLKS 2 0071 1000
System Programming 27

Rearrange Codes into Program Blocks

n Pass 2
n Calculate the address for each symbol relative to the
start of the object program by adding
n The location of the symbol relative to the
start of its block
n The starting address of this block

System Programming 28
Example of Address Calculation
20 0006 0 LDA LENGTH 032060
n The value of the operand (LENGTH)
n Address 0003 relative to Block 1 (CDATA)
n Address 0003+0066=0069 relative to program
n When this instruction is executed
n PC = 0000 (starting addr. Of default block) + 0009
n disp = 0069 – 0009 = 0060
n op nixbpe disp
000000 110010 060 => 032060

SYMTAB
label name block num addr. Flag
LENGTH 1 0003
…. …. …. ….
System Programming 29

Program Blocks Loaded in Memory


(P.84 Fig 2.14)
Program loaded Relative
Line Addr.
Source program Object program in memory
address
5 0000 0000
Default(1) Default(1) Default(1)

0027
70 0024 Default(2) Default(2)
95 0000
Not present CDATA(1)
in object 004D
100 0003 CDATA(2) Default(3)
program 105 0000
CBLKS(1) Default(3)
125 0027 0066
Default(2) CDATA(1)
006C
CDATA(3) CDATA(2)
006D
180 0047 CDATA(3)
185 0006 0071
CDATA(2) CBLKS(1)
210 004D
Default(3)

245 0063
253 0007
CDATA(3)
1071
System Programming 30
Object Program

n It is not necessary to physically rearrange the


generated code in the object program
n The assembler just simply insert the proper load address in
each Text record.
n The loader will load these codes into correct place

Default(1)

Default(2)

CDATA(2)
Default(3)
CDATA(3)

System Programming 31

Control Sections and Program Linking


n Control sections
n can be loaded and relocated independently of the other
control sections
n are most often used for subroutines or other logical
subdivisions of a program
n the programmer can assemble, load, and manipulate
each of these control sections separately
n because of this, there should be some means for linking
control sections together
n assembler directive: CSECT
secname CSECT
n separate location counter for each control section

System Programming 32
External Definition and Reference
n Instructions in one control section may need to refer to
instructions or data located in another section
n External definition
EXTDEF name [, name]
n EXTDEF names symbols that are defined in this control section
and may be used by other sections
n Ex: EXTDEF BUFFER, BUFEND, LENGTH
n External reference
EXTREF name [,name]
n EXTREF names symbols that are used in this control section and
are defined elsewhere
n Ex: EXTREF RDREC, WRREC
n To reference an external symbol, extended format instruction is
needed (why?)

System Programming 33

Example: pp. 86, Figure 2.15


Implicitly defined as an external symbol
first control section

System Programming 34
Example: pp. 86, Figure 2.15
Implicitly defined as an external symbol
second control section

System Programming 35

Example: pp. 86, Figure 2.15


Implicitly defined as an external symbol
second control section

System Programming 36
External Reference Handling

n Case 1 (P.87)
15 0003 CLOOP +JSUB RDREC 4B100000
n The operand RDREC is an external reference.
n The assembler
n has no idea where RDREC is
n inserts an address of zero
n can only use extended format to provide
enough room (that is, relative addressing
for external reference is invalid)
n The assembler generates information for each external
reference that will allow the loader to perform the
required linking

System Programming 37

External Reference Handling


n Case 2
190 0028 MAXLEN WORD BUFEND-BUFFER 000000
n There are two external references in the expression, BUFEND and
BUFFER.
n The assembler
n inserts a value of zero
n passes information to the loader
n Add to this data area the address of BUFEND
n Subtract from this data area the address of BUFFER

n Case 3
n On line 107, BUFEND and BUFFER are defined in the same control
section and the expression can be calculated immediately.
107 1000 MAXLEN EQU BUFEND-BUFFER

System Programming 38
Object Code of Figure 2.15

Case 1

System Programming 39

Object Code of Figure 2.15

Case 2

System Programming 40
Object Code of Figure 2.15

System Programming 41

Records for Object Program


n The assembler must include information in the object
program that will cause the loader to insert proper
values where they are required
n Define record (EXTDEF)
n Col. 1 D
n Col. 2-7 Name of external symbol defined in this control section
n Col. 8-13 Relative address within this control section (hexadecimal)
n Col.14-73 Repeat information in Col. 2-13 for other external symbols
n Refer record (EXTREF)
n Col. 1 R
n Col. 2-7 Name of external symbol referred to in this control section
n Col. 8-73 Name of other external reference symbols

System Programming 42
Records for Object Program

n Modification record
n Col. 1 M
n Col. 2-7 Starting address of the field to be modified (hexadecimal)
n Col. 8-9 Length of the field to be modified, in half-bytes (hexadecimal)
n Col.11-16 External symbol whose value is to be added to or subtracted
from the indicated field
n Control section name is automatically an external symbol, i.e.
it is available for use in Modification records.

System Programming 43

Object Program of Figure 2.15


COPY

System Programming 44
Object Program of Figure 2.15
RDREC

BUFEND - BUFFER

WRREC

System Programming 45

Expressions in
Multiple Control Sections
n Extended restriction
n Both terms in each pair of an expression must be within the same
control section
n Legal: BUFEND-BUFFER
n Illegal: RDREC-COPY
n How to enforce this restriction
n When an expression involves external references, the assembler
cannot determine whether or not the expression is legal.
n The assembler evaluates all of the terms it can, combines these to
form an initial expression value, and generates Modification
records.
n The loader checks the expression for errors and finishes the
evaluation.

System Programming 46

You might also like