Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 14

4.

Familiarity with MASM, Introduction


to Memory Segments
Part I: Background
The Microsoft Assembler package, MASM, is a programming environment that contains four major tools:
the assembler and linker; Quick help complete on-line help; Programmers Workbench and CodeView.
The programmers workbench serves as an editor that has a series of options to guide you through
assembler language program development. The Quick help program allows the developer to access
detailed online help about assembler language instructions, DOS INT 21H function calls, and BIOS
function calls. The CodeView tool is an enhanced version of DEBUG with a graphical interface that also
handles 32 bit instructions.
Processors in the 80x86 family divide the memory into segments. In real mode, the program uses a single
memory segment of 64 kB. There are six defined segments, the code, stack, data, extra, F, and G. The
instructions are stored in the code segment; the stack segment is reserved for the stack; and the data is
stored in the data segment. The physical locations are determined by the values in the registers CS, SS,
DS, ES, FS, and GS. Segments may be separate or may overlap fully or partially.
The 80x86 family supports two types of executable files COM and EXE. Each makes use of the segments
differently. The simplest form of an x86 program is the COM file. The COM file uses only a single REAL
MODE memory segment. Thus, COM programs are limited to 64 kB in length. When we write COM
files, we should ensure that the code, data and stack information are all stored in the same memory
segment. This can be accomplished in MASM by including the following directives in the assembler
language source file.
cseg segment code
assume cs:cseg, ds:cseg, ss:cseg, es:cseg
The EXE program has no file size restrictions and may contain several segments. Modular programs are
often written using several different segments. However, multiple segments must be aligned on 16 byte
boundaries since all segments begin at addresses that end with 4 binary 0s. The linker will ensure that the
multiple segments are grouped on paragraph boundaries. The figure below is an example of 4 different
segments and the addresses where they are stored
10000
20000
20030
20040
20060
CSEG1
CSEG2
DSEG
SSEG
Objectives:
Learn to:
A. Use the Programmers Workbench to create, link and assemble a program.
B. Use CodeView to debug and execute an assembler language program.
C. Use QuickHelp to access help on instructions and the assembler.
D. Meanings behind the code and data segments
Pre-Lab
Read sections 3.2, 6.1, 6.2, and 6.3 in the Uffenbeck text. What is the physical address corresponding to
DS:103fH if BX=94d0H? (physical address corresponding to DS:BX) Explain why segments must be
located on paragraph boundaries when they are loaded into memory (Hint: think about how logical
addresses are converted to physical addresses).
Lab
A.1 The Assembly Language Process Using the Command line
The following section explains how to assemble and link a file using the command line from a DOS
window. The steps are:
1. Create or edit the source code (.asm file) using any ASCII text editor. Warning -- the file must be
saved in an ASCII format - some editors like 'winword', or 'word' store the file by default in a binary
format. To save as an ASCII format in some of the microsoft editors, select output type as *.TXT but
specify the full file name as myfile.asm (the .asm extension should be used for assembly language
files).
2. Invoke the masm program to assemble the file and produce a .obj file and optionally, a .lst file.
3. Invoke the link program to produce a .exe program (or a .com program via a command line argument).
Assume we have an assembly language file called test.asm that has been saved in ASCII format. Open a
DOS window. To assemble the file, change to the directory where the file is via the 'cd' command, and
type:
C:\> masm test
If assembly is successful, this will produce a file called test.obj. If errors are present, you will be given the
line numbers where the syntax errors ocurred. You can also produce a listing file (.lst) which shows
opcodes for all instructions via:
C:\> masm test,test, test
It is a good idea to always create a .lst file output.
A .exe file must be created from the .obj file via the link program. Type:
C:\> link test
You will be prompted for file names for the Run file, List file, libraries, and Definitions file. Just hitting
<enter> for each choice will use the defaults. This will produce a test.exe file which can then be executed.
You can also produce the .exe file with no prompting from the link program via:
C:\> link test,,,,,
Use 5 commas after filename (test) to provide defaults for all other choices.
Using the command line for masm/link is probably the easiest thing to do if you are only
assembling/linking one source file. If your program is composed of multiple source files, the PWB
program (next section) is probably a better choice. Most of your labs will only consist of one source file.
Section B discusses how to use a debugger called 'codeview'. In order to view the source code of your
program within the 'codeview' debugger, you need to use some command line switches with the masm and
link programs in order to include debugging information. The switches are "/zi' for masm, and "/co' for
link as shown below:
C:\> masm /zi test,test, test
C:\> link /co test,,,,,
A.2 The Assembly Language Process Using PWB
The following section discusses a program called Programmer's WorkBench (PWB) for editing your
assembly language file and invoking the assembler Program called 'MASM''. You DO NOT HAVE TO
USE the pwb program if you do not wish to. An alternative is to edit your file with any ASCII text editor,
and invoke MASM via the DOS command line to produce .obj files. You can then invoke the 'link' program
to produce either an .exe or .com program. PWB offers a somewhat pushbutton approach to assembling
your program, and will allow you to create a 'project' that allows you to assemble/link multiple files with
one pushbutton.
PWB can be used for the development of the assembler language program. The development procedure
follows a four-step process.
1. Create or edit the source code. (.asm)
2. Assemble the program to create the object code. (.lst)
3. Link the program to create the executable code. (.exe or .com)
4. Test and debug the program.
MASM is located in c:\Program Files\MASM611.
Activate the PWB program by typing PWB at the MS-DOS command line (or Start -> Run -> pwb) Use the
pulldown menu to create a new file. Type Ex. 2.1 in the new file using the editor. Save the program by
selecting the file pulldown menu or ALT F3. Use the string .asm as the extension for the desired filename.
Once the filename and the path are selected, choose OK to accept the new filename.
The Example 4.1 uses the basic shell of an assembler language program. The shell includes the stack
segment, the data segment and the code segment. We will discuss the meanings of the various segments
and definitions later in this lab. The section identified as the code segment will be used in CodeView.
CodeView will display the code segment in symbolic form. The data segment is identified and is displayed
as data in symbolic form.
Example 4.1
Title EX 4-1 (EXE) Purpose Adds 4 bytes of data
STSEG SEGMENT
DB 32 DUP (?)
STSEG ENDS
;--------------------------
DTSEG SEGMENT
FOUR_NO DB 12H,0B5H,6CH,78H
SUM DB ?
DTSEG ENDS
;----------------------------
CDSEG SEGMENT
MAIN PROC FAR
ASSUME CS:CDSEG,DS:DTSEG,SS:STSEG
MOV AX,DTSEG
MOV DS,AX
MOV BX,OFFSET FOUR_NO ;set up BX as data ptr
MOV AL,0 ;intialize AL
ADD AL,[BX] ;add next item to AL (AL=AL+[BX])
INC BX ;point to next item (BX=BX+1)
ADD AL,[BX] ;add next item to AL (AL=AL+[BX])
INC BX ;point to next item
ADD AL,[BX] ;add next item to AL (AL=AL+[BX])
INC BX ;point to next item
ADD AL,[BX] ;add next item to AL (AL=AL+[BX])
INC BX ;point to next item
MOV SUM,AL ;store result in SUM
MOV AH,4CH ;set up return
INT 21H ;invoke interrupt
INT 20H ;breakpoint, exit
MAIN ENDP
CDSEG ENDS
END MAIN
Now lets configure the program for the desired assembly format. Use the Options -> Projects Template
-> Set Project Template from the pulldown menu. In this window, the runtime support section allows the
choice is NONE because most programs dont require runtime support from a separate library such as C,
C++, etc.
Select the DOS.exe entry to generate a DOS.exe (executable) file as the target for the assembler and linker.
Once NONE and DOS.exe have been selected, choose OK at the bottom of the dialog box. Next, use
Project->Edit Project , and select your .asm file from the list, and use the Add choice to add it to your
project file list.
Now that Project is defined, select the Options -> Build. This determines the type of program developed
by the assembler and builder program. Choose the DEBUG option in the build dialog. After debugging is
complete, choose the release option for the final program.
Next, use the options->Language Options->MASM Options from the pull down menu. In the popup
window, deselect Warnings Treated as Errors. Then select <Set Debug Options...>. In the new popup
window, select Generate Listing File from the Listing section if it is not already selected. The generate
listing options initiated the .exe file and the .lst file. The listing file shows the source and object in one file.
Now that PWB has been configured, the project can be built. Select Project -> Build. You may ignore
warnings about the stack unless the program uses more than 128 bytes of stack space. The final product is
now in the form of an .exe file and a .lst file. Choose CANCEL to return to PWB. Choose FILE -> OPEN
and view the .lst file.
How are the opcodes displayed in the .lst file ?
Do the source code and/or the comments display in the file?
Are the other segments displayed such as stack or code ?
Run this program if it is free from errors. If not, debug it using instructions from the following section.
B.Debugging Assembly Language Programs Using Code View
Codeview (cv.exe) is an external debugger that offers many more features than the 'debug.exe' program.
You can debug programs simply by using debug.exe, but Codeview allows you easily track both memory
and register changes. It is recommended that you use Codeview for debugging your programs. The
program typed in PWB should be error-free; however, we will use it to demonstrate the CodeView
program.
Codeview (cv.exe) can be executed from the DOS command line, or from within PWB from the Run menu.
To execute codeview from the DOS command line for a .exe file, just do:
C:\> cv myfile.exe
This will bring up codeview for the file myfile.exe.
Codeview can also be run from within PWB. If codeview is not available within PWB from the Run menu,
then Select Run->Customize Run Menu from the Pull down menu. In the popup window, select <Add...>.
In the new popup window, input CodeView following Menu Text. In the second field, Path Name,
input the directory in which the cv.exe is located. It should be: c:\masm611\binr\cv.exe
Lets configure CodeView. Choose Option -> CodeView. The configuration should vary depending on
your monitor. Select a 50 line display and the default CodeView configuration.
Now that CodeView is configured, select Run -> CodeView. Codeview dynamically displays content of all
registers and the various memory locations.
What is the logical address for the code segment?
What are the content of registers CS:IP ?
What is the logical address of the data?
Step through the program with F10.
Restart the program. Execute the entire program using F5.
What is the result of the addition? What register(s) is the result stored?
Segments
Programming segments usually have a naming convention. The convention consists of
label SEGMENT [options]
;statements belonging to the segment
label ENDS
The options field can be used to give information to the assembler for organizing the segment, but is not
required. The label for ENDS must be the same as the label for SEGMENT.
C.Data Segments
The data segment is the portion of the memory used to store static data. The data is accessed in the code
segment by the labels given in the data directives and types in the data segment definition portion of the
assembler language source file. The x86 supports various data types and directives. MASM assembler
directives are used to allocate space and names to data values and/or locations.
ORG is a MASM directive that is used to indicate the origin of an offset address (A directive is an
instruction to the assembler program, it is NOT an x86 instruction). The number must end in H to indicate
hexadecimal otherwise the assembler will assume decimal and convert the number to hexadecimal.
DB is the defined byte directive which is used to allocate memory in byte-size chunks. The assembler
default is decimal; however, for hexadecimal, the number must end with an H and for binary the number
must end with a B. DB is also the only directive used to define ASCII strings longer than 2 characters.
ORG 0020H
DATA1 DB 37 ;decimal
DATA2 DB 37H ;hexadecimal
DATA3 DB 100101B ;binary
DATA4 DB 0110111B ;binary
DATA5 DB My name is Amy$
Assemble the data above. Dump the contents of memory at the respective address. Observe that the data
storage is at the offset, 0020H.
What is the logical address (and offset) for the values equivalent to those listed above in Example
3.1? How are the numbers represented, decimal, hex, binary? What is stored in
memory that corresponds to the string above?
DUP is a MASM directive that is used to duplicate a given number for a given number of characters.
Assemble the instruction DATA6 DB 6 DUP(0FH) at origin 0030H. What are the memory contents
at that offset? What is an alternate way to duplicate 0FH?
DUP is also used to set aside or reserve space for variables. For example,
DATA7 DB 32DUP (?) ; set aside 32 bytes
DATA8 DW 32DUP(?) ;set aside 32 words
DW is used to define words or allocate memory 2 bytes at a time.
ORG 0070H
DATA9 DW 253FH ;store 2 bytes
DATA10 DW 7,6,5,4,3,2,1 ;store various data words
DATA11 DW 8 DUP (?) ;set aside 8 words
If we use DW to store DATA9, then use DB as stated below ...
ORG 0090H
DATA12 DB 25H
DATA13 DB 3FH
to store 253FH, will the memory appear the same? Why or why not?
EQU is used to define a constant but does not reserve memory storage for the value. As an example,
consider the following segment definition directives.
ORG 0060H
VALUE EQU 25 ; sets a constant 25
MOV CX, VALUE
Assemble the above example using EQU, then assemble the following. Check the value of
the internal registers. Does CX appear differently in the two examples?
ORG 0080H
VALUE2 DB 25
MOV CX, VALUE2
Equate also makes changing constants throughout the program easier. The value can be changed in the
equate line, rather than at each instance in the program.
DD (define double) is used to allocate memory for a double word (4 bytes). The data is converted to hex,
then placed in the memory location.
The low byte goes to the low address and the high byte goes to the high address (the x86 is a little endian
architecture).
DQ, define quadword, is used to allocate memory 8 bytes (4 words) in size. This directive will store up to
64 bits of data at a time.
ORG 0080H
DATA14 DQ HI
DATA15 DQ 7,6,5,4
DATA16 DQ 65534H
What is the hexadecimal equivalent stored for HI? How many bytes are allocated for each character?
Does that differ from the numbers in DATA15?
DT, define ten bytes is useful to allocate memory for packed BCD.
ORG 0090H
DATA17 DT 36768
DATA18 DT 36768H
Do the values differ in memory? If so, explain why.
D.The Stack Segment
The stack is an area of memory reserved for temporary storage of program data and subroutine return
addresses. We will talk more about the stack segment in a future lab. For now, in your programs include a
stack segment declaration as shown below (allocates 64 bytes of memory for stack storage):
SSEG SEGMENT
DB 64 DUP (?)
SSEG ENDS
For now, make sure that any program you write has a stack segment.
E.Code Segments
The code segment contains the x86 instructions that make up your program. Example 4.2 shows the shell
of a program (repeated here for convenience).
Example 4.2
The form of an assembly language Program
SSEG SEGMENT
DB 64 DUP (?)
SSEG ENDS
;
DSEG SEGMENT
; all data goes here
DSEG ENDS
;
CSEG SEGMENT CODE
MAIN PROC FAR; program entry pt
ASSUME CS: CSEG, DS:DSEG, SS:SSEG
MOV AX, DSEG; bring in segment for data
MOV DS, AX; assign the DS value
; place code here
;
MOVE AH, 4CH; set up to
INT 21H; Return to DOS
INT 20H
MAIN ENDP
CSEG ENDS
END MAIN ; argument for 'end' directiive specifies
; program ENTRY point
The segment directive precedes the program entry point which defines a procedure labeled MAIN. A
procedure is a group of functions designed to accomplish a specific function.
A code segment is usually organized into several small procedures. Each procedure must contain a PROC
directive at the beginning and it is closed by an ENDP directive.
The procedure may contain options FAR or NEAR. FAR must designate the program entry point. NEAR
refers to procedures that are not outside the current CS. The next line contains an ASSUME statement.
The ASSUME statement associates segment registers with specific memory segments.
.

Write an assembly language program that includes a code segment named
Cod_Seg, a data segment named Dat_Seg, a stack segment named Sta_Seg. The
data segment should have data items named BIG_DAT, SMAL_DAT and SUM.
The program should add the two values in BIG_DAT and SMAL_DAT and then
store the result in SUM. What is actually assigned to the CS, DS, and SS registers? What is the
value of SUM? Where did you find it, that is, what is its address? You MUST use MASM to
assemble your code and produce a listing file.
Lab Report
A.Describing What You Learned
Answer all of the Think About It questions above.
B.Applying What You Learned
Discuss your experiences using PWB, Codeview, and MASM.
Discuss the small program that you wrote, be sure you include the listing file of your program in your
report.
Appendix (Short method of specifying segments)
There is a shorthand method for specifying segments using the '.MODEL' directive.
.model small
.stack 100h
.data
four_no db 12h,0b5h,6ch,78h
sum db ?
.code
main proc near
mov ax,@data
mov ds,ax
;; your statements here
mov ah,4ch
int 21h
int 20h
main endp
end main
The program above uses the .MODEL directive that specifies a 'small' memory model. A memory model
causes the assembler to make assumptions about the size and number of the program segments. The small
memory model allows one code segment (<64K) and one data segment (<64K), one stack segment (< 64K).
There are other models such as medium (data <= 64k, multiple code segments, any size), compact (code <=
64k, multiple data segments, any size), large (multiple code, multiple data segments, any size), flat (no
segments, protected mode only , all 32 bit addresses). The different memory models determine is code/data
address require only offset information, or both segment and offset information.
The small memorey model will be sufficient for EE 3724 programs.
Appendix (Short method of specifying segments)
There is a shorthand method for specifying segments using the '.MODEL' directive.
Example 4.3
.model small
.stack 100h
.data
four_no db 12h,0b5h,6ch,78h
sum db ?
.code
main proc near
mov ax,@data
mov ds,ax
;; your statements here
mov ah,4ch
int 21h
int 20h
main endp
end main
The program above uses the .MODEL directive that specifies a 'small' memory model. A memory model
causes the assembler to make assumptions about the size and number of the program segments. The small
memory model allows one code segment (<64K) and one data segment (<64K), one stack segment (< 64K).
There are other models such as medium (data <= 64k, multiple code segments, any size), compact (code <=
64k, multiple data segments, any size), large (multiple code, multiple data segments, any size), flat (no
segments, protected mode only , all 32 bit addresses). The different memory models determine is code/data
address require only offset information, or both segment and offset information.
The small memorey model will be sufficient for EE 3724 programs.

You might also like