Category : Assembly Language Source Code
Archive   : ASMTUT3.ZIP
Filename : CHAP10-1.DOC

 
Output of file : CHAP10-1.DOC contained in archive : ASMTUT3.ZIP



78


CHAPTER 10 - TEMPLATES


Do you remember when you were younger and you needed to look up a
word in the dictionary? It would define the word in terms of a
second word which you didn't know so you would look that up too.
Most likely that second word was either defined in terms of a
third word you didn't know or it referred you back to the first
word.

This chapter is something like that. The items in the template
file are interdependent. If you're lucky, everything will be
clear by the time you have finished the chapter. If not, you'll
have to reread it.

There are four different things which operate on the assembler
instructions which you write - the ASSEMBLER, the LINKER, the
LOADER and the 8086.

1) The ASSEMBLER takes your text and turns it into the machine
code that is used by the 8086. It is complete except that the
addresses of data and subroutines might change during linking and
loading. The assembler generates information called HEADER files
which give the LINKER and LOADER the information they need to
update these addresses in the machine code. This means that you
can move the code anywhere in memory.

2) If your program is made up of more than one file, the LINKER
links them together. It then makes it ready for running. If there
is only one file, the linker makes it ready for running. It does
this by updating the addresses of anything it has moved. It still
leaves the HEADER files which contain the segment addresses.

3) At run time, the LOADER, which is part of the operating
system, decides where to put your program in memory. It loads the
program, and adjusts any segment addresses in the program to
reflect where the program actually is in memory. It then gives
control to the program.

4) The code is fixed at the time the 8086 takes over. Any
addresses are constants and are unchangable.

Keep this in mind as we work through the template file.


THE .LST FILE

The first thing we need to look at is segments. Let's look at a
slightly modified version of the template file called segs.asm.
Here it is.

;***********************************
; segs.asm

______________________

The PC Assembler Tutor - Copyright (C) 1989 Chuck Nelson




Chapter 10 - Templates 79
______________________


; - - - - - - - - - - - - -
STACKSEG SEGMENT STACK 'STACK'

variable4 dw 4444h
dw 100h dup (?)

STACKSEG ENDS
; - - - - - - - - - - - - -
MORESTUFF SEGMENT PUBLIC 'HOHUM'

variable2 dw 2222h

MORESTUFF ENDS
; - - - - - - - - - - - - -
DATASTUFF SEGMENT PUBLIC 'DATA'

variable1 dw 1111h

DATASTUFF ENDS
; - - - - - - - - - - - - -
CODESTUFF SEGMENT PUBLIC 'CODE'

EXTRN print_num:NEAR , get_num:NEAR

ASSUME cs:CODESTUFF,ds:DATASTUFF
ASSUME es:MORESTUFF,ss:STACKSEG

variable3 dw 3333h

main proc far
start: push ds
sub ax,ax
push ax

mov ax, DATASTUFF
mov ds,ax
mov ax, MORESTUFF
mov es,ax

mov cx, variable1
mov variable1, cx

ret

main endp


CODESTUFF ENDS
; - - - - - - - - - - - -

END start
;***************************

There is an extra segment put in that has the definition

MORESTUFF SEGMENT PUBLIC 'HOHUM'




The PC Assembler Tutor 80
______________________


There is a variable defined in each segment including the stack
segment. These variables all have numbers in them, and the
numbers are in hex so they will be easy to read. There are only
two external subroutines (neither of which is called). It is time
to take a look at an assembler listing.

----- THIS IS FROM THE SCREEN -----

C>masm segs
Microsoft (R) Macro Assembler Version 5.10
Copyright (C) Microsoft Corp 1981, 1988. All rights reserved.

Object filename [segs.OBJ]:
Source listing [NUL.LST]: segs
Cross-reference [NUL.CRF]:

-----------

If you don't put a semicolon after the filename with masm, you
get some prompts. The first asks you if you want the object file
name to be different from the asm file name. You may change
either the name or the name and the extension. If you don't want
to change either, just press ENTER. The second asks if you want a
listing. Normally you don't, so you just press ENTER. This time
we do, so we give it the same name as the assembler file. The
assembler will generate a file SEGS.LST. Finally, it asks if you
want the information needed to create a cross-reference file. We
won't cover that. Once again, press ENTER. The assembler
generates an object file and a listing. Here's the complete
listing.

**********************
Microsoft (R) Macro Assembler Version 5.10
9/2/89 09:50:54

Page 1-1



; segs.asm

; - - - - - - - - - - - - -
0000 STACKSEG SEGMENT STACK 'STACK'

0000 4444 variable4 dw 4444h
0002 0100[ dw 100h dup (?)
????
]


0202 STACKSEG ENDS
; - - - - - - - - - - - - -
0000 MORESTUFF SEGMENT PUBLIC 'HOHUM'

0000 2222 variable2 dw 2222h





Chapter 10 - Templates 81
______________________

0002 MORESTUFF ENDS
; - - - - - - - - - - - - -
0000 DATASTUFF SEGMENT PUBLIC 'DATA'

0000 1111 variable1 dw 1111h


0002 DATASTUFF ENDS
; - - - - - - - - - - - - -
0000 CODESTUFF SEGMENT PUBLIC 'CODE'

EXTRN print_num:NEAR , get_num:NEAR

ASSUME cs:CODESTUFF,ds:DATASTUFF
ASSUME es:MORESTUFF,ss:STACKSEG

0000 3333 variable3 dw 3333h

0002 main proc far
0002 1E start: push ds
0003 2B C0 sub ax,ax
0005 50 push ax

0006 B8 ---- R mov ax, DATASTUFF
0009 8E D8 mov ds,ax
000B B8 ---- R mov ax, MORESTUFF
000E 8E C0 mov es,ax

0010 8B 0E 0000 R mov cx, variable1
0014 89 0E 0000 R mov variable1, cx

0018 CB ret

0019 main endp


0019 CODESTUFF ENDS

Microsoft (R) Macro Assembler Version 5.10
9/2/89 09:50:54

Page 1-2


; - - - - - - - - - - - -

END start


Microsoft (R) Macro Assembler Version 5.10

9/2/89 09:50:54


Symbols-1


Segments and Groups:





The PC Assembler Tutor 82
______________________

N a m e Length Align Combine Class

CODESTUFF . . . . . . . . . . . 0019 PARA PUBLIC 'CODE'
DATASTUFF . . . . . . . . . . . 0002 PARA PUBLIC 'DATA'
MORESTUFF . . . . . . . . . . . 0002 PARA PUBLIC 'HOHUM'
STACKSEG . . . . . . . . . . . . 0202 PARA STACK 'STACK'

Symbols:

N a m e Type Value Attr

GET_NUM . . . . . L NEAR 0000 CODESTUFF External

MAIN . . . . . . . . F PROC 0002 CODESTUFF Length = 0017

PRINT_NUM . . . . L NEAR 0000 CODESTUFF External

START . . . . . . . . L NEAR 0002 CODESTUFF

VARIABLE1 . . . . . . L WORD 0000 DATASTUFF
VARIABLE2 . . . . . . L WORD 0000 MORESTUFF
VARIABLE3 . . . . . . L WORD 0000 CODESTUFF
VARIABLE4 . . . . . . L WORD 0000 STACKSEG

@CPU . . . . . . . . . . . . . . TEXT 0101h
@FILENAME . . . . . . . . . . . TEXT segs
@VERSION . . . . . . . . . . . . TEXT 510


54 Source Lines
54 Total Lines
21 Symbols

48006 + 428261 Bytes symbol space free

0 Warning Errors
0 Severe Errors

**********************

As you can see, the listing, even for a short program, is very
long. Let's take it apart section by section. The first large
section is a copy of the text file except that there is
information on the left. The number on the far left tells the
offset address (in hex) from the beginning of the segment for
each label, variable or instruction. In this section:


0000 3333 variable3 dw 3333h

0002 main proc far
0002 1E start: push ds
0003 2B C0 sub ax,ax
0005 50 push ax

0006 B8 ---- R mov ax, DATASTUFF
0009 8E D8 mov ds,ax




Chapter 10 - Templates 83
______________________

000B B8 ---- R mov ax, MORESTUFF
000E 8E C0 mov es,ax

0010 8B 0E 0000 R mov cx, variable1
0014 89 0E 0000 R mov variable1, cx

0018 CB ret

0019 main endp

"start" is at 0002h ,"mov cx, variable1" is at 0010h and "ret" is
at 18h.

The second set of numbers is the actual machine instructions in
hex. These are the what the 8086 operates on. "push ds" is 1E,
"mov ds, ax" is 8E D8, and "ret" is CB. The instructions can be
from 1 - 6 bytes long. Notice the "R" after some of the
instructions. The "R" stands for relocatable. This means that it
is an address that might be changed by either the linker or the
loader. We'll talk about that later. In any case, the object file
keeps track of these so they can be changed if necessary. Also,
go back to the complete listing and look at the four variables;
you will see that the values have been put in the object code;
that is, 1111h, 2222h, 3333h and 4444h.

If we had had an error, the assembler would have placed an error
message at the spot of the error in this part of the file.


The next part of the .LST file is the segment listing. It tells
how the segments are defined.

N a m e Length Align Combine Class

CODESTUFF . . . . . . . . 0019 PARA PUBLIC 'CODE'
DATASTUFF . . . . . . . . 0002 PARA PUBLIC 'DATA'
MORESTUFF . . . . . . . . 0002 PARA PUBLIC 'HOHUM'
STACKSEG . . . . . . . . . 0202 PARA STACK 'STACK'


We have the segment name, length, and some other information
we'll talk about later. Notice that 'HOHUM' which is an
artificial class, is dutifully listed with no complaints.


Then comes the listing of all labels, variables, and procedure
names.

Symbols:

N a m e Type Value Attr

GET_NUM . . . . L NEAR 0000 CODESTUFF External
MAIN . . . . . F PROC 0002 CODESTUFF Length = 0017
PRINT_NUM . . L NEAR 0000 CODESTUFF External
START . . . . . L NEAR 0002 CODESTUFF
VARIABLE1 . . . L WORD 0000 DATASTUFF




The PC Assembler Tutor 84
______________________

VARIABLE2 . . . L WORD 0000 MORESTUFF
VARIABLE3 . . . L WORD 0000 CODESTUFF
VARIABLE4 . . . L WORD 0000 STACKSEG


It shows the segment and offset, whether they are bytes, words,
processes etc. The "L" stands for label. The variables and
procedures which are in an external file are so marked. Neither
print_num nor get_num was called, but the assembler maintains a
listing for them.

Finally, some internal info for the assembler.

@CPU . . . . . . . . . . . . . . TEXT 0101h
@FILENAME . . . . . . . . . . . TEXT segs
@VERSION . . . . . . . . . . . . TEXT 510

We will come back to parts of the .LST file, so make yourself
comfortable with it.



SEGMENTS

It is now time for the nitty-gritty. We need to know what all
those statements in the template file mean. Remember that there
are four players in the game - (1) MASM, the Microsoft assembler,
(2) LINK, the Microsoft linker, (3) the program loader and (4)
the 8086 chip itself. Who does what to whom is the subject of
this chapter.

You will notice that there are three segments in all the template
files, one for data, one for code, and one for the stack. How
many segments can a program have? An unlimited number for code,
an unlimited number for data, and one for the stack.{1} Although
you can have an unlimited number of segments, you can use only
four at any one time - two for regular data (referenced by the DS
and ES registers), one for code (referenced by the CS register),
and one for temporary data (referenced by the SS register).

You don't have direct control over CS. You should NEVER change
the value in SS. This means that you can only change which
segments that ES and DS refer to. How do you do that? The 8086
does not allow you to move a constant into a segment register.
Therefore it is a two step process. Put the constant into an
arithmetic register (AX, BX, CX, DX, SI, DI or BP) and from there
to the segment register. Suppose we have 327 different data
segments in our file (named SEG1, SEG2, SEG3 ... SEG327) and we
wanted to reference data in SEG27. The code would be:

mov ax, SEG27
mov ds, ax

____________________

1 Although if you REALLY need more space for a stack it is
possible, if a little arcane.




Chapter 10 - Templates 85
______________________

This is the standard way to do it, and this is the same as the
fourth and fifth instructions in the code segment of the template
files where we are putting the address of DATASTUFF in ds.

What is that SEG27 in the instruction (mov ax, SEG27)? It is a
constant. When the assembler assembles the program, it makes note
of the fact that you want to have the starting address of SEG27
in that instruction (you saw the "R" in the listing for the
instruction 'mov ax, DATASTUFF'). Later the linker makes sure
there is a SEG27 segment in the complete program, gives it a
temporary segment address, and puts this temporary address in
every place that references that segment address. This address is
guaranteed to be adjusted. You will see why when we look at the
linker .MAP file.

Finally, the loader (which is the program that puts your program

into memory) puts the segment where it wants and updates all
references to the segment address to reflect where it now is.
Thus, the program is complete only when this information is put
in at run time. Each time you run the program SEG27 might be in a
different place, but the loader will always update the references
correctly.

We named the segments SEG1, SEG2, etc. Does SEG have to be part
of the segment name? Not on your life. Here are three perfectly
acceptable segment definitions:

CURLY SEGMENT
LARRY SEGMENT
MOE SEGMENT

It is good practice to have 'SEG' as part of the segment name to
remind you that these are segments, not variables, but this is a
practice only, it is not a law. Any name you could use for a
variable or a label you could use as a segment name. The reserved
word SEGMENT after the name tells the assembler that this is the
beginning of a segment with that name. You tell the assembler
that you are starting a segment with 'SEGMENT'

CURLY SEGMENT

and you tell the assembler that you are finished with that
segment with the reserved word ENDS (END [of] Segment):

CURLY ENDS

You need to put the name of the segment before the ENDS
directive.

In the template file, the data segment definition reads:

DATASTUFF SEGMENT PUBLIC 'DATA'

DATASTUFF is the segment name, but what are PUBLIC and 'DATA'
there for? To understand this, we need to look at the linker.
First, let's assemble temp1.asm (our first template file) just
the way it is.




The PC Assembler Tutor 86
______________________


---------- FROM THE SCREEN ----------
C>masm temp1.asm
Microsoft (R) Macro Assembler Version 5.10
Copyright (C) Microsoft Corp 1981, 1988. All rights reserved.


Object filename [temp1.OBJ]:
Source listing [NUL.LST]: temp1
Cross-reference [NUL.CRF]:

----------
We have made the listing file so let's look at the segment
information.


N a m e Length Align Combine Class

CODESTUFF . . . . . . . . . . . 000A PARA PUBLIC 'CODE'
DATASTUFF . . . . . . . . . . . 0000 PARA PUBLIC 'DATA'
STACKSEG . . . . . . . . . . . . 00C8 PARA STACK 'STACK'

You will see that CODESTUFF is Ah (10d) bytes long, DATASTUFF has

no data and is 0 bytes long, and STACKSEG is C8h (200d) bytes
long.

Now let's link temp1.obj and asmhelp.obj.

---------- FROM THE SCREEN -----

C>link temp1+asmhelp

Microsoft (R) Overlay Linker Version 3.61
Copyright (C) Microsoft Corp 1983-1987. All rights reserved.


Run File [TEMP1.EXE]:
List File [NUL.MAP]: temp
Libraries [.LIB]:

----------

This time we have made a listing file for the link process. It is
called TEMP.MAP. Let's look at it.

Start Stop Length Name Class
00000H 000C7H 000C8H STACKSEG STACK
000D0H 00540H 00471H DATASTUFF DATA
00550H 01944H 013F5H CODESTUFF CODE

Program entry point at 0055:0000

This is what the map file looks like. There are still only three
segments in the final executable file, STACKSEG, DATASTUFF and
CODESTUFF. You will notice that the class name is still there,
but the PUBLIC is missing. It's job is finished. "Start" says
where the segment starts in the executable file, "Stop" says




Chapter 10 - Templates 87
______________________

where the segment ends in the executable file, and "Length" says
the length in bytes of the segment. These numbers are 5 digit hex
numbers instead of 4. That means that they are showing the total
address. The segment number is the left 4 digits of 'Start'.

STACKSEG is C8h (200d) bytes long like before. Although DATASTUFF
had no data, it is now 471h (1137d) bytes long, and CODESTUFF was
Ah (10d) bytes long before but now it is a whopping 13F5h (5109d)
bytes long. What happened? The linker did its work.

One of the things the linker does is combine things that we want
to be in the same segment. It took the DATASTUFF segment from
temp1.obj and appended the DATASTUFF segment from asmhelp.obj,
combining them into one larger segment.{2} It took the CODESTUFF
segment from temp1.obj and appended the CODESTUFF segment from
asmhelp.obj, making them one large segment. Why did it do that?
Because we put the word "PUBLIC" in the segment definition. When
the assembler sees "PUBLIC" in the segment definition, it passes
that information along to the linker in a header file.{3} When
the linker has a segment which is "PUBLIC", it will append any
other segment which (1) is "PUBLIC", (2) has the same name (i.e.
CODESTUFF or DATASTUFF or CURLY etc.), and (3) has the same class
name{4}. All three things must be true for the linker to combine
them. We will actually check this out a little later to make sure
you believe it.

One other thing to notice is that the linker is allocating only
as much space as is needed. It could allocate 65536 bytes for
each segment defined, but it uses only as much as the program
needs and then starts the next segment at the next segment
starting address. This is efficient management of memory.

What is the advantage of combining the smaller segments into one
larger segment? For code, there is no big advantage. But for
data, remember that every time we want to access data, we need to
have the starting address of that particular segment in register
ds. We do this by using:

mov ax, DATASTUFF
mov ds, ax

If we have a number of data segments, every time we access data
____________________

2 The linker always works from left to right. For each
different type of segment, it starts with the first one it finds
and then appends each succeeding one it finds.

3 A header is information for the linker or loader which is
put in front of the machine code in an object file or an
executable file. There are typically a number of headers in front
of the machine code.

4 Remember that class names are somewhat arbitrary. I use
'CODE', 'DATA' and 'STACK' for clarity and because they are the
standard Microsoft class names, but if you are not linking with
anyone else's programs, you can use any class name you want.




The PC Assembler Tutor 88
______________________

we need to (1) make sure that ds contains the address of the
correct data segment, and (2) if not, we need to write the code
to change ds. This entails using a lot of code, can be confusing
and is certainly error prone. With one data segment, you simply
load ds with the correct address at the beginning of the program
and then forget about it. This should be a rule for you. Unless
you have truly humongous amounts of data (over 65535 bytes),
ALWAYS put all your data in the same segment.

Do you remember those dashes '----' in the assembler listing?
That was because the assembler didn't have a segment address to
put there.

0004 B8 ---- R mov ax, DATASTUFF
0007 8E D8 mov ds,ax

0009 8B 0E 0000 R mov cx, variable1
000D 89 0E 0000 R mov variable1, cx

The linker now has a temporary address for the start of DATASTUFF
(000D0h) so it will put the segment address (the left four hex
bytes) in this spot. This is temporary, but will be updated by
the loader. If variable1 has been moved, it will update that too.

Why am I sure that these temporary segments will be moved? The
segment address of STACKSEG is 0000h. The segment address of
DATASTUFF is 000Dh (13h) and the segment address of CODESTUFF is
0055h (85d). But the operating sysyem owns the first several
THOUSAND segments. The loader will load your program in much
higher memory. They must move.

So the linker combines all the segments we want to combine, and
then it looks at the machine code and modifies every reference to
the segments and to the variables which have been moved. That is
a lot of work. For instance, when the linker appends asmhelp.obj,
there are a hundred or so variables which it moves and a thousand
or so references to those variables which it modifies. The linker
does that every time you link a file with ASMHELP.OBJ. That's not
too shabby.



  3 Responses to “Category : Assembly Language Source Code
Archive   : ASMTUT3.ZIP
Filename : CHAP10-1.DOC

  1. Very nice! Thank you for this wonderful archive. I wonder why I found it only now. Long live the BBS file archives!

  2. This is so awesome! 😀 I’d be cool if you could download an entire archive of this at once, though.

  3. But one thing that puzzles me is the “mtswslnkmcjklsdlsbdmMICROSOFT” string. There is an article about it here. It is definitely worth a read: http://www.os2museum.com/wp/mtswslnk/