Category : Assembly Language Source Code
Archive   : ASMTUT3.ZIP
Filename : CHAP15-1.DOC

 
Output of file : CHAP15-1.DOC contained in archive : ASMTUT3.ZIP




137

CHAPTER 15 - SUBROUTINES


It is now time to talk about subroutines. If you have only used
BASIC this may be difficult for you. It is assumed that you are
familiar with subroutines and use them constantly in your
programming.

You have been using subroutines since the very first program in
this manual. When you wrote:

call get_num

you called a subroutine in ASMHELP.OBJ. Now you are going to
write subroutines yourself and have them call each other. There
are different template files for programs with subroutines. They
are SUBTEMP1.ASM and SUBTEMP2.ASM. We will start with SUBTEMP1.
It has the entry subroutine and a space for additional
subroutines. The entry subroutine is the subroutine where the
operating system starts the program; it does the necessary
initialization and has special code for that.

You will see some additions to the normal template file. At the
top is the line:

INCLUDE \pushregs.mac

What this is will be explained later, but you must put the file
PUSHREGS.MAC in the root directory of your current drive. You
will find it in the \TEMPLATE subdirectory.

At the end of the SUBTEMP1.ASM is:

; + + + + + + + + + + + + START SUBROUTINES BELOW THIS LINE

; + + + + + + + + + + + + END SUBROUTINES ABOVE THIS LINE

This is where you will write all the subroutines except the entry
subroutine which is still the same as before. All data for all
subroutines still goes in the DATASTUFF segment.


Our first program will just call subroutines which will print out
messages. Using SUBTEMP1.ASM, it looks like this:

;prog1.asm

; + + + + + + + + + + + + + + + START DATA BELOW THIS LINE
main_message db "This is the entry routine.", 0
sub1_message db "This is subroutine1.", 0
sub2_message db "This is subroutine2.", 0
sub3_message db "This is subroutine 3.", 0
; + + + + + + + + + + + + + + + END DATA ABOVE THIS LINE


______________________

The PC Assembler Tutor - Copyright (C) 1989 Chuck Nelson




The PC Assembler Tutor 138
______________________

; + + + + + + + + + + + + + + + START CODE BELOW THIS LINE
mov ax, offset main_message
call print_string
call sub1
mov ax, offset main_message
call print_string
; + + + + + + + + + + + + + + + END CODE ABOVE THIS LINE

; + + + + + + + + + + + + START SUBROUTINES BELOW THIS LINE
;------------
sub1 proc near

push ax
mov ax, offset sub1_message
call print_string
call sub2
mov ax, offset sub1_message
call print_string
pop ax

ret

sub1 endp
;------------
sub2 proc near

push ax
mov ax, offset sub2_message
call print_string
call sub3
mov ax, offset sub2_message
call print_string
pop ax

ret

sub2 endp
;------------
sub3 proc near

push ax
mov ax, offset sub3_message
call print_string
pop ax

ret

sub3 endp
; ----------
; + + + + + + + + + + + + END SUBROUTINES ABOVE THIS LINE

The data consists of messages to be printed by print_string.
Print_string prints a zero terminated string (the number zero,
not the character '0'), so there must be a zero after each
message in the data segment. The entry subroutine prints a
message and then calls sub1, the first subroutine, which prints a
message and calls sub2 which prints a message and calls sub3.




Chapter 15 - Subroutines 139
________________________

Sub3 prints a message and then returns to sub2 which prints a
message and returns to sub1 which prints a message and returns to
the entry routine which prints a message and then exits. This
program should print 7 messages in all. You will notice that the
first thing that each subroutine does is save the value in AX,
since it uses the AX register. This is the cardinal rule of
robustness at the assembler level.

IF YOU USE A REGISTER, YOU MUST SAVE ITS VALUE BY PUSHING IT
ON THE STACK; YOU MUST THEN RESTORE THE VALUE JUST BEFORE
EXITING.

It is impossible to overstress this. The routines which call your
routine might rely on the registers remaining unaltered. If you
disobey this rule and alter the registers, you'll be sorry.

Why doesn't the entry routine push and pop the registers it uses?
Well, the operating system assumes the registers will contain
trash upon return from the program, so it uses nothing in the
data registers.

All the subroutines except the entry routine are near routines.
We will only use near routines. Assemble this program, link it
and run it. If it works ok, it is then time for program 2, which
is the same as program1, but is in two files.

Often, we want parts of a program in different files. Perhaps
parts are standard subprograms which you have already written and
assembled, perhaps the total program is too large to be handled
comfortably in one file, perhaps different people are writing
different parts of the program. Not only must we write the
programs, but we must be able to connect them. We will put the
entry routine, sub2 and the associated data in subtemp1.asm. We
will put sub1, sub3, and the associated data in subtemp2.asm.

Take a look at SUBTEMP2.ASM. It is slightly different. First, it
does not have the variables that you need for set_reg_style
(ax_byte, bx_byte, etc.) but it does have EXTRN statements for
them. This means that you can change the register style from this
file. SUBTEMP1.ASM has these variables declared PUBLIC so the
linker can join them correctly.{1} We will talk about the correct
way to declare external data later.

SUBTEMP2.ASM has no stack segment, though there could be one.
There is no entry subroutine. Therefore at the very end, you have
the line:

END

with nothing after it. In SUBTEMP1.ASM, you have
____________________

1. The reason for having only one set of variables for the
style is so that every time you change one of the style
variables, the array is updated. If you had two different arrays
you could have two different sets of information for
set_reg_style.




The PC Assembler Tutor 140
______________________


END start

so the assembler and linker know that the program begins at the
label "start".

Let's do the two programs. Here are the data, the entry code and
the subroutine code from the first file.

;prog1.asm
; + + + + + + + + + + + + + + + START DATA BELOW THIS LINE
main_message db "This is the entry routine.", 0
sub2_message db "This is subroutine2.", 0
; + + + + + + + + + + + + + + + END DATA ABOVE THIS LINE

; + + + + + + + + + + + + + + + START CODE BELOW THIS LINE
PUBLIC sub2
EXTRN sub1:NEAR, sub3:NEAR

mov ax, offset main_message
call print_string
call sub1
mov ax, offset main_message
call print_string
; + + + + + + + + + + + + + + + END CODE ABOVE THIS LINE

; + + + + + + + + + + + + START SUBROUTINES BELOW THIS LINE
sub2 proc near

push ax
mov ax, offset sub2_message
call print_string
call sub3
mov ax, offset sub2_message
call print_string
pop ax

ret

sub2 endp
; + + + + + + + + + + + + END SUBROUTINES ABOVE THIS LINE

Notice that sub1 and sub3 have been declared EXTRN before they
were referenced, and the EXTRN statement tells the assembler that
they are both near procedures. sub2 has been declared PUBLIC so
the assembler will give the address of sub2 to the linker.

Here's the data and code for the other file.

;prog2.asm

; + + + + + + + + + + + + + + + START DATA BELOW THIS LINE
sub1_message db "This is subroutine1.", 0
sub3_message db "This is subroutine 3.", 0
; + + + + + + + + + + + + + + + END DATA ABOVE THIS LINE

; + + + + + + + + + + + + START SUBROUTINES BELOW THIS LINE




Chapter 15 - Subroutines 141
________________________

PUBLIC sub1, sub3
EXTRN sub2:NEAR
;------------
sub1 proc near

push ax
mov ax, offset sub1_message
call print_string
call sub2
mov ax, offset sub1_message
call print_string
pop ax

ret

sub1 endp
;------------
sub3 proc near

push ax
mov ax, offset sub3_message
call print_string
pop ax

ret

sub3 endp
; ----------
; + + + + + + + + + + + + END SUBROUTINES ABOVE THIS LINE

Here sub1 and sub3 have been declared PUBLIC and sub2 has been
declared EXTRN.

Assemble both programs and then link all three.

link prog1+prog2+\asmhelp.obj

assuming that asmhelp is in the root directory. Run it. You
should have the same results as before.

We are going to do one more thing with the same two files.
Without changing any of the code, we are going to put the data
for prog1 in prog2 and the data for prog2 in prog1 like this.

;prog1
; + + + + + + + + + + + + + + + START DATA BELOW THIS LINE
sub1_message db "This is subroutine1.", 0
sub3_message db "This is subroutine 3.", 0
; + + + + + + + + + + + + + + + END DATA ABOVE THIS LINE

;prog2
; + + + + + + + + + + + + + + + START DATA BELOW THIS LINE
main_message db "This is the entry routine.", 0
sub2_message db "This is subroutine2.", 0
; + + + + + + + + + + + + + + + END DATA ABOVE THIS LINE

So far, so good. Obviously we are going to need some more PUBLIC




The PC Assembler Tutor 142
______________________

statements and some EXTRN statements so the linker can link the
four messages, but where do they go and what do they look like?
The PUBLIC statements are the easiest. Put them in the segment
where the message data appears, either before or after the data
declaration.

The EXTRN statement is a little more complicated. First, all data
is declared EXTRN by giving the variable name followed by a colon
followed by its data type. The data types are BYTE, WORD, DWORD
(4bytes), QWORD (quadword or 8 bytes), and TBYTE (10 bytes).
These are the standard 8086/7 data sizes. Therefore we have:

EXTRN sub1_message:BYTE, sub3_message:BYTE

in prog2.asm and:

EXTRN main_message:BYTE, sub2_message:BYTE

in prog1.asm. Where do they go? In order to know that, we need to
talk about segment overrides again.

You will remember from our discussion of the ASSUME statement
that every time the assembler writes code with a variable, it
checks the ASSUME statements to see which segment register(s)
have the address of the segment that that variable is in. If we
have:

ASSUME cs:SEG1, ds:SEG2, es:SEG3, ss:SEG4

then if variable1 is in SEG2, the assembler will write no
override in the code since DS is the 8086 default segment.

MACHINE CODE ASSEMBLER INSTRUCTION
A1 0000 mov ax, variable1

If variable1 is in SEG1 or SEG3 or SEG4, the assembler will write
the appropriate segment override in the code.

MACHINE CODE ASSEMBLER INSTRUCTION

2E: A1 0000 mov ax, variable1
26: A1 0000 mov ax, variable1
36: A1 0000 mov ax, variable1

(By the way, those zeros just mean that the variable is at 0000
offset from the beginning of the segment).

The same thing happens when you have an EXTRN statement. The
assembler associates the externally declared variable with the
segment it is declared in. When the variable is used, it then
goes through the same actions as if the variable were actually in
that segment. Let's declare variable5 external with:

EXTRN variable5:WORD

If we have:





Chapter 15 - Subroutines 143
________________________

ASSUME cs:SEG1, ds:SEG2, es:SEG3, ss:SEG4

then if variable5 is declared external in SEG2, the assembler
will write no override in the code since DS is the 8086 default
segment.

MACHINE CODE ASSEMBLER INSTRUCTION
A1 0000 E mov ax, variable5

If variable5 is declared external in SEG1 or SEG3 or SEG4, the
assembler will write the appropriate segment override in the
code.

MACHINE CODE ASSEMBLER INSTRUCTION

2E: A1 0000 E mov ax, variable5
26: A1 0000 E mov ax, variable5
36: A1 0000 E mov ax, variable5

The "E" after the machine code means that the assembler knows
that the variable is external and it will tell the linker so the
linker can put the correct offset address at that point in the
machine code.

Remember, as always, that it is your responsibility to have the
correct segment address in the segment register before using a
variable.

Now we know where it goes. When you declare a variable external,
you must put the EXTRN statement in a segment which uses the same
segment register as the EXTRN variable is going to use. If the
EXTRN variable will use DS, then the segment where the EXTRN
statement is must use DS. If the variable uses ES, then the
segment the EXTRN statement is in must use ES. In other words,
the ASSUME statement for the segment the variable is in must
match EXACTLY the ASSUME statement you would write if the
variable were internal, not external.{2} Normally, this is DS,
but in special circumstances you might want something else. Also,
if there is no segment that exactly matches what you want, then
you need to create a dummy segment:

DUMMY_SEG SEGMENT
EXTRN variable7:QWORD
DUMMY_SEG ENDS

and make the assume statement that you want:

____________________

2. This means that if the segment with the EXTRN statement has
more than one segment register in the assume statement:

ASSUME ds:MORESTUFF, es:MORESTUFF

then both those registers must be set to the segment of the
external variable when using it or your results may be
unreliable.




The PC Assembler Tutor 144
______________________

ASSUME es:DUMMY_SEG



What segment has DS in an ASSUME statement? DATASTUFF in both
files, so that is where the EXTRN declaration goes - in the
DATASTUFF segment.

;prog1
; + + + + + + + + + + + + + + + START DATA BELOW THIS LINE
PUBLIC sub1_message, sub3_message
EXTRN main_message:BYTE, sub2_message:BYTE
sub1_message db "This is subroutine1.", 0
sub3_message db "This is subroutine 3.", 0
; + + + + + + + + + + + + + + + END DATA ABOVE THIS LINE

;prog2
; + + + + + + + + + + + + + + + START DATA BELOW THIS LINE
PUBLIC main_message, sub2_message
EXTRN sub1_message:BYTE, sub3_message:BYTE
main_message db "This is the entry routine.", 0
sub2_message db "This is subroutine2.", 0
; + + + + + + + + + + + + + + + END DATA ABOVE THIS LINE

Change the data in the two files, assemble them again and link
them again:

link prog1+prog2+\asmhelp.obj

You should get the same results as before. We are now through
with these programs. Make sure you understand how to define
PUBLIC and EXTRN procedures and PUBLIC and EXTRN data before
going on, since we are not going to cover it again. Everything
else in this chapter will be done with a single file in order to
make life easier.


PASSING DATA

When you pass data to the routines in ASMHELP.OBJ, you always
pass it through the AX register. The reason for this is that you
needed to use these routines before you knew much about 8086
assembler language. It is solely for the convenience of beginners
and is totally non-standard. In the real world, when you call a
subroutine you ALWAYS pass the data on the stack, no matter which
language you are using.

If you have the C statement:

my_procedure (variable1, variable2, variable3) ;

then the C compiler will generate the following code:

push variable3
push variable2
push variable1





Chapter 15 - Subroutines 145
________________________

call my_procedure{3}

The C language pushes these variables in right to left order.
Before the call instruction is executed variable1 is on the top
of the stack, variable2 is the next down, and variable3 is third
on the stack. Is variable1 still on the stack top after the call
instruction is executed? No. The call instruction pushes either
one or two words on the stack. Before you go any farther with
subroutines you need to know how the call and return instruction
operate.

Every time you have used show_regs, both CS the code segment
address and IP the instruction pointer have been displayed. What
does IP do? When the 8086 is ready to execute an instruction, it
takes IP, adds it to CS to calculate the total address, and gets
the instruction at that address. It then immediately figures out
how long the instruction is going to be and adds that amount to
IP.{4} What this means is that at any time, IP points to the NEXT
instruction, not the current instruction. When you execute a
call, the 8086 changes IP to point to the first byte of the
called subroutine, so the next instruction executed is the first
byte of the called subroutine.

There are two different types of procedures, near procedures and
far procedures. In a near procedure, you keep CS, the code
segment register, the same. In a far procedure you change CS. So,
when you call a near procedure you change one thing (IP) and in a
far procedure you change two things (IP and CS).

When you want to get back from the subroutine, you need to have
CS with the segment of the calling routine and IP with the
address of the instruction after the call. What are the mechanics
of all this? Let's take a near procedure first.

In a near call, the 8086 first changes the instruction pointer to
point to the next instruction. It then pushes IP on the stack,
and puts the address of the called subroutine (which is in bytes
2 and 3 of the call instruction) in IP. IP now points to the
called subroutine. There is one more word (2 bytes) on the stack.
At the end of the called subroutine, a NEAR return (ret) pops the
top word off the stack into IP. IP then points to the instruction
after the call instruction.

In a far call, the 8086 first changes the instruction pointer
(IP) to point to the next instruction. It then pushes CS on the
stack, followed by IP. It then loads the offset address of the
called subroutine in IP and the segment address of the called
subroutine in CS. This new IP is in bytes 2 and 3 of the call
instruction and the new CS is in bytes 4 and 5 of the call
____________________

3. You C fanatics will notice that there are some initial
underscores missing. Let's not confuse the issue.

4. Instructions can vary from one byte long to six bytes long,
and the 8086 can tell from the first (or first and second)
byte(s) how long the total instruction will be.




The PC Assembler Tutor 146
______________________

instruction. IP and CS now have the address of the called
subroutine. The stack has two words (4 bytes) more on the stack.
The old IP is the stack top and the old CS is next on the stack.
At the end of the subroutine, a FAR return (ret) pops the stack
top into IP, then pops the next stack item into CS. Now IP and CS
point to the instruction after the call instruction.

These are two different types of call and they have two different
machine codes. These are two different types of returns and they
have two different machine codes.


MACHINE CODE ASSEMBLER INSTRUCTIONS

; a far routine
;-----
far_routine proc far
CB ret
far_routine endp
;-----

; a near routine
;-----
near_routine proc near
C3 ret
near_routine endp
;-----


; a near and far call
E8 0A43 R call near_routine
9A 015C ---- R call far_routine


The machine code for a near return is C3; for a far return it's
CB. The machine code for a near call is E8; for a far call it's
9A. The near call has the address of the called routine (0A43h)
in the following two bytes. The far call has the address of the
the called routine (015Ch) in the next two bytes followed by the
segment of the called routine. The segment address isn't there
yet. It will be put there by the linker and loader, but the
assembler has saved the space for the address. That's why the

dashes are there. Remember, the R is there because those
addresses might be relocated by the linker or the loader.

You tell the assembler whether to code a near return or far
return by telling it whether it is a near or a far procedure.

routine1 proc near
routine2 proc far

How does the assembler know whether to code a near or far call?
If it has already seen the procedure, it knows what type it is.








Chapter 15 - Subroutines 147
________________________

If it hasn't seen it yet, it uses the default type.{5} If it is
an external subroutine, the assembler knows because you have
written an EXTRN statement.

EXTRN routine3:NEAR, routine4:FAR

This EXTRN statement should appear before the call.

What if the routine appears after the call in the source file but
it isn't the default type? You can override the default type.

call NEAR PTR routine5
call FAR PTR routine6

This is the same cumbersome syntax that we had with pointers to
data, but it's the only game in town. Normally, if the subroutine
appears after the call, you don't need to do anything if it is a
near call but you need to put a FAR PTR override if it is a far
call.

































____________________

5. The default is near for what we are doing. However,
Microsoft has something called "simplified" directives and the
default changes in these cases.