.. leave 6/12 lines for article headline
.. put cursor on ruler line & hit Ctrl-OF
IBM Personal Computer Assembly Language Tutorial
Joshua Auerbach, Yale University
Yale Computer Center
175 Whitney Avenue
P. O. Box 2112
New Haven, Connecticut 06520
This talk is for people just getting started with
the PC MACRO Assembler. Maybe you are just contem-
plating doing some coding in assembler, maybe you
have tried it with mixed success. If you are here
to get aimed in the right direction, to get off to
a good start with the assembler, then you have come
for the right reason. I can't promise you'll get
what you want, but I'll do my best.
On the other hand, if you have already turned out
some working assembler code, then this talk is
likely to be on the elementary side for you. If
you want to review a few basics and have no where
else pressing to go, then by all means stay.
Why Learn Assembler?
The reasons for LEARNING assembler are not the same
as the reasons for USING it in a particular appli-
cation. But, we have to start with some of the
reasons for using it and then I think the reasons
for learning it will become clear.
First, let's dispose of a bad reason for using it.
Don't use it just because you think it is going to
execute faster. A particular sequence of ordinary
bread-and-butter computations written in PASCAL, C,
FORTRAN, or compiled BASIC can do the job about as
fast as the same algorithm coded in assembler. Of
course, interpretive BASIC is slower, but if you
have a BASIC application which runs too slow you
probably want to try compiling it before you think
too much about translating parts of it to another
On the other hand, high level languages do tend to
isolate you from the machine. That is both their
strength and their weakness. Usually, when imple-
mented on a micro, a high level language provides
an escape mechanism to the underlying operating
system or to the bare machine. So, for example,
BASIC has its PEEK and POKE. But, the route to the
bare machine is often a circuitous one, leading to
tricky programming which is hard to follow.
For those of us working on PC's connected to SHARE-
class mainframes, we are generally concerned with
three interfaces: the keyboard, the screen, and
the communication line or lines. All three of
these entities raise machine dependent issues which
are imperfectly addressed by the underlying operat-
ing system or by high level languages.
Sometimes, the system or the language does too
little for you. For example, with the asynch
adapter, the system provides no interrupt handler,
no buffer, and no flow control. The application is
stuck with the responsibility for monitoring that
port and not missing any characters, then deciding
what to do with all errors. BASIC does a
reasonable job on some of this, but that is only
BASIC. Most other languages do less.
Sometimes, the system may do too much for you.
System support for the keyboard is an example. At
the hardware level, all 83 keys on the keyboard
send unique codes when they are pressed, held down,
and released. But, someone has decided that certain
keys, like Num Lock and Scroll Lock are going to do
certain things before the application even sees
them and can't therefore be used as ordinary keys.
Sometimes, the system does about the right amount
of stuff but does it less efficiently than it
should. System support for the screen is in this
class. If you use only the official interface to
the screen you sometimes slow your application down
unacceptably. I said before, don't use assembler
just to speed things up, but there I was talking
about mainline code, which generally can't be
speeded up much by assembler coding. A critical
system interface is a different matter: sometimes
we may have to use assembler to bypass a hopelessly
inefficient implementation. We don't want to do
this if we can avoid it, but sometimes we can't.
Assembly language code can overcome these deficien-
cies. In some cases, you can also overcome these
deficiencies by judicious use of the escape valves
which your high level language provides. In BASIC,
you can PEEK and POKE and INP and OUT your way
around a great many issues. In other languages you
can issue system calls and interrupts and usually
manage, one way or other, to modify system memory.
Writing handlers to take real-time hardware inter-
rupts from the keyboard or asynch port, though, is
still going to be a problem in most languages.
Some languages claim to let you do it but I have
yet to see an acceptably clean implementation done
The real reason while assembler is better than
"tricky POKEs" for writing machine-dependent code,
though, is the same reason why PASCAL is better
than assembler for writing a payroll package: it
is easier to maintain.
Let the high level language do what it does best,
but recognize that there are some things which are
best done in assembler code. The assembler, unlike
the tricky POKE, can make judicious use of equates,
macros, labels, and appropriately placed comments
to show what is really going on in this machine-
dependent realm where it thrives.
So, there are times when it becomes appropriate to
write in assembler; given that, if you are a
responsible programmer or manager, you will want to
be "assembler-literate" so you can decide when
assembler code should be written.
What do I mean by "assembler-literate?" I don't
just mean understanding the 8086 architecture; I
think, even if you don't write much assembler code
yourself, you ought to understand the actual
process of turning out assembler code and the
various ways to incorporate it into an application.
You ought to be able to tell good assembler code
from bad, and appropriate assembler code from
Steps to becoming ASSEMBLER-LITERATE
1. Learn the 8086 architecture and most of the
instruction set. Learn what you need to know
and ignore what you don't. Reading: The 8086
Primer by Stephen Morse, published by Hayden.
You need to read only two chapters, the one on
machine organization and the one on the
2. Learn about a few simple DOS function calls.
Know what services the operating system
provides. If appropriate, learn a little about
other systems too. It will aid portability
later on. Reading: appendices D and E of the
PC DOS manual.
3. Learn enough about the MACRO assembler and the
LINKer to write some simple things that really
work. Here, too, the main thing is figuring
out what you don't need to know. Whatever you
do, don't study the sample programs distributed
with the assembler unless you have nothing
4. At the same time as you are learning the
assembler itself, you will need to learn a few
tools and concepts to properly combine your
assembler code with the other things you do.
If you plan to call assembler subroutines from
a high level language, you will need to study
the interface notes provided in your language
manual. Usually, this forms an appendix of
some sort. If you plan to package your
assembler routines as .COM programs you will
need to learn to do this. You should also
learn to use DEBUG.
5. Read the Technical Reference, but selectively.
The most important things to know are the
header comments in the BIOS listing. Next, you
will want to learn about the RS 232 port and
maybe about the video adapters.
Notice that the key thing in all five phases is
being selective. It is easy to conclude that there
is too much to learn unless you can throw away what
you don't need. Most of the rest of this talk is
going to deal with this very important question of
what you need and don't need to learn in each
phase. In some cases, I will have to leave you to
do almost all of the learning, in others, I will
teach a few salient points, enough, I hope, to get
you started. I hope you understand that all I can
do in an hour is get you started on the way.
Phase 1: Learn the architecture and instruction set
The Morse book might seem like a lot of book to buy
for just two really important chapters; other books
devote a lot more space to the instruction set and
give you a big beautiful reference page on each
instruction. And, some of the other things in the
Morse book, although interesting, really aren't
very vital and are covered too sketchily to be of
any real help. The reason I like the Morse book is
that you can just read it; it has a very conversa-
tional style, it is very lucid, it tells you what
you really need to know, and a little bit more
which is by way of background; because nothing
really gets belabored too much, you can gracefully
forget the things you don't use. And, I very much
recommend READING Morse rather than studying it.
Get the big picture at this point.
Now, you want to concentrate on those things which
are worth fixing in memory. After you read Morse,
you should relate what you have learned to this
1. You want to fix in your mind the idea of the
four segment registers CODE, DATA, STACK, and
EXTRA. This part is pretty easy to grasp. The
8086 and the 8088 use 20 bit addresses for
memory, meaning that they can address up to 1
megabyte of memory. But, the registers and the
address fields in all the instructions are no
more than 16 bits long. So, how to address all
of that memory? Their solution is to put
together two 16 bit quantities like this:
calculation SSSS0 ---- value in the
register SHL 4
depicted in AAAA ---- apparent address
from register or
RRRRR ---- real address placed
on address bus
In other words, any time memory is accessed,
your program will supply a sixteen bit address.
Another sixteen bit address is acquired from a
segment register, left shifted four bits (one
nibble) and added to it to form the real
address. You can control the values in the
segment registers and thus access any part
of memory you want. But the segment registers
are specialized: one for code, one for most
data accesses, one for the stack (which we'll
mention again) and one "extra" one for
additional data accesses.
Most people, when they first learn about this
addressing scheme become obsessed with convert-
ing everything to real 20 bit addresses. After
a while, though, you get used to thinking in
segment/offset form. You tend to get your
segment registers set up at the beginning of
the program, change them as little as possible,
and think just in terms of symbolic locations
in your program, as with any assembly language.
MOV DS,AX ;Set value of Data segment
ASSUME DS:DATASEG ;Tell assembler DS is usable
MOV AX,PLACE ;Access storage symbolically
by 16 bit address
In the above example, the assembler knows that
no special issues are involved because the
machine generally uses the DS register to
complete a normal data reference.
If you had used ES instead of DS in the above
example, the assembler would have known what to
do, also. In front of the MOV instruction
which accessed the location PLACE, it would
have placed the ES segment prefix. This would
tell the machine that ES should be used,
instead of DS, to complete the address.
Some conventions make it especially easy to
forget about segment registers. For example,
any program of the COM type gets control with
all four segment registers containing the same
value. This program executes in a simplified
64K address space. You can go outside this
address space if you want but you don't have to.
2. You will want to learn what other registers are
available and learn their personalities:
' AX and DX are general purpose registers.
They become special only when accessing
machine and system interfaces.
' CX is a general purpose register which is
slightly specialized for counting.
' BX is a general purpose register which is
slightly specialized for forming base-
' AX-DX can be divided in half, forming AH, AL,
BH, BL, CH, CL, DH, DL.
' SI and DI are strictly 16 bit. They can be
used to form indexed addresses (like BX) and
they are also used to point to strings.
' SP is hardly ever manipulated. It is there
to provide a stack.
' BP is a manipulable cousin to SP. Use it to
access data which has been pushed onto the
' Most sixteen bit operations are legal (even
if unusual) when performed in SI, DI, SP, or
3. You will want to learn the classifications of
operations available WITHOUT getting hung up in
the details of how 8086 opcodes are constructed.
8086 opcodes are complex. Fortunately, the
assembler opcodes used to assemble them are
simple. When you read a book like Morse, you
will learn some things which are worth knowing
but NOT worth dwelling on.
a. 8086 and 8088 instructions can be broken up
into subfields and bits with names like R/M,
MOD, S and W. These parts of the instruction
modify the basic operation in such ways as
whether it is 8 bit or 16 bit, and, if 16 bit,
whether all 16 bits of the data are given;
whether the instruction is register to
register, register to memory, or memory to
register; for operands which are registers,
which register; for operands which are memory,
what base and index registers should be used in
finding the data.
b. Also, some instructions are actually repre-
sented by several different machine opcodes
depending on whether they deal with immediate
data or not, or on other issues, and there are
some expedited forms which assume that one of
the arguments is the most commonly used ope-
rand, like AX in the case of arithmetic.
There is no point in memorizing this detail;
just distill the bottom line, which is, what
kinds of operand combinations EXIST in the
instruction set and what kinds don't. If you
ask the assembler to ADD two things and they
are things for which there is a legal ADD
instruction somewhere in the instruction set,
the assembler will find the right instruction
and fill in all the modifier fields for you.
I guess if you memorized all the opcode con-
struction rules you might have a crack at being
able to disassemble hex dumps by eye, like you
may have learned to do somewhat with 370 assem-
bler. I submit to you that this feat, if ever
mastered by anyone, would be in the same class
as playing the "Minute Waltz" in a minute; a
Here is the basic matrix you should remember:
Two operands: One operand:
R <-- M R
M <-- R M
R <-- R S *
R or M <-- I
R or M <-- S *
S <-- R_M *
* -- data moving instructions
(MOV, PUSH, POP) only
S -- segment register (CS, DS, ES, SS)
R -- ordinary register (AX, BX, CX, DX, SI, DI,
BP, SP, AH, AL, BH, BL, CH, CL, DH, DL)
M -- one of the following
any of the above indexed by SI
any of the first three indexed by DI
4. Of course, you want to learn the operations
themselves. As I've suggested, you want to
learn the op codes as the assembler presents
them, not as the CPU machine language presents
them. So, even though there are many MOV op
codes you don't need to learn them. Basically,
here is the instruction set:
a. Ordinary two operand instructions. These
instructions perform an operation and leave the
result in place of one of the operands. They
1) ADD and ADC -- addition, with or without
including a carry from a previous addition
2) SUB and SBB -- subtraction, with or without
including a borrow from a previous subtraction
3) CMP -- compare. It is useful to think of
this as a subtraction with the answer thrown
away and neither operand actually changed
4) AND, OR, XOR -- typical boolean operations
5) TEST -- like an AND, except the answer is
thrown away and neither operand is changed.
6) MOV -- move data from source to target
7) LDS, LES, LEA -- some specialized forms of
MOV with side effects
b. Ordinary one operand instructions. These
can take any of the operand forms described
above. Usually, the perform the operation and
leave the result in the stated place:
1) INC -- increment contents
2) DEC -- decrement contents
3) NEG -- twos complement
4) NOT -- ones complement
5) PUSH -- value goes on stack (operand
location itself unchanged)
6) POP -- value taken from stack, replaces
c. Now you touch on some instructions which do
not follow the general operand rules but which
require the use of certain registers. The
important ones are
1) The multiply and divide instructions
2) The "adjust" instructions which help in
performing arithmetic on ASCII or packed
3) The shift and rotate instructions. These
have a restriction on the second operand:
it must either be the immediate value 1 or
the contents of the CL register.
4) IN and OUT which send or receive data from
one of the 1024 hardware ports.
5) CBW and CWD -- convert byte to word or word
to doubleword by sign extension
d. Flow of control instructions. These
deserve study in themselves and we will discuss
them a little more. They include
1) CALL, RET -- call and return
2) INT, IRET -- interrupt and return-from-
3) JMP -- jump or "branch"
4) LOOP, LOOPNZ, LOOPZ -- special (and useful)
instructions which implement a counted loop
similar to the 370 BCT instruction
5) various conditional jump instructions
e. String instructions. These implement a
limited storage-to-storage instruction subset
and are quite powerful. All of them have the
1) The source of data is described by the
combination DS and SI.
2) The destination of data is described by the
combination ES and DI.
3) As part of the operation, the SI and/or DI
register(s) is(are) incremented or decremented
so the operation can be repeated.
1) CMPSB/CMPSW -- compare byte or word
2) LODSB/LODSW -- load byte or word
into AL or AX
3) STOSB/STOSW -- store byte or word
from AL or AX
4) MOVSB/MOVSW -- move byte or word
5) SCASB/SCASW -- compare byte or word with
contents of AL or AX
6) REP/REPE/REPNE -- a prefix which can be
combined with any of the above instructions
to make them execute repeatedly across a
string of data whose length is held in CX.
f. Flag instructions: CLI, STI, CLD, STD, CLC,
STC. These can set or clear the interrupt
(enabled) direction (for string operations) or
The addressing summary and the instruction
summary given above masks a lot of annoying
little exceptions. For example, you can't
POP CS, and although the R <-- M form of LES
is legal, the M <-- R form isn't etc. etc. My
a. Go for the general rules
b. Don't try to memorize the exceptions
c. Rely on common sense and the assembler to
teach you about exceptions over time. A lot of
the exceptions cover things you wouldn't want
to do anyway.
5. A few instructions are rich enough and useful
enough to warrent careful study. Here are a
few final study guidelines:
a. It is well worth the time learning to use
the string instruction set effectively. Among
the most useful are
REP MOVSB ;moves a string
REP STOSB ;initializes memory
REPNE SCASB ;look up occurance of
character in string
REPE CMPSB ;compare two strings
b. Similarly, if you have never written for a
stack machine before, you will need to exercise
PUSH and POP and get very comfortable with them
because they are going to be good friends. If
you are used to the 370, with lots of general
purpose registers, you may find yourself
feeling cramped at first, with many fewer
registers and many instructions having register
restrictions. But, you have a hidden ally:
you need a register and you don't want to throw
away what's in it? Just PUSH it, and when you
are done, POP it back. This can lead to abuse.
Never have more than two "expedient" PUSHes in
effect and never leave something PUSHed across
a major header comment or for more than 15
instructions or so. An exception is the saving
and restoring of registers at entrance to and
exit from a subroutine; here, if the subroutine
is long, you should probably PUSH everything
which the caller may need saved, whether you
will use the register or not, and POP it in
reverse order at the end.
Be aware that CALL and INT push return address
information on the stack and RET and IRET pop
it off. It is a good idea to become familiar
with the structure of the stack.
c. In practice, to invoke system services you
will use the INT instruction. It is quite
possible to use this instruction effectively in
a cookbook fashion without knowing precisely
how it works.
d. The transfer of control instructions (CALL,
RET, JMP) deserve careful study to avoid
confusion. You will learn that these can be
classified as follows:
1) all three have the capability of being
either NEAR (CS register unchanged)
or FAR (CS register changed)
2) JMPs and CALLs can be DIRECT (target is
assembled into instruction) or INDIRECT
(target fetched from memory or register)
3) if NEAR and DIRECT, a JMP can be SHORT
(less than 128 bytes away) or LONG
In general, the third issue is not worth
worrying about. On a forward jump which is
clearly VERY short, you can tell the assembler
it is short and save one byte of code:
JMP SHORT CLOSEBY
On a backward jump, the assembler can figure it
out for you. On a forward jump of dubious
length, let the assembler default to a LONG
form; at worst you waste one byte.
Also leave the assembler to worry about how the
target address is to be represented, in
absolute form or relative form.
e. The conditional jump set can be confusing
when studied apart from the assembler, but you
do need to get a feeling for it. The inter-
actions of the sign, carry, and overflow flags
can get your mind stuttering pretty fast if you
worry about it too much. What is boils down
to, though, is
JZ means what it says
JNZ means what it says
JG reater this means "if the SIGNED
difference is positive"
JA bove this means "if the UNSIGNED
difference is positive"
JL ess this means "if the SIGNED
difference is negative"
JB elow this means "if the UNSIGNED
difference is negative"
JC arry assembles the same as JB; it's an
You should understand that all conditional
jumps are inherently DIRECT, NEAR, and "short";
the "short" part means that they can't go more
than 128 bytes in either direction. Again,
this is something you could easily imagine to
be more of a problem than it is. I follow this
1) When taking an abnormal exit from a block
of code, I always use an unconditional jump.
Who knows how far you are going to end up
jumping by the time the program is finished.
For example, I wouldn't code this:
TEST AL,IDIBIT ;Is the idiot bit on?
JNZ OYVEY ;Yes. Go to cleanup
Rather, I would probably code this:
TEST AL,IDIBIT ;Is the idiot bit on?
JZ NOIDIOCY ;No. I am saved.
JMP OYVEY ;Yes. What can we say...
The latter, of course, is a jump around a
jump. Some would say it is evil, but I
submit it is hard to avoid in this language.
2) Otherwise, within a block of code, I use
conditional jumps freely. If the block
eventually grows so long that the assembler
starts complaining that my conditional jumps
are too long, I
a) consider reorganizing the block but
b) also consider changing some conditional
jumps to their opposite and use the "jump
around a jump" approach as shown above.
Enough about specific instructions!
6. Finally, in order to use the assembler
effectively, you need to know the default rules
for which segment registers are used to
complete addresses in which situations.
a. CS is used to complete an address which is
the target of a NEAR DIRECT jump. On an NEAR
INDIRECT jump, DS is used to fetch the address
from memory but then CS is used to complete the
address thus fetched. On FAR jumps, of course,
CS is itself altered. The instruction counter
is always implicitly pointing in the code
b. SS is used to complete an address if BP is
used in its formation. Otherwise, DS is always
used to complete a data address.
c. On the string instructions, the target is
always formed from ES and DI. The source is
normally formed from DS and SI. If there is a
segment prefix, it overrides the source not the
Learning about DOS
I think the best way to learn about DOS internals
is to read the technical appendices in the manual.
These are not as complete as we might wish, but
they really aren't bad; I certainly have learned a
lot from them. What you don't learn from them you
might eventually learn via judicious disassembly of
parts of DOS, but that shouldn't really be necessary.
From reading the technical appendices, you learn
that interrupts 20H through 27H are used to
communicate with DOS. Mostly, you will use
interrupt 21H, the DOS function manager.
The function manager implements a great many
services. You request the individual services by
means of a function code in the AH register. For
example, by putting a nine in the AH register and
issuing interrupt 21H you tell DOS to print a
message on the console screen.
Usually, but by no means always, the DX register is
used to pass data for the service being requested.
For example, on the print message service just
mentioned, you would put the 16 bit address of the
message in the DX register. The DS register is
also implicitly part of this argument, in keeping
with the universal segmentation rules.
In understanding DOS functions, it is useful to
understand some history and also some of the
philosophy of MS-DOS with regard to portability.
Generally, you will find, once you read the
technical information on DOS and also the IBM
technical reference, you will know more than one
way to do almost anything. Which is best? For
example, to do asynch adapter I/O, you can use the
DOS calls (pretty incomplete), you can use BIOS, or
you can go directly to the hardware. The same
thing is true for most of the other primitive I/O
(keyboard or screen) although DOS is more likely to
give you added value in these areas. When it comes
to file I/O, DOS itself offers more than one
interface. For example, there are four calls which
read data from a file.
The way to decide rationally among the alternatives
is by understanding the tradeoffs of functionality
versus portability. Three kinds of portability
need to be considered: machine portability, oper-
ating system portability (for example, the ability
to assemble and run code under CP/M 86) and DOS
version portability (the ability for a program to
run under older versions of DOS>.
Most of the functions originally offered in DOS 1.0
were direct descendents of CP/M functions; there is
even a compatibility interface so that programs
which have been translated instruction for instruc-
tion from 8080 assembler to 8086 assembler might
have a reasonable chance of running if they use
only the core CP/M function set. Among the most
generally useful in this original compatibility set
09 - print a full message on the screen
0A - get a console input line with full DOS editing
0F - open a file
10 - close a file (really needed only when writing)
11 - find first file matching a pattern
12 - find next file matching a pattern
13 - erase a file
16 - create a file
17 - rename a file
1A - set disk transfer address
The next set provide no function above what you can
get with BIOS calls or more specialized DOS calls.
However, they are preferable to BIOS calls when
portability is an issue.
00 - terminate execution
01 - read keyboard character
02 - write screen character
03 - read COM port character
04 - write COM port character
05 - print a character
06 - read keyboard or write screen with no editing
The standard file I/O calls are inferior to the
specialized DOS calls but have the advantage of
making the program easier to port to CP/M style
systems. Thus they are worth mentioning:
14 - sequential read from file
15 - sequential write to file
21 - random read from file
22 - random write to file
23 - determine file size
24 - set random record
In addition to the CP/M compatible services, DOS
also offers some specialized services which have
been available in all releases of DOS. These
27 - multi-record random read.
28 - multi-record random write.
29 - parse filename
2A-2D - get and set date and time
All of the calls mentioned above which have any-
thing to do with files make use of a data area
called the "FILE CONTROL BLOCK" (FCB). The FCB is
anywhere from 33 to 37 bytes long depending on how
it is used. You are responsible for creating an
FCB and filling in the first 12 bytes, which
contain a drive code, a file name, and an
When you open the FCB, the system fills in the next
20 bytes, which includes a logical record length.
The initial lrecl is always 128 bytes, to achieve
CP/M compatibility. The system also provides other
useful information such as the file size.
After you have opened the FCB, you can change the
logical record length. If you do this, your prog-
ram is no longer CP/M compatible, but that doesn't
make it a bad thing to do. DOS documentation
suggests you use a logical record length of one for
maximum flexibility. This is usually a good
To perform actual I/O to a file, you eventually
need to fill in byte 33 or possibly bytes 34-37 of
the FCB. Here you supply information about the
record you are interested in reading or writing.
For the most part, this part of the interface is
compatible with CP/M.
In general, you do not need to (and should not)
modify other parts of the FCB.
The FCB is pretty well described in appendix E of
the DOS manual.
Beginning with DOS 2.0, there is a whole new system
of calls for managing files which don't require
that you build an FCB at all. These calls are
quite incompatible with CP/M and also mean that
your program cannot run under older releases of
DOS. However, these calls are very nice and easy
to use. They have these characteristics
1. To open, create, delete, or rename a file, you
need only a character string representing its
2. The open and create calls return a 16 bit value
which is simply placed in the BX register on
subsequent calls to refer to the file.
3. There is not a separate call required to
specify the data buffer.
4. Any number of bytes can be transfered on a
single call; no data area must be manipulated
to do this.
The "new" DOS calls also include comprehensive
functions to manipulate the new chained directory
structure and to allocate and free memory.
[We'll conclude this superb article next month]
[when Joshua Auerbach tells us about ]
[ Learning the assembler ]
[ What about subroutines? ]
[ Learning about BIOS and the hardware ]
[ A final example Ed.]