Dec 072017
Randy Hyde’s Standard Library for 8086 Assembly Language Programmers. | |||
---|---|---|---|
File Name | File Size | Zip Size | Zip Type |
ADDCHAR.ASM | 477 | 265 | deflated |
ADDSTR.ASM | 698 | 361 | deflated |
ADDSTRL.ASM | 892 | 438 | deflated |
AT.BAT | 85 | 79 | deflated |
ATOH.ASM | 2162 | 738 | deflated |
ATOI.ASM | 1703 | 660 | deflated |
ATOL.ASM | 1872 | 711 | deflated |
COPYSET.ASM | 863 | 394 | deflated |
CRSETS.ASM | 988 | 463 | deflated |
CTYPES.ASM | 841 | 273 | deflated |
DIFFSET.ASM | 731 | 366 | deflated |
EMPTYSET.ASM | 0 | 0 | stored |
GETC.ASM | 2822 | 937 | deflated |
GETS.ASM | 1600 | 713 | deflated |
HTOA.ASM | 2312 | 819 | deflated |
INTERSET.ASM | 726 | 357 | deflated |
ISIZE.ASM | 1111 | 407 | deflated |
ITOA.ASM | 2250 | 842 | deflated |
LSIZE.ASM | 2106 | 610 | deflated |
LTOA.ASM | 2588 | 936 | deflated |
LWRTBL.ASM | 417 | 216 | deflated |
MEMBER.ASM | 607 | 308 | deflated |
MEMORY.ASM | 24095 | 6385 | deflated |
ML.BAT | 82 | 78 | deflated |
NEXTITEM.ASM | 547 | 298 | deflated |
PRINT.ASM | 416 | 244 | deflated |
PRINTF.ASM | 14719 | 3758 | deflated |
PROBE.CFG | 46 | 43 | deflated |
PUTC.ASM | 3244 | 1022 | deflated |
PUTH.ASM | 630 | 268 | deflated |
PUTI.ASM | 819 | 360 | deflated |
PUTISIZE.ASM | 1492 | 524 | deflated |
PUTL.ASM | 1224 | 487 | deflated |
PUTLSIZE.ASM | 1686 | 607 | deflated |
PUTS.ASM | 446 | 241 | deflated |
RANGESET.ASM | 676 | 339 | deflated |
RMVCHAR.ASM | 490 | 273 | deflated |
RMVITEM.ASM | 614 | 333 | deflated |
RMVSTR.ASM | 698 | 371 | deflated |
RMVSTRL.ASM | 0 | 0 | stored |
SCANF.ASM | 10637 | 2774 | deflated |
SHELL.A | 1094 | 561 | deflated |
SPRINTF.ASM | 16024 | 4003 | deflated |
STDIO.A | 5037 | 491 | deflated |
STDLIB.A | 20959 | 2438 | deflated |
STDLIB.LIB | 37376 | 13427 | deflated |
STDLIB.TXT | 106402 | 23429 | deflated |
STRCAT.ASM | 893 | 415 | deflated |
STRCAT2.ASM | 1983 | 740 | deflated |
STRCAT2L.ASM | 2204 | 817 | deflated |
STRCATL.ASM | 999 | 473 | deflated |
STRCHR.ASM | 856 | 387 | deflated |
STRCMP.ASM | 1906 | 828 | deflated |
STRCMPL.ASM | 2058 | 810 | deflated |
STRCPY.ASM | 956 | 457 | deflated |
STRCPYL.ASM | 940 | 437 | deflated |
STRCSPAN.ASM | 1297 | 552 | deflated |
STRCSPN2.ASM | 1360 | 588 | deflated |
STRCSPNL.ASM | 1425 | 608 | deflated |
STRDUP.ASM | 1046 | 459 | deflated |
STRDUPL.ASM | 1162 | 510 | deflated |
STRICMP.ASM | 2069 | 856 | deflated |
STRICMPL.ASM | 2541 | 940 | deflated |
STRLEN.ASM | 511 | 254 | deflated |
STRLWR.ASM | 1037 | 474 | deflated |
STRSET.ASM | 654 | 348 | deflated |
STRSET2.ASM | 1113 | 532 | deflated |
STRSPAN.ASM | 1251 | 536 | deflated |
STRSPANL.ASM | 1402 | 594 | deflated |
STRSTR.ASM | 1492 | 601 | deflated |
STRSTRL.ASM | 1595 | 630 | deflated |
STRUPR.ASM | 1017 | 460 | deflated |
TEST.ASM | 27960 | 5893 | deflated |
TPCREAD.ME | 199 | 165 | deflated |
UNIONSET.ASM | 683 | 347 | deflated |
UPRTBL.ASM | 417 | 213 | deflated |
X.BAT | 26 | 25 | deflated |
Download File RHSTDLIB.ZIP Here
Contents of the STDLIB.TXT file
1 Randy Hyde's Standard Library for 8086 Assembly Language Programmers
This software is ...
sssssss ss ss ss sssssss sssssss
ss ss ss ssss ss ss ss
ss ss ss ss ss ss ss ss
sssssss sssssssss ssssssss sssssss sssss ssssssss
ss ss ss ss ss ss ss ss
ss ss ss ss ss ss ss ss
sssssss ss ss ss ss ss ss sssssss
ww ww ww sssssss sssssss
ww ww wwww ss ss ss
ww ww ww ww ww ss ss ss
ww wwww ww wwwwwwww sssssss sssss
ww ww ww ww ww ww ss ss ss
wwww wwww ww ww ss ss ss
ww ww ww ww ss ss sssssss
'cuz I'm sharing it with you!
I do not want any registrations or fees for the use of this software. I
thank God and Jesus Christ (my personal saviour) for giving me the ability to
write such software. God wants all of us to use our talents to glorify him,
therefore I offer this software as such.
Now for the catch... It is more blessed to give than to receive. If this
software saves you time and effort and you enjoy using it, my life will be
enriched knowing that others have appreciated my work. I would like to share
this wonderful feeling with you. If you like this software and use it, I would
like you to contribute at least one routine to the library. Perhaps you think
this library has some neet-o routines in it. Imagine how nice it would become
if everyone used their imagination to contribute something useful to it.
I hereby release this software to the public domain. You can use it in
any way you see fit. However, I would appreciate it if you share this software
with other much as I've shared it with you. I'm not suggesting that you give
away software you've written with this package (I'm not quite as crazy as
Richard Stallman, bless his heart), but if someone else would like a copy of
this library, please help them out. Naturally, I'd be tickeled pink to receive
credit in software that uses these routines (which is the honorable thing to
do) but I understand the way many corporations operate and won't be terrible
put off if you use it without giving due credit. Enjoy!
If you have comments, bug reports, new code to contribute, etc., you can
reach me at:
rhyde (On BIX).
[email protected] (On Internet).
[email protected] (On Internet, this one may go away).
or
Randy Hyde
Dept of Computer Science
2208 Sproul Hall
University of California
Riverside, Ca. 92521-0135
or
Randy Hyde
c/o Braintec Corporation
10 Corporate Park Way, ste 110
Irvine, Ca. 92714
1.1 Comments about the code
This code has received very little testing. C'mon, whadda expect for
free? I've been cranking this stuff out as fast as possible without going back
and reworking anything I've done. The only exception has been modification of
the routines to use the es:di/dx:si register pairs rather than es:si/ds:di
register pairs. I expect those modifications introduced more bugs. Please
don't expect super optimal code here. I have had anytime to study and improve
this code. Most of it is fairly mediocre (from a size/speed point of view).
Hopefully, you'll agree, it's the idea that counts. If you don't like
something I've done, you've got the sources -- have at it. (Of course, I'd
appreciate it if you would send me any modifications.)
1.2 Wish List
Next, I'll be working on FILE I/O versions of the I/O routines in this
package. Sooner or later I'll get around to adding floating point routines to
this package. If you're interested in adding some routines to this package,
GREAT!
Routines I'd like to have but am too busy to work on now:
1) Routines which manipulate directories (read/write/etc.)
2) A regular expression interpreter.
3) Length-prefixed strings package.
4) A windowing package.
5) A graphics package.
6) An object-oriented programming class library.
7) Just about anything else appearing in a HLL "standard" library.
If you've got any ideas, I'd love to discuss them with you. Best bet is
to reach me electronically at the E-MAIL addresses above.
1.3 Missing Routines to Supply RSN
String package:
strins Inserts one string into the middle of another
strdel Deletes a sequence of characters from the middle of a string.
Character Set Package:
span- Skips through a sequence of characters in a string which belong to
a character set.
break- Skips through a sequence of characters in a string which do not
belong to a character set.
Memory Manager Package
Memavail- Largest block of free memory available on the heap.
Memfree- Total amount of free space on the heap.
2 Character Output Routines
2.1 Putc
* Outputs character in AL register to the standard output device.
* Output is redirectable to user-written routine.
Inputs: AL- character to print.
Outputs: None.
Include: stdlib.a
Putc is the primitive character output routine. Most other output routines in
the standard library output data through this procedure. Prints the
ASCII character in AL. Processing of control codes is undefined
although most output routines this guy links to should be able to
handle return, line feed, back space, and tab. By default, this
routine calls DOS to print the character to the standard output
device.
Example:
mov al, 'C'
putc ;Prints "C" to std output.
2.2 PutCR
* Easy way of printing a newline to the stdlib standard output.
Inputs: None.
Outputs: None.
Include: stdlib.a
Prints a newline (carriage return/line feed) to the current standard
output device.
Example:
PutCR
2.3 PutcStdOut
* Outputs character in AL to the DOS standard output device.
* Sends a character directly to the DOS std output device.
* Output is redirectable via DOS I/O redirection.
* Bypasses redirection through the standard library Putc routine.
Inputs: AL- character to output.
Outputs: None.
Include: stdlib.a
PutcStdOut calls DOS to print the character in AL to the standard output
device. Although processing of non-ASCII characters and control characters is
undefined, most output devices handle these characters properly. In
particular, most output devices properly handle return, line feed, back space,
and tab.
Example:
mov al, 'C'
PutcStdOut ;Writes "C" to std output.
2.4 PutcBIOS
* Prints character in AL to the display device by calling BIOS.
* Cannot be redirected by stdlib or by DOS.
* Uses INT 10H/AH=14 for teletype-like output.
* Handles return, line feed, back space, and tab. Prints other
control characters using the IBM Character set.
Inputs: AL- Character to print.
Outputs: None.
Include- stdlib.a
PutcBIOS prints the character in AL using the BIOS routines. Output
through this routine cannot be redirected, such output is always sent to the
video display on the PC (unless, of course, someone has patched INT 10h).
Example:
mov al, "C"
PutcBIOS
2.5 GetOutAdrs
* Retrieves address of the current output routine.
Inputs: None.
Outputs: es:di - address of current output routine (called by Putc).
Include: stdlib.a
You can use this function to get the address of the current output
routine, perhaps so you can save it or see if it is currently pointing at some
particular piece of code. If you want to temporarily redirect the output and
then restore the original output routine, consider using PushOutAdrs/PopOutAdrs
described later.
Example:
GetOutAdrs
mov word ptr SaveOutAdrs, di
mov word ptr SaveOutAdrs+2, es
2.6 SetOutAdrs
* Lets you set the address of the current output routine.
Inputs: es:di- Address of new output routine.
Outputs: None.
Include: stdlib.a
This routine redirects the stdlib standard output so that it calls the
routine whose address you pass in es:di. This routine should expect the
character in AL and must preserve all registers. At a bare minimum, it should
handle the printable ASCII characters and the four control characters return,
line feed, back space, and tab (unless, of course, the main purpose of this
routine is to handle these codes in a different fashion).
Example:
mov es, seg NewOutputRoutine
mov di, offset NewOutputRoutine
SetOutAdrs
.
.
.
les di, RoutinePtr
SetOutAdrs
2.7 PushOutAdrs
* Lets you redirect the standard output device and preserve the
previous address.
* Saves up to 16 old output routine addresses on an internal stack.
* Restoration is possible using PopOutAdrs.
Inputs: es:di- Address of new output routine.
Outputs: Carry=0 if operation successful.
Carry=1 if there were already 16 items on the stack.
Include: stdlib.a
This routine "pushes" the current output address onto an internal stack
and then stores the value in es:di into the current output routine pointer.
The PushOutAdrs and PopOutAdrs routines let you easily save and redirect the
standard output and then restore the original output routine address later on.
If you attempt to push more than 16 items on the stack, PushOutAdrs will
ignore your request and return with the carry flag set. If PushOutAdrs is
successful, it will return with the carry flag clear.
Example:
mov es, seg NewOutputRoutine
mov di, offset NewOutputRoutine
PushOutAdrs
.
.
.
les di, RoutinePtr
PushOutAdrs
2.8 PopOutAdrs
* Restores output routine addresses saved by PushOutAdrs.
* Defaults to PutcStdOut if you attempt to pop too many items off the
stack.
Inputs: None.
Outputs: es:di- Points at the previous stdout routine before the pop.
Include: stdlib.a
PopOutAdrs undoes the effects of PushOutAdrs. It pops an item off the
internal stack and stores it into the output routine pointer. The previous
value in the output pointer is returned in es:di.
Example:
mov es, seg NewOutputRoutine
mov di, offset NewOutputRoutine
PushOutAdrs
.
.
.
PopOutAdrs
2.9 Puts
* Outputs a string of characters to the stdlib standard output device.
* Calls putc for each character in the string thereby sending each
character out to the standard output device.
Inputs: es:di- Contains the address of the string to print.
Outputs: None.
Include: stdlib.a
Puts prints a zero-terminated string whose address appears in es:di. Each
character appearing in the string is printed verbatim. There are no special
escape characters. Unlike the "C" routine by the same name, puts does not
print a newline after printing the string. Use putcr if you want to print the
newline after printing a string with puts.
Example:
les di, StrToPrt
puts
putcr
2.10 Puth
* Outputs the byte in AL as two hex digits (including leading zero if
necessary).
* Calls stdlib putc routine to print both characters to the stdlib
standard output device.
Inputs: AL- Value to print.
Outputs: None.
Include: Stdlib.a
Prints the value in the AL register as two hexadecimal digits. If the
value in AL is between 0 and 0Fh, puth will print a leading zero. This routine
calls the stdlib standard output routine (putc) to print all characters.
Example:
mov al, 1fh
puth
2.11 Putw
* Outputs the word in AX as four hex digits (including leading zeros
if necessary).
* Calls stdlib putc routine to print characters to the stdlib standard
output device.
Inputs: AX- Value to print.
Outputs: None.
Include: Stdlib.a
Prints the value in the AX register as four hexadecimal digits. If the
value in AX is between 0 and 0Fh, puth will print a leading zero. This routine
calls the stdlib standard output routine (putc) to print all characters.
Example:
mov ax, 0f1fh
putw
2.12 Puti
* Outputs the word in AX as a signed decimal number (including minus
sign, if necessary).
* Calls stdlib putc routine to print characters to the stdlib standard
output device.
Inputs: AX- Value to print.
Outputs: None.
Include: Stdlib.a
Prints the value in the AX register as a decimal integer. This routine
uses the exact number of screen positions required to print the number
(including a position for the minus sign, if the number is negative). This
routine calls the stdlib standard output routine (putc) to print all
characters.
Example:
mov ax, -1234
puti
2.13 Putu
* Outputs the word in AX as an unsigned decimal number.
* Calls stdlib putc routine to print both characters to the stdlib
standard output device.
Inputs: AX- Value to print.
Outputs: None.
Include: Stdlib.a
Prints the value in the AX register as a decimal integer. This routine
uses the exact number of screen positions required to print the number. This
routine calls the stdlib standard output routine (putc) to print all
characters.
Example:
mov ax, 1234
putu
2.14 Putl
* Outputs the double word in DX:AX as a signed decimal number
(including minus sign, if necessary).
* Calls stdlib putc routine to print characters to the stdlib standard
output device.
Inputs: DX:AX- Value to print.
Outputs: None.
Include: Stdlib.a
Prints the value in the DX:AX registers as a decimal integer. This
routine uses the exact number of screen positions required to print the number
(including a position for the minus sign, if the number is negative). This
routine calls the stdlib standard output routine (putc) to print all
characters.
Example:
mov dx, 0ffffh
mov ax, -1234
putl
2.15 Putul
* Outputs the double word in DX:AX as an unsigned decimal number
(including minus sign, if necessary).
* Calls stdlib putc routine to print characters to the stdlib standard
output device.
Inputs: DX:AX- Value to print.
Outputs: None.
Include: Stdlib.a
Prints the value in the DX:AX registers as a decimal integer. This
routine uses the exact number of screen positions required to print the
number. This routine calls the stdlib standard output routine (putc) to print
all characters.
Example:
mov dx, 12h
mov ax, 1234
putul
2.16 PutISize
* Prints the value in AX as a signed decimal integer.
* Prints the number in a minimum field width specified by the value in
CX.
Inputs: AX- Value to print.
CX- Minimum number of print positions to use.
Outputs: None.
Include: Stdlib.a
PutISize prints the signed integer value in AX to the stdlib standard
output device using a minimum of n print positions. CX contains n, the minimum
field width for the output value. The number (including any necessary minus
sign) is printed right justified in the output field.
If the number in AX requires more print positions than specified by CX,
PutISize uses however many print positions are necessary to actually print the
number. If you specify zero in CX, PutISize uses the minimum number of print
positions required. Of course, PutI will also use the minimum number of print
positions without disturbing the value in the CX register.
Note that, under no circumstances, will the number in AX ever require more
than size print positions (-32,767 requires the most print positions).
Examples:
mov cx, 5
mov ax, I
PutISize
.
.
.
mov cx, 12
mov ax, J
PutISize
2.17 PutUSize
* Prints the value in AX as an unsigned decimal integer.
* Prints the number in a minimum field width specified by the value in
CX.
Inputs: AX- Value to print.
CX- Minimum number of print positions to use.
Outputs: None.
Include: Stdlib.a
Like PutISize above except this guy prints unsigned values. Note that the
maximum number of print positions required by any number (e.g., 65,535) is
five.
Example:
mov cx, 8
mov ax, U
PutUSize
2.18 PutLSize
* Prints the value in DX:AX as a long signed decimal integer.
* Prints the number in a minimum field width specified by the value in
CX.
Inputs: DX:AX- Value to print.
CX- Minimum number of print positions to use.
Outputs: None.
Include: Stdlib.a
Like PutISize above, except this guy prints the long integer value in
DX:AX. Note that there may be as many as 11 print positions (e.g.,
-1,000,000,000).
Example:
mov cx, 16
mov dx, word ptr L+2
mov ax, word ptr L
PutLSize
2.19 PutULSize
* Prints the value in DX:AX as a long unsigned decimal integer.
* Prints the number in a minimum field width specified by the value in
CX.
Inputs: DX:AX- Value to print.
CX- Minimum number of print positions to use.
Outputs: None.
Include: Stdlib.a
Just like PutLSize above except this guy prints unsigned numbers rather
than signed long integers. The largest field width for such a value is 10
print positions.
Example:
mov cx, 8
mov dx, word ptr UL+2
mov ax, word ptr UL
PutULSize
2.20 Print
* Prints a string literal.
* Very convenient to use.
* Calls stdlib putc routine to print characters to the stdlib standard
output device.
Inputs: CS:RET - Return address points at the string to print.
Outputs: None.
Include: Stdlib.a
Print lets you print string literals in a convenient fashion. The string
to print immediately follows the call to the print routine. The string must
contain a zero terminating byte and may not contain any intervening zero
bytes. Since the print routine returns to the address immediately following
the zero terminating byte, forgetting this byte or attempting to print a zero
byte in the middle of a literal string will cause print to return to an
unexpected instruction. This usually hangs up the machine. Be very careful
when using this routine!
Example:
print
db "Print this string to the display device"
db 13,10
db "This appears on a new line"
db 13,10
db 0
2.21 Printf
* Formatted output routine.
* Very similar to the "C" function of the same name.
* Prints integers (normal, long, unsigned, etc.), characters, strings,
and other data types (this routine, however, does not support
floating point output).
* Calls stdlib putc routine to print characters to the stdlib standard
output device.
Inputs: CS:RET - Return address points at the format string.
Outputs: None.
Include: Stdlib.a
Printf, like its "C" namesake, provides formatted output capabilities for
the stdlib package. A typical call to printf always takes the following form:
printf
db "format string",0
dd operand1, operand2, ..., operandn
The format string is comparable to the one provided in the "C" programming
language. For most characters, printf simply prints the characters in the
format string up to the terminating zero byte. The two exceptions are
character prefixed by a backslash ("\") and character prefixed by a percent
sign ("%"). Like C's printf, stdlib's printf uses the backslash as an escape
character and the percent sign as a lead-in to a format string.
Printf uses the escape character ("\") to print special characters in a
fashion similar to, but not identical to C's printf. Stdlib's printf routine
supports the following special characters:
* r Print a carriage return (but no line feed)
* n Print a new line character (carriage return/line feed).
* b Print a backspace character.
* t Print a tab character.
* l Print a line feed character (but no carriage return).
* f Print a form feed character.
* \ Print the backslash character.
* % Print the percent sign character.
* 0xhh Print ASCII code hh, represented by two hex digits.
C users should note a couple of differences between stdlib's escape
sequences and C's. First, use "\%" to print a percent sign within a format
string, not "%%". C doesn't allow the use of "\%" because the C compiler
processes "\%" at compile time (leaving a single "%" in the object code)
whereas printf processes the format string at run-time. It would see a single
"%" and treat it as a format lead-in character. Stdlib's printf, on the other
hand, processes both the "\" and "%" and run-time, therefore it can distinguish
"\%".
Strings of the form "\0xhh" must contain exactly two hex digits. The
current printf routine isn't robust enough to handle sequences of the form
"\0xh" which contain only a single hex digit. Keep this in mind if you find
printf chopping off characters after you print a value.
There is absolutely no reason to use any escape character sequences except
"\0x00". Printf grabs all characters following the call to printf up to the
terminating zero byte (which is why you'd need to use "\0x00" if you want to
print the null character, printf will not print such values). Stdlib's printf
routine doesn't care how those characters got there. In particular, you are
not limited to using a single string after the printf call. The following is
perfectly legal:
printf
db "This is a string",13,10
db "This is on a new line",13,10
db "Print a backspace at the end of this line:"
db 8,13,10,0
You code will run a tiny amount faster if you avoid the use of the escape
character sequences. More importantly, the escape character sequences take at
least two bytes. You can encode most of them as a single byte by simply
embedding the ASCII code for that byte directly into the code stream. Don't
forget, you cannot embed a zero byte into the code stream. A zero byte
terminates the format string. Instead, use the "\0x00" escape sequence.
Format sequences always between with "%". For each format sequence you
must provide a far pointer to the associated data immediately following the
format string, e.g.,
printf
db "%i %i",0
dd i,j
Format sequences take the general form "%s\cn^f" where:
* "%" is always the "%" character. Use "\%" if you actually want
to print a percent sign.
* s is either nothing or a minus sign ("-").
* "\c" is also optional, it may or may not appear in the format
item. "c" represents any printable character.
* "n" represents a string of 1 or more decimal digits.
* "^" is just the caret (up-arrow) character.
* "f" represents one of the format characters: i, d, x, h, u, c,
s, ld, li, lx, or lu.
The "s", "\c", "n", and "^" items are optional, the "%" and "f" items must
be present. Furthermore, the order of these items in the format item is very
important. The "\c" entry, for example, cannot precede the "s" entry.
Likewise, the "^" character, if present, must follow everything except the "f"
character(s).
The format characters i, d, x, h, u, c, s, ld, li, lx, and lu control the
output format for the data. The i and d format characters perform identical
functions, they tell printf to print the following value as a 16-bit signed
decimal integer. The x and h format characters instruct printf to print the
specified value as a 16-bit or 8-bit hexadecimal value (respectively). If you
specify u, printf prints the value as a 16-bit unsigned decimal integer. Using
c tells printf to print the value as a single character. S tells printf that
you're supplying the address of a zero-terminated character string, printf
prints that string. The ld, li, lx, and lu entries are long (32-bit) versions
of d/i, x, and u. The corresponding address points at a 32-bit value which
printf will format and print to the standard output. The following example
demonstrates these format items:
printf
db "I= %i, U= %u, HexC= %h, HexI= %x, C= %c, "
db "S= %s",13,10
db "L= %ld",13,10,0
dd i,u,c,i,c,s,l
The number of far addresses (specified by operands to the "dd"
pseudo-opcode) must match the number of "%" format items in the format string.
Printf counts the number of "%" format items in the format string and skips
over this many far addresses following the format string. If the number of
items do not match, the return address for printf will be incorrect and the
program will probably hang or otherwise malfunction. Likewise (as for the
print routine), the format string must end with a zero byte. The addresses of
the items following the format string must point directly at the memory
locations where the specified data lies.
When used in the format above, printf always prints the values using the
minimum number of print positions for each operand. If you want to specify a
minimum field width, you can do so using the "n" format option. A format item
of the format "%10d" prints a decimal integer using at least ten print
positions. Likewise, "%16s" prints a string using at least 16 print
positions. If the value to print requires more than the specified number of
print positions, printf will use however many are necessary. If the value to
print requires fewer, printf will always print the specified number, padding
the value with blanks. Printf will print the value right justified in the
print field (regardless of the data's type). If you want to print the value
left justified in the output file, use the "-" format character as a prefix to
the field width, e.g.,
printf
db "%-17s",0
dd string
In this example, printf prints the string using a 17 character long field with
the string left justified in the output field.
By default, printf blank fills the output field if the value to print
requires fewer print positions than specified by the format item. The "\c"
format item allows you to change the padding character. For example, to print
a value, right justified, using "*" as the padding character you would use the
format item "%\*10d". To print it left justified you would use the format item
"%-\*10d". Note that the "-" must precede the "\*". This is a limitation of
the current version of the software. The operands must appear in this order.
Normally, the address(es) following the printf format string must be far
pointers to the actual data to print. On occassion, especially when allocating
storage on the heap (using malloc), you may not know (at assembly time) the
address of the object you want to print. You may have only a pointer to the
data you want to print. The "^" format option tells printf that the far
pointer following the format string is the address of a pointer to the data
rather than the address of the data itself. This option lets you access the
data indirectly.
Examples:
printf
db "Indirect access to i: %^d",13,10,0
dd IPtr
;
printf
db "A string allocated on the heap: %-\.32^s"
db 13,10,0
dd SPtr
Note: unlike C, stdlib's printf routine does not support floating point
output. There are two reasons for this: first, stdlib does not (yet) have a
floating point library associated with it; second, adding floating point
support would increase the size of printf by a tremendous amount, even if you
don't use its floating point capabilities. Since most assembly language
programmers don't use floating point arithmetic, I've intentionally left out
floating point output. As soon as I add a floating point package to stdlib I
will include floating point output. However, I will create a new routine,
printff which includes floating point output. This will allow those who never
use floating point I/O to keep their programs much smaller.
3 Character Input Routines
3.1 Getc
* Reads a character from the standard input device and returns the
character in the AL register.
* Redirectable under program control.
Inputs: None.
Outputs: AL- Character from input device.
AH- Undefined. However, if AL contains zero, AH should contain a
keyboard scan code.
Include: Stdlib.a
This routine reads a character from the standard input device. This call
is synchronous, that is, it does not return until a character is available.
Default input device is DOS standard input.
Example:
getc
mov KbdChar, al
putc
3.2 GetcStdIn
* Reads a character from the DOS standard input device and returns the
character in the AL register.
* Redirectable from DOS command line.
Inputs: None.
Outputs: AL- Character from input device.
AH- Scan code if AL=0.
Include: Stdlib.a
This routine reads a character from the DOS standard input device. This
call is synchronous, that is, it does not return until a character is
available.
Example:
GetcStdIn
mov InputChr, al
putc
3.3 GetcBIOS
* Reads a character from the keyboard and returns the character in the
AL register and the scan code in the AH register.
Inputs: None.
Outputs: AL- Character from the keyboard.
AH- Scan code from the keyboard.
Include: Stdlib.a
This routine reads a character from the keyboard. This call is
synchronous, that is, it does not return until a character is available.
Example:
GetcBIOS
mov CharRead, al
mov ScanCode, ah
putc
3.4 SetInAdrs
* Lets you set the address of the current input routine.
Inputs: es:di- Address of new input routine.
Outputs: None.
Include: stdlib.a
This routine redirects the stdlib standard input so that it calls the
routine whose address you pass in es:di. This routine should obtain a
character (from anywhere) and return the character in AL. If it makes sense do
do so, it should also return a "scan code" in the AH register. It must
preserve all other registers.
Example:
mov es, seg NewInputRoutine
mov di, offset NewInputRoutine
SetInAdrs
.
.
.
les di, RoutinePtr
SetInAdrs
3.5 GetInAdrs
* Retrieves address of the current input routine.
Inputs: None.
Outputs: es:di - address of current input routine (called by Getc).
Include: stdlib.a
You can use this function to get the address of the current input routine,
perhaps so you can save it or see if it is currently pointing at some
particular piece of code. If you want to temporarily redirect the input and
then restore the original input routine, consider using PushInAdrs/PopInAdrs
described later.
Example:
GetInAdrs
mov word ptr SaveInAdrs, di
mov word ptr SaveInAdrs+2, es
3.6 PushInAdrs
* Lets you redirect the standard input device and preserve the
previous address.
* Saves up to 16 old input routine addresses on an internal stack.
* Restoration is possible using PopInAdrs.
Inputs: es:di- Address of new input routine.
Outputs: Carry=0 if operation successful.
Carry=1 if there were already 16 items on the stack.
Include: stdlib.a
This routine "pushes" the current input address onto an internal stack and
then stores the value in es:di into the current input routine pointer. The
PushInAdrs and PopInAdrs routines let you easily save and redirect the standard
output and then restore the original output routine address later on.
If you attempt to push more than 16 items on the stack, PushInAdrs will
ignore your request and return with the carry flag set. If PushInAdrs is
successful, it will return with the carry flag clear.
Example:
mov es, seg NewInputRoutine
mov di, offset NewInputRoutine
PushInAdrs
.
.
.
les di, RoutinePtr
PushInAdrs
3.7 PopInAdrs
* Restores output routine addresses saved by PushInAdrs.
* Defaults to GetcStdOut if you attempt to pop too many items off the
stack.
Inputs: None.
Outputs: es:di- Points at the previous stdout routine before the pop.
Include: stdlib.a
PopInAdrs undoes the effects of PushInAdrs. It pops an item off the
internal stack and stores it into the input routine pointer. The previous
value in the output pointer is returned in es:di.
Example:
mov es, seg NewInRoutine
mov di, offset NewInputRoutine
PushInAdrs
.
.
.
PopInAdrs
3.8 Gets
* Reads a line of text from the stdlib standard input device.
* Automatically allocates storage for the input string on the heap.
* Handles input lines up to 256 characters long.
Inputs: None.
Outputs: es:di - address of input of text.
Include: stdlib.a
Gets reads a line of text from the stdlib standard input. It returns a
pointer to a string containing each character read in the ES:DI registers.
Gets calls malloc to allocate 256 bytes on the heap (plus any overhead bytes
required by the memory manager system). If the user enters less than 256
bytes, gets calls realloc to free any unnecessary bytes. Gets returns all
characters typed by the user except for the carriage return (ENTER) key code.
Gets always returns a zero-terminated string. The action of various keys to
gets depends upon where input has be directed. Generally, you can count on
gets properly handling the backspace (erase previous character), escape (erase
entire line), and ENTER (accept line) keys. Other keys may be active as well.
For example, by default gets calls getc which calls DOS' standard input
routine. If you type a control-C or break key while reading from DOS' standard
input it will abort the program. If this bothers you, you can always redirect
stdlib's getc routine so it calls BIOS directly rather than reading data
through DOS' keyboard input routine.
Example:
gets ;Read a string from the keyboard
puts ;Print it
putcr ;Print a new line
free ;Deallocate storage for string.
3.9 Scanf
* Formatted input from stdlib standard input.
* Similar to C's scanf routine.
* Converts ASCII to integer, unsigned, character, string, hex, and
long values of the above.
Inputs: None.
Outputs: None.
Include: stdlib.a
Scanf provides formatted input in a fashion analogous to printf's output
facilities. Actually, it turns out that scanf is considerably less useful than
printf because it doesn't provide reasonable error checking facilities (neither
does C's version of this routine). But for quick and dirty programs whose
input can be controlled in a rigid fashion (or if you're willing to live by
"garbage in, garbage out") scanf provides a convenient way to get input from
the user.
Like printf, the scanf routine expects you to follow the call with a
format string and then a list of (far pointer) memory addresses. The items in
the scanf format string take the following form:
%^f
where f represents d, i, x, h, u, c, x, ld, li, lx, or lu. Like printf, the
"^" symbol tells scanf that the address following the format string is the
address of a (far) pointer to the data rather than the address of the data
location itself.
By default, scanf automatically skips any leading whitespace before
attempting to read a numeric value. You can instruct scanf to skip other
characters by placing that character in the format string. For example, the
following call instructs scanf to read three integers separated by commas
(and/or whitespace):
scanf
db "%i,%i,%i",0
dd i1,i2,i3
Whenever scanf encounters a non-blank character in the format string, it
will skip that character (including multiple occurrences of that character) if
it appears next in the input stream.
Scanf always calls gets to read a new line of text from stdlib's standard
input. If scanf exhausts the format list, it ignores any remaining characters
on the line. If scanf exhausts the input line before processing all of the
format items, it leaves the remaining variables unchanged. Scanf always
deallocates the storage allocated by gets.
Example:
scanf
db "%i %h %^s",0
dd i, x, sptr
4 Conversion Routines
4.1 ATOL/ATOL2
* Converts an ASCII string of digits to long integer format.
Inputs: ES:DI- Points at string to convert.
Outputs: DX:AX- Long integer converted from string.
Carry flag- Error status
DI (ATOL2)- First character beyond string of digits.
Include: stdlib.a
ATOL convert the string of digits that ES:DI points at to a long integer
(signed) value and returns this value in DX:AX. ATOL2 works in a similar
fashion except it doesn't preserve the DI register. That is, it leaves DI
pointing at the first character beyond the string of digits. This routine
returns the carry flag clear if it translated the string of digits witout
error. It returns the carry flag set if overflow occurred. Note that this
routine stops on the first non-digit. If the string does not begin with a
digit, this routine returns zero. The only except to the "string of digits"
rule is that the number can have a preceding minus sign to denote a negative
number. In particular, note that this routine does not allow leading spaces.
Example:
gets ;Get a string from user
atol ;Convert to a value in DX:AX
4.2 ATOUL/ATOUL2
Just like ATOL above, except this guy handles unsigned long integers.
4.3 ATOI
* Converts an ASCII string of digits to integer format.
Inputs: ES:DI- Points at string to convert.
Outputs: AX- Integer converted from string.
Carry flag- Error status
DI (ATOI2)- First character beyond string of digits.
Include: stdlib.a
Works just like ATOL except it translates the string to a signed 16-bit
integer rather than a 32-bit long integer.
4.4 ATOU/ATOU2
* Converts an ASCII string of digits to unsigned integer format.
Inputs: ES:DI- Points at string to convert.
Outputs: AX- Unsigned 16-bit integer converted from string.
Carry flag- Error status
DI (ATOU2)- First character beyond string of digits.
Include: stdlib.a
Like ATOI except it handle unsigned 16-bit integers in the range 0..65535.
4.5 ATOH/ATOH2
* Converts an ASCII string of hex digits to a value in AX.
Inputs: ES:DI- Points at string to convert.
Outputs: AX- Unsigned 16-bit integer converted from hex string.
Carry flag- Error status
DI (ATOH2)- First character beyond string of hex digits.
Include: stdlib.a
This routine converts a string of hexadecimal digits into numeric form and
returns that value in the AX register.
Example:
les di, Str2Convrt
atoh ;Convert to value in AX.
putw ;Print word in AX.
4.6 ATOLH/ATOLH2
Like ATOH above, except it handles 32-bit values and returns the result in
DX:AX.
4.7 ITOA
* Converts a 16-bit signed integer value in AX to a string of
characters.
* Automatically allocates storage for string on the heap.
Inputs: AX- Signed 16-bit value to convert to a string.
Outputs: ES:DI- Pointer to string containing converted characters.
Include: stdlib.a
ITOA converts the signed integer value in AX to a string of characters
which represent that value. It allocates storage for this string on the heap
via a call to the malloc routine and returns a pointer to that string in
ES:DI. The string contains the minimum number of characters required to hold
the character representation of the value and is always between one and six
characters long.
Example:
mov ax, -1234
itoa ;Convert to string.
puts ;Print it.
free ;Deallocate string.
4.8 UTOA
* Converts a 16-bit unsigned integer value in AX to a string of
characters.
* Automatically allocates storage for string on the heap.
Inputs: AX- Unsigned 16-bit value to convert to a string.
Outputs: ES:DI- Pointer to string containing converted characters.
Include: stdlib.a
Like ITOA above, except it converts the unsigned value in AX to a string
of characters. The string returned by UTOA is always one to five characters
long.
Example:
mov ax, 65000
utoa
puts
free
4.9 HTOA
* Converts an 8-bit value in AL to the two-character hexadecimal
representation of that byte.
* Automatically allocates storage for string on the heap.
Inputs: AL- 8-bit value to convert to a string.
Outputs: ES:DI- Pointer to string containing converted characters.
Include: stdlib.a
Converts a byte to a string containing the hexadecimal representation of
that byte. Otherwise, it's just like ITOA above. This routine always outputs
exactly two hexadecimal digits, including a leading zero (if necessary).
4.10 WTOA
* Converts a 16-bit value in AX to hexadecimal representation.
* Automatically allocates storage for string on the heap.
Inputs: AX- 16-bit value to convert to a string.
Outputs: ES:DI- Pointer to string containing converted characters.
Include: stdlib.a
Like HTOA above, except it converts the 16-bit value in AX to a string of
four hexadecimal digits. Outputs exactly four digits including leading zeros
if necessary.
4.11 LTOA
* Converts a 32-bit signed integer value in DX:AX to a string of
characters.
* Automatically allocates storage for string on the heap.
Inputs: DX:AX- Signed 32-bit value to convert to a string.
Outputs: ES:DI- Pointer to string containing converted characters.
Include: stdlib.a
Like ITOA except it converts a long integer value in DX:AX to a string of
one to eleven characters.
4.12 ULTOA
* Converts a 32-bit unsigned integer value in DX:AX to a string of
characters.
* Automatically allocates storage for string on the heap.
Inputs: DX:AX- Unsigned 32-bit value to convert to a string.
Outputs: ES:DI- Pointer to string containing converted characters.
Include: stdlib.a
Like LTOA except this guy handles unsigned integer values.
4.13 SPrintf
* In-memory formatting routine.
* Just like C's sprintf routine.
* Automatically allocates storage for the string on the heap.
* Programmer selectable maximum length for the output string.
Inputs: CS:RET- Pointer to format string and operands of the sprintf
routine.
Outputs: ES:DI- Pointer to string containing output data.
Include: stdlib.a
Works in a manner quite similar to printf except sprintf writes its output
to a string variable rather than to the stdlib standard output. Sprintf
returns a pointer to the string (which is allocates on the heap) in the ES:DI
registers. SPrintf, by default, allocates 2048 characters for this string and
then deallocates any unnecessary storage. An external variable, sp_MaxBuf,
holds the number of bytes to allocate upon entry into sprintf. If you wish to
allocate more or less than 2048 bytes when calling sprintf, simply change the
value of this public variable (type is word). Sprintf calls malloc to allocate
the storage dynamically. You should call free to return this buffer to the
heap when you are through with it.
Example:
sprintf
db "I=%i, U=%u, S=%s",13,10,0
db i,u,s
puts
free
4.14 SBPrintf
* In-memory formatting routine.
* Programmer-supplied output buffer for string
Inputs: CS:RET- Pointer to format string and operands of the sprintf
routine.
ES:DI- Pointer to buffer area to store string data.
Outputs: None.
Include: stdlib.a
Works just like sprintf except it does not automatically allocate storage
for the output string. Instead, you must supply the address of an output
buffer in the ES:DI registers.
Example:
les di, BufferAdrs
sbprintf
db "I=%i, U=%u, S=%s",13,10,0
db i,u,s
puts
4.15 SScanf
* Formatted in-memory conversions.
* Similar to C's sscanf routine.
* Converts ASCII to integer, unsigned, character, string, hex, and
long values of the above.
Inputs: ES:DI- Points at string containing values to convert.
CS:RET- Points at format string and variable parameter list.
Outputs: None.
Include: stdlib.a
Sscanf provides formatted input in a fashion analogous to scanf. The
difference is that scanf reads a line of text from the stdlib standard input
whereas you pass the address of a sequence of characters to sscanf in es:di.
Example:
;
; This code reads the values for i, j, and s from the characters
; starting at memory locaiton Buffer.
;
les di, Buffer
sscanf
db "%i %i %s",0
dd i,j,s
4.16 ToLower
* Converts uppercase characters in AL to lower case.
* Macro implementation for high performance.
* Leaves characters other than uppercase unchanged.
Inputs: AL- Character to (possibly) convert to lower case.
Outputs: AL- Converted character.
Include: stdlib.a
ToLower checks the character in the AL register. If it is upper case it
converts it to lower case. If it is anything else, ToLower leaves the value in
AL unchanged. Note: this routine is implemented as a macro rather than as a
procedure call. This routine is so short you would spend more time actually
calling the routine than executing the code inside. However, the code is
definitely longer than a (far) procedure call, so if space is critical and
you're invoking this code several times, you may want to convert it to a
procedure call to save a little space.
Example:
mov al, char
ToLower
4.17 ToUpper
* Converts lowercase characters in AL to upper case.
* Macro implementation for high performance.
* Leaves characters other than lowercase unchanged.
Inputs: AL- Character to (possibly) convert to upper case.
Outputs: AL- Converted character.
Include: stdlib.a
This is just like the ToLower routine except it converts lower case to
uppercase rather than vice versa.
5 Utility Routines
5.1 ISize
* Computes the number of print positions required by a 16-bit signed
integer value.
Inputs: AX- 16-bit value to compute the output size for.
Outputs: AX- Number of print positions required by this number (including the
minus sign, if necessary).
Include: stdlib.a
ISize computes the minimum number of character positions it will take to
print the signed decimal value in the AX register. If the number is negative,
it will include space for the minus sign in the count.
Example:
mov ax, I
ISize
puti ;Prints positions req'd by I.
5.2 USize
Just like ISize above, except this guy returns the number of print
positions required by a 16-bit unsigned value.
5.3 LSize
* Computes the number of print positions required by a 32-bit signed
integer value.
Inputs: DX:AX- 32-bit value to compute the output size for.
Outputs: AX- Number of print positions required by this number (including the
minus sign, if necessary).
Include: stdlib.a
LSize computes the minimum number of character positions it will take to
print the signed decimal value in the DX:AX registers. If the number is
negative, it will include space for the minus sign in the count.
Example:
mov ax, word ptr L
mov dx, word ptr L+2
LSize
puti ;Prints positions req'd by L.
5.4 ULSize
As with LSize, except ULSize treats the value in DX:AX as an unsigned long
integer.
5.5 IsAlNum
* Checks character in AL to see if it is alphanumeric.
Inputs: AL- Character to check.
Outputs: Zero flag- Set if character is alphanumeric, clear if not.
Include: stdlib.a
This procedure checks the character in the AL register to see if it is in
the range A-Z, a-z, or 0-9. Upon return, you can use the JE instruction to
check to see if the character was in this range (or, conversely, you can use
jne to see if it is not in the range).
Example:
mov al, char
IsAlNum
je IsAlNumChar
5.6 IsXDigit
* Checks character in AL to see if it is a hexadecimal digit.
Inputs: AL- Character to check.
Outputs: Zero flag- Set if character is a hex digit, clear if not.
Include: stdlib.a
This procedure checks the character in the AL register to see if it is in
the range A-F, a-f, or 0-9. Upon return, you can use the JE instruction to
check to see if the character was in this range (or, conversely, you can use
jne to see if it is not in the range).
Example:
mov al, char
IsXDigit
je IsXDigitChar
5.7 IsDigit
* Checks character in AL to see if it is numeric.
* Macro implementation for high performance.
Inputs: AL- Character to check.
Outputs: Zero flag- Set if character is numeric, clear if not.
Include: stdlib.a
This procedure checks the character in the AL register to see if it is in
the range 0-9. Upon return, you can use the JE instruction to check to see if
the character was in this range (or, conversely, you can use jne to see if it
is not in the range).
Example:
mov al, char
IsDigit
je IsDecChar
5.8 IsAlpha
* Checks character in AL to see if it is alphabetic.
* Macro implementation for high performance.
Inputs: AL- Character to check.
Outputs: Zero flag- Set if character is alphabetic, clear if not.
Include: stdlib.a
This procedure checks the character in the AL register to see if it is in
the range A-Z, or a-z. Upon return, you can use the JE instruction to check to
see if the character was in this range (or, conversely, you can use jne to see
if it is not in the range).
Example:
mov al, char
IsAlpha
je IsAlChar
5.9 IsLower
* Checks character in AL to see if it is a lower case alphabetic
character.
* Macro implementation for high performance.
Inputs: AL- Character to check.
Outputs: Zero flag- Set if character is lower case alphabetic, clear if not.
Include: stdlib.a
This procedure checks the character in the AL register to see if it is in
the range a-z. Upon return, you can use the JE instruction to check to see if
the character was in this range (or, conversely, you can use jne to see if it
is not in the range).
Example:
mov al, char
IsLower
je IsLowerChar
5.10 IsUpper
* Checks character in AL to see if it is uppercase alphabetic.
* Macro implementation for high performance.
Inputs: AL- Character to check.
Outputs: Zero flag- Set if character is uppercase alpha, clear if not.
Include: stdlib.a
This procedure checks the character in the AL register to see if it is in
the range A-Z. Upon return, you can use the JE instruction to check to see if
the character was in this range (or, conversely, you can use jne to see if it
is not in the range).
Example:
mov al, char
IsUpper
je IsUpperChar
6 Memory Management
The stdlib memory management routines let you dynamically allocate storage
on the heap. These routines are somewhat similar to those provided by the "C"
programming language. These routines do not perform garbage collection. Doing
so would introduce too many restrictions. Of course, feel free to add your own
garbage collection if you like...
The allocation/deallocation routines should be fairly fast. Malloc and
free use a modified first/next fit algorithm which lets the system quickly find
a memory block of the desired size without undue fragmentation problems
(average case). The overhead (eight bytes) per allocated block may seem rather
high, but that is part of the price to pay for faster malloc and free routines.
The memory manager data structure has an overhead of eight bytes (meaning
each malloc operation requires at least eight more bytes than you ask for) and
a granularity of 16 bytes. All pointers are far pointers and I allocate each
new item on a paragraph boundary. The current memory manager routines always
allocates (n+8) bytes, rounding up to the next multiple of 16 if the result is
not evenly divisible by sixteen. The first eight bytes of the structure are
used by the memory management routines, the remaining bytes are available for
use by the caller (malloc, et. al., return a pointer to the first byte beyond
the memory management overhead structure). Of course, you should never count
on any of this stuff. I could rewrite the memory manager tomorrow and if you
use the interface which follows your code will still work properly. If you
make assumptions about the structure of the memory management record, your code
may go up in flames on the next revision.
6.1 MemInit
* Initializes memory manager system.
Inputs: DX- Number of paragraphs to reserve.
zzzzzzseg- Segment name of last segment in your program.
PSP- Public word variable which holds the PSP value for your
program.
Outputs: CX- Number of paragraphs actually reserved by MemInit
Carry=0 if no error. If carry=1, AX contains DOS error code.
Include: stdlib.a
This routine initializes the memory manager system. You must call it
before using any routines which call any of the memory manager procedures
(since a good number of the stdlib routines call the memory manager, you should
get it the habit of always calling this routine. The system will die a
horrible death if you call a memory manager routine (like malloc) without first
calling MemInit.
This routine excepts you to define (and set up) two global names:
zzzzzzseg and PSP. "zzzzzzseg" is a dummy segment which must be the name of
the very last segment defined in your program. MemInit uses the name of this
segment to determine the address of the last byte in your program. If you do
not declare this segment last, the memory manager will happily wipe out
anything which follows zzzzzzseg. The "shell.asm" file provides you with a
template for your programs which properly defines this segment.
PSP should be a word variable which contains the program segment prefix
value for your program. MS-DOS passes the PSP value to your program in the DS
and ES registers. You should save this value in the PSP variable. Don't
forget to make PSP a public symbol in your main program's source file. The
"shell.asm" file demonstrates how to properly set up this value.
The DX register contains the number of 16-byte paragraphs you want to
reserve for the heap. If DX contains zero, MemInit will allocate all of the
available memory to the heap. If your program is going to allow the user to
run a copy of the command interpreter, or if your program is going to EXEC some
other program, you should not allocate all storage to the heap. Instead, you
should reserve some memory for those programs. By setting DX to some value
other than zero, you can tell MemInit how much memory you want to reserve for
the heap. All left over memory will be available for other system (or program)
use.
If the value is DX is larger than the amount of available RAM, MemInit
will split the available memory in half and reserve half for the heap leaving
the other half unallocated. If you want to force this situation (to leave half
of available memory for other purposes), simply load DX with 0FFFFh before
calling MemInit. There will never be this much memory available, so this will
force MemInit to split the available RAM between the heap and unallocated
storage.
On return from MemInit, the CX register contains the number of paragraphs
actually allocated. You can use this value to see if MemInit has actually
allocated the number of paragraphs you requested. You can also use this value
to determine how much space is available when you elect to split the free space
between the heap and the unallocated portions.
If all goes well, this routine returns the carry flag clear. If a DOS
memory manager error occurs, this routine returns the carry flag set and the
DOS error code in the AX register.
Example:
;
; Don't forget to set up PSP and zzzzzzseg before calling MemInit.
;
mov dx, dx ;Allocate all available RAM
MemInit
jc MemoryError
;
; cx contains the number of paragraphs actually allocated.
;
6.2 Malloc
* Allocates storage from the heap.
* Allocates blocks up to 64K long.
* Very fast combination first/next fit allocation strategy
Inputs: CX- Number of bytes to reserve.
Outputs: CX- Number of bytes actually reserved by malloc.
ES:DI- Pointer to first byte of memory allocated by malloc.
Carry=0 if no error. Carry=1 if insufficient memory
Include: stdlib.a
Malloc is the workhorse routine you use to allocate a block of memory.
You give it the number of bytes you need and if it finds a block large enough,
it will allocate the requested amount and return a pointer to that block.
Most memory managers require a small amount of overhead for each block
they allocate. Stdlib's (current) memory manager requires an overhead of eight
bytes. Furthermore, the grainularity is 16 bytes. This means that malloc
always allocates blocks of memory in paragraph multiples. Therefore, malloc
may actually reserve more storage than you specify. Therefore, the value
returned in CX may be somewhat greater than the requested value. By setting
the minimum allocation size to a paragraph, I was able to reduce the overhead
and improve the speed of malloc by a considerable amount.
Stdlib's memory management system does not do any garbage collection.
Doing so would place too many demands on malloc's users. Therefore, it is
quite possible for you to fragment memory with multiple calls to malloc,
realloc, and free. You could wind up in a situation where there is enough free
memory to satisfy your request, but there isn't a single contiguous block large
enough for the request. Malloc treats this as an insufficient memory error and
returns with the carry flag set.
If malloc cannot allocate a block of the requested size, it returns with
the carry flag set. In this situation, the contents of ES:DI is undefined.
Attempting to dereference this pointer will produce erratic and, perhaps,
disasterous results.
Example:
mov cx, 256
malloc
jnc GoodMalloc
print
db "Insufficient memory to continue.",cr,lf,0
jmp Quit
GoodMalloc: mov es:[di], 0 ;Init string to NULL.
6.3 Realloc
* Reallocates a block of memory on the heap.
* Allocates blocks up to 64K long.
* Allows you to make the new block smaller or larger than the original
block.
* Automatically copies the data from the original block to the new
block if the new block is larger than the old block.
Inputs: CX- Number of bytes to reserve.
ES:DI- Pointer to block to reallocate.
Outputs: CX- Number of bytes actually reserved by realloc.
ES:DI- Pointer to first byte of memory allocated by realloc.
Carry=0 if no error. Carry=1 if insufficient memory
Include: stdlib.a
Realloc lets you change the size of an allocated block in the heap. It
allows you to make the block larger or smaller. If you make the block smaller,
realloc simply frees (returns to the heap) any leftover bytes at the end of the
block. If you make the block larger, realloc goes out and allocates a block of
the requested size, copies the bytes from the old block to the beginning of the
new block (leaving the bytes at the end of the new block uninitialized), and
then frees the old block.
6.4 Free
* Deallocates a block of memory on the heap.
* Automatically coalesces all contiguous, unused, blocks on the heap.
* Very fast algorithm.
* Handles the situation where several active pointers may still point
at the specified block.
Inputs: ES:DI- Pointer to block to deallocate.
Outputs: Carry=0 if no error. Carry=1 if es:di doesn't point at a free
block.
Include: stdlib.a
Free (possibly) deallocates storage allocated on the heap by malloc or
realloc. Free returns this storage to the heap so other code can reuse it
later. Note, however, that free doesn't always return storage to the heap.
The memory manager data structure keeps track of the number of pointers
currently pointing at a block on the heap (see DupPtr, below). If you've set
up several pointers such that they point at the same block, free will not
deallocate the storage until you've freed all of the pointers which point at
that block.
Free usually returns an error code (carry flag = 1) if you attempt to free
a block which is not currently allocated or if you pass it a memory address
which was not returned by malloc (or realloc). By no means is this routine
totally robust. If you start calling free with arbitrary pointers in es:di
(which happen to be pointing into the heap) it is possible, under certain
circumstances, to confuse free and it will attempt to free a block it really
shouldn't. I could fix this problem by adding a lot of (slow) code to the free
routine. However, this library is for assembly language programmers. People
who are supposed to know what they are doing. Therefore, I opted to sacrifice
a little safety for a lot of speed.
Example:
les di, HeapPtr
free
6.5 DupPtr
* Informs the memory manager that you have more than one active
pointer pointing at a block of memory.
* Prevents free from deallocating storage to a block while there are
still some active pointers to that block.
Inputs: ES:DI- Pointer to block.
Outputs: Carry=0 if no error. Carry=1 if es:di doesn't point at a free
block.
Include: stdlib.a
DupPtr increments the pointer count for the block at the specified
address. Malloc sets this counter to one. Free decrements it by one. If free
decrements the value and it becomes zero, free will release the storage to the
heap for other use. By using DupPtr you can tell the memory manager that you
have several pointers pointing at the same block and that it shouldn't
deallocate the storage until you free all of those pointers.
Example:
les di, Ptr
DupPtr
6.6 IsInHeap
* Tells you if a pointer contains the address of a byte in the heap.
Inputs: ES:DI- Pointer to block.
Outputs: Carry=0 if es:di points into the heap. Carry=1 if not.
Include: stdlib.a
This routine lets you know if es:di contains the address of a byte in the
heap somewhere. It does not tell you if es:di contains a valid pointer
returned by malloc (see IsPtr, below). For example, if es:di contains the
address of some particular element of an array (not necessarily the first
element) allocated on the heap, IsInHeap will return with the carry clear
denoting that the es:di point somewhere in the heap. Keep in mind, that
calling this routine does not validate the pointer. It could be pointing at a
byte which is part of the memory manager data structure rather than at actual
data (since the memory manager maintains that information within the bounds of
the heap). This routine is mainly useful for seeing if something is allocated
on the heap as opposed to somewhere else (like your code, data, or stack
segment).
6.7 IsPtr
* Tells you if a pointer contains the address of the start of a block
in the heap.
Inputs: ES:DI- Pointer to block.
Outputs: Carry=0 if es:di is a valid pointer. Carry=1 if not.
Include: stdlib.a
IsPtr is much more specific than IsInHeap. This guy returns the carry
flag clear if and only if es:di contains the address of a properly allocated
(and currently allocated) block on the heap. This pointer must be a value
returned by malloc, realloc, or DupPtr and that block must be currently
allocated for IsPtr to return the carry flag clear.
7 String Routines
The stdlib string package supports "C" style zero-terminated strings.
Most of these routines mirror their "C" counterpart. Of course, I've added a
few additional routines which seem useful to me.
7.1 Strcpy, Strcpyl
* Copies a zero terminated string from one buffer to another.
* Does not require the use of the DS segment register.
Inputs: ES:DI- Pointer to source string (Strcpy only).
CS:RET- Pointer to source string (Strcpyl only).
DX:SI- Pointer to destination string.
Outputs: ES:DI- Points at the destination string.
Include: stdlib.a
Strcpy is used to copy a zero-terminated string from one location to
another. ES:DI points at the source string, DX:SI points at the destination
address. Strcpy copies all bytes, up to and including the zero byte, from the
source address to the destination address. The target buffer must be large
enough to hold the string. Strcpy performs no error checking on the size of
the destination buffer.
Strcpyl copies the zero-terminated string immediately following the call
instruction to the destination address specified by DX:SI. Again, this routine
expects you to ensure that the target buffer is large enough to hold the
result.
Examples:
mov dx, seg target
mov si, offset target
Strcpyl
db "String for Strcpyl",0
;
; Copy that string to Target2 as well, note that ES:DI already points
; at "Target".
;
mov dx, seg Target2
mov si, offset Target2
Strcpy
7.2 StrDup, StrDupl
* Duplicates a string by copying a zero-terminated string from one
location to a newly allocated spot on the heap.
* Automatically allocates sufficient storage for destination string on
the heap.
* Does not require the use of the DS segment register.
Inputs: ES:DI- Pointer to source string (Strdup only).
CS:RET- Pointer to source string (Strdupl only).
Outputs: ES:DI- Points at the destination string allocated on heap.
Carry=0 if operation successful. Carry=0 if insufficient memory for
new string.
Include: stdlib.a
Strdup and strdupl duplicate strings. You pass them a pointer to the
string (in es:di for strdup, via the return address for strdupl) and they
allocate sufficient storage on the heap for a copy of this string. Then these
two routines copy their source strings to the newly allocated storage and
return a pointer to the new string in ES:DI.
Examples:
Strdupl
db "String for Strdupl",0
jc MallocError
mov word ptr Dest1, di
mov word ptr Dest1+2, es
;
; Create another copy of this string. Note that es:di points at
; Dest1 upon entry to Strdup, but it points at the new string on
; exit.
;
Strdup
jc MallocError
mov word ptr Dest2, di
mov word ptr Dest2+2, es
7.3 Strlen
* Computes the length of a zero terminated string.
Inputs: ES:DI- Pointer to source string.
Outputs: CX- Length of specified string.
Include: stdlib.a
Strlen computes the length of the string whose address appears in ES:DI.
It returns the number of characters up to, but not including, the zero
terminating byte.
Example:
les di, String
strlen
mov sl, cx
printf
db "Length of '%s' is %d\n",0
dd String, sl
7.4 Strcat, Strcat2, Strcatl, Strcat2l
* Concatenates one string to the end of another.
* Strcatl and Strcat2l allow literal string operands.
* Strcat2 and Strcat2l automatically allocate storage for destination
string.
Inputs: ES:DI- Pointer to first string.
DX:SI- Pointer to second string (Strcat & Strcat2 only).
Outputs: ES:DI- Pointer to new string (Strcat2 & StrCat2l only).
Carry=0 No error. Carry=1 Insufficient memory (Strcat2 & StrCat2l
only).
Include: stdlib.a
These routines concatenate two strings together. They differ mainly in
the location of their source and destination operands.
Strcat concatenates the string pointed at by DX:SI to the end of the
string pointed at by ES:DI in memory (both strings must be zero-terminated).
The buffer pointed at by ES:DI must be large enough to hold the resulting
string. Strcat performs no bounds checking on the data.
Strcat2 works just like strcat except it does not append the second string
on to the end of the first. Instead, Strcat2 computes the length of the two
strings and attempts to allocate this much storage on the heap. If it is
unsuccessful, Strcat2 returns with the carry flag set. If it successfully
allocates this storage on the heap, it copies the string pointed at by es:di to
the heap and then concatenates the string dx:si points at to the end of this
string on the heap and returns with the carry flag clear and es:di pointing at
the new string on the heap.
Strcatl and Strcat2l work just like Strcat and Strcat2 except you supply
the second string as a literal constant immediately after the call rather than
pointing dx:si at it.
Examples:
les di, String1
mov dx, seg String2
lea si, String2
Strcat ;String1 <- String1 + String2
;
les di, String1
Strcatl
db "Appended String",0
;
les di, String1
mov dx, seg String2
lea si, String2
Strcat2 ;NewString<-String1+String2
puts
free
;
les di, String1
Strcat2l
db "Appended String",0
puts
free
7.5 Strchr
* Searches for a single character inside a string.
Inputs: ES:DI- Pointer to string.
AL- Character to search for.
Outputs: CX- Position (starting at zero) where Strchr found the character.
Carry=0 if Strchr found the character.
Carry=1 if the character wasn't present in the string.
Include: stdlib.a
Strchr locates the first occurrence of a character within a string. It
searches through the zero-terminated string pointed at by es:di for the
character passed in AL. If it locates the character, it returns the position
of that character in the CX register. The first character in the string
corresponds to location zero. If the character is not in the string, Strchr
returns the carry flag set. CX's value is undefined in that case. If Strchr
locates the character in the string, it returns with the carry flag clear.
Example:
les di, String
mov al, Char2Find
strchr
jc NotPresent
mov CharPosn, cx
7.6 Strstr, Strstrl
* Searches for a substring inside another string.
Inputs: ES:DI- Pointer to string.
DX:SI- Pointer to substring (strstr).
CS:RET- Pointer to substring (strstrl).
Outputs: CX- Position (starting at zero) where Strstr/Strstrl found the
character.
Carry=0 if Strstr/Strstrl found the character.
Carry=1 if the character wasn't present in the string.
Include: stdlib.a
Strstr searches for the position of a substring within another string.
ES:DI points at the string to search through, DX:SI points at the substring.
Strstr returns the index into ES:DI's string where DX:SI's string is found. If
the string is found, Strstr returns with the carry flag clear and CX contains
the (zero-based) index into the string. If Strstr cannot locate the substring
within the string ES:DI points at, it returns the carry flag set.
Strstrl works just like Strstr except it expects the substring to search
for immediately after the call instruction (rather than passing this address in
DX:SI).
Examples:
les di, MainString
lea si, Substring
mov dx, seg Substring
strstr
jc NoMatch
mov i, cx
printf
db "Found the substring '%s' at location %i\n",0
dd Substring, i
jmp Done
;
NoMatch: print
db "Could not find the substring.",cr,lf,0
Done: les di, MainString
strstrl
db "test",0
jc NoMatch2
print "Found 'test' in the string",cr,lf,0
jmp Done2
;
NoMatch2: print
db "Did not find 'test' in the string",cr,lf,0
Done2:
7.7 Strcmp, Strcmpl
* Compares two strings.
* Reflects comparison in 8086 condition code flags.
Inputs: ES:DI- Pointer to first string.
DX:SI- Pointer to second string (strcmp).
CS:RET- Pointer to substring (strcmpl).
Outputs: CX- Position (starting at zero) where the two strings differ.
Flags- hold the result of the comparison (should use unsigned
branches).
Include: stdlib.a
Strcmp and strcmpl compare two strings. Strcmp compares the string which
es:di points at to the string which dx:si points at. Strcmpl compares the
string which es:di points at to the string immediately following the call
instruction in the code stream. Strcmp(l) reflects the status of this
comparison in the flags register. Immediately upon return from strcmp(l) you
can use the unsigned jump instructions to test the comparison between the two
strings. Also (upon return), the CX register contains the index into the
strings where they are different (if the two strings are equal, Strcmp(l)
returns with CX containing the offset of the zero byte in the two strings.
Examples:
les di, String1
mov dx, seg String2
lea si, String2
strcmp
jae s1GEs2
mov i, cx
printf
db "String1 is less than String2 and they "
db "differ at position %i\n",0
dd i
;
les di, String3
strcmpl
db "Hello",0
jbe S3BEHello
;
7.8 Stricmp, Stricmpl
* Compares two strings ignoring differences in alphabetic case.
* Reflects comparison in 8086 condition code flags.
Inputs: ES:DI- Pointer to first string.
DX:SI- Pointer to second string (stricmp).
CS:RET- Pointer to substring (stricmpl).
Outputs: CX- Position (starting at zero) where the two strings differ.
Flags- hold the result of the comparison (should use unsigned
branches).
Include: stdlib.a
Stricmp and Stricmpl work just like Strcmp and Strcmpl except that these
two routines are case insenstive. Strcmp and Strcmpl treat "GETS" and "gets"
as different strings. Stricmp and Stricmpl treat these two strings as equal.
7.9 Strupr, Strupr2
* Converts all of the lower case characters in a string to upper case.
* Converts the characters in place (Strupr) or creates a new string on
the heap for the converted string (Strupr2).
Inputs: ES:DI- Pointer to string.
Outputs: ES:DI- Pointer to new string on heap (Strupr2 only).
Carry=1 if memory allocation error (Strupr2 only).
Include: stdlib.a
Strupr and Strupr2 convert the alphabetic characters in a string to upper
case. You pass the address of the string containing the characters you want to
convert in ES:DI. Strupr converts the characters in place. That is, it will
actually modify the string you pass to it. Strupr2 first calls strdup to
duplicate the string (on the heap) and then it converts the characters in this
duplicate string to upper case, returning the pointer to the new string is
ES:DI.
Examples:
les di, Str2Cnvrt
strupr
les di, Str2Cnvrt
puts
les di, Str2Cnvrt2
strupr2
puts
free
7.10 Strlwr, Strlwr2
* Converts all of the upper case characters in a string to lower case.
* Converts the characters in place (Strlwr) or creates a new string on
the heap for the converted string (Strlwr2).
Inputs: ES:DI- Pointer to string.
Outputs: ES:DI- Pointer to new string on heap (Strlwr2 only).
Carry=1 if memory allocation error (Strlwr2 only).
Include: stdlib.a
Strlwr and Strlwr2 convert the alphabetic characters in a string to lower
case. You pass the address of the string containing the characters you want to
convert in ES:DI. Strlwr converts the characters in place. That is, it will
actually modify the string you pass to it. Strlwr2 first calls strdup to
duplicate the string (on the heap) and then it converts the characters in this
duplicate string to lower case, returning the pointer to the new string is
ES:DI.
Examples:
les di, Str2Cnvrt
strlwr
les di, Str2Cnvrt
puts
les di, Str2Cnvrt2
strlwr2
puts
free
7.11 Strset, Strset2
* Initializes all the characters in a string to a single value.
* Automatically allocates storage on the heap for the string (Strset2
only).
Inputs: ES:DI- Pointer to string (Strset only)
AL- Character to initialize the string with.
CX- Length of string (Strset2 only).
Outputs: ES:DI- Pointer to new string on heap (Strset2 only).
Carry=1 if memory allocation error (Strset2 only).
Include: stdlib.a
Strset and Strset2 initialize strings such that each element of the string
contains the same value (passed in AL). Strset overwrites the data in an
existing string, replacing the characters previously in the string. To use
Strset, simply load ES:DI with the address of a string, load AL with the
character you want to overwrite the string with, and then call Strset. Strset
will replace each existing character (up to the zero terminating byte) of the
string with the character in AL.
Strset2 lets you create a brand-new string. You pass the initialization
character in AL and the length of the string in CX. Strset2 allocates CX+1
bytes on the heap and initializes the first CX bytes to the value in AL. It
stores a zero in the last memory location.
Examples:
lesi di, Str2Cnvrt
mov al, '*'
Strset
;
mov al, '#'
mov cx, 32
Strset2
puts
free
;
7.12 Strspan, Strspanl
* Allows you to skip over successive characters in a string.
* Very compact implementation.
Inputs: ES:DI- Pointer to string to scan.
DX:SI- Pointer to character set (Strspan only).
CS:RET- Pointer to character set (Strspanl only).
Outputs: First position where Strspan(l) could not find a character in the
attendant character set. Points at the zero terminating byte of the
string if all of the characters in the string were present in the
set.
Include: stdlib.a
Strspan(l) scans a string counting the number of characters which are
present in a second string (which represents a character set). While each
successive character in the source string is present in the character set,
Strspan(l) advances past it. ES:DI points at a zero-terminated string of
characters to check. DX:SI (strspan) or CS:RET (strspanl) points at another
zero-terminated string containing the set of characters to compare against.
While the character that ES:DI points at is present (anywhere) in the character
set string, the routine advances to the next character and bumps a counter by
one. Upon encountering a character which is not in the character set string,
the routine terminates and returns the number of characters (i.e., an index
into the string) where the mismatch occurred.
Although strspan (and, especially, strspanl) is very compact and
convenient to use, it is not particularly efficient. The character set
routines described in the next section provide a much faster alternative at the
expense of a little more space.
Examples:
les di, String
mov dx, seg CharSet
lea si, CharSet
strspan
mov i, cx
printf
db "The first char which is not in CharSet "
db "occurs at position %d in String.\n",0
dd i
;
les di, String
db "aeiou",0
mov j, cx
printf
db "The first char which is not a vowel "
db "occurs at position %d in String.\n",0
dd j
7.13 Strcspan, Strcspanl
* Allows you to skip past characters in a string which are not members
of a particular character set.
Inputs: ES:DI- Pointer to string to scan.
DX:SI- Pointer to character set (Strcspan only).
CS:RET- Pointer to character set (Strcspanl only).
Outputs: First position where Strcspan(l) found a character in the attendant
character set. Points at the zero terminating byte of the string if
none of the characters in the string were in the set.
Include: stdlib.a
These two routines work just like strspan and strspanl except they skip
over characters which are not in the set rather than skipping over characters
that are in the associated character set.
8 Character Set Routines
The character set routines let you deal with groups of characters as a set
rather than a string. A set is an unordered collection of objects where
membership (presence or absence) is the only important quality. I designed the
stdlib set routines to let you quickly check to see if an ASCII character is in
a set, to quickly add characters to a set or remove characters from a set.
These operations are the ones most commonly used on character sets. The other
operations (like union, intersection, difference, etc.) are useful, but don't
enjoy the popularity of use as the former routines. Therefore, I've optimized
the data structure for sets to handle the membership and add/delete operations
at the slight expense of the others.
Character sets are implemented via bit vectors. A "1" bit means that an
item is present in the set and a "0" bit means that the item is absent from the
set. The most common implementation of a character set is to use 32
consecutive bytes, eight bits per, giving 256 bits (one bit for each character
in the character set). While this makes certain operations (like assignment,
union, intersection, etc.) fast and convenient. Other operations (membership,
add/remove items), however, run much slower. Since these are the more
important operations, I've chosen a different data structure to represent
sets. A faster approach is to simply use a byte value for each item in the
set. This offers a major advantage over the 32-bit scheme: for operations like
membership it's very fast (since all you've got to do is index into an array
and test the resulting value). It has two drawbacks: first, operations like
set assignment, union, difference, etc., require 256 operations rather than
32. Second, it takes eight times as much memory.
The first drawback, speed, is of little consequence. You'll rarely use
the operations so affected, so the fact that they run a little slower will be
of little consequence. Wasting 224 bytes is a problem however. Especially if
you have a lot of character sets.
The approach I've used is to allocate 272 bytes. The first eight bytes
contain bit masks, 1, 2, 4, 8, 16, 32, 64, and 128. These masks tell you which
bit in the following 264 bytes is associated with the set. This lets me pack
eight sets into 272 bytes (34 bytes per character set). This provides almost
the speed of the 256-byte set with only a two byte overhead.
In the stdlib.a file there is a macro that lets you defined a group of
character sets: set. You use the macro as follows:
set set1, set2, set3, ..., set8
You must supply between one and eight labels in the operand field. These
are the names of the sets you want to create. The set macro automatically
attaches these labels to the appropriate mask bytes in the set. The actual bit
patterns for the set begin eight bytes later (from each label). Therefore, the
byte corresponding to chr(0) is staggered by one byte for each set (which
explains the other eight bytes needed above and beyond the 256 required for the
set).
When using the set manipulation routines, you should always pass the
address of the mask byte (i.e., the seg/offset of one of the labels above) to
the particular set manipulation routine you're using. Passing the address of
the structure created with the macro above will reference only the first set in
the group.
Note that you can use the set operations for fast pattern matching
applications. The set membership operation, for example, is much faster than
the strspan routine found in the string package. Proper use of character sets
can produce a program which runs much faster than some of the equivalent string
operations.
8.1 Createsets
* Allocates storage for eight character sets on the stack.
Inputs: None.
Outputs: ES:DI- Pointer to eight sets.
Carry=0 if no error.
Carry=1 if insufficient memory to allocate storage for sets.
Include: stdlib.a
Createsets allocates 272 bytes on the heap. This is sufficient room for
eight character sets. It then initializes the first eight bytes of this
storage with the proper mask values for each set. Location es:0[di] gets set
to 1, location es:1[di] gets 2, location es:2[di] gets 4, etc. The createsets
routine also initializes all of the sets to the empty set by clearing all the
bits to zero.
Example:
createsets
jc NoMemory
mov word ptr SetPtr, di
mov word ptr SetPtr+2, es
;
8.2 EmptySet
* Clears all of the bits for a particular set to zero.
Inputs: ES:DI- pointer to first byte of desired set.
Outputs: None.
Include: stdlib.a
Emptyset clears out the bits in a character set to zero (thereby setting
it to the empty set). Upon entry, es:di must point at the first byte of the
character set you want to clear. Note that this is not the address returned by
createsets. The first eight bytes of a character set structure are the
addresses of eight different sets. ES:DI must point at one of these bytes upon
entry into emptyset.
Example:
les di, SetPtr
add di, 3 ;Point at 4th set in group.
emptyset
;
8.3 RangeSet
* Adds all of the elements between two values to a set.
Inputs: ES:DI- pointer to first byte of desired set.
AL- Lower bounds for range of items.
AH- Upper bound for range (must be greater than AL).
Outputs: None.
Include: stdlib.a
Rangeset adds in (via a UNION operation) to a set a range of values.
Example:
les di, SetPtr
add di, 4 ;Point at 5th set in group.
mov al, 'A' ;Add in the alphabetic chars
mov ah, 'Z'
rangeset
;
8.4 Addstr, Addstrl
* Adds all of the characters from a string to a set.
Inputs: ES:DI- pointer to first byte of desired set.
DX:SI- pointer to string to add to set (Addstr only).
CS:RET-pointer to string to add to set (Addstrl only).
Outputs: None.
Include: stdlib.a
Addstr lets you add a group of characters to a set by specifying a string
containing the characters you want in the set. To Addstr you pass a pointer to
a zero-terminated string in dx:si. Addstr will add (union) each character from
this string into the set. Addstrl lets you specify the string as a literal
constant immediately after the call to addstrl.
Example:
les di, SetPtr
add di, 1 ;Point at 2nd set in group.
mov dx, seg CharStr ;Pointer to string containing
lea si, CharStr ; chars to add to set.
addstr ;Union in these characters.
;
les di, SetPtr ;Point at first set in group.
addstrl
db "AaBbCcDdEeFf0123456789",0
;
8.5 Rmvstr
* Removes all of the characters in a string from a set.
Inputs: ES:DI- pointer to first byte of desired set.
DX:SI- pointer to string to remove from set (Rmvstr only).
CS:RET-pointer to string to remove from set (Rmvstrl only).
Outputs: None.
Include: stdlib.a
Rmvstr is the converse operation to Addstr. It removes from a set the
characters appearing in the associated character string. Rmvstrl works the
same way except you pass the string of characters immediately after the call
rather than via a pointer in DX:SI.
Example:
les di, SetPtr
add di, 1 ;Point at 2nd set in group.
mov dx, seg CharStr ;Pointer to string containing
lea si, CharStr ; chars to add to set.
rmvstr ;Remove these characters.
;
les di, SetPtr ;Point at first set in group.
rmvstrl
db "AaBbCcDdEeFf0123456789",0
;
8.6 AddChar
* Adds a single character to a set.
Inputs: ES:DI- pointer to first byte of desired set.
AL- character to add to the set.
Outputs: None.
Include: stdlib.a
AddChar lets you add a single character (passed in AL) to a set.
Example:
les di, SetPtr
add di, 1 ;Point at 2nd set in group.
mov al, Ch2Add ;Character to add to set.
addchar
8.7 RmvChar
* Removes a single character from a set.
Inputs: ES:DI- pointer to first byte of desired set.
AL- character to remove from the set.
Outputs: None.
Include: stdlib.a
RmvChar lets you remove a single character (passed in AL) from a set.
Example:
les di, SetPtr
add di, 1 ;Point at 2nd set in group.
mov al, Ch2Rmv ;Character to add to set.
rmvchar
8.8 Member
* Checks a character value to see if it is in the set..
Inputs: ES:DI- pointer to first byte of desired set.
AL- character to check.
Outputs: Zero flag=1 if character is in the set.
Zero flag=0 if character is not in the set.
Include: stdlib.a
Member lets you check for set membership, that is, it lets you see if a
character value is present in some set. This routine is probably the
most-often called routine in the collection of set routines.
Example:
les di, SetPtr
add di, 7 ;Point at 8th set in group.
mov al, Ch2Chk ;Character to check.
member
je IsInSet
;
8.9 CopySet
* Copies one set to another.
Inputs: ES:DI- pointer to first byte of destination set.
DX:SI- pointer to first byte of source set.
Outputs: None.
Include: stdlib.a
CopySet copies the items from one set to another. This is a straight
assignment not a union operation. After the operation the destination set is
identical to the source set, both in terms of the element present in the set
and absent from the set.
Example:
les di, SetPtr
add di, 7 ;Point at 8th set in group.
mov dx, seg SetPtr2 ;Point at first set in group.
lea si, SetPtr2
copyset
;
8.10 SetUnion
* Unions (adds) the members of one set into another.
Inputs: ES:DI- pointer to first byte of destination set.
DX:SI- pointer to first byte of source set.
Outputs: None.
Include: stdlib.a
The SetUnion routine computes the union of two sets. That is, it adds all
of the items present in a source set to a destination set. This operation
preserves items present in the destination set before the SetUnion operation.
Example:
les di, SetPtr
add di, 7 ;Point at 8th set in group.
mov dx, seg SetPtr2 ;Point at first set in group.
lea si, SetPtr2
unionset
;
8.11 SetIntersect
* Computes the intersection of two sets.
Inputs: ES:DI- pointer to first byte of destination set.
DX:SI- pointer to first byte of source set.
Outputs: None.
Include: stdlib.a
Setintersect computes the intersection of two sets, leaving the result in
the destination set. The new set consists only of those items which previously
appeared in both the source and destination sets.
Example:
les di, SetPtr
add di, 7 ;Point at 8th set in group.
mov dx, seg SetPtr2 ;Point at first set in group.
lea si, SetPtr2
setintersect
;
8.12 SetDifference
* Computes the difference of two sets.
Inputs: ES:DI- pointer to first byte of destination set.
DX:SI- pointer to first byte of source set.
Outputs: None.
Include: stdlib.a
SetDifference computes the result of (ES:DI) := (ES:DI) - (DX:SI). The
destination set is left with its original items minus those items which are
also in the source set.
Example:
les di, SetPtr
add di, 7 ;Point at 8th set in group.
mov dx, seg SetPtr2 ;Point at first set in group.
lea si, SetPtr2
setdifference
;
8.13 NextItem
* Locates the next (first) available item in a set.
* Searches for items in ascending order using the ASCII collating
sequence.
Inputs: ES:DI- pointer to first byte of set.
Outputs: AL- Contains first item found in set (zero if the set is empty).
Include: stdlib.a
NextItem searches for the next available item in a set. It returns the
ASCII code of the character it finds in the AL register. If the set is empty,
NextItem returns zero (since chr(0) is illegal). This call does not affect the
set in any way. In particular, after the call the character located will still
be present in the set.
Example:
les di, SetPtr
add di, 7 ;Point at 8th set in group.
nextitem
mov ch2, al
;
8.14 RmvItem
* Locates the next (first) available item in a set and then removes
that item from the set.
* Searches for items in ascending order using the ASCII collating
sequence.
Inputs: ES:DI- pointer to first byte of set.
Outputs: AL- Contains first item found in set (zero if the set is empty).
Include: stdlib.a
RmvItem searches for the next available item in a set. It returns the
ASCII code of the character it finds in the AL register and removes that item
from the set. If the set is empty, NextItem returns zero (since chr(0) is
illegal).
Example:
les di, SetPtr
add di, 7 ;Point at 8th set in group.
rmvitem
mov ch3, al
;
This software is ...
sssssss ss ss ss sssssss sssssss
ss ss ss ssss ss ss ss
ss ss ss ss ss ss ss ss
sssssss sssssssss ssssssss sssssss sssss ssssssss
ss ss ss ss ss ss ss ss
ss ss ss ss ss ss ss ss
sssssss ss ss ss ss ss ss sssssss
ww ww ww sssssss sssssss
ww ww wwww ss ss ss
ww ww ww ww ww ss ss ss
ww wwww ww wwwwwwww sssssss sssss
ww ww ww ww ww ww ss ss ss
wwww wwww ww ww ss ss ss
ww ww ww ww ss ss sssssss
'cuz I'm sharing it with you!
I do not want any registrations or fees for the use of this software. I
thank God and Jesus Christ (my personal saviour) for giving me the ability to
write such software. God wants all of us to use our talents to glorify him,
therefore I offer this software as such.
Now for the catch... It is more blessed to give than to receive. If this
software saves you time and effort and you enjoy using it, my life will be
enriched knowing that others have appreciated my work. I would like to share
this wonderful feeling with you. If you like this software and use it, I would
like you to contribute at least one routine to the library. Perhaps you think
this library has some neet-o routines in it. Imagine how nice it would become
if everyone used their imagination to contribute something useful to it.
I hereby release this software to the public domain. You can use it in
any way you see fit. However, I would appreciate it if you share this software
with other much as I've shared it with you. I'm not suggesting that you give
away software you've written with this package (I'm not quite as crazy as
Richard Stallman, bless his heart), but if someone else would like a copy of
this library, please help them out. Naturally, I'd be tickeled pink to receive
credit in software that uses these routines (which is the honorable thing to
do) but I understand the way many corporations operate and won't be terrible
put off if you use it without giving due credit. Enjoy!
If you have comments, bug reports, new code to contribute, etc., you can
reach me at:
rhyde (On BIX).
[email protected] (On Internet).
[email protected] (On Internet, this one may go away).
or
Randy Hyde
Dept of Computer Science
2208 Sproul Hall
University of California
Riverside, Ca. 92521-0135
or
Randy Hyde
c/o Braintec Corporation
10 Corporate Park Way, ste 110
Irvine, Ca. 92714
1.1 Comments about the code
This code has received very little testing. C'mon, whadda expect for
free? I've been cranking this stuff out as fast as possible without going back
and reworking anything I've done. The only exception has been modification of
the routines to use the es:di/dx:si register pairs rather than es:si/ds:di
register pairs. I expect those modifications introduced more bugs. Please
don't expect super optimal code here. I have had anytime to study and improve
this code. Most of it is fairly mediocre (from a size/speed point of view).
Hopefully, you'll agree, it's the idea that counts. If you don't like
something I've done, you've got the sources -- have at it. (Of course, I'd
appreciate it if you would send me any modifications.)
1.2 Wish List
Next, I'll be working on FILE I/O versions of the I/O routines in this
package. Sooner or later I'll get around to adding floating point routines to
this package. If you're interested in adding some routines to this package,
GREAT!
Routines I'd like to have but am too busy to work on now:
1) Routines which manipulate directories (read/write/etc.)
2) A regular expression interpreter.
3) Length-prefixed strings package.
4) A windowing package.
5) A graphics package.
6) An object-oriented programming class library.
7) Just about anything else appearing in a HLL "standard" library.
If you've got any ideas, I'd love to discuss them with you. Best bet is
to reach me electronically at the E-MAIL addresses above.
1.3 Missing Routines to Supply RSN
String package:
strins Inserts one string into the middle of another
strdel Deletes a sequence of characters from the middle of a string.
Character Set Package:
span- Skips through a sequence of characters in a string which belong to
a character set.
break- Skips through a sequence of characters in a string which do not
belong to a character set.
Memory Manager Package
Memavail- Largest block of free memory available on the heap.
Memfree- Total amount of free space on the heap.
2 Character Output Routines
2.1 Putc
* Outputs character in AL register to the standard output device.
* Output is redirectable to user-written routine.
Inputs: AL- character to print.
Outputs: None.
Include: stdlib.a
Putc is the primitive character output routine. Most other output routines in
the standard library output data through this procedure. Prints the
ASCII character in AL. Processing of control codes is undefined
although most output routines this guy links to should be able to
handle return, line feed, back space, and tab. By default, this
routine calls DOS to print the character to the standard output
device.
Example:
mov al, 'C'
putc ;Prints "C" to std output.
2.2 PutCR
* Easy way of printing a newline to the stdlib standard output.
Inputs: None.
Outputs: None.
Include: stdlib.a
Prints a newline (carriage return/line feed) to the current standard
output device.
Example:
PutCR
2.3 PutcStdOut
* Outputs character in AL to the DOS standard output device.
* Sends a character directly to the DOS std output device.
* Output is redirectable via DOS I/O redirection.
* Bypasses redirection through the standard library Putc routine.
Inputs: AL- character to output.
Outputs: None.
Include: stdlib.a
PutcStdOut calls DOS to print the character in AL to the standard output
device. Although processing of non-ASCII characters and control characters is
undefined, most output devices handle these characters properly. In
particular, most output devices properly handle return, line feed, back space,
and tab.
Example:
mov al, 'C'
PutcStdOut ;Writes "C" to std output.
2.4 PutcBIOS
* Prints character in AL to the display device by calling BIOS.
* Cannot be redirected by stdlib or by DOS.
* Uses INT 10H/AH=14 for teletype-like output.
* Handles return, line feed, back space, and tab. Prints other
control characters using the IBM Character set.
Inputs: AL- Character to print.
Outputs: None.
Include- stdlib.a
PutcBIOS prints the character in AL using the BIOS routines. Output
through this routine cannot be redirected, such output is always sent to the
video display on the PC (unless, of course, someone has patched INT 10h).
Example:
mov al, "C"
PutcBIOS
2.5 GetOutAdrs
* Retrieves address of the current output routine.
Inputs: None.
Outputs: es:di - address of current output routine (called by Putc).
Include: stdlib.a
You can use this function to get the address of the current output
routine, perhaps so you can save it or see if it is currently pointing at some
particular piece of code. If you want to temporarily redirect the output and
then restore the original output routine, consider using PushOutAdrs/PopOutAdrs
described later.
Example:
GetOutAdrs
mov word ptr SaveOutAdrs, di
mov word ptr SaveOutAdrs+2, es
2.6 SetOutAdrs
* Lets you set the address of the current output routine.
Inputs: es:di- Address of new output routine.
Outputs: None.
Include: stdlib.a
This routine redirects the stdlib standard output so that it calls the
routine whose address you pass in es:di. This routine should expect the
character in AL and must preserve all registers. At a bare minimum, it should
handle the printable ASCII characters and the four control characters return,
line feed, back space, and tab (unless, of course, the main purpose of this
routine is to handle these codes in a different fashion).
Example:
mov es, seg NewOutputRoutine
mov di, offset NewOutputRoutine
SetOutAdrs
.
.
.
les di, RoutinePtr
SetOutAdrs
2.7 PushOutAdrs
* Lets you redirect the standard output device and preserve the
previous address.
* Saves up to 16 old output routine addresses on an internal stack.
* Restoration is possible using PopOutAdrs.
Inputs: es:di- Address of new output routine.
Outputs: Carry=0 if operation successful.
Carry=1 if there were already 16 items on the stack.
Include: stdlib.a
This routine "pushes" the current output address onto an internal stack
and then stores the value in es:di into the current output routine pointer.
The PushOutAdrs and PopOutAdrs routines let you easily save and redirect the
standard output and then restore the original output routine address later on.
If you attempt to push more than 16 items on the stack, PushOutAdrs will
ignore your request and return with the carry flag set. If PushOutAdrs is
successful, it will return with the carry flag clear.
Example:
mov es, seg NewOutputRoutine
mov di, offset NewOutputRoutine
PushOutAdrs
.
.
.
les di, RoutinePtr
PushOutAdrs
2.8 PopOutAdrs
* Restores output routine addresses saved by PushOutAdrs.
* Defaults to PutcStdOut if you attempt to pop too many items off the
stack.
Inputs: None.
Outputs: es:di- Points at the previous stdout routine before the pop.
Include: stdlib.a
PopOutAdrs undoes the effects of PushOutAdrs. It pops an item off the
internal stack and stores it into the output routine pointer. The previous
value in the output pointer is returned in es:di.
Example:
mov es, seg NewOutputRoutine
mov di, offset NewOutputRoutine
PushOutAdrs
.
.
.
PopOutAdrs
2.9 Puts
* Outputs a string of characters to the stdlib standard output device.
* Calls putc for each character in the string thereby sending each
character out to the standard output device.
Inputs: es:di- Contains the address of the string to print.
Outputs: None.
Include: stdlib.a
Puts prints a zero-terminated string whose address appears in es:di. Each
character appearing in the string is printed verbatim. There are no special
escape characters. Unlike the "C" routine by the same name, puts does not
print a newline after printing the string. Use putcr if you want to print the
newline after printing a string with puts.
Example:
les di, StrToPrt
puts
putcr
2.10 Puth
* Outputs the byte in AL as two hex digits (including leading zero if
necessary).
* Calls stdlib putc routine to print both characters to the stdlib
standard output device.
Inputs: AL- Value to print.
Outputs: None.
Include: Stdlib.a
Prints the value in the AL register as two hexadecimal digits. If the
value in AL is between 0 and 0Fh, puth will print a leading zero. This routine
calls the stdlib standard output routine (putc) to print all characters.
Example:
mov al, 1fh
puth
2.11 Putw
* Outputs the word in AX as four hex digits (including leading zeros
if necessary).
* Calls stdlib putc routine to print characters to the stdlib standard
output device.
Inputs: AX- Value to print.
Outputs: None.
Include: Stdlib.a
Prints the value in the AX register as four hexadecimal digits. If the
value in AX is between 0 and 0Fh, puth will print a leading zero. This routine
calls the stdlib standard output routine (putc) to print all characters.
Example:
mov ax, 0f1fh
putw
2.12 Puti
* Outputs the word in AX as a signed decimal number (including minus
sign, if necessary).
* Calls stdlib putc routine to print characters to the stdlib standard
output device.
Inputs: AX- Value to print.
Outputs: None.
Include: Stdlib.a
Prints the value in the AX register as a decimal integer. This routine
uses the exact number of screen positions required to print the number
(including a position for the minus sign, if the number is negative). This
routine calls the stdlib standard output routine (putc) to print all
characters.
Example:
mov ax, -1234
puti
2.13 Putu
* Outputs the word in AX as an unsigned decimal number.
* Calls stdlib putc routine to print both characters to the stdlib
standard output device.
Inputs: AX- Value to print.
Outputs: None.
Include: Stdlib.a
Prints the value in the AX register as a decimal integer. This routine
uses the exact number of screen positions required to print the number. This
routine calls the stdlib standard output routine (putc) to print all
characters.
Example:
mov ax, 1234
putu
2.14 Putl
* Outputs the double word in DX:AX as a signed decimal number
(including minus sign, if necessary).
* Calls stdlib putc routine to print characters to the stdlib standard
output device.
Inputs: DX:AX- Value to print.
Outputs: None.
Include: Stdlib.a
Prints the value in the DX:AX registers as a decimal integer. This
routine uses the exact number of screen positions required to print the number
(including a position for the minus sign, if the number is negative). This
routine calls the stdlib standard output routine (putc) to print all
characters.
Example:
mov dx, 0ffffh
mov ax, -1234
putl
2.15 Putul
* Outputs the double word in DX:AX as an unsigned decimal number
(including minus sign, if necessary).
* Calls stdlib putc routine to print characters to the stdlib standard
output device.
Inputs: DX:AX- Value to print.
Outputs: None.
Include: Stdlib.a
Prints the value in the DX:AX registers as a decimal integer. This
routine uses the exact number of screen positions required to print the
number. This routine calls the stdlib standard output routine (putc) to print
all characters.
Example:
mov dx, 12h
mov ax, 1234
putul
2.16 PutISize
* Prints the value in AX as a signed decimal integer.
* Prints the number in a minimum field width specified by the value in
CX.
Inputs: AX- Value to print.
CX- Minimum number of print positions to use.
Outputs: None.
Include: Stdlib.a
PutISize prints the signed integer value in AX to the stdlib standard
output device using a minimum of n print positions. CX contains n, the minimum
field width for the output value. The number (including any necessary minus
sign) is printed right justified in the output field.
If the number in AX requires more print positions than specified by CX,
PutISize uses however many print positions are necessary to actually print the
number. If you specify zero in CX, PutISize uses the minimum number of print
positions required. Of course, PutI will also use the minimum number of print
positions without disturbing the value in the CX register.
Note that, under no circumstances, will the number in AX ever require more
than size print positions (-32,767 requires the most print positions).
Examples:
mov cx, 5
mov ax, I
PutISize
.
.
.
mov cx, 12
mov ax, J
PutISize
2.17 PutUSize
* Prints the value in AX as an unsigned decimal integer.
* Prints the number in a minimum field width specified by the value in
CX.
Inputs: AX- Value to print.
CX- Minimum number of print positions to use.
Outputs: None.
Include: Stdlib.a
Like PutISize above except this guy prints unsigned values. Note that the
maximum number of print positions required by any number (e.g., 65,535) is
five.
Example:
mov cx, 8
mov ax, U
PutUSize
2.18 PutLSize
* Prints the value in DX:AX as a long signed decimal integer.
* Prints the number in a minimum field width specified by the value in
CX.
Inputs: DX:AX- Value to print.
CX- Minimum number of print positions to use.
Outputs: None.
Include: Stdlib.a
Like PutISize above, except this guy prints the long integer value in
DX:AX. Note that there may be as many as 11 print positions (e.g.,
-1,000,000,000).
Example:
mov cx, 16
mov dx, word ptr L+2
mov ax, word ptr L
PutLSize
2.19 PutULSize
* Prints the value in DX:AX as a long unsigned decimal integer.
* Prints the number in a minimum field width specified by the value in
CX.
Inputs: DX:AX- Value to print.
CX- Minimum number of print positions to use.
Outputs: None.
Include: Stdlib.a
Just like PutLSize above except this guy prints unsigned numbers rather
than signed long integers. The largest field width for such a value is 10
print positions.
Example:
mov cx, 8
mov dx, word ptr UL+2
mov ax, word ptr UL
PutULSize
2.20 Print
* Prints a string literal.
* Very convenient to use.
* Calls stdlib putc routine to print characters to the stdlib standard
output device.
Inputs: CS:RET - Return address points at the string to print.
Outputs: None.
Include: Stdlib.a
Print lets you print string literals in a convenient fashion. The string
to print immediately follows the call to the print routine. The string must
contain a zero terminating byte and may not contain any intervening zero
bytes. Since the print routine returns to the address immediately following
the zero terminating byte, forgetting this byte or attempting to print a zero
byte in the middle of a literal string will cause print to return to an
unexpected instruction. This usually hangs up the machine. Be very careful
when using this routine!
Example:
db "Print this string to the display device"
db 13,10
db "This appears on a new line"
db 13,10
db 0
2.21 Printf
* Formatted output routine.
* Very similar to the "C" function of the same name.
* Prints integers (normal, long, unsigned, etc.), characters, strings,
and other data types (this routine, however, does not support
floating point output).
* Calls stdlib putc routine to print characters to the stdlib standard
output device.
Inputs: CS:RET - Return address points at the format string.
Outputs: None.
Include: Stdlib.a
Printf, like its "C" namesake, provides formatted output capabilities for
the stdlib package. A typical call to printf always takes the following form:
printf
db "format string",0
dd operand1, operand2, ..., operandn
The format string is comparable to the one provided in the "C" programming
language. For most characters, printf simply prints the characters in the
format string up to the terminating zero byte. The two exceptions are
character prefixed by a backslash ("\") and character prefixed by a percent
sign ("%"). Like C's printf, stdlib's printf uses the backslash as an escape
character and the percent sign as a lead-in to a format string.
Printf uses the escape character ("\") to print special characters in a
fashion similar to, but not identical to C's printf. Stdlib's printf routine
supports the following special characters:
* r Print a carriage return (but no line feed)
* n Print a new line character (carriage return/line feed).
* b Print a backspace character.
* t Print a tab character.
* l Print a line feed character (but no carriage return).
* f Print a form feed character.
* \ Print the backslash character.
* % Print the percent sign character.
* 0xhh Print ASCII code hh, represented by two hex digits.
C users should note a couple of differences between stdlib's escape
sequences and C's. First, use "\%" to print a percent sign within a format
string, not "%%". C doesn't allow the use of "\%" because the C compiler
processes "\%" at compile time (leaving a single "%" in the object code)
whereas printf processes the format string at run-time. It would see a single
"%" and treat it as a format lead-in character. Stdlib's printf, on the other
hand, processes both the "\" and "%" and run-time, therefore it can distinguish
"\%".
Strings of the form "\0xhh" must contain exactly two hex digits. The
current printf routine isn't robust enough to handle sequences of the form
"\0xh" which contain only a single hex digit. Keep this in mind if you find
printf chopping off characters after you print a value.
There is absolutely no reason to use any escape character sequences except
"\0x00". Printf grabs all characters following the call to printf up to the
terminating zero byte (which is why you'd need to use "\0x00" if you want to
print the null character, printf will not print such values). Stdlib's printf
routine doesn't care how those characters got there. In particular, you are
not limited to using a single string after the printf call. The following is
perfectly legal:
printf
db "This is a string",13,10
db "This is on a new line",13,10
db "Print a backspace at the end of this line:"
db 8,13,10,0
You code will run a tiny amount faster if you avoid the use of the escape
character sequences. More importantly, the escape character sequences take at
least two bytes. You can encode most of them as a single byte by simply
embedding the ASCII code for that byte directly into the code stream. Don't
forget, you cannot embed a zero byte into the code stream. A zero byte
terminates the format string. Instead, use the "\0x00" escape sequence.
Format sequences always between with "%". For each format sequence you
must provide a far pointer to the associated data immediately following the
format string, e.g.,
printf
db "%i %i",0
dd i,j
Format sequences take the general form "%s\cn^f" where:
* "%" is always the "%" character. Use "\%" if you actually want
to print a percent sign.
* s is either nothing or a minus sign ("-").
* "\c" is also optional, it may or may not appear in the format
item. "c" represents any printable character.
* "n" represents a string of 1 or more decimal digits.
* "^" is just the caret (up-arrow) character.
* "f" represents one of the format characters: i, d, x, h, u, c,
s, ld, li, lx, or lu.
The "s", "\c", "n", and "^" items are optional, the "%" and "f" items must
be present. Furthermore, the order of these items in the format item is very
important. The "\c" entry, for example, cannot precede the "s" entry.
Likewise, the "^" character, if present, must follow everything except the "f"
character(s).
The format characters i, d, x, h, u, c, s, ld, li, lx, and lu control the
output format for the data. The i and d format characters perform identical
functions, they tell printf to print the following value as a 16-bit signed
decimal integer. The x and h format characters instruct printf to print the
specified value as a 16-bit or 8-bit hexadecimal value (respectively). If you
specify u, printf prints the value as a 16-bit unsigned decimal integer. Using
c tells printf to print the value as a single character. S tells printf that
you're supplying the address of a zero-terminated character string, printf
prints that string. The ld, li, lx, and lu entries are long (32-bit) versions
of d/i, x, and u. The corresponding address points at a 32-bit value which
printf will format and print to the standard output. The following example
demonstrates these format items:
printf
db "I= %i, U= %u, HexC= %h, HexI= %x, C= %c, "
db "S= %s",13,10
db "L= %ld",13,10,0
dd i,u,c,i,c,s,l
The number of far addresses (specified by operands to the "dd"
pseudo-opcode) must match the number of "%" format items in the format string.
Printf counts the number of "%" format items in the format string and skips
over this many far addresses following the format string. If the number of
items do not match, the return address for printf will be incorrect and the
program will probably hang or otherwise malfunction. Likewise (as for the
print routine), the format string must end with a zero byte. The addresses of
the items following the format string must point directly at the memory
locations where the specified data lies.
When used in the format above, printf always prints the values using the
minimum number of print positions for each operand. If you want to specify a
minimum field width, you can do so using the "n" format option. A format item
of the format "%10d" prints a decimal integer using at least ten print
positions. Likewise, "%16s" prints a string using at least 16 print
positions. If the value to print requires more than the specified number of
print positions, printf will use however many are necessary. If the value to
print requires fewer, printf will always print the specified number, padding
the value with blanks. Printf will print the value right justified in the
print field (regardless of the data's type). If you want to print the value
left justified in the output file, use the "-" format character as a prefix to
the field width, e.g.,
printf
db "%-17s",0
dd string
In this example, printf prints the string using a 17 character long field with
the string left justified in the output field.
By default, printf blank fills the output field if the value to print
requires fewer print positions than specified by the format item. The "\c"
format item allows you to change the padding character. For example, to print
a value, right justified, using "*" as the padding character you would use the
format item "%\*10d". To print it left justified you would use the format item
"%-\*10d". Note that the "-" must precede the "\*". This is a limitation of
the current version of the software. The operands must appear in this order.
Normally, the address(es) following the printf format string must be far
pointers to the actual data to print. On occassion, especially when allocating
storage on the heap (using malloc), you may not know (at assembly time) the
address of the object you want to print. You may have only a pointer to the
data you want to print. The "^" format option tells printf that the far
pointer following the format string is the address of a pointer to the data
rather than the address of the data itself. This option lets you access the
data indirectly.
Examples:
printf
db "Indirect access to i: %^d",13,10,0
dd IPtr
;
printf
db "A string allocated on the heap: %-\.32^s"
db 13,10,0
dd SPtr
Note: unlike C, stdlib's printf routine does not support floating point
output. There are two reasons for this: first, stdlib does not (yet) have a
floating point library associated with it; second, adding floating point
support would increase the size of printf by a tremendous amount, even if you
don't use its floating point capabilities. Since most assembly language
programmers don't use floating point arithmetic, I've intentionally left out
floating point output. As soon as I add a floating point package to stdlib I
will include floating point output. However, I will create a new routine,
printff which includes floating point output. This will allow those who never
use floating point I/O to keep their programs much smaller.
3 Character Input Routines
3.1 Getc
* Reads a character from the standard input device and returns the
character in the AL register.
* Redirectable under program control.
Inputs: None.
Outputs: AL- Character from input device.
AH- Undefined. However, if AL contains zero, AH should contain a
keyboard scan code.
Include: Stdlib.a
This routine reads a character from the standard input device. This call
is synchronous, that is, it does not return until a character is available.
Default input device is DOS standard input.
Example:
getc
mov KbdChar, al
putc
3.2 GetcStdIn
* Reads a character from the DOS standard input device and returns the
character in the AL register.
* Redirectable from DOS command line.
Inputs: None.
Outputs: AL- Character from input device.
AH- Scan code if AL=0.
Include: Stdlib.a
This routine reads a character from the DOS standard input device. This
call is synchronous, that is, it does not return until a character is
available.
Example:
GetcStdIn
mov InputChr, al
putc
3.3 GetcBIOS
* Reads a character from the keyboard and returns the character in the
AL register and the scan code in the AH register.
Inputs: None.
Outputs: AL- Character from the keyboard.
AH- Scan code from the keyboard.
Include: Stdlib.a
This routine reads a character from the keyboard. This call is
synchronous, that is, it does not return until a character is available.
Example:
GetcBIOS
mov CharRead, al
mov ScanCode, ah
putc
3.4 SetInAdrs
* Lets you set the address of the current input routine.
Inputs: es:di- Address of new input routine.
Outputs: None.
Include: stdlib.a
This routine redirects the stdlib standard input so that it calls the
routine whose address you pass in es:di. This routine should obtain a
character (from anywhere) and return the character in AL. If it makes sense do
do so, it should also return a "scan code" in the AH register. It must
preserve all other registers.
Example:
mov es, seg NewInputRoutine
mov di, offset NewInputRoutine
SetInAdrs
.
.
.
les di, RoutinePtr
SetInAdrs
3.5 GetInAdrs
* Retrieves address of the current input routine.
Inputs: None.
Outputs: es:di - address of current input routine (called by Getc).
Include: stdlib.a
You can use this function to get the address of the current input routine,
perhaps so you can save it or see if it is currently pointing at some
particular piece of code. If you want to temporarily redirect the input and
then restore the original input routine, consider using PushInAdrs/PopInAdrs
described later.
Example:
GetInAdrs
mov word ptr SaveInAdrs, di
mov word ptr SaveInAdrs+2, es
3.6 PushInAdrs
* Lets you redirect the standard input device and preserve the
previous address.
* Saves up to 16 old input routine addresses on an internal stack.
* Restoration is possible using PopInAdrs.
Inputs: es:di- Address of new input routine.
Outputs: Carry=0 if operation successful.
Carry=1 if there were already 16 items on the stack.
Include: stdlib.a
This routine "pushes" the current input address onto an internal stack and
then stores the value in es:di into the current input routine pointer. The
PushInAdrs and PopInAdrs routines let you easily save and redirect the standard
output and then restore the original output routine address later on.
If you attempt to push more than 16 items on the stack, PushInAdrs will
ignore your request and return with the carry flag set. If PushInAdrs is
successful, it will return with the carry flag clear.
Example:
mov es, seg NewInputRoutine
mov di, offset NewInputRoutine
PushInAdrs
.
.
.
les di, RoutinePtr
PushInAdrs
3.7 PopInAdrs
* Restores output routine addresses saved by PushInAdrs.
* Defaults to GetcStdOut if you attempt to pop too many items off the
stack.
Inputs: None.
Outputs: es:di- Points at the previous stdout routine before the pop.
Include: stdlib.a
PopInAdrs undoes the effects of PushInAdrs. It pops an item off the
internal stack and stores it into the input routine pointer. The previous
value in the output pointer is returned in es:di.
Example:
mov es, seg NewInRoutine
mov di, offset NewInputRoutine
PushInAdrs
.
.
.
PopInAdrs
3.8 Gets
* Reads a line of text from the stdlib standard input device.
* Automatically allocates storage for the input string on the heap.
* Handles input lines up to 256 characters long.
Inputs: None.
Outputs: es:di - address of input of text.
Include: stdlib.a
Gets reads a line of text from the stdlib standard input. It returns a
pointer to a string containing each character read in the ES:DI registers.
Gets calls malloc to allocate 256 bytes on the heap (plus any overhead bytes
required by the memory manager system). If the user enters less than 256
bytes, gets calls realloc to free any unnecessary bytes. Gets returns all
characters typed by the user except for the carriage return (ENTER) key code.
Gets always returns a zero-terminated string. The action of various keys to
gets depends upon where input has be directed. Generally, you can count on
gets properly handling the backspace (erase previous character), escape (erase
entire line), and ENTER (accept line) keys. Other keys may be active as well.
For example, by default gets calls getc which calls DOS' standard input
routine. If you type a control-C or break key while reading from DOS' standard
input it will abort the program. If this bothers you, you can always redirect
stdlib's getc routine so it calls BIOS directly rather than reading data
through DOS' keyboard input routine.
Example:
gets ;Read a string from the keyboard
puts ;Print it
putcr ;Print a new line
free ;Deallocate storage for string.
3.9 Scanf
* Formatted input from stdlib standard input.
* Similar to C's scanf routine.
* Converts ASCII to integer, unsigned, character, string, hex, and
long values of the above.
Inputs: None.
Outputs: None.
Include: stdlib.a
Scanf provides formatted input in a fashion analogous to printf's output
facilities. Actually, it turns out that scanf is considerably less useful than
printf because it doesn't provide reasonable error checking facilities (neither
does C's version of this routine). But for quick and dirty programs whose
input can be controlled in a rigid fashion (or if you're willing to live by
"garbage in, garbage out") scanf provides a convenient way to get input from
the user.
Like printf, the scanf routine expects you to follow the call with a
format string and then a list of (far pointer) memory addresses. The items in
the scanf format string take the following form:
%^f
where f represents d, i, x, h, u, c, x, ld, li, lx, or lu. Like printf, the
"^" symbol tells scanf that the address following the format string is the
address of a (far) pointer to the data rather than the address of the data
location itself.
By default, scanf automatically skips any leading whitespace before
attempting to read a numeric value. You can instruct scanf to skip other
characters by placing that character in the format string. For example, the
following call instructs scanf to read three integers separated by commas
(and/or whitespace):
scanf
db "%i,%i,%i",0
dd i1,i2,i3
Whenever scanf encounters a non-blank character in the format string, it
will skip that character (including multiple occurrences of that character) if
it appears next in the input stream.
Scanf always calls gets to read a new line of text from stdlib's standard
input. If scanf exhausts the format list, it ignores any remaining characters
on the line. If scanf exhausts the input line before processing all of the
format items, it leaves the remaining variables unchanged. Scanf always
deallocates the storage allocated by gets.
Example:
scanf
db "%i %h %^s",0
dd i, x, sptr
4 Conversion Routines
4.1 ATOL/ATOL2
* Converts an ASCII string of digits to long integer format.
Inputs: ES:DI- Points at string to convert.
Outputs: DX:AX- Long integer converted from string.
Carry flag- Error status
DI (ATOL2)- First character beyond string of digits.
Include: stdlib.a
ATOL convert the string of digits that ES:DI points at to a long integer
(signed) value and returns this value in DX:AX. ATOL2 works in a similar
fashion except it doesn't preserve the DI register. That is, it leaves DI
pointing at the first character beyond the string of digits. This routine
returns the carry flag clear if it translated the string of digits witout
error. It returns the carry flag set if overflow occurred. Note that this
routine stops on the first non-digit. If the string does not begin with a
digit, this routine returns zero. The only except to the "string of digits"
rule is that the number can have a preceding minus sign to denote a negative
number. In particular, note that this routine does not allow leading spaces.
Example:
gets ;Get a string from user
atol ;Convert to a value in DX:AX
4.2 ATOUL/ATOUL2
Just like ATOL above, except this guy handles unsigned long integers.
4.3 ATOI
* Converts an ASCII string of digits to integer format.
Inputs: ES:DI- Points at string to convert.
Outputs: AX- Integer converted from string.
Carry flag- Error status
DI (ATOI2)- First character beyond string of digits.
Include: stdlib.a
Works just like ATOL except it translates the string to a signed 16-bit
integer rather than a 32-bit long integer.
4.4 ATOU/ATOU2
* Converts an ASCII string of digits to unsigned integer format.
Inputs: ES:DI- Points at string to convert.
Outputs: AX- Unsigned 16-bit integer converted from string.
Carry flag- Error status
DI (ATOU2)- First character beyond string of digits.
Include: stdlib.a
Like ATOI except it handle unsigned 16-bit integers in the range 0..65535.
4.5 ATOH/ATOH2
* Converts an ASCII string of hex digits to a value in AX.
Inputs: ES:DI- Points at string to convert.
Outputs: AX- Unsigned 16-bit integer converted from hex string.
Carry flag- Error status
DI (ATOH2)- First character beyond string of hex digits.
Include: stdlib.a
This routine converts a string of hexadecimal digits into numeric form and
returns that value in the AX register.
Example:
les di, Str2Convrt
atoh ;Convert to value in AX.
putw ;Print word in AX.
4.6 ATOLH/ATOLH2
Like ATOH above, except it handles 32-bit values and returns the result in
DX:AX.
4.7 ITOA
* Converts a 16-bit signed integer value in AX to a string of
characters.
* Automatically allocates storage for string on the heap.
Inputs: AX- Signed 16-bit value to convert to a string.
Outputs: ES:DI- Pointer to string containing converted characters.
Include: stdlib.a
ITOA converts the signed integer value in AX to a string of characters
which represent that value. It allocates storage for this string on the heap
via a call to the malloc routine and returns a pointer to that string in
ES:DI. The string contains the minimum number of characters required to hold
the character representation of the value and is always between one and six
characters long.
Example:
mov ax, -1234
itoa ;Convert to string.
puts ;Print it.
free ;Deallocate string.
4.8 UTOA
* Converts a 16-bit unsigned integer value in AX to a string of
characters.
* Automatically allocates storage for string on the heap.
Inputs: AX- Unsigned 16-bit value to convert to a string.
Outputs: ES:DI- Pointer to string containing converted characters.
Include: stdlib.a
Like ITOA above, except it converts the unsigned value in AX to a string
of characters. The string returned by UTOA is always one to five characters
long.
Example:
mov ax, 65000
utoa
puts
free
4.9 HTOA
* Converts an 8-bit value in AL to the two-character hexadecimal
representation of that byte.
* Automatically allocates storage for string on the heap.
Inputs: AL- 8-bit value to convert to a string.
Outputs: ES:DI- Pointer to string containing converted characters.
Include: stdlib.a
Converts a byte to a string containing the hexadecimal representation of
that byte. Otherwise, it's just like ITOA above. This routine always outputs
exactly two hexadecimal digits, including a leading zero (if necessary).
4.10 WTOA
* Converts a 16-bit value in AX to hexadecimal representation.
* Automatically allocates storage for string on the heap.
Inputs: AX- 16-bit value to convert to a string.
Outputs: ES:DI- Pointer to string containing converted characters.
Include: stdlib.a
Like HTOA above, except it converts the 16-bit value in AX to a string of
four hexadecimal digits. Outputs exactly four digits including leading zeros
if necessary.
4.11 LTOA
* Converts a 32-bit signed integer value in DX:AX to a string of
characters.
* Automatically allocates storage for string on the heap.
Inputs: DX:AX- Signed 32-bit value to convert to a string.
Outputs: ES:DI- Pointer to string containing converted characters.
Include: stdlib.a
Like ITOA except it converts a long integer value in DX:AX to a string of
one to eleven characters.
4.12 ULTOA
* Converts a 32-bit unsigned integer value in DX:AX to a string of
characters.
* Automatically allocates storage for string on the heap.
Inputs: DX:AX- Unsigned 32-bit value to convert to a string.
Outputs: ES:DI- Pointer to string containing converted characters.
Include: stdlib.a
Like LTOA except this guy handles unsigned integer values.
4.13 SPrintf
* In-memory formatting routine.
* Just like C's sprintf routine.
* Automatically allocates storage for the string on the heap.
* Programmer selectable maximum length for the output string.
Inputs: CS:RET- Pointer to format string and operands of the sprintf
routine.
Outputs: ES:DI- Pointer to string containing output data.
Include: stdlib.a
Works in a manner quite similar to printf except sprintf writes its output
to a string variable rather than to the stdlib standard output. Sprintf
returns a pointer to the string (which is allocates on the heap) in the ES:DI
registers. SPrintf, by default, allocates 2048 characters for this string and
then deallocates any unnecessary storage. An external variable, sp_MaxBuf,
holds the number of bytes to allocate upon entry into sprintf. If you wish to
allocate more or less than 2048 bytes when calling sprintf, simply change the
value of this public variable (type is word). Sprintf calls malloc to allocate
the storage dynamically. You should call free to return this buffer to the
heap when you are through with it.
Example:
sprintf
db "I=%i, U=%u, S=%s",13,10,0
db i,u,s
puts
free
4.14 SBPrintf
* In-memory formatting routine.
* Programmer-supplied output buffer for string
Inputs: CS:RET- Pointer to format string and operands of the sprintf
routine.
ES:DI- Pointer to buffer area to store string data.
Outputs: None.
Include: stdlib.a
Works just like sprintf except it does not automatically allocate storage
for the output string. Instead, you must supply the address of an output
buffer in the ES:DI registers.
Example:
les di, BufferAdrs
sbprintf
db "I=%i, U=%u, S=%s",13,10,0
db i,u,s
puts
4.15 SScanf
* Formatted in-memory conversions.
* Similar to C's sscanf routine.
* Converts ASCII to integer, unsigned, character, string, hex, and
long values of the above.
Inputs: ES:DI- Points at string containing values to convert.
CS:RET- Points at format string and variable parameter list.
Outputs: None.
Include: stdlib.a
Sscanf provides formatted input in a fashion analogous to scanf. The
difference is that scanf reads a line of text from the stdlib standard input
whereas you pass the address of a sequence of characters to sscanf in es:di.
Example:
;
; This code reads the values for i, j, and s from the characters
; starting at memory locaiton Buffer.
;
les di, Buffer
sscanf
db "%i %i %s",0
dd i,j,s
4.16 ToLower
* Converts uppercase characters in AL to lower case.
* Macro implementation for high performance.
* Leaves characters other than uppercase unchanged.
Inputs: AL- Character to (possibly) convert to lower case.
Outputs: AL- Converted character.
Include: stdlib.a
ToLower checks the character in the AL register. If it is upper case it
converts it to lower case. If it is anything else, ToLower leaves the value in
AL unchanged. Note: this routine is implemented as a macro rather than as a
procedure call. This routine is so short you would spend more time actually
calling the routine than executing the code inside. However, the code is
definitely longer than a (far) procedure call, so if space is critical and
you're invoking this code several times, you may want to convert it to a
procedure call to save a little space.
Example:
mov al, char
ToLower
4.17 ToUpper
* Converts lowercase characters in AL to upper case.
* Macro implementation for high performance.
* Leaves characters other than lowercase unchanged.
Inputs: AL- Character to (possibly) convert to upper case.
Outputs: AL- Converted character.
Include: stdlib.a
This is just like the ToLower routine except it converts lower case to
uppercase rather than vice versa.
5 Utility Routines
5.1 ISize
* Computes the number of print positions required by a 16-bit signed
integer value.
Inputs: AX- 16-bit value to compute the output size for.
Outputs: AX- Number of print positions required by this number (including the
minus sign, if necessary).
Include: stdlib.a
ISize computes the minimum number of character positions it will take to
print the signed decimal value in the AX register. If the number is negative,
it will include space for the minus sign in the count.
Example:
mov ax, I
ISize
puti ;Prints positions req'd by I.
5.2 USize
Just like ISize above, except this guy returns the number of print
positions required by a 16-bit unsigned value.
5.3 LSize
* Computes the number of print positions required by a 32-bit signed
integer value.
Inputs: DX:AX- 32-bit value to compute the output size for.
Outputs: AX- Number of print positions required by this number (including the
minus sign, if necessary).
Include: stdlib.a
LSize computes the minimum number of character positions it will take to
print the signed decimal value in the DX:AX registers. If the number is
negative, it will include space for the minus sign in the count.
Example:
mov ax, word ptr L
mov dx, word ptr L+2
LSize
puti ;Prints positions req'd by L.
5.4 ULSize
As with LSize, except ULSize treats the value in DX:AX as an unsigned long
integer.
5.5 IsAlNum
* Checks character in AL to see if it is alphanumeric.
Inputs: AL- Character to check.
Outputs: Zero flag- Set if character is alphanumeric, clear if not.
Include: stdlib.a
This procedure checks the character in the AL register to see if it is in
the range A-Z, a-z, or 0-9. Upon return, you can use the JE instruction to
check to see if the character was in this range (or, conversely, you can use
jne to see if it is not in the range).
Example:
mov al, char
IsAlNum
je IsAlNumChar
5.6 IsXDigit
* Checks character in AL to see if it is a hexadecimal digit.
Inputs: AL- Character to check.
Outputs: Zero flag- Set if character is a hex digit, clear if not.
Include: stdlib.a
This procedure checks the character in the AL register to see if it is in
the range A-F, a-f, or 0-9. Upon return, you can use the JE instruction to
check to see if the character was in this range (or, conversely, you can use
jne to see if it is not in the range).
Example:
mov al, char
IsXDigit
je IsXDigitChar
5.7 IsDigit
* Checks character in AL to see if it is numeric.
* Macro implementation for high performance.
Inputs: AL- Character to check.
Outputs: Zero flag- Set if character is numeric, clear if not.
Include: stdlib.a
This procedure checks the character in the AL register to see if it is in
the range 0-9. Upon return, you can use the JE instruction to check to see if
the character was in this range (or, conversely, you can use jne to see if it
is not in the range).
Example:
mov al, char
IsDigit
je IsDecChar
5.8 IsAlpha
* Checks character in AL to see if it is alphabetic.
* Macro implementation for high performance.
Inputs: AL- Character to check.
Outputs: Zero flag- Set if character is alphabetic, clear if not.
Include: stdlib.a
This procedure checks the character in the AL register to see if it is in
the range A-Z, or a-z. Upon return, you can use the JE instruction to check to
see if the character was in this range (or, conversely, you can use jne to see
if it is not in the range).
Example:
mov al, char
IsAlpha
je IsAlChar
5.9 IsLower
* Checks character in AL to see if it is a lower case alphabetic
character.
* Macro implementation for high performance.
Inputs: AL- Character to check.
Outputs: Zero flag- Set if character is lower case alphabetic, clear if not.
Include: stdlib.a
This procedure checks the character in the AL register to see if it is in
the range a-z. Upon return, you can use the JE instruction to check to see if
the character was in this range (or, conversely, you can use jne to see if it
is not in the range).
Example:
mov al, char
IsLower
je IsLowerChar
5.10 IsUpper
* Checks character in AL to see if it is uppercase alphabetic.
* Macro implementation for high performance.
Inputs: AL- Character to check.
Outputs: Zero flag- Set if character is uppercase alpha, clear if not.
Include: stdlib.a
This procedure checks the character in the AL register to see if it is in
the range A-Z. Upon return, you can use the JE instruction to check to see if
the character was in this range (or, conversely, you can use jne to see if it
is not in the range).
Example:
mov al, char
IsUpper
je IsUpperChar
6 Memory Management
The stdlib memory management routines let you dynamically allocate storage
on the heap. These routines are somewhat similar to those provided by the "C"
programming language. These routines do not perform garbage collection. Doing
so would introduce too many restrictions. Of course, feel free to add your own
garbage collection if you like...
The allocation/deallocation routines should be fairly fast. Malloc and
free use a modified first/next fit algorithm which lets the system quickly find
a memory block of the desired size without undue fragmentation problems
(average case). The overhead (eight bytes) per allocated block may seem rather
high, but that is part of the price to pay for faster malloc and free routines.
The memory manager data structure has an overhead of eight bytes (meaning
each malloc operation requires at least eight more bytes than you ask for) and
a granularity of 16 bytes. All pointers are far pointers and I allocate each
new item on a paragraph boundary. The current memory manager routines always
allocates (n+8) bytes, rounding up to the next multiple of 16 if the result is
not evenly divisible by sixteen. The first eight bytes of the structure are
used by the memory management routines, the remaining bytes are available for
use by the caller (malloc, et. al., return a pointer to the first byte beyond
the memory management overhead structure). Of course, you should never count
on any of this stuff. I could rewrite the memory manager tomorrow and if you
use the interface which follows your code will still work properly. If you
make assumptions about the structure of the memory management record, your code
may go up in flames on the next revision.
6.1 MemInit
* Initializes memory manager system.
Inputs: DX- Number of paragraphs to reserve.
zzzzzzseg- Segment name of last segment in your program.
PSP- Public word variable which holds the PSP value for your
program.
Outputs: CX- Number of paragraphs actually reserved by MemInit
Carry=0 if no error. If carry=1, AX contains DOS error code.
Include: stdlib.a
This routine initializes the memory manager system. You must call it
before using any routines which call any of the memory manager procedures
(since a good number of the stdlib routines call the memory manager, you should
get it the habit of always calling this routine. The system will die a
horrible death if you call a memory manager routine (like malloc) without first
calling MemInit.
This routine excepts you to define (and set up) two global names:
zzzzzzseg and PSP. "zzzzzzseg" is a dummy segment which must be the name of
the very last segment defined in your program. MemInit uses the name of this
segment to determine the address of the last byte in your program. If you do
not declare this segment last, the memory manager will happily wipe out
anything which follows zzzzzzseg. The "shell.asm" file provides you with a
template for your programs which properly defines this segment.
PSP should be a word variable which contains the program segment prefix
value for your program. MS-DOS passes the PSP value to your program in the DS
and ES registers. You should save this value in the PSP variable. Don't
forget to make PSP a public symbol in your main program's source file. The
"shell.asm" file demonstrates how to properly set up this value.
The DX register contains the number of 16-byte paragraphs you want to
reserve for the heap. If DX contains zero, MemInit will allocate all of the
available memory to the heap. If your program is going to allow the user to
run a copy of the command interpreter, or if your program is going to EXEC some
other program, you should not allocate all storage to the heap. Instead, you
should reserve some memory for those programs. By setting DX to some value
other than zero, you can tell MemInit how much memory you want to reserve for
the heap. All left over memory will be available for other system (or program)
use.
If the value is DX is larger than the amount of available RAM, MemInit
will split the available memory in half and reserve half for the heap leaving
the other half unallocated. If you want to force this situation (to leave half
of available memory for other purposes), simply load DX with 0FFFFh before
calling MemInit. There will never be this much memory available, so this will
force MemInit to split the available RAM between the heap and unallocated
storage.
On return from MemInit, the CX register contains the number of paragraphs
actually allocated. You can use this value to see if MemInit has actually
allocated the number of paragraphs you requested. You can also use this value
to determine how much space is available when you elect to split the free space
between the heap and the unallocated portions.
If all goes well, this routine returns the carry flag clear. If a DOS
memory manager error occurs, this routine returns the carry flag set and the
DOS error code in the AX register.
Example:
;
; Don't forget to set up PSP and zzzzzzseg before calling MemInit.
;
mov dx, dx ;Allocate all available RAM
MemInit
jc MemoryError
;
; cx contains the number of paragraphs actually allocated.
;
6.2 Malloc
* Allocates storage from the heap.
* Allocates blocks up to 64K long.
* Very fast combination first/next fit allocation strategy
Inputs: CX- Number of bytes to reserve.
Outputs: CX- Number of bytes actually reserved by malloc.
ES:DI- Pointer to first byte of memory allocated by malloc.
Carry=0 if no error. Carry=1 if insufficient memory
Include: stdlib.a
Malloc is the workhorse routine you use to allocate a block of memory.
You give it the number of bytes you need and if it finds a block large enough,
it will allocate the requested amount and return a pointer to that block.
Most memory managers require a small amount of overhead for each block
they allocate. Stdlib's (current) memory manager requires an overhead of eight
bytes. Furthermore, the grainularity is 16 bytes. This means that malloc
always allocates blocks of memory in paragraph multiples. Therefore, malloc
may actually reserve more storage than you specify. Therefore, the value
returned in CX may be somewhat greater than the requested value. By setting
the minimum allocation size to a paragraph, I was able to reduce the overhead
and improve the speed of malloc by a considerable amount.
Stdlib's memory management system does not do any garbage collection.
Doing so would place too many demands on malloc's users. Therefore, it is
quite possible for you to fragment memory with multiple calls to malloc,
realloc, and free. You could wind up in a situation where there is enough free
memory to satisfy your request, but there isn't a single contiguous block large
enough for the request. Malloc treats this as an insufficient memory error and
returns with the carry flag set.
If malloc cannot allocate a block of the requested size, it returns with
the carry flag set. In this situation, the contents of ES:DI is undefined.
Attempting to dereference this pointer will produce erratic and, perhaps,
disasterous results.
Example:
mov cx, 256
malloc
jnc GoodMalloc
db "Insufficient memory to continue.",cr,lf,0
jmp Quit
GoodMalloc: mov es:[di], 0 ;Init string to NULL.
6.3 Realloc
* Reallocates a block of memory on the heap.
* Allocates blocks up to 64K long.
* Allows you to make the new block smaller or larger than the original
block.
* Automatically copies the data from the original block to the new
block if the new block is larger than the old block.
Inputs: CX- Number of bytes to reserve.
ES:DI- Pointer to block to reallocate.
Outputs: CX- Number of bytes actually reserved by realloc.
ES:DI- Pointer to first byte of memory allocated by realloc.
Carry=0 if no error. Carry=1 if insufficient memory
Include: stdlib.a
Realloc lets you change the size of an allocated block in the heap. It
allows you to make the block larger or smaller. If you make the block smaller,
realloc simply frees (returns to the heap) any leftover bytes at the end of the
block. If you make the block larger, realloc goes out and allocates a block of
the requested size, copies the bytes from the old block to the beginning of the
new block (leaving the bytes at the end of the new block uninitialized), and
then frees the old block.
6.4 Free
* Deallocates a block of memory on the heap.
* Automatically coalesces all contiguous, unused, blocks on the heap.
* Very fast algorithm.
* Handles the situation where several active pointers may still point
at the specified block.
Inputs: ES:DI- Pointer to block to deallocate.
Outputs: Carry=0 if no error. Carry=1 if es:di doesn't point at a free
block.
Include: stdlib.a
Free (possibly) deallocates storage allocated on the heap by malloc or
realloc. Free returns this storage to the heap so other code can reuse it
later. Note, however, that free doesn't always return storage to the heap.
The memory manager data structure keeps track of the number of pointers
currently pointing at a block on the heap (see DupPtr, below). If you've set
up several pointers such that they point at the same block, free will not
deallocate the storage until you've freed all of the pointers which point at
that block.
Free usually returns an error code (carry flag = 1) if you attempt to free
a block which is not currently allocated or if you pass it a memory address
which was not returned by malloc (or realloc). By no means is this routine
totally robust. If you start calling free with arbitrary pointers in es:di
(which happen to be pointing into the heap) it is possible, under certain
circumstances, to confuse free and it will attempt to free a block it really
shouldn't. I could fix this problem by adding a lot of (slow) code to the free
routine. However, this library is for assembly language programmers. People
who are supposed to know what they are doing. Therefore, I opted to sacrifice
a little safety for a lot of speed.
Example:
les di, HeapPtr
free
6.5 DupPtr
* Informs the memory manager that you have more than one active
pointer pointing at a block of memory.
* Prevents free from deallocating storage to a block while there are
still some active pointers to that block.
Inputs: ES:DI- Pointer to block.
Outputs: Carry=0 if no error. Carry=1 if es:di doesn't point at a free
block.
Include: stdlib.a
DupPtr increments the pointer count for the block at the specified
address. Malloc sets this counter to one. Free decrements it by one. If free
decrements the value and it becomes zero, free will release the storage to the
heap for other use. By using DupPtr you can tell the memory manager that you
have several pointers pointing at the same block and that it shouldn't
deallocate the storage until you free all of those pointers.
Example:
les di, Ptr
DupPtr
6.6 IsInHeap
* Tells you if a pointer contains the address of a byte in the heap.
Inputs: ES:DI- Pointer to block.
Outputs: Carry=0 if es:di points into the heap. Carry=1 if not.
Include: stdlib.a
This routine lets you know if es:di contains the address of a byte in the
heap somewhere. It does not tell you if es:di contains a valid pointer
returned by malloc (see IsPtr, below). For example, if es:di contains the
address of some particular element of an array (not necessarily the first
element) allocated on the heap, IsInHeap will return with the carry clear
denoting that the es:di point somewhere in the heap. Keep in mind, that
calling this routine does not validate the pointer. It could be pointing at a
byte which is part of the memory manager data structure rather than at actual
data (since the memory manager maintains that information within the bounds of
the heap). This routine is mainly useful for seeing if something is allocated
on the heap as opposed to somewhere else (like your code, data, or stack
segment).
6.7 IsPtr
* Tells you if a pointer contains the address of the start of a block
in the heap.
Inputs: ES:DI- Pointer to block.
Outputs: Carry=0 if es:di is a valid pointer. Carry=1 if not.
Include: stdlib.a
IsPtr is much more specific than IsInHeap. This guy returns the carry
flag clear if and only if es:di contains the address of a properly allocated
(and currently allocated) block on the heap. This pointer must be a value
returned by malloc, realloc, or DupPtr and that block must be currently
allocated for IsPtr to return the carry flag clear.
7 String Routines
The stdlib string package supports "C" style zero-terminated strings.
Most of these routines mirror their "C" counterpart. Of course, I've added a
few additional routines which seem useful to me.
7.1 Strcpy, Strcpyl
* Copies a zero terminated string from one buffer to another.
* Does not require the use of the DS segment register.
Inputs: ES:DI- Pointer to source string (Strcpy only).
CS:RET- Pointer to source string (Strcpyl only).
DX:SI- Pointer to destination string.
Outputs: ES:DI- Points at the destination string.
Include: stdlib.a
Strcpy is used to copy a zero-terminated string from one location to
another. ES:DI points at the source string, DX:SI points at the destination
address. Strcpy copies all bytes, up to and including the zero byte, from the
source address to the destination address. The target buffer must be large
enough to hold the string. Strcpy performs no error checking on the size of
the destination buffer.
Strcpyl copies the zero-terminated string immediately following the call
instruction to the destination address specified by DX:SI. Again, this routine
expects you to ensure that the target buffer is large enough to hold the
result.
Examples:
mov dx, seg target
mov si, offset target
Strcpyl
db "String for Strcpyl",0
;
; Copy that string to Target2 as well, note that ES:DI already points
; at "Target".
;
mov dx, seg Target2
mov si, offset Target2
Strcpy
7.2 StrDup, StrDupl
* Duplicates a string by copying a zero-terminated string from one
location to a newly allocated spot on the heap.
* Automatically allocates sufficient storage for destination string on
the heap.
* Does not require the use of the DS segment register.
Inputs: ES:DI- Pointer to source string (Strdup only).
CS:RET- Pointer to source string (Strdupl only).
Outputs: ES:DI- Points at the destination string allocated on heap.
Carry=0 if operation successful. Carry=0 if insufficient memory for
new string.
Include: stdlib.a
Strdup and strdupl duplicate strings. You pass them a pointer to the
string (in es:di for strdup, via the return address for strdupl) and they
allocate sufficient storage on the heap for a copy of this string. Then these
two routines copy their source strings to the newly allocated storage and
return a pointer to the new string in ES:DI.
Examples:
Strdupl
db "String for Strdupl",0
jc MallocError
mov word ptr Dest1, di
mov word ptr Dest1+2, es
;
; Create another copy of this string. Note that es:di points at
; Dest1 upon entry to Strdup, but it points at the new string on
; exit.
;
Strdup
jc MallocError
mov word ptr Dest2, di
mov word ptr Dest2+2, es
7.3 Strlen
* Computes the length of a zero terminated string.
Inputs: ES:DI- Pointer to source string.
Outputs: CX- Length of specified string.
Include: stdlib.a
Strlen computes the length of the string whose address appears in ES:DI.
It returns the number of characters up to, but not including, the zero
terminating byte.
Example:
les di, String
strlen
mov sl, cx
printf
db "Length of '%s' is %d\n",0
dd String, sl
7.4 Strcat, Strcat2, Strcatl, Strcat2l
* Concatenates one string to the end of another.
* Strcatl and Strcat2l allow literal string operands.
* Strcat2 and Strcat2l automatically allocate storage for destination
string.
Inputs: ES:DI- Pointer to first string.
DX:SI- Pointer to second string (Strcat & Strcat2 only).
Outputs: ES:DI- Pointer to new string (Strcat2 & StrCat2l only).
Carry=0 No error. Carry=1 Insufficient memory (Strcat2 & StrCat2l
only).
Include: stdlib.a
These routines concatenate two strings together. They differ mainly in
the location of their source and destination operands.
Strcat concatenates the string pointed at by DX:SI to the end of the
string pointed at by ES:DI in memory (both strings must be zero-terminated).
The buffer pointed at by ES:DI must be large enough to hold the resulting
string. Strcat performs no bounds checking on the data.
Strcat2 works just like strcat except it does not append the second string
on to the end of the first. Instead, Strcat2 computes the length of the two
strings and attempts to allocate this much storage on the heap. If it is
unsuccessful, Strcat2 returns with the carry flag set. If it successfully
allocates this storage on the heap, it copies the string pointed at by es:di to
the heap and then concatenates the string dx:si points at to the end of this
string on the heap and returns with the carry flag clear and es:di pointing at
the new string on the heap.
Strcatl and Strcat2l work just like Strcat and Strcat2 except you supply
the second string as a literal constant immediately after the call rather than
pointing dx:si at it.
Examples:
les di, String1
mov dx, seg String2
lea si, String2
Strcat ;String1 <- String1 + String2
;
les di, String1
Strcatl
db "Appended String",0
;
les di, String1
mov dx, seg String2
lea si, String2
Strcat2 ;NewString<-String1+String2
puts
free
;
les di, String1
Strcat2l
db "Appended String",0
puts
free
7.5 Strchr
* Searches for a single character inside a string.
Inputs: ES:DI- Pointer to string.
AL- Character to search for.
Outputs: CX- Position (starting at zero) where Strchr found the character.
Carry=0 if Strchr found the character.
Carry=1 if the character wasn't present in the string.
Include: stdlib.a
Strchr locates the first occurrence of a character within a string. It
searches through the zero-terminated string pointed at by es:di for the
character passed in AL. If it locates the character, it returns the position
of that character in the CX register. The first character in the string
corresponds to location zero. If the character is not in the string, Strchr
returns the carry flag set. CX's value is undefined in that case. If Strchr
locates the character in the string, it returns with the carry flag clear.
Example:
les di, String
mov al, Char2Find
strchr
jc NotPresent
mov CharPosn, cx
7.6 Strstr, Strstrl
* Searches for a substring inside another string.
Inputs: ES:DI- Pointer to string.
DX:SI- Pointer to substring (strstr).
CS:RET- Pointer to substring (strstrl).
Outputs: CX- Position (starting at zero) where Strstr/Strstrl found the
character.
Carry=0 if Strstr/Strstrl found the character.
Carry=1 if the character wasn't present in the string.
Include: stdlib.a
Strstr searches for the position of a substring within another string.
ES:DI points at the string to search through, DX:SI points at the substring.
Strstr returns the index into ES:DI's string where DX:SI's string is found. If
the string is found, Strstr returns with the carry flag clear and CX contains
the (zero-based) index into the string. If Strstr cannot locate the substring
within the string ES:DI points at, it returns the carry flag set.
Strstrl works just like Strstr except it expects the substring to search
for immediately after the call instruction (rather than passing this address in
DX:SI).
Examples:
les di, MainString
lea si, Substring
mov dx, seg Substring
strstr
jc NoMatch
mov i, cx
printf
db "Found the substring '%s' at location %i\n",0
dd Substring, i
jmp Done
;
NoMatch: print
db "Could not find the substring.",cr,lf,0
Done: les di, MainString
strstrl
db "test",0
jc NoMatch2
print "Found 'test' in the string",cr,lf,0
jmp Done2
;
NoMatch2: print
db "Did not find 'test' in the string",cr,lf,0
Done2:
7.7 Strcmp, Strcmpl
* Compares two strings.
* Reflects comparison in 8086 condition code flags.
Inputs: ES:DI- Pointer to first string.
DX:SI- Pointer to second string (strcmp).
CS:RET- Pointer to substring (strcmpl).
Outputs: CX- Position (starting at zero) where the two strings differ.
Flags- hold the result of the comparison (should use unsigned
branches).
Include: stdlib.a
Strcmp and strcmpl compare two strings. Strcmp compares the string which
es:di points at to the string which dx:si points at. Strcmpl compares the
string which es:di points at to the string immediately following the call
instruction in the code stream. Strcmp(l) reflects the status of this
comparison in the flags register. Immediately upon return from strcmp(l) you
can use the unsigned jump instructions to test the comparison between the two
strings. Also (upon return), the CX register contains the index into the
strings where they are different (if the two strings are equal, Strcmp(l)
returns with CX containing the offset of the zero byte in the two strings.
Examples:
les di, String1
mov dx, seg String2
lea si, String2
strcmp
jae s1GEs2
mov i, cx
printf
db "String1 is less than String2 and they "
db "differ at position %i\n",0
dd i
;
les di, String3
strcmpl
db "Hello",0
jbe S3BEHello
;
7.8 Stricmp, Stricmpl
* Compares two strings ignoring differences in alphabetic case.
* Reflects comparison in 8086 condition code flags.
Inputs: ES:DI- Pointer to first string.
DX:SI- Pointer to second string (stricmp).
CS:RET- Pointer to substring (stricmpl).
Outputs: CX- Position (starting at zero) where the two strings differ.
Flags- hold the result of the comparison (should use unsigned
branches).
Include: stdlib.a
Stricmp and Stricmpl work just like Strcmp and Strcmpl except that these
two routines are case insenstive. Strcmp and Strcmpl treat "GETS" and "gets"
as different strings. Stricmp and Stricmpl treat these two strings as equal.
7.9 Strupr, Strupr2
* Converts all of the lower case characters in a string to upper case.
* Converts the characters in place (Strupr) or creates a new string on
the heap for the converted string (Strupr2).
Inputs: ES:DI- Pointer to string.
Outputs: ES:DI- Pointer to new string on heap (Strupr2 only).
Carry=1 if memory allocation error (Strupr2 only).
Include: stdlib.a
Strupr and Strupr2 convert the alphabetic characters in a string to upper
case. You pass the address of the string containing the characters you want to
convert in ES:DI. Strupr converts the characters in place. That is, it will
actually modify the string you pass to it. Strupr2 first calls strdup to
duplicate the string (on the heap) and then it converts the characters in this
duplicate string to upper case, returning the pointer to the new string is
ES:DI.
Examples:
les di, Str2Cnvrt
strupr
les di, Str2Cnvrt
puts
les di, Str2Cnvrt2
strupr2
puts
free
7.10 Strlwr, Strlwr2
* Converts all of the upper case characters in a string to lower case.
* Converts the characters in place (Strlwr) or creates a new string on
the heap for the converted string (Strlwr2).
Inputs: ES:DI- Pointer to string.
Outputs: ES:DI- Pointer to new string on heap (Strlwr2 only).
Carry=1 if memory allocation error (Strlwr2 only).
Include: stdlib.a
Strlwr and Strlwr2 convert the alphabetic characters in a string to lower
case. You pass the address of the string containing the characters you want to
convert in ES:DI. Strlwr converts the characters in place. That is, it will
actually modify the string you pass to it. Strlwr2 first calls strdup to
duplicate the string (on the heap) and then it converts the characters in this
duplicate string to lower case, returning the pointer to the new string is
ES:DI.
Examples:
les di, Str2Cnvrt
strlwr
les di, Str2Cnvrt
puts
les di, Str2Cnvrt2
strlwr2
puts
free
7.11 Strset, Strset2
* Initializes all the characters in a string to a single value.
* Automatically allocates storage on the heap for the string (Strset2
only).
Inputs: ES:DI- Pointer to string (Strset only)
AL- Character to initialize the string with.
CX- Length of string (Strset2 only).
Outputs: ES:DI- Pointer to new string on heap (Strset2 only).
Carry=1 if memory allocation error (Strset2 only).
Include: stdlib.a
Strset and Strset2 initialize strings such that each element of the string
contains the same value (passed in AL). Strset overwrites the data in an
existing string, replacing the characters previously in the string. To use
Strset, simply load ES:DI with the address of a string, load AL with the
character you want to overwrite the string with, and then call Strset. Strset
will replace each existing character (up to the zero terminating byte) of the
string with the character in AL.
Strset2 lets you create a brand-new string. You pass the initialization
character in AL and the length of the string in CX. Strset2 allocates CX+1
bytes on the heap and initializes the first CX bytes to the value in AL. It
stores a zero in the last memory location.
Examples:
lesi di, Str2Cnvrt
mov al, '*'
Strset
;
mov al, '#'
mov cx, 32
Strset2
puts
free
;
7.12 Strspan, Strspanl
* Allows you to skip over successive characters in a string.
* Very compact implementation.
Inputs: ES:DI- Pointer to string to scan.
DX:SI- Pointer to character set (Strspan only).
CS:RET- Pointer to character set (Strspanl only).
Outputs: First position where Strspan(l) could not find a character in the
attendant character set. Points at the zero terminating byte of the
string if all of the characters in the string were present in the
set.
Include: stdlib.a
Strspan(l) scans a string counting the number of characters which are
present in a second string (which represents a character set). While each
successive character in the source string is present in the character set,
Strspan(l) advances past it. ES:DI points at a zero-terminated string of
characters to check. DX:SI (strspan) or CS:RET (strspanl) points at another
zero-terminated string containing the set of characters to compare against.
While the character that ES:DI points at is present (anywhere) in the character
set string, the routine advances to the next character and bumps a counter by
one. Upon encountering a character which is not in the character set string,
the routine terminates and returns the number of characters (i.e., an index
into the string) where the mismatch occurred.
Although strspan (and, especially, strspanl) is very compact and
convenient to use, it is not particularly efficient. The character set
routines described in the next section provide a much faster alternative at the
expense of a little more space.
Examples:
les di, String
mov dx, seg CharSet
lea si, CharSet
strspan
mov i, cx
printf
db "The first char which is not in CharSet "
db "occurs at position %d in String.\n",0
dd i
;
les di, String
db "aeiou",0
mov j, cx
printf
db "The first char which is not a vowel "
db "occurs at position %d in String.\n",0
dd j
7.13 Strcspan, Strcspanl
* Allows you to skip past characters in a string which are not members
of a particular character set.
Inputs: ES:DI- Pointer to string to scan.
DX:SI- Pointer to character set (Strcspan only).
CS:RET- Pointer to character set (Strcspanl only).
Outputs: First position where Strcspan(l) found a character in the attendant
character set. Points at the zero terminating byte of the string if
none of the characters in the string were in the set.
Include: stdlib.a
These two routines work just like strspan and strspanl except they skip
over characters which are not in the set rather than skipping over characters
that are in the associated character set.
8 Character Set Routines
The character set routines let you deal with groups of characters as a set
rather than a string. A set is an unordered collection of objects where
membership (presence or absence) is the only important quality. I designed the
stdlib set routines to let you quickly check to see if an ASCII character is in
a set, to quickly add characters to a set or remove characters from a set.
These operations are the ones most commonly used on character sets. The other
operations (like union, intersection, difference, etc.) are useful, but don't
enjoy the popularity of use as the former routines. Therefore, I've optimized
the data structure for sets to handle the membership and add/delete operations
at the slight expense of the others.
Character sets are implemented via bit vectors. A "1" bit means that an
item is present in the set and a "0" bit means that the item is absent from the
set. The most common implementation of a character set is to use 32
consecutive bytes, eight bits per, giving 256 bits (one bit for each character
in the character set). While this makes certain operations (like assignment,
union, intersection, etc.) fast and convenient. Other operations (membership,
add/remove items), however, run much slower. Since these are the more
important operations, I've chosen a different data structure to represent
sets. A faster approach is to simply use a byte value for each item in the
set. This offers a major advantage over the 32-bit scheme: for operations like
membership it's very fast (since all you've got to do is index into an array
and test the resulting value). It has two drawbacks: first, operations like
set assignment, union, difference, etc., require 256 operations rather than
32. Second, it takes eight times as much memory.
The first drawback, speed, is of little consequence. You'll rarely use
the operations so affected, so the fact that they run a little slower will be
of little consequence. Wasting 224 bytes is a problem however. Especially if
you have a lot of character sets.
The approach I've used is to allocate 272 bytes. The first eight bytes
contain bit masks, 1, 2, 4, 8, 16, 32, 64, and 128. These masks tell you which
bit in the following 264 bytes is associated with the set. This lets me pack
eight sets into 272 bytes (34 bytes per character set). This provides almost
the speed of the 256-byte set with only a two byte overhead.
In the stdlib.a file there is a macro that lets you defined a group of
character sets: set. You use the macro as follows:
set set1, set2, set3, ..., set8
You must supply between one and eight labels in the operand field. These
are the names of the sets you want to create. The set macro automatically
attaches these labels to the appropriate mask bytes in the set. The actual bit
patterns for the set begin eight bytes later (from each label). Therefore, the
byte corresponding to chr(0) is staggered by one byte for each set (which
explains the other eight bytes needed above and beyond the 256 required for the
set).
When using the set manipulation routines, you should always pass the
address of the mask byte (i.e., the seg/offset of one of the labels above) to
the particular set manipulation routine you're using. Passing the address of
the structure created with the macro above will reference only the first set in
the group.
Note that you can use the set operations for fast pattern matching
applications. The set membership operation, for example, is much faster than
the strspan routine found in the string package. Proper use of character sets
can produce a program which runs much faster than some of the equivalent string
operations.
8.1 Createsets
* Allocates storage for eight character sets on the stack.
Inputs: None.
Outputs: ES:DI- Pointer to eight sets.
Carry=0 if no error.
Carry=1 if insufficient memory to allocate storage for sets.
Include: stdlib.a
Createsets allocates 272 bytes on the heap. This is sufficient room for
eight character sets. It then initializes the first eight bytes of this
storage with the proper mask values for each set. Location es:0[di] gets set
to 1, location es:1[di] gets 2, location es:2[di] gets 4, etc. The createsets
routine also initializes all of the sets to the empty set by clearing all the
bits to zero.
Example:
createsets
jc NoMemory
mov word ptr SetPtr, di
mov word ptr SetPtr+2, es
;
8.2 EmptySet
* Clears all of the bits for a particular set to zero.
Inputs: ES:DI- pointer to first byte of desired set.
Outputs: None.
Include: stdlib.a
Emptyset clears out the bits in a character set to zero (thereby setting
it to the empty set). Upon entry, es:di must point at the first byte of the
character set you want to clear. Note that this is not the address returned by
createsets. The first eight bytes of a character set structure are the
addresses of eight different sets. ES:DI must point at one of these bytes upon
entry into emptyset.
Example:
les di, SetPtr
add di, 3 ;Point at 4th set in group.
emptyset
;
8.3 RangeSet
* Adds all of the elements between two values to a set.
Inputs: ES:DI- pointer to first byte of desired set.
AL- Lower bounds for range of items.
AH- Upper bound for range (must be greater than AL).
Outputs: None.
Include: stdlib.a
Rangeset adds in (via a UNION operation) to a set a range of values.
Example:
les di, SetPtr
add di, 4 ;Point at 5th set in group.
mov al, 'A' ;Add in the alphabetic chars
mov ah, 'Z'
rangeset
;
8.4 Addstr, Addstrl
* Adds all of the characters from a string to a set.
Inputs: ES:DI- pointer to first byte of desired set.
DX:SI- pointer to string to add to set (Addstr only).
CS:RET-pointer to string to add to set (Addstrl only).
Outputs: None.
Include: stdlib.a
Addstr lets you add a group of characters to a set by specifying a string
containing the characters you want in the set. To Addstr you pass a pointer to
a zero-terminated string in dx:si. Addstr will add (union) each character from
this string into the set. Addstrl lets you specify the string as a literal
constant immediately after the call to addstrl.
Example:
les di, SetPtr
add di, 1 ;Point at 2nd set in group.
mov dx, seg CharStr ;Pointer to string containing
lea si, CharStr ; chars to add to set.
addstr ;Union in these characters.
;
les di, SetPtr ;Point at first set in group.
addstrl
db "AaBbCcDdEeFf0123456789",0
;
8.5 Rmvstr
* Removes all of the characters in a string from a set.
Inputs: ES:DI- pointer to first byte of desired set.
DX:SI- pointer to string to remove from set (Rmvstr only).
CS:RET-pointer to string to remove from set (Rmvstrl only).
Outputs: None.
Include: stdlib.a
Rmvstr is the converse operation to Addstr. It removes from a set the
characters appearing in the associated character string. Rmvstrl works the
same way except you pass the string of characters immediately after the call
rather than via a pointer in DX:SI.
Example:
les di, SetPtr
add di, 1 ;Point at 2nd set in group.
mov dx, seg CharStr ;Pointer to string containing
lea si, CharStr ; chars to add to set.
rmvstr ;Remove these characters.
;
les di, SetPtr ;Point at first set in group.
rmvstrl
db "AaBbCcDdEeFf0123456789",0
;
8.6 AddChar
* Adds a single character to a set.
Inputs: ES:DI- pointer to first byte of desired set.
AL- character to add to the set.
Outputs: None.
Include: stdlib.a
AddChar lets you add a single character (passed in AL) to a set.
Example:
les di, SetPtr
add di, 1 ;Point at 2nd set in group.
mov al, Ch2Add ;Character to add to set.
addchar
8.7 RmvChar
* Removes a single character from a set.
Inputs: ES:DI- pointer to first byte of desired set.
AL- character to remove from the set.
Outputs: None.
Include: stdlib.a
RmvChar lets you remove a single character (passed in AL) from a set.
Example:
les di, SetPtr
add di, 1 ;Point at 2nd set in group.
mov al, Ch2Rmv ;Character to add to set.
rmvchar
8.8 Member
* Checks a character value to see if it is in the set..
Inputs: ES:DI- pointer to first byte of desired set.
AL- character to check.
Outputs: Zero flag=1 if character is in the set.
Zero flag=0 if character is not in the set.
Include: stdlib.a
Member lets you check for set membership, that is, it lets you see if a
character value is present in some set. This routine is probably the
most-often called routine in the collection of set routines.
Example:
les di, SetPtr
add di, 7 ;Point at 8th set in group.
mov al, Ch2Chk ;Character to check.
member
je IsInSet
;
8.9 CopySet
* Copies one set to another.
Inputs: ES:DI- pointer to first byte of destination set.
DX:SI- pointer to first byte of source set.
Outputs: None.
Include: stdlib.a
CopySet copies the items from one set to another. This is a straight
assignment not a union operation. After the operation the destination set is
identical to the source set, both in terms of the element present in the set
and absent from the set.
Example:
les di, SetPtr
add di, 7 ;Point at 8th set in group.
mov dx, seg SetPtr2 ;Point at first set in group.
lea si, SetPtr2
copyset
;
8.10 SetUnion
* Unions (adds) the members of one set into another.
Inputs: ES:DI- pointer to first byte of destination set.
DX:SI- pointer to first byte of source set.
Outputs: None.
Include: stdlib.a
The SetUnion routine computes the union of two sets. That is, it adds all
of the items present in a source set to a destination set. This operation
preserves items present in the destination set before the SetUnion operation.
Example:
les di, SetPtr
add di, 7 ;Point at 8th set in group.
mov dx, seg SetPtr2 ;Point at first set in group.
lea si, SetPtr2
unionset
;
8.11 SetIntersect
* Computes the intersection of two sets.
Inputs: ES:DI- pointer to first byte of destination set.
DX:SI- pointer to first byte of source set.
Outputs: None.
Include: stdlib.a
Setintersect computes the intersection of two sets, leaving the result in
the destination set. The new set consists only of those items which previously
appeared in both the source and destination sets.
Example:
les di, SetPtr
add di, 7 ;Point at 8th set in group.
mov dx, seg SetPtr2 ;Point at first set in group.
lea si, SetPtr2
setintersect
;
8.12 SetDifference
* Computes the difference of two sets.
Inputs: ES:DI- pointer to first byte of destination set.
DX:SI- pointer to first byte of source set.
Outputs: None.
Include: stdlib.a
SetDifference computes the result of (ES:DI) := (ES:DI) - (DX:SI). The
destination set is left with its original items minus those items which are
also in the source set.
Example:
les di, SetPtr
add di, 7 ;Point at 8th set in group.
mov dx, seg SetPtr2 ;Point at first set in group.
lea si, SetPtr2
setdifference
;
8.13 NextItem
* Locates the next (first) available item in a set.
* Searches for items in ascending order using the ASCII collating
sequence.
Inputs: ES:DI- pointer to first byte of set.
Outputs: AL- Contains first item found in set (zero if the set is empty).
Include: stdlib.a
NextItem searches for the next available item in a set. It returns the
ASCII code of the character it finds in the AL register. If the set is empty,
NextItem returns zero (since chr(0) is illegal). This call does not affect the
set in any way. In particular, after the call the character located will still
be present in the set.
Example:
les di, SetPtr
add di, 7 ;Point at 8th set in group.
nextitem
mov ch2, al
;
8.14 RmvItem
* Locates the next (first) available item in a set and then removes
that item from the set.
* Searches for items in ascending order using the ASCII collating
sequence.
Inputs: ES:DI- pointer to first byte of set.
Outputs: AL- Contains first item found in set (zero if the set is empty).
Include: stdlib.a
RmvItem searches for the next available item in a set. It returns the
ASCII code of the character it finds in the AL register and removes that item
from the set. If the set is empty, NextItem returns zero (since chr(0) is
illegal).
Example:
les di, SetPtr
add di, 7 ;Point at 8th set in group.
rmvitem
mov ch3, al
;
December 7, 2017
Add comments