Contents of the DLL.TXT file
OS/2 Dynamic Link Libraries
by Ross M. Greenberg
You are in a maze of twisty little passages, all alike.
>USE DYNAMIC LINK LIBRARY
I see no DLL here.
I don't know how to make a DLL.
an 80286 machine
An OS/2 toolkit. In the toolkit is:
a text editor
a C compiler
>USE TOOLS IN TOOLKIT TO MAKE DLL.
Like an ADVENTURE game, not knowing the keywords when you're
trying to create a Dynamic Link Library can be very frustrating.
Once you know the keywords, though, you can explore new areas of
the game, and bring home new prizes and treasures.
The objective of this article is to help teach you some of the
new keywords and key techniques necessary to understand and build
a Dynamic Link Library of your own.
What is a Dynamic Link Library and Why Should I Care?
Well, I think the idea of Dynamic Link libraries is one of the
more important concepts OS/2 introduces. Although the principle
of DLL's have been around for a while (usually called shareable
libraries in other operating systems), OS/2 is one which makes
such a shareable library an intrinsic part of the operating
system, not just an added-on, "neat" idea. In fact, the system
library functions themselves are DLL's in OS/2, with a clear
separation by device type, allowing an easy upgrade route.
In MS-DOS, after you compile a program, you next link it with
other portions of the program and with portions from a library of
commonly used routines. The end result, is a stand-alone piece
of code which is loaded into memory, outstanding address references
resolved, and then executed. The physical file resulting from
the link contains portions of the library it used: two programs
which use the printf() function will each contain a copy of the
library functions which comprise that ubiquitous function.
In a single tasking operating system with a sufficiently large
hard disk, this really isn't a problem. In a multi-tasking
operating system which allows for shared memory usage, loading
multiple copies of the same code seems wasteful. OS/2 obviates
this need by allowing only one copy of a given function to be
loaded in memory and to have this copy shared by any task which
seeks its functionality. Since the function itself is not
physically part of the program file, it is possible for the
executable to be rather small, and to only update the library as
required. The concept of separate overlay files (and complicated
linkers) is no longer needed: just include the specific DLL's
required and let the operating system do the rest.
But the advantages of using DLL's go far beyond the convenience:
there is a functionality to DLL's which can be exploited in many
different ways. One way includes the ability for two (or more)
totally separate and distinct programs to be able to share memory
by simply accessing the same run-time routine. This message
passing, already an intrinsic part of OS/2, can be fine-tuned
with DLLs to fit the exact needs you might have in a complicated
This is not without a price though: there are some "tricks" to
writing an operable DLL, and some cautions and caveats. I
discovered some of them the hard way while preparing an
example of DLL usage for this article.
Of course, if you wish to avoid the supposed complexity of using
these useful techniques, there is nothing in OS/2 to prevent you
from using the standard, and more familiar, linking techniques of
the past. Except, perhaps, the knowledge that There Is A Better
A Dynamic Link Library is not simply a new way of linking an old
library. There are some intrinsic differences between the two
techniques. A look at how statically linked libraries are linked
into your code will help in understanding how the new DynaLink
approach differs, and why it is a better way.
When you compile a standard C program, the resulting output of
the compiler is an object file. The object file contains a
number of different types of records. There are different record
types for procedures and routines, externally accessible
variables, local stack variables and so on. Each unique type of
item has a unique record type associated with it.
There is a also a unique record type which indicates where the
code for a routine starts. Another unique record type has the
name of the routine and a pointer to that routines code. Some
record types indicate that the routine requested in not found,
and hence must be external to the object module. Other record
types indicate that the object is an externally located data
item. And so on.
The important thing is that each call to a routine which can not
be resolved within the particular source module is changed into a
parameter which simply involves an "external item" record.
Helping out the compiler, you can specify the call type as being
a near or far routine, or a near or far external item of some
After you've finished compiling all of your various source
modules into their object modules, you next link them together,
along with some appropriate libraries, and end up with an
executable piece of code. What does the linker actually do,
The linker examines each object module it sees (usually in the
order in which they're presented) and keeps a list of all record
types which either indicate a request for an external item or
which define a global item. Then as it sees record types which
indicate actual code routines, it determines that routine's
placement and resolves all calls for it into an actual address
within the eventual output file. Far calls, of course, indicate
not just an offset within a 64K segment, but allow additional
segments to be addressed, which in turn allow much larger
programs to be created.
There is nothing intrinsically foreign to the compiler in the
concept of mixed memory-model code as long as it knows how a routine
will be called. The compiler will generate a far return for routines
defined as far calls, and near returns for near calls. Addressing of
"far" data items are resolved in a similar way: the compiler puts out
a record type which the linker can understand and resolve into an actual
segment and offset (there's an extra step for the actual loading and
executing of the code, covered below).
Whatever items are not resolved within the linking of the various
object modules are next searched for in the libraries. These
libraries are, basically, object modules with nothing except
local references resolved. A module in the library starts off as
a simple object file usually, and then is stored and indexed into
a library as an entire unit: it is not stored on a routine by
routine basis, but rather on an object file by object file basis.
The appropriate routine or external item is found in the library,
the module is then pulled from the library and inserted into the
executable form. All references to it are resolved and the
process continues. An important consideration is that the object
module originally loaded into the library as one unit is pulled
from it as one unit as well, even if only one of the functions
specified in the routine is referenced.
The end result is a totally self contained image out on disk.
This image is loaded at some address (called the base address)
when you run it, the base segment address is added to all of the
other segment addresses throughout the code in the mysterious
load routines, and then finally, with a simple call or jump, your
program is executed.
That's basically how static linking works, with the more
technical details glossed over.
What are the differences with Dynamic Linking, then, with the
idea of statically linking an already existing library? Well,
the differences in the process are not all that substantial. The
end result is, though. And because of that, the conceptual
design of the Dynamic Link Library is different, as I describe
With a "normal" library, you compile all of the object modules
you'll need, then use a librarian program to create a library.
The library itself is in some strange format, suitable only to
linkers and librarian programs.
Things are a little different with DLL's, though. First, there
are two separate link steps. You must link the constituent
object file members which form the DLL together, and then link
your own code with the resulting DLL. Creating the DLL itself,
however, requires a bit of work:
After you've created the object files from your source, you use
the normal linker to create the DLL (plus a special file
described below), and its format is really no different than a
normal EXE file (really: it even has the 'MZ' as its first two
bytes!). It is therefore admirably suited for the standard
system loader to load as if it were actually a program. Later on
the system will, basically, do just that for the initialization
routine. Typically, the new library will have an extension of
The special file, described fully below, is called a module
definition file. It describes the external interface for each of
the accessible routines: their public names and their
attributes. Anything not specifically mentioned in the DEF file
can not be accessed routinely by an outside program. This
definition file is called the "export module definition file".
By running the export DEF file through a program called IMPLIB
(Import Librarian), a special library file can be created. This
library file is conceptually similar to the "standard" idea of a
library, and hence has the LIB extension. (See Figure x)
An option to running the special file through the IMPLIB program
is to create what is in essence the *inverse* of the export DEF file.
Such a file is called an "import module definition" file. It, too,
has the extension of DEF. (See Figure x)
When you link the DLL with your own code, the linker sees the
special record format of the import library (the LIB file created
by IMPLIB), or reads the import DEF file, and creates special
records which are understood by OS/2's program load facilities.
The end result of a link which uses DLL's is a hybrid file. It
can be considered as if a partial EXE and a partial OBJ at the
same time. A compiled object module will resolve local variables
and routines into a segment and an offset, leaving external
references virtually undefined. The Dynalink program will have
result in an EXE coming from the linker with its external
references to DLL routines effectively unresolved.
At this point, I'm going to stop referring to 'segments' as such,
and start calling them 'selectors': DLL's are applicable only in
protected mode OS/2, after all.
Part of OS/2's program loader recognizes that the EXE it's about
to load contains DLL calls. Finding these records causes a
lookup on an internal table to determine if the DLL's has already
been loaded. Now, each module in the DLL can be defined as a
"load at runtime" or a "load on demand" module. Regardless of
this definition, a selector is allocated for each module and all
references to those modules are now resolved into a selector and
offset pair. If the module has been defined as a "load at
runtime" module, then the actual code for the module is read from
the file, loaded into memory and any outstanding linkages
[Sidebar on alternative "manual" approach to locating and calling
A brief mention of why protected mode is a handy thing: consider
what happens if a "load-on-demand" function is called before that
selector points to valid code: a page fault occurs, and the
memory management module can easily resolve what the problem is,
load the appropriate code, and allow the program to continue
operating as if nothing had happened! Subsequent calls to
routines within the same selector would operate without a page
fault. Once the page fault mechanism (an intrinsic part of OS/2
and protected mode applications) has been enabled, it is
virtually transparent whether or not a requested page exists in
"real" memory or in virtual memory.
The 80286 and 80386 chips have a table within them, called the
Local Descriptor Table (See Figure x), which holds selectors, and
the characteristics of these selectors. There is a an LDT for
each of the processes currently running. If a process attempts
to access memory using a selector not within their LDT then
hardware will cause a fault to occur: effective hardware
protection of memory space.
The GDT, or Global Descriptor Table, is similar to the LDT,
except that all tasks may access the selectors (and their
associated memory) contained therein. Although this seems a
simple way in which to make a selector and its data space
accessible to multiple processes, OS/2 does not use the GDT for
shared memory access. Instead it makes an entry into the LDT of each
process. [Why is this?]
When a request is made to OS/2 for memory allocation, the type of
memory (shared or non-shared) is included in the request, and an
entry made in the LDT for all processes allowed to share this
*** the following paragraph is not necessary ***
Only the kernal (Ring 0) code may write to these tables, however.
(device drivers also run at Ring 0, so they'd have write access
to the descriptor tables as well, but we'll save that for another
So...What's the Big Deal?
Well, so far, functionally, a relatively efficient mechanism
exists for linking in routines as required at run time instead of
just once at link. All automatically and transparently, of
course, but what are the advantages of such an ability? There are
quite a few. First swapping the DLL routines in and and out of
memory becomes pretty easy: the LDT has a 'present' bit which
indicates whether the requested segment is in memory or not.
If not in memory, a page fault occurs as described above, and the
swapped out DLL routine can be brought into 'real' memory. Since
the selector itself is but an index into a table which contains
real address information, the individual DLL modules can end up
anywhere in memory. Transparent to your own code, of course.
Program code without some data space associated with it is a
rarity: pure code can't manipulate items, although often useful
in purely mathematical routines. The 8088/8086 family of
processors used the data segment register to address its data
space. The 80286/80386 family of chips requires data to be
addressed through a selector as well. And, the information for
the data selector is also stored in the LDT (See Figure x).
By setting of the appropriate bits in the LDT entry for a given
selector, its associated memory can be made private or publicly
accessible, or it can be set so it may be written to or is a
read-only piece of memory. Data selectors can even require a
certain level of privilege in the code attempting to access it.
Any "illegal" operation will cause a fault to occur, and the OS/2
is able to deal with the faulting process as required.
This means that, with the LDT set properly, memory can be
shareable between tasks, memory can be protected from illegal or
erroneous access, and other interesting memory usage and control
techniques can be enabled.
As such, the DLL can be controlled and fine tuned in a variety of
different ways. This fine tuning is done through the DEF files
Defining the DEF File
There are two different types of DEF files. One, the EXPORT
definition file, is used to let the world know what the various
entry points and characteristics of these entry points are. The
IMPORT definition file indicates what functions from the DLL will
be used and therefore should be linked at run time. There is
also the IMPORT library created by processing the EXPORT
definition file through IMPLIB. Let's look at each piece
separately, with a list of the available features and options
handy (See Figure ?). All these options, by the way, must be
entered in the appropriate DEF file in UPPERCASE. [Why is that?]
The DEF Files: Showing Your Face to the Outside World
The EXPORT DEF file really only requires a few fields. The most
important required field is the LIBRARY field. This defines that
this is a DLL Export definition file, instead of a "normal"
application DEF file.
The LIBRARY statement must be the first one in the DEF file,
allowing the linker (and IMPLIB) to have a bit of a head start on
what is about to come. The first argument [name] to the LIBRARY
statement is the eventual output name for the created DLL. The
extension of DLL is used unless you specify a different one.
When the DLL is first loaded, there may well be some things you'd
want to be initialized (setting certain data items, assuring
certain system resources are available, etc.). Each DLL has the
ability of having an initialization routine which will be called
when the DLL is first loaded, or upon each invocation of the DLL.
[init_type]allows you to specify if you want the initialization
routine called each time the DLL is invoked (INITINSTANCE) or
only once, when the DLL is first loaded (INITGLOBAL - the
In a like manner, if the linker sees the NAME statement, it
understands that you are creating an application and not a DLL.
The NAME statement allows you to specify whether the application
is WINDOWS compatible and, if so, whether it is capable of
running in real mode or protected mode. If you specify WINDOWAPI
as the second argument, then WINDOWS is required by this
application in order to execute. Specifying WINDOWCOMPAT means
that it is not only WINDOWS compatible, but can also run in its
own screen group under OS/2. Finally specifying NOTWINDOWCOMPAT
indicates that the application requires its own screen group when
running, and is the default if you specify nothing.
NAME allows you to specify (with [apname]) the name the
application shall *** shall ??? *** have after linking. The default
extension, naturally enough, is .EXE.
All code segments within the DLL will share a similar set of
attributes unless otherwise specified. The default set of these
attributes is set with the CODE statement.
There are several other optional parameters allowed (but ignored)
in the CODE statement for compatibility with WINDOWS.
The [load] parameter indicates if you wish the segment to be
automatically loaded upon DLL invocation (PRELOAD) or to wait
until the segment is actually accessed with a call (LOADONCALL -
the default). In an application which may have large areas of
the code which might never be called, there is no real need any
longer to load those library calls into memory all at once. If
they're called, then they'll be loaded automatically if the
LOADONCALL option is specified. Once the routine is loaded, it
will stay loaded in memory (except for swapping out to disk, of
If you use the [executeonly] option to specify that other
processes can not read this segment (by using EXECUTEONLY), then,
even though the LDT marks the selector as accessible on a global
basis, it may not be read (or treated like a data segment
selector) by any process without the appropriate privilege level
required. The default (EXECUTEREAD) allows the memory allocated
to this selector to be read for purposes other than for
Only code segments with a high enough privilege level may access
the hardware directly. You may specify that a segment has this
ability with the [iopl] parameter. The default (NOIOPL) makes
sense: unless otherwise specified, an attempt to access the
hardware (such as the comm port directly) will cause an immediate
fault to be taken. In the case where you allow a segment to
access hardware directly, include the IOPL parameter in that your
CODE line (it is probably a better idea to specify IOPL only when
required, see the SEGMENTS statement below)
OS/2 still requires that you make a system call in order to
request the privilege of hardware access. [Why is this? How
many processes can have hardware access at the same time?]
A brief description of how the Privilege Level in OS/2 functions
is important to understanding the implications of using the IOPL
The 80286/80386 chip prohibit direct transitions between code
segments of a differing level of privilege. The default
privilege level for application code in OS/2 is Ring 3. Ring 2
code segments may access hardware directly. The only way to
transfer from one privilege level to another is through what is
called a call gate. A call gate has a specific selector type in
the LDT and an actual "destination selector" which is the
selector belonging to the actual code segment of the privileged
call. Additionally, the gateway has its own attendant privilege
level and may only be called by code segments of the same
Conceptually, when a call is made to a privileged routine, it
passes through the call gate before passing to the privileged
routine. Since the only way through the call gate would be with
either a CALL instruction (going into the routine) and a RET
(coming back from the routine), the call gateway concept provides
for an extra level of code security, but at the cost of some
additional hardware overhead. Each transition via a gateway
causes parameters on the stack to be copied to a new stack,
another interesting security feature of the 80286, since a lower
privileged program could manipulate the return address on the
stack otherwise. See the EXPORTS statement below for more
information on that requirement.
An IOPL'ed routine also uses up an additional slot in the LDT
table. Although the LDT table has 8K possible entries in it
(each LDT entry takes up eight byte, so an entire segment has
been allocated to the LDT in OS/2), 5K of those are reserved for
OS/2 itself. That leaves you with only about 3K LDT entries.
Probably enough for the foreseeable future.
The final parameter in the CODE statement allows you to specify
whether the segment is a CONFORMING or NONCONFORMING segment.
This also deals with the IOPL privilege level and can be pretty
confusing at first. Consider it to be the inverse of the gateway
Normally, a segment will execute with the privilege level
of the calling segment. However, there are time when this
might not be appropriate: consider a Ring 3 communications
protocol checking routine called from a Ring 2 device driver.
In this situation, you might not want to allow the the protocol
checker to operate with the higher privilege of its calling
segment. The default case, the NONCONFORMING parameter, would
cause the Ring 3 routine to execute at Ring 3. Set to
CONFORMING, it would execute at the privilege level of the
routine calling it: the device driver running at Ring 2.
Data Space Definitions
Just as code segments have a method of setting default parameters,
the DATA segments also allow certain parameters to be set. This is
done with the DATA statement.
The DATA statement shares some of its parameter list with the
CODE statement. This makes a great deal of sense since these
parameters describe how to make the default settings for each
data selector in the LDT. The format of the DATA statement is
therefore very similar to the CODE statement:
[load], as above, indicates whether the data segment should be
loaded upon first invocation or load should wait until the first
access to the selector's address. The default condition is
LOADONCALL, however you can specify invocation load with PRELOAD.
[readonly] allows you to determine if the data segment is allowed
to be written into (with the default parameter of READWRITE) or
whether it should be protected against write access (READONLY).
Attempts to write to a READONLY segment will cause a hardware
[instance] allows you to specify whether or not this data segment
(the DGROUP data segment in most cases) should be automatically
allocated upon invocation, and if so whether there should be one
copy allocated for the entire DLL (SINGLE, which is the default
setting for DLL's), or whether each instance of DLL usage should
have its own automatic data segment allocated (Multiple, and the
default setting for applications). If no automatic allocation is
required, then the parameter should be set to NONE.
Each data segment can also have its own IOPL level. This allows
you to set the minimum privilege level required in order to
access this data segment: setting the [iopl] parameter to IOPL
means that only Ring 2 and more privileged levels are allowed
access to the data segment. The default, NOIOPL, allows Ring 3
code segment routines to have access to the data affiliated with
the data segment. This allows an interesting interface to be
created between IOPL'ed segments and non-IOPL'ed segments through
common shared memory: like passing a message through a keyhole.
Finally, [shared] allows you to determine whether a data segment
marked as a READWRITE segment may be shared among different
tasks. If it is marked as shareable, then only one segment is
allocated at load time, and any process with privilege level
sufficient to write to it may do so. The default, NONSHARED,
does not allow write access to a common data segment and causes a
separate copy to be loaded for each instance. If a data segment
is marked as READONLY, then it is shareable by definition.
Segment by Segment Parameters
Unless otherwise specified, code and data segments have the
attributes you set in the CODE and DATA statements as described
above (or their pre-defined default values if you don't describe
However, using the SEGMENTS statement, you may specify the
individual characteristics for a given named segment. Using the
fields as specified above, the format for the SEGMENTS statement
[Tony:the following should be on one line, indented slightly]
The [CLASS 'classname'] is an option which allows you to specify
that the parameter (which is required) be assigned
to the class specified. If you don't specify a classname, then
the 'CODE' classname will be assigned. [What happens to the DATA
segments which aren't named?]
Other arguments to the SEGMENT members are as outlined above.
The EXPORTS statement is the only method of letting the outside
world know about the routines of the DLL (the EXPORT statement is
only applicable to DLL's. See the IMPORTS statement below for
Unless specified by inclusion in the EXPORTS section of the DEF
file, a DLL routine is invisible to applications. The full
format of each line within the EXPORT section is:
A name used internally within the DLL need not be the name the
outside application world knows the routine by: you can specify
the outside name as different from the internal name easily. This
allows you to have a class of functions each serving a similar
purpose and then to categorize them if you wish with a meaningful
If you wish, you can allow access to the function by its ordinal
(or the routines library "slot" number) instead of by its name,
by specifying the desired ordinal (obviously unique for the DLL)
preceded by an '@' sign. If you do, lookups will be faster at
load time, and less space will be required for the in-memory
If you do use the [@ordinal] option, then you may have to
consider using the [RESIDENTNAME] option as well: normally, if
an ordinal is used, then OS/2 will not keep the specified
external name available. If you're not using the ordinal
parameter, then OS/2 will keep the name resident in its search
If you've included usage of any privileged functions in your
routine, you'll have to let the linker know how many words to
reserve for parameter copying by using the [pwords] variable.
Since a calling task will have its own parameters copied as it
passes through the gateway, you have to reserve that space now.
The imports section allows you to specify which external DLL
routines you require in your application (although a DLL can
import functions from another DLL, ad infinitum). The format of
a line in the IMPORTS section is:
Again, like the EXPORTS lines, you can specify a name your
routine uses when it is trying to resolve external routines. You
could, therefore, create a debugging DLL and a "normal" DLL and
be able to link between them only by changing the or
the associated with the named routine.
is the name of the application or DLL which contains the desired
which was specified in the EXPORTS statement for the
DLL. The can also be an ordinal number.
If the optional [name=] parameter is not specified, then the
default name the routine will be "known" as will be the same as
. You must specify an internal name, however, if
you've specified an ordinal number instead of an .
There are a variety of other statements which can be included in
the DEF file(s). They are described in Figure xDEF.
Using the DEF files
There are two specific ways in which the DEF files can be used:
first, just including them on the command line to the linker, and
second, passing them onto IMPLIB.
IMPLIB is the Import Library Manager utility, a standard part of
the developers toolkit in OS/2. If you're creating the DLL and
the application to use the DLL, you don't have an absolute need
for IMPLIB, since you can create the EXPORT and IMPORT library
definition files as you desire. However, if you're creating a
DLL for other applications to use (perhaps a commercial functions
library, or perhaps a replacement for an already available
product you produce), then IMPLIB should be part of your
IMPLIB takes a definition file for input and produces what
appears to be a simple LIB file for output. This then allows you
to include the LIB file in the link step. And allows you to
include multiple DEF files into one LIB file, too. Assuming you
had two DLL's, called COM_INP.DLL and COM_OUT.DLL each with their
associated DEF files. You could specify:
IMPLIB COM_STUF.LIB COM_INP.DEF COM_OUT_DEF
and then simply distribute the COM_STUF.LIB and the two DLL's,
keeping the internal details of the DLL's to yourself.
A DLL Example
In attempting to create a DLL for this article, I ran into a
number of difficulties. Some can not, by the very nature of the
DLL and multi-tasking software, be resolved: deadlock can occur
in DLL's just as they can in other types of software.
Think of the required aspects of a simple multi-session process
such as a "chat" facility: multiple copies of the same process
running, each of which occasionally generates a message, which is
added to some internal queue. Each message generated must be
collected by all other processes before it can be erased from the
queue of outstanding messages, and each such message must be
displayed, eventually, by each process. Finally, each of the
processes must be able to "login" or "logout" from the chat
session, and each must have some type of unique identifier.
I've designed such a facility as a method of demonstrating some
of the unique abilities and problem spots of using a DLL as the
"glue" which holds a multi-process concept like this together.
Is it useful? Well, perhaps not on a single-screen machine, but
if the output were to a number of communications ports, it might
Er... one aspect of this code should be brought to your attention
before you start reading. All of the problems inherent with this
code design can be readily and easily solved using an approach
which includes the OS/2 system resource of queues. Why wasn't
that approach used for this article, then?
Primarily because it wouldn't have required the concept of using
One of the underlaying advantage of DLL's which makes them useful
in this application is the ability to not only have private and
shared memory, but the ability of separately compiled and
executed tasks to utilize the same code at the same time. In
essence, there is nothing to prevent one of "users" of this chat
code from following the coding conventions I've created and
creating their own user-friendly interface (the bane of spiffy-
concept-designers everywhere). In fact, there is no reason why
differently designed front ends couldn't be used for each
Starting with a concept like that, I designed this code using a
majority of the capabilities in the DLL.
One of the abilities of the DLL is to provide for initialization
code which will be executed either upon just the first invocation
of the DLL, or upon each invocation. This initialization routine
is called before the process itself starts to run. This DLL only
calls its initialization routine the first time, so the EXPORT
file for it contains the INITGLOBAL parameter. Since this is the
default condition, if could be excluded, if you wished. The
routine I use in this DLL is a simple one, merely setting certain
default conditions and allocating some required queue space.
First, there is a login procedure. The login procedure must advise
the library code that another consumer and provider of messages has
suddenly appeared. To make things easier, the login procedure
returns some user identifier to the process: it becomes useful to
include an ID when generating new messages, when consuming old
ones and, of course, when logging out.
When the DLL sees the login, it also allocates and assigns
whatever global and local objects and structures are required for
the new process. A choice had to be made in the design as to
where the actual allocations of memory would be made, since the
memory could be allocated either in the DLL (becoming, in
essence, a hidden object from the "client" code) or in the per-
process code itself. There are advantages to having a DLL
routine allocate memory which is globally accessible to all
processes but which only the DLL routines know about.
Additionally, a login causes each message already in the queue to
appear unread to the newly logged in task. Later, when requests
are made for an outstanding and unread message, these messages
will be returned.
The general design of the DLL causes a sharing of the "cleanup"
task on each call to the "get a message" routine. When a message
is passed to the DLL, it is added to a queue - a structure which
includes a flag word with one bit for each session. A mask word
with a set bit for each empty task slot is used for the initial
value of this flag word. The current task ID is then or'ed in,
allowing the sender of the message to indicate it has already
received the message.
When a process fetches a new message, it sets the bit in the flag
word to indicate that it has fetched this message. Then, when
that flag indicates all processes have gotten a copy of the
message, the message can be removed from the global queue. Each
process therefore has to have the ability to manipulate that
queue directly or must call a routine which has that ability.
I've opted for a more modular design: using a routine to
specifically remove the message from the queue (or to add a
message to the queue) allows me to isolate the queue itself.
Although the queue resides in global memory at this point,
perhaps in the future it might reside on some node on a network,
or some memory device which might require a higher privilege
level? Therefore, isolating the routine which physically
modifies the queues is a good idea.
Since there isn't a human attached to each of the sessions, I
have each session send a message only after a random amount of
time has passed. And, just to keep things interesting, there is
a suitable sleep period whilst the imaginary typist is "entering"
his or her message. This allows messages to build up in the
queue. Whenever the sender is not "typing" or "sending" a
message, it is executing loop which constantly seeks the
outstanding message count. Blocking on a null message count
would prohibit the sender from sending a message. Of course,
OS/2 provides the ability of having two different threads, one of
which could block on a null message count within the DLL, but
that is not within the scope of this article.
Displaying of messages received takes place on a per-process
basis. This can cause problems when the session does not
currently have screen access. Eventually, when the internal
queue for the process fills up, not having access to the screen
will cause it to block. When a process blocks, it stops fetching
messages from the DLL queue. Eventually that queue will fill up.
When it does, another session will block when it attempts to add
a message to the queue. This condition can cascade until all
sessions are blocked.
Therefore, before any session sends a message, it checks to determine
if room exists in the queue. However, OS/2 is a multi-tasking operating
system. Therefore, a routine must not be interrupted between the time
it determines there is room on the queue and the process of actually
adding the message to the queue. Two specific alternatives exist to
get around this problem: the first is to call DOSCritSec, which
prohibits the given task from being interrupted by any other
system process - rather drastic, and inherently ugly.
The other, and the one I used in DLL_CHAT, was to setup a
globally accessible RAM semaphore and to assign the semaphore
immediately upon entry to the "add a message" routine. Other
procceses attempting to add a message would programatically block
on this flag and would wait in the loop for it to free up, or for
a certain amount of time to pass. If the flag didn't change
within the specified time-out period, then an error condition
would be returned to the calling task.
I used a little trick here which the optimization of the MSC 5.0
compiler makes easy. I set the initial value of the RAM
semaphore to 1111111111111110. With a simple right shift of one
bit position, I can simultaneously read the current status of the
semaphore as well as reserve it for my own usage if it is not in
use. When I grab the semaphore, I immediately set it to all
1's (since the right shift causes the topmost bit to be set to
a zero in 80286 architecture), forcing subsequent right shifts to
not only see the semaphore is in use, but to do so without having
to "turn off interrupts" or indicate it is a critical section.
This will only work when the word is right shifted in place: if
your C compiler does not generate this as an in-place shift, then
this will not be a safe way for you to manipulate the semaphore.
It's an easy operation to do with an assembler routine, though,
in any case. [Reed - Which optimization switches cause the SHR
directly to memory?]
Finally, the logout routine. When the session gets a quit
command from the keyboard, it immediately passes control to the
DLL logout routine. This sets the above mentioned RAM semaphore,
then proceeds to loop through the outstanding message list. For
each outstanding message, it sets the flag as if the process had
already received the message. After each flag word has been so
set, it is examined to determine if it has been read by all
processes. If so, it is removed from the queue.
Each message on the queue is a member of a linked list, and it's
memory is allocated from the global memory pool. When removing a
message from the queue, the pointers of the other messages it
points to are modified to point to each other, then the memory is
Well, that is the basic design of the DLL_CHAT program.
Now for the bad news.
Caveats and Warnings
It's not really as bad as all that, but there are a few things
you have to be aware about when you're designing your DLL's.
Above I mentioned some extraordinary lengths I went to in the
original design to assure that certain areas of the code are
protected against two "competing" tasks attempting to access it
This is a problem inherent in any multi-processing system.
Typically, it's called a "re-entrancy" problem, that is, a piece
of code being entered by a calling process before another process
has finished with its call. Using semaphores, as I did, is effective
in most circumstances. But, the method I chose was not the optimal
Consider what happens if the session currently executing
the semaphores routine happens to be interrupted by some high
priority event (perhaps a keystroke, or (if attached to a comm
port) the modem losing carrier). There is no guarantee it will
return to where it left off. Yet, if it doesn't return and
finish the routine, then the semaphore will forever be marked as
OS/2 does, however, provide an alternative if you use one of the
system semaphores. The semaphore is created with a DosCreateSem()
call, which returns a semaphore handle (similar to a file handle).
By using other semaphore calls, a process can effectively keep the
re-entrancy problem from occurring. In the event that a process who
"owns" the semaphore at that time (and therefore is blocking others
waiting on it) gets killed for some reason, even unintentionally,
the system will effectively call DosCloseSem(), which will clear the
semaphore if set and restore it as a system resource if there are no
other references to it.
In this application, I chose not to use system semaphores,
since there would be frequent system calls with heavy overhead,
and the likelihood that a process would be killed unintentionally
was pretty small. However, this also meant that I had to insure
that a client program "dying" would die only after relinquishing
control of the semaphore.
Therefore, I use the OS/2 system call to add my own specific routine
to my exit list, that is, the list of routines which OS/2 will execute
on my behalf between the time the client program dies, and the time it
is buried. This routine simply calls the logout procedure, which in turn
will reset the system-wide flagword, the bits in each message, and finally
cleanup the message base and any outstanding semaphores.
When designing a DLL, you should always keep in mind worst case
scenarios: what would happen if this line of code were running while
ten other processes were running *those* ten different lines of code.
Since you can not really control what the other processes might be doing
as they start to execute common areas of code, it is better to design
the code as modularly as possible, and be sure to semaphore around areas
sensitive to multi-tasking happening at just the wrong time. Chances are
that it will!
Remember that, not only must you program defensively against other
processes using the DLL routines and their attendant data, but if you
opt to use OS/2 threads, you'll have to protect against their re-entrant
usage of the DLL routines (in fact, most of the considerations I'm
advising you of regarding DLL's can also be of importance when
designing a threaded program).
When speaking about unanticipated or asynchronous interruptions,
you should be thinking about signal catching. And about not doing
it in a DLL!
If you're going to use the system to set a routine to catch a
particular asynchronous event (such as program termination, or
control-C trapping done with the DosSetSigHandler system call),
doing it in the DLL can be dangerous. The concept of "resource"
is the one which plays a critical role here. The question is,
who owns the "resource" of a signal catcher in a DLL? Remember
that the code is re-entrant, and that trying to determine the
death of the last client member for the DLL can be tricky:
especially if the signal catcher for client process termination
is within the DLL itself. [Why? What portion of OS/2 doesn't allow it?]
On a similar basis, it is probably a good idea to stay away from
the DosError (which allows a process to suspend hardware error
processing), DosSetVect (which, lets your exception handler be
called when certain conditions, such as attempts to execute an
illegal opcode, occur).
If you must include such calls in your code, be sure to thoroughly
isolate those portions of the code from other client members of the
DLL's, and to preserve all aspects of your process state. Be sure
to terminate "normally", too, not in some unique way, since DLL's have
some special characteristics which are taken care of properly in
automatic exit list processing upon client death.
Ramifications of what happens if you "signal out" of a DLL
instead of "normal" termination include the possibility that the
"active" count of the selectors which the DLL has used will not
be updated properly. The DLL may still be considered by OS/2 to
have some client members accessing it, since Process Termination
was handled by a signal handler of your own design which doesn't
know how to update the DLL active client count. [What Signals Apply
on a global basis versus local? What happens to a Signal Catcher
when it's owner dies?]
When designing your program to use DLL's, there are a few things
you'll have to be careful of in your initial program design.
First, access to all DLL routines is through a far call. So,
although you can use the small memory models if you wish in the
client section of your code, and in the DLL itself, the external
definition of the DLL routines must indicate it is a far routine.
As such, the routine itself must also indicate in its prototype
that it is a far routine: otherwise the CALL and RET statement
types won't match.
What about the data allocated in the DLL? That, too, must be
addressed as far data from the client routines. Locally, within
the DLL, it may be addressed as near or far as required.
Before your client code ever executes, the initialization routine
for the DLL will have already executed. Expecting any
initialization by the main() routine in your client code would be
premature. Therefore, your DLL initialization code should only
access data within the DLL itself, since the startup code may not
have even allocated memory as of yet! [Exactly where does the DLL
init routine get called from?]
What of the differences between global and instance data items?
Well, obviously, they can be confusing concepts, since each DLL
module has no easily method of determining whether the data space
it is using is private or common to all tasks. This can be
tricky, since many programmers routinely use temporary pointers
to objects which they place in "global" data space instead of
allocating it on the stack for local usage.
It is important to recognize the differences here between global
data (such as items defined and allocated outside the scope of
any routine in 'C') and globally accessible data. In the first
case it really is "local" data, that is, data local to the client
process itself and not accessible to other clients of the DLL.
In the second case it is accessible to all clients of the DLL.
And that can fool you if you're not careful: you must be sure
that globally accessible data items don't change value when
you're not looking! Keeping items to a local stack frame is
probably the safest bet. Items which are kept around without
changing value are best kept in private client data space.
You can easily indicate through the DATA statement in the EXPORT
file which data segments you wish to be allocated on a private
per-client basis and which ones you wish globally accessible. If
a segment is marked as READONLY, then it is globally accessible.
The MSC data group named CONST should always be marked as
READONLY: that allows for only one copy of literal strings to be
loaded for the entire DLL.
This brings up another interesting topic: using a C compiler to
create DLL's. There were some rumors floating about for a while
that this was impossible since the stack segment (SS) did not
equal the data segment (DS) upon entry into a DLL. Since the
library has many routines which expects them to be equal, it at
first appeared that creating DLL's in C was blatantly forbidden.
Using Microsoft C to Create and Use DLL's
There are several specific enhancements available in the MSC
5.1 C compiler which make writing C DLL's very easy.
A new pragma, #pragma data_seg, allows you to specify for any
function that later loads its own data segment, exactly which
data segment to use. By specifying the data segment as:
#pragma data_seg (segment_name)
you not only make things easier for using DLL's, but you have
more control over which data segment all initialized static and
global data will reside in. The default data segment name if you
don't specify one is the one used by DGROUP, which depends upon
the memory model you use.
This is half of the solution of which data segment to use in the
DLL. The other half is to specify the called function as one
which uses the previously saved data segment with the _loadds
keyword. Upon entry into a _loadds function, the current DS
register is saved, the last one specified in the #pragma data_seg
is written into it, the function executed, and the saved DS
restored upon exit. This is not such a new concept, since you've
had the ability to use /Au as a compiler option for quite some
time now, but this allows you to specify some capability on a
function by function basis.
In order for the compiler to know, in advance, that the routine
is going to be part of a dynamic link library, the new keyword
_export has been added. In particular, if the function is one
with an IO privilege level associated with it, then the number of
words to reserve for the privilege level transition stack copy
operation can be easily calculated at compile time if the _export
keyword is used. In fact, if you use the _export keyword as part
of your function definition, the number of words to reserve as
indicated in the DEF file is ignored. [Is this TRUE? I haven't
figured out how to verify it yet....]
When setting up the various data segments into their constituent
types (SHARED, READONLY, etc), you should also take a look at the
map file produced from the link: some additional segments might
be created which you hadn't thought about. In particular, some
NULL segments are created for each group as _CONST, and _BSS. In
order not to confuse the linker, each member of the group should
be specified within the SEGMENTS section of the EXPORT DEF file,
and you need only mention the "special" segments: those with
attributes different from the default setting of the DATA
statement. [Is this true?]
Creating the Initialization Routine for Your C DLL
Remember that the DLL, once passed through the linker, looks much
like an EXE file. In fact, the same load routine used for your
own client module is used to load the DLL itself. And, if you've
defined an initialization routine within the DLL, it will be
executed almost as if a stand-alone routine: called immediately
after the DLL is loaded, it is only called (if you specify so)
upon subsequent loads of the DLL.
You can easily tell the loader where the initialization routine
is located by including a small assembly language routine as part
of your DLL, and linking it into the DLL when you do its link.
In fact, it probably is not a bad idea to have a module similar
to the one in Figure xINIT, and to always name your DLL initialization
routine the same. The secret of the initialization routine?
Simply the fact that the only "program" the loader will find is
the one which is addressed by the 'END START' directive!
The MS C compiler throws a small monkey wrench in your path, as
well. Meaning to be helpful, the compiler throws a usage of the
_acrtused variable into each object module. This forces the
linker to be sure to include some of the startup routines from
the C run-time library into the eventual output of the linker
(which the compiler thought was going to be a normal EXE file).
To prevent this code from being loaded into your DLL, you should
define the variable yourself, as external data in a 'solo'
int _acrtused = 0x1234;
or some number of particular meaning to you.
Additionally, in order to have global data items show up in the
named segment for the particular object module you're linking, it
should either be initialized, or declared as static. Or both.
[What about using const?]
When writing your own DLL, you'll also want to use the -Gs switch
on the MSC compiler to disable stack checking. Aside from the
slight added efficiency you'll gain (slightly smaller code and
one less function call per function), this is a requirement for
the DLL since the stack segment is different for each client
process and the size of the stack may vary on a per client basis
as well. In the few places where you really need to add stack
checking, MSC provides you with an abundant set of #pragma's and
MSC 5.1 also includes some very welcome additions to the run time
libraries package. Three new libraries exist for working with
programs requiring support for multi-thread, and for DLL's with
single thread and multi thread. A couple of changes which will
affect you are the subtle differences such as errno now being a
macro which translates into a function call: a table must now be
used somehow in the functions of the run-time to enable a single
run-time package to handle errors from multiple sources. [Assuming
there is a table, how is the offset into the table created? PID?]
Additionally, the new DLL run-time libraries allows you to use
any of the functions you've grown accustomed to. Although I have
not tried each and every function, I trust that MS would have
specifically mentioned any there might be a problem with. [Er...
hopefully this is so?]
By the look of things and how they operate, it is probably safe
to assume semaphoring was used throughout the library --- this to
keep a call using an globally accessible variable from being
clobbered from two client processes trying to use it simultaneously.
This forces a heavy overhead in system calls to frequently called
routines, but one which there is little choice about. Remember that
the libraries had to be written under a worst case scenario, and you
pay a penalty in speed and efficiency for the safety inherent in
putting semaphores around the "dangerous" routines.
[SideBar/Figure of printf() being interrupted with and without
With the introduction of DLL's in OS/2, another programming
environment was created. Much like WINDOWS programming, it has
it's own strict rules. These rules, however, make a great deal
of sense once the underlying design concept and limitations of
both the chipset and of the appropriate portions of OS/2 are
You can avoid a lot of these sticky problems by piece-at-a-time
programming: get as much of your program to work using routines
in a more "normal" library (using the library utilities), then
move the routines out into a DLL. Then by adding the additional
functionality and safeguards required for shared memory access
between sessions and re-entrancy problems, you'll be able to
easily create a program which uses up less disk space, less
memory space, and allows for inter-process communication in
whatever manner *you* wish to design. Not a bad feature at all
for a new operating system to be written around.
And, once you've been through the maze of twisty little passages
once, the next time it isn't so hard to get through it rapidly
and collect that treasure. The secret is just knowing a couple
of key phrases. And thinking ahead before you enter the maze.
STUB 'filename' which allows you to specify the name of a
DOS 3.x file to be run if this file is
run under DOS instead of under OS/2.
PROTMODE Indicates that this file can only be run
in Protected Mode. An aid to the linker.
OLD This statement allows you to preserve the
names associated with ordinal numbers in
a multi DLL environment. I haven't
really figured out a use for it yet,
REALMODE The opposite of PROTMODE, this indicates
the program can only be run in real mode.
An aid to the linker.
EXETYPE Insures that the specified operating
system is the current one for the
program. You can specify OS2, WINDOWS.
Or DOS4. DOS4?!?!? Yep. More on this in a
HEAPSIZE Determines how much local heap must be
allocated within the automatic data
STACKSIZE Allows you to specify how much space
should be reserved in the stack segment
when the program is run.
ASSUME CS: _TEXT
_TEXT SEGMENT BYTE PUBLIC 'CODE'
START PROC FAR
call INITROUTINE ; the real initialization routine
END START ; defines auto-init entry point