Category : Tutorials + Patches
Archive   : OBERON.ZIP
Filename : OBERON.TXT

 
Output of file : OBERON.TXT contained in archive : OBERON.ZIP
==========================
oberon/long.messages #11, from frode, 21774 chars, Tue Feb 23 20:43:46 1988
--------------------------
TITLE: Oberon Language Article, (c) ETH-ZENTRUM, SWITZERLAND

From Modula to Oberon

N. Wirth

Abstract

The programming language Oberon is the result of a
concentrated effort increase the power of Modula-2 and
simultaneously to reduce its complexity. Several features
were eliminated, and a few were added in order to increase
the expressive power and flexibility of the language. This
paper describes and motivates the changes. The language is
defined in a concise report.

Introduction

The programming language Oberon evolved from a project whose
goal was the design of a modern, flexible and efficient
operating system for a single- user workstation. A
principal guideline was to concentrate on properties that
are genuinely essential and - as a consequence - to omit
ephemeral issues. It is the best way to keep a system in
hand, to make it understandable, explicable, reliable and
efficiently implementable.

Initially, it was planned to express the system in Modula-2
[1], as that language supports the notion of modular design
quite effectively, and because an operating system has to
be designed in terms of separately compilable parts with
conscientiously chosen interfaces. In fact, an operating
system should be no more than a set of basic modules, and
the design of an application must be considered as a
goal-oriented extension of that basic set: Programming is
always *extending* a given system.

Whereas modern languages, such as Modula, support the notion
of extensibility in the procedural realm, the notion is
less well established in the domain of data types. Modula
in particular does not allow the definition of new data
types as extensions of other, programmer-defined types in an
adequate manner. An additional feature was called for,
thereby giving rise to an extension of Modula.

The concept of the planned operating system also called for
a highly dynamic, centralized storage management relying on
the technique of garbage collection. Although Modula does
not prevent the incorporation of a garbage collector in
principle, its variant record feature constitutes a genuine
obstacle. As the new facility for extending types would
make the variant record feature superfluous, the removal of
this stumbling block was a logical decision. This step,
however gave rise to a restriction (subset) of Modula.
^LIt soon became clear that the rule to concentrate on the
essential and to eliminate inessential should not only be
applied to the design of the new system, but equally
stringently to the language in which the system is
formulated. The application of the principle thus led from
Modula to a new language. The adjective new, however, has
to be understood in proper context: Oberon evolved from
Modula by very few additions and several subtractions. In
relying on evolution rather than revolution we remain in the

tradition of a long development that led from Algol to
Pascal, then to Modula-2, and eventually to Oberon. The
common trait of these languages are their procedural rather
than functional model, and the strict typing of data. More
fundamental even is perhaps the idea of abstraction: the
language must be defined in terms of mathematical, abstract
concepts without reference to any computing mechanism. Only
if a language satisfies this criterion can it be called
"higher-level". No syntactic coasting whatsoever can earn a
language this attribute alone.

The definition of a language must be coherent and concise.
This can only be achieved by a careful choice of the
underlying abstractions an appropriate structure combining
them. The language manual must be reasonably short,
avoiding the explanation of individual cases derivable from
the general rules. The power of a formalism must not be
measured by the length of its description. To the contrary,
an overly lengthy definition is a sure symptom of
inadequacy. In this respect, not complexity but simplicity
must be the goal.

In spite of its brevity, a description must be complete.
Completeness is to be achieved within the framework of the
chosen abstractions. Limitations imposed by particular
implementations do not belong to a language definition
proper. Examples of such restrictions are the maximum
values of numbers, rounding and truncation errors in
arithmetic, and actions taken when a program violates the
stated rules. It should not be necessary to supplement a
language definition with voluminous standards document to
cover "unforeseen" cases.

But neither should a programming language be a mathematical
theory only. It must be practical tool. This imposes
certain limits on the terseness of the formalism. Several
features of Oberon are superfluous from a purely theoretical
point of view. They are nevertheless retained for practical
reasons, either for programmers' convenience or to allow
for efficient code generation without the necessity of
complex, "optimizing" pattern matching algorithms in
compilers. Examples of such features are the presence of
several forms of repetitive statements, and of standard
procedures such as INC, DEC, and ODD. They complicate
neither the language conceptually nor the compiler to any
significant degree.

These underlying premises must be kept in mind when
comparing Oberon with other languages. Neither the language
nor its defining document reach the ideal; but Oberon
approximates these goals much better than its predecessors.

A compiler for Oberon has been implemented for the NS32000
processor family and is embedded in the Oberon operating
environment. The following data provide an estimate of the
simplicity and efficiency of the implementation, and readers
are encouraged to compare them with implementations of other
languages. (Measurements with 10 MHz NS32032).

----------------------------------------------------------------
length of source program
length of compiled code
time of self-comp.
----------------------------------------------------------------
lines characters bytes seconds
----------------------------------------------------------------
Parser 1116 3671 99928 11.53
Scanner 346 9863 3388 3.80
Import/Export 514 18386 4668 5.25
Code generator 19636 5901 21636 21.02
Total 3939 130869 39620 41.60

Subsequently, we present a brief introduction to Oberon
assuming familiarity with Modula (or Pascal), concentrating
on the added features and listing the eliminated ones. In
order to be able to "start with a clean table", the latter
are taken first.

Features omitted from Modula

Data types

Variant records are eliminated, because they constitute a
genuine difficulty for the implementation of a reliable
storage management system based on automatic garbage
collection. The functionality of variant records is
preserved by the introduction of extensible data types.

Opaque types cater to the concept of the abstract data type
and information hiding. They are eliminated because again
the concept is covered by the new facility of extended
record types.

Enumeration types appear to be a simple enough feature to be
uncontroversial. However, they defy extensibility over
module boundaries. Either a facility to extend enumeration
types would have to be introduced, or they would have to be
dropped. A reason in favour of the latter, radical solution
was the observation that in a growing number of programs
the indiscriminate use of enumerations had led to a pompous
style that contributed not to program clarity, but rather
to verbosity. In connection with import and export,
enumerations gave rise to the exceptional rule that import
of a type identifier also causes the (automatic) import of
all associated constant identifiers. This exceptional rule
defies conceptual simplicity and causes unpleasant problems
for the implementor.

Subrange types were introduced in Pascal (and adopted in
Modula) for two reasons: (1) to indicate that a variable
accepts a limited range of values of the base type and
allow a compiler to generate appropriate guards for
assignments, and (2) to allow a compiler to allocate the
minimal storage space needed to store values of the
indicated subrange. This appeared desirable in connection
with packed records. Very few implementations have taken
advantage of this space saving facility, because additional
compiler complexity is very considerable. Reason 1 alone,
however, did not appear to provide sufficient justification
to retain the subrange facility in Oberon.

With the absence of enumeration and subrange types, the
general possibility to define set types based on given
element types appeared as redundant. Instead, a single,
basic type SET is introduced, whose values are sets of
integers from 0 to an implementation-defined maximum.

The basic type CARDINAL had been introduced in Modula-2 in
order to allow address arithmetic with values from 0 to
2^16 on 16-bit computers. With the prevalence of 32-bit
addresses in modern processors, the need for unsigned
arithmetic has practically vanished, and therefore the type
CARDINAL has been eliminated. With it, the bothersome
incompatibilities of operands of types CARDINAL and INTEGER
have disappeared.

The notion of a definable index type of arrays has also been
abandoned: All indecies are by default integers.
Furthermore, the lower bound is fixed to 0; array
declarations specify a number of elements (length) rather
than a pair of bounds. This break with a long standing
tradition since Algol 60 demonstrates the principle of
eliminating the inessential most clearly. The specification
of an arbitrary lower bound provides no expressive power at
all, but it introduces a non-negligible amount of hidden,
computational effort. (Only in the case of static
declarations can it be delegated to the compiler).

Modules and import/export rules

Experience with Modula over the last eight years has shown
that local modules were rarely used. The additional
complexity of the compiler required to handle them, and the
additional complications in the visibility rules of the
language definition appear not to justify local modules.

The qualification of an imported object's identifier x by
the exporting module's name M, viz. M.x can be circumvented
in Modula by the use of the import clause FROM M IMPORT x.
This facility has also been discarded. Experience in
programming systems involving many modules has taught that
the explicit qualification of each occurrence of x is
actually preferable. A simplification of the compiler is a
welcome side-effect.

The dual role of the main module in Modula is conceptually
confusing. It constitutes a module in the sense of a
package of data and procedures enclosed by a scope of
visibility, and at the same time it constitutes a single
procedure called the main program. Within the Oberon system,
the notion of a main program has vanished. Instead, the
system allows the user to activate any (exported,
parameterless) procedure (called a command). Hence, the
language excludes modules without explicit definition parts,
and every module is defined in terms of a definition part
and an implementation part (not definition module and
implementation module).

Statements

The with statement has been discarded. Like in the case of
exported identifiers, the explicit qualification of field
identifiers is to be preferred.

The elimination of the for statement constitutes a break
with another long standing tradition. The baroque mechanism
in Algol 60's for statement had been trimmed considerably
in Pascal (and Modula). Its marginal value in practice has
led to its absence in Oberon.

Low-level facilities

Modula-2 makes access to machine-specific facilities
possible trough low-level constructs, such as the data
types ADDRESS and WORD, absolute addressing of variables,
and type casting functions. Most of them are packaged in a
module called SYSTEM. The features were supposed to rarely
used and easily visible trough the presence of SYSTEM in a
module's import list. Experience has revealed, however,
that a significant number of programmers import this module
quite indiscriminately. A particulary seducing trap are
Modula's type transfer functions.

It appears preferable to drop the pretense of portability of
programs that import a "standard", yet system-specific
module. Both the module SYSTEM and the type transfer
functions are eliminated, and with them also the types
ADDRESS and WORD. Individual implementors are free to
provide system-dependent modules, but they do not belong to
the general language definition. Their use then declares a
program to be patently implementation-specific, and thereby
non-portable.

Concurrency

The system Oberon does not require any language facilities
for expressing concurrent processes. The pertinent,
rudimentary features of Modula, in particular the
coroutine, were therefore not retained. This exclusion is
merely a reflection of our actual needs within the concrete
project, but not on the general relevance of concurrency in
programming.

Features introduced in Oberon

Type extension

The most important addition is the facility of extended
record types. It permits the construction of new types on
the basis of existing types, and establishing a certain
degree of compatibility between the names of the new and
old types. Assuming a given type

T = RECORD x, y: INTEGER END

extensions may be defined which contain certain fields in
addition to the existing ones. For example

T0 = RECORD (T) z: REAL END;
T1 = RECORD (T) w: LONGREAL END;

define types with fields x, y, z and x, y, w respectively.
We define a type declared by

T' = RECORD (T) END

to be a (direct) extension of T, and conversely T to be the
(direct) base type of T'. Extended types may be extended
again, giving rise to the following definitions:

A type T' is an extension of T, if T' = T or T' is a direct
extension of an extension of T. Conversely, T is a base of
T', if T = T' or T is the direct base type of a base type
of T'. We denote this relationship by T' => T.

The rule of assignment compatibility states that values of
an extended type are assignable to variables of their base
types. For example, a record of type T0 can be assigned to
a variable of the base type T. This assignment involves the
fields x and y only, and in fact constitutes a projection of
the value onto the space spanned by the base type.

It is important that an extended type may be declared in a
module that imports the base type. In fact, this is
probably the normal case.

This concept of extensible data type gains importance when
extended to pointers. It is appropriate to say that a
pointer type P' bound to T' extends a pointer type P, if P
is bound to a base type T of T', and to extend the
assignment rule to cover this case. It is now possible to
form structures whose nodes are of different types, i.e.
inhomogenious data structures. The inhomogeneity is
automatically (and most sensibly) bounded by the fact that
the nodes are linked by pointers of a common base type.

Typically, the pointer fields establishing the structure are
contained in the base type T, and the procedures
manipulating the structure are defined in the same (base)
module as T. Individual extensions (variants) are defined in
client modules together with procedures operating on nodes
of the extended type. This scheme is in full accordance
with the notion of system extensibility: new modules
defining new extensions may be added to a system without
requiring a change of the base modules, not even their
recompilation.

As access to an individual node via a pointer bound to a
base type provides a projected view of the node data only,
a facility to widen the view is necessary. It depends on
the possibility to determine the actual type of the
referenced node. This is achieved by a type test, a Boolean
expression of the form

t IS T' (or p IS P')

If the test is affirmative, an assignment t' := t (t' of
type T') or p' := p (p' of type P') should be possible. The
static view of types, however, prohibits this. Note that
both assignments violate the rule of assignment
compatibility. The desired statement is made possible by
providing a type guard of the form

t' := t(T) (p' := p(P))

and by the same token access to the field z of a T0 (see
previous examples) is made possible by a type guard in the
designator t(T0).z. Here the guard asserts that t is
(currently) of type T0.

The declaration of extended record types, the type test, and
the type guard are the only additional features introduced
in this context. A more extensive discussion is provided in
[2]. The concept is very similar to the class notion of
Simula 67 [3], Smalltalk [4], and others. Differences lie in
the fact that the class facility stipulates that all
procedures applicable to objects of the class are defined
together with the data declaration. It is awkward to be
obliged to define a new class solely because a method
(procedure) has been added or changed. In Oberon, procedure
(method) types rather than methods are connected with
objects in the program text. The binding of actual methods
(specific procedures) to objects (instances) is delayed
until the program is executed. In Smalltalk, the
compatibility rules between a class and its subclasses are
confined to pointers, thereby intertwining the concept of
access method and data type in an undesirable way. Here,
the relationship between a type an its extensions is based
on the established mathematical concept of projection.

In Modula, it is possible to declare a pointer type within
an implementation module, and to export it as an opaque
type by listing the same identifier in the corresponding
definition module. The net effect is that the type is
exported whereby its associated binding remains hidden
(invisible to clients). In Oberon, this facility is
generalized in the following way: Let a record type be
defined in a certain implementation part, for example:

Viewer = RECORD width, height: INTEGER; x, y: INTEGER END

In the corresponding definition part, a partial definition
of the same type may be specified, for example

Viewer = RECORD width, height: INTEGER END

with the effect that a partial view - a public projection -
is visible to clients. In client modules as well as in the
implementation part it is possible to define extensions of
the base type (e.g. TextViewers or GraphViewers).

Type inclusion

Modern processors feature arithmetic operations on several
number formats. It is desirable to have all these formats
reflected in the language as basic types. Oberon features
five of them:

LONGINT, INTEGER, SHORTINT(integer types)
LONGREAL, REAL(real types)

With the proliferation of basic types, a relaxation of
compatibility rules between them becomes almost mandatory.
(Note that in Modula the arithmetic types INTEGER, CARDINAL
and REAL are uncompatible). To this end, the notion of type
inclusion is introduced: a type T includes a type T', if the
values of T' are also values of type T. Oberon postulates
the following hierarchy:

LONGREAL > REAL > LONGINT > INTEGER > SHORTINT

[Note that ">" should be replaced by the appropriate
mathematical sign. Limitation of type-in..]

The assignment rule is relaxed accordingly: A value of type
T' can be assigned to a variable of type T, if T' is
included in T (if T' extends T), i.e. if T > T' or T' => T.
In this respect, we return to (and extend) the flexibility
of Algol 60. For example, given variables

i: INTEGER; k: LONGINT; x: REAL

the assignments

k:=i; x:=k; x:=1; k:=k+1; x := x*10 + i

are confirming to the rules, where the assignments

i:=k; k:=x

are not acceptable. Finally, it is worth noting that the
various arithmetic types represent a limited set of
subrange types.

The multi-dimensional open array and the closure statement
(in symmetry to a module's initialization body) are the
remaining facilities of Oberon not present in Modula.

Summary

The language Oberon has evolved from Modula-2 and
incorporates the experiences of many years of programming
in Modula. A significant number of features have been
eliminated. They appear to have contributed more to
language and compiler complexity than to genuine power and
flexibility of expression. A small number of features have
been added, the most significant one being the concept of
type extension.

The evolution of a new language that is smaller, yet more
powerful than its ancestor is contrary to common practices
and trends, but has inestimable advantages. Apart from
simpler compilers, it results in a concise definition
document [5], and indispensible prerequisite for any tool
that must serve in the construction of sophisticated and
reliable systems.

Acknowledgement

It is impossible to explicitly acknowledge all contributions
of ideas that ultimately simmered down to what is now
Oberon. Most came from the use or study of existing
languages, such as Modula-2, Ada, Smalltalk, C++ and Cedar,
which often though us how not to do it. Of particular value
was the contribution of Oberon's first user, J. Gutknecht.
The author is grateful for his insistence on the
elimination of deadwood and on basing the remaining
features on a sound mathematical foundation.

References

1.N. Wirth. Programming in Modula-2.
Springer-Verlag, 1982.

2.N. Wirth. Type Extensions. ACM Trans. on Prog.
Languages and Systems (to appear)

3.G. Birtwistle, et al. Simula Begin. Auervach,
1973.

4.A. Goldberg, D. Robson. Smalltalk-80: The language
and its implementation. Addison-Wesley, 1983.

5.N. Wirth. The Programming language Oberon
(language definition document)



  3 Responses to “Category : Tutorials + Patches
Archive   : OBERON.ZIP
Filename : OBERON.TXT

  1. Very nice! Thank you for this wonderful archive. I wonder why I found it only now. Long live the BBS file archives!

  2. This is so awesome! 😀 I’d be cool if you could download an entire archive of this at once, though.

  3. But one thing that puzzles me is the “mtswslnkmcjklsdlsbdmMICROSOFT” string. There is an article about it here. It is definitely worth a read: http://www.os2museum.com/wp/mtswslnk/