Category : C Source Code
Archive   : CSRC2.ZIP
Filename : CSET.H

Output of file : CSET.H contained in archive : CSRC2.ZIP
* c s e t . h



title cset Header file for character set functions
index Header file for character set functions


#ifdef vms
#include "c:cset.h"


The character set functions provide a set of routines for describing
and manipulating sets of characters. The character sets, called
"csets", created in this way can be manipulated quickly, and require
relatively little storage. They are meant to be used as arguments
to pattern-matching functions like span() (which see).

For these purposes, a set of functions to create csets, and produce
the complement (with respect to the set of all 8-bit characters) of
a set, and the join (union), meet (intersection) and difference of
two csets is provided; see cset(), cscomp(), csjoin(), csmeet(),
and csdiff().

csets can also be used more generally as representations of sets -
i.e., the name can be read as "C sets". In this case, the universe is
the set of numbers 0...(cssize-1), where cssize is a global parameter
defined in cset.c; it is normally 256 for character work. The
functions provided for this kind of application include csmember(),
which checks membership, and csless() and cswith(), which add and
remove elements from sets.

When csets are used in this way, it is important to understand that
a cset is a data object with an internal structure, and that different
csets may share internal data - i.e., csets are not normally "atomic"
objects and care must be taken in manipulating them. A look at the
representation of csets should help clarify this point.

The only object you normally manipulate directly in your code is
a cset pointer, type (CSET *). This pointer points to a cset
header, which contains a mask and a pointer to a table of cssize
bytes. A character is in the cset if any of the bits in its mask
is on in the corresponding table entry. Csets created by cset()
always have a one-bit mask; however, csjoin() and friends, avoid, if
possible, using up a bit position, by creating a header with a mask
containing more than one bit. Hence, the join of two csets often can
be represented very cheaply.

Complements of csets are represented still more efficiently; even
the header of a cset and its complement are shared. Only the pointer
is changed - its bit pattern is complemented.

A consequence of this representation is that a great deal of data is
often shared between csets. When manipulating csets as arbitrary
sets, it is important to understand that applying csless() or cswith()
to a cset may cause any related csets to be changed. Thus, after the
sequence of calls:

uvowels = cset("AEIOU");
lvowels = cset("aeiou");
vowels = csjoin(uvowels,lvowels);
lvowels = cswith(lvowels,'y');

'y' is probably a member of vowels. (Only "probably" because it
is impossible to predict whether uvowels and lvowels happen to get
the same table; csjoin() cannot use the "cheap" representation if
they don't.)

Two methods are available to avoid this problem. First, cscopy()
returns a guaranteed-"unique" copy of a cset. Second, the global
csunique (in cset.c) can be set, forcing functions such asÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿà  ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ À À À À À À À À À À À À À À À À À<øxf~øx~~~~fÌ<|~~ø~f À~ü|f~ü|<~~~~fÌ~~~~ü~f ÀfÌnf`Ìn~``fØff`Ì`f ÀfÌfv`Ì ff``vØff`Ì`v ÀfÌfv|Ì ff||vðff|Ì|v Àfüf~|üff||~ðff|ü|~ Àføf~`øf~``~Øf~`ø`~ ÀfØfn`Ø0f~``nØf|`Ø`n ÀfÌfn`Ì0ff``nÌf``Ì`n ÀfÌnf`Ì`nf``fÌf``Ì`f À~Æ|f~Æ`|f~~~fÆ~`~~Æ~f À<Æxf~Æ`xf~~~fÆ<`~~Æ~f À À À À À À À À À À À À À

  3 Responses to “Category : C Source Code
Archive   : CSRC2.ZIP
Filename : CSET.H

  1. Very nice! Thank you for this wonderful archive. I wonder why I found it only now. Long live the BBS file archives!

  2. This is so awesome! 😀 I’d be cool if you could download an entire archive of this at once, though.

  3. But one thing that puzzles me is the “mtswslnkmcjklsdlsbdmMICROSOFT” string. There is an article about it here. It is definitely worth a read: