Dec 082017
 
Updated versions of original Turbo Power Software Utilities. Includes: DIFF (difference finder), SDIR (directory finder), ROOT (file finder), REP (command repeater), RPL (pattern match and replace).
File TPOWER.ZIP from The Programmer’s Corner in
Category Utilities for DOS and Windows Machines
Updated versions of original Turbo Power Software Utilities. Includes: DIFF (difference finder), SDIR (directory finder), ROOT (file finder), REP (command repeater), RPL (pattern match and replace).
File Name File Size Zip Size Zip Type
DIFF.COM 28866 15723 deflated
REP.COM 32030 18560 deflated
ROOT.COM 21844 13710 deflated
RPL.COM 36162 19654 deflated
SDIR.COM 24686 14675 deflated
TPOWER.DOC 80289 24515 deflated

Download File TPOWER.ZIP Here

Contents of the TPOWER.DOC file


The TurboPower Utilities were originally designed for users of
Turbo Pascal 2.0 and 3.0. Since then, the Pascal-specific
utility programs have been updated and extended to form the
Turbo Analyst product. Several of the original utilities are
still useful in their original form, and these are now being
released for free distribution. These utilities are general
purpose DOS utilities for displaying directories, finding
files, comparing files, performing regular expression search
and replace on text files, and automating repetitive command
sequences.

The following documentation is comprised of the relevant
portions of the original TurboPower Utilities manual. (It's
interesting how much things have changed since it was written
in 1985!)

You may use and distribute these files freely, but you may not
sell them (except for nominal handling costs charged by
shareware distributors) without the express written permission
of TurboPower Software.

Kim Kokkonen

March 1991
CompuServe 76004,2611

-----------------------------------------------------------


TurboPower Programmer's Utilities

User's Manual

Copyright (C) 1985 TurboPower Software.

All Rights Reserved.
Second Edition


Trademarks Mentioned

TurboPower Software and the distinctive TurboPower logo are
trademarks of TurboPower Software. TurboPower Utilities is a
trademark of TurboPower Software.

WordStar is a trademark of Micropro International.

IBM is a trademark of International Business Machines
Corporation.

Turbo Pascal, Turbo Tutor and Sidekick are trademarks of
Borland International.

Unix is a trademark of Bell Laboratories.

-----------------------------------------------------------

TABLE OF CONTENTS

1. Introduction
2. Getting Started
3. Using the Utilities
4. References

5. SDIR - Super Directory Utility
A. Purpose
B. Usage
C. Sorting Options
D. Filtering Options
E. Listing Options
F. File Attribute Codes
G. Examples

6. ROOT - File Finder
A. Purpose
B. Usage
C. Command Options
D. Examples

7. REP - Command Repeater
A. Purpose
B. Usage
C. Command Options
1. Redirecting Input and Output
2. Changing Current Directory
3. Changing Delimiters
4. Sending Keystrokes
5. Double-Checking Commands
6. Output Options
7. Command Line from a File
8. Miscellaneous Options
D. Examples

8. DIFF - Text File Difference Finder
A. Purpose
B. Usage
C. Command Options
1. Disregarding Differences
2. Formatting the Output of DIFF
3. Miscellaneous Options
D. Using the Script Mode
E. Further Examples

9. RPL - Pattern Match and Replace
A. Purpose
B. Usage
1. Command Files
2. Specifying Input and Output
3. Behavior at Runtime
C. Command Options
1. Specifying Regular Expressions
2. Deciding What Lines to Output
3. Miscellaneous Formatting
D. Select and Match Expressions
E. Replace Expressions
F. Examples

-----------------------------------------------------------

1. INTRODUCTION

Each of the Utilities is described in a separate section of
this manual. Here is an overview of each program:

o DIFF - Difference Finder
Finds differences between two text files. Reports differences
in any of several formats, one of which allows efficient
archiving of file history. Allows certain differences to be
disregarded (including spacing, case, selected characters and
Pascal comments).

o RPL - Pattern Match and Replace
Uses regular expressions to find arbitrarily complex text
patterns in a file, then optionally allows replacement of
those patterns. Supports many advanced regular expression
features, including nesting, tagged match words, alternation
and three types of closures.

o SDIR - Super Directory
Displays the MS-DOS disk directory with a large number of
options, including sort order, extended pattern matching,
hidden file display, date filtering and others.

o ROOT - File Finder
Finds files (singular or wildcarded) anywhere in the directory
hierarchy and then allows you to act on them with a single
keystroke (e.g. copy, type, execute, or delete). Displays the
directory structure in any of several formats.

o REP - Command Repeater
Combines a programmable text parser with general purpose
command execution capability to automate many repetitive
tasks. Uses include applying file operations across multiple
subdirectories and running RPL and DIFF operations on all
source files comprising a program.

-----------------------------------------------------------

2. GETTING STARTED

The TurboPower Utilities require that you be operating under
PC-DOS 2.X or later on an IBM PC, XT, AT or 100% compatible.
We are testing on non-IBM hardware as it becomes available to
us. We would appreciate hearing from you if you find that
these programs work well on your non-IBM machine. We will use
the terms MS-DOS, PC-DOS and DOS interchangeably throughout
the manual.

All of the utilities will run in a system with 128K bytes
available RAM. They are designed to take advantage of all
available RAM space. To get full performance out of the
programs that build large internal data structures (such as
DIFF), your system should have at least 192K bytes of RAM. If a
lot of your RAM space is consumed by print spoolers, RAM disks
and resident programs, you may need more RAM.

Two double-sided disk drives are highly recommended unless you
have a hard disk, which is even better.

-----------------------------------------------------------

3. USING THE UTILITIES

Most of the Utilities are designed so that you simply type the
command name and something useful will happen. Nevertheless,
it is worthwhile to do a little preparation before starting
out.

The capacity of the Utilities has been set so that each can
easily handle the kinds of programs that a Turbo Pascal
programmer is likely to produce. This generally sets a limit
of 64K bytes per individual text file and several thousand
lines per program. Generally, the programs can take advantage
of increased RAM space, but in some cases an array size
hard-coded into the program will set an upper limit. All of
the utilities follow a consistent command format. If the name
of the utility is entered by itself, either the utility will
execute a default set of actions or it will prompt for
additional information. Optionally, each utility can be called
with command line arguments. For example:

ROOT -T -F myfile.* >files.dat

Each of the words -T and -F specifies a command option to the
utility. Every command option must begin with the hyphen - and
be followed immediately by one or more characters. Individual
options must be separated from one another by at least one
space or tab. The case (upper or lower) of the words typed on
the command line is generally not significant. Differing case
will be used in this manual for emphasis only. The fourth word
on the example command line above represents a file or set of
files on which the utility will operate. Depending on the
utility, there may be from 0 up to 3 files specified on the
command line. These files sometimes may contain wildcards as in
the example, and usually may contain MS-DOS pathnames. These
details are specified in the section on each utility.

The final word on the example command line controls MS-DOS
output redirection. In this case the results of the command
will not be sent to the screen, but rather to the file named
FILES.DAT. If you do not know about the redirection features
of MS-DOS 2.0+, it is worth your time to study the DOS manual,
since redirection provides some very powerful capabilities.
Where it makes sense, the TurboPower Utilities support
input/output redirection.

The discussion of each utility will include the following
format:

ROOT [options] [file or directory] [I/O redir]

Optional entries will always be enclosed in brackets. The
order of the entries does not matter except when explicitly
specified.

Each of the Utilities that takes command line options has a
built-in help feature. Simply enter the name of the utility
followed by either -? or just ? to get a quick help screen
describing the options. For example,

ROOT -?

will show what options ROOT provides.

The directly executable TurboPower utilities (SDIR, ROOT, REP,
DIFF, RPL) provide return codes after an abnormal exit from
the program. If the program is terminated by Ctrl-Break, it
returns a code of 1. If the program exits with an error, it
returns a code of 2. These codes can be accessed through the
MS-DOS batch ERRORLEVEL function or through MS-DOS function
call $4D.

You can usually abort the operation of the utilities by typing
Ctrl-C or Ctrl-Break.

-----------------------------------------------------------

4. REFERENCES

The PC-DOS manuals contain a wealth of information for both
beginners and advanced programmers. The following sections are
relevant to using the TurboPower Utilities.

Subdirectories and Pathnames:

Using Tree-Structured Directories. (Chapter 5 for either
MS-DOS 2.X or 3.0).

PATH command. (Chapter 6 for 2.X, Chapter 7 for 3.0)


Redirection of Input and Output:

Chapter 10 for 2.X, Chapter 6 for 3.0.

-----------------------------------------------------------

5. SDIR - Super Directory Utility

A. Purpose

SDIR displays a file directory in various ways. It allows the
order in which the files are displayed to be sorted according
to name, extension, size and/or time. Hidden or subdirectory
entries optionally can be shown. Files older or newer than a
chosen date can be excluded from the listing. File names and
extensions may be specified using any combination of the
standard wild cards * and ?. Finally, the listing can be
printed in either of two formats, or displayed in any of
several formats.

SDIR has been designed to work well with REP, the TurboPower
Command Repeater. Together, these utilities can save you a lot
of repetitive typing.

SDIR does not modify the actual directory encoded on the disk.


B. Usage

SDIR [options] [directory] [output redir]

Optional entries are enclosed in brackets [ ] here and
throughout this manual.

The directory specifier follows normal DOS 2.0+ rules. If not
specified, the current drive and directory are used. If a
partial pathname (not beginning with \) is specified, this
pathname is appended to the current directory pathname. If a
pathname beginning with ..\ is specified, the remainder of the
specified pathname is appended to the pathname of the
directory above the current directory. If a pathname beginning
with just .. is specified, any file specification that follows
will refer to the contents of the directory above the current
directory.

If only a drive is specified, the current default directory of
that drive is used. File specifiers (including wildcards) may
also be appended to the pathname.

SDIR divides the directory output into columnar fields. When a
filename does not include an extension, the column for that
field will appear blank. It is NOT really blank, but is filled
with nulls (ASCII 0). This allows the REP parser to keep the
word count consistent between files with or without extensions
(that is, @3 in REP will always refer to file size).

Each option must be specified individually, and separated from
other options by a space or tab. The order of the directory
and option entries is not important. The case (upper or lower)
of entries is not important.

If output is being sent to the screen, and the screen has been
filled, SDIR will prompt "MORE?". Type (carriage return)
to get another single line of output, or Y (for
Yes) to get another screenful, and any other key to quit
operation of SDIR.

The maximum number of files SDIR will find in a given
directory is 512 (if you have more than this, you should
probably create some subdirectories).

The output of SDIR can be redirected using the standard DOS
2.X techniques. These allow sending the output to a file, to
the printer, or to the input of another program. In the case
of the printer, it is recommended that you use one of the
print command options described below, as they provide print
formatting as well as redirection.

Obtain a summary of SDIR options at any time by typing

SDIR -?


C. Sorting Options

-AN
Sort the listing by file Names in ascending (alphabetical)
order.

-AE
Sort the listing by file Extensions in ascending order.

-AS
Sort the listing by file Size in ascending order.

-AT
Sort the listing by file Time (combined date and time) in
ascending order.

-DN
-DE
-DS
-DT
Sort the listing in descending order for any of the four
categories described above.

Two different sort keys may be specified simultaneously. The
first key encountered on the command line becomes the primary
key, and the second the secondary. If no sort key is
specified, SDIR defaults to -AN -AE (files sorted first by
name, then by extension). If only one sort key is specified,
the second one defaults to a reasonable value. If more than
two sort keys are specified, all but the first two are
ignored.


D. Filtering Options

-H[O]
Show Hidden files [Only] in the directory listing. SDIR finds
hidden, read-only, and system files using this search.

-S[O]
Show Subdirectories [Only] in the listing. SDIR will not show
the directory entries . and .. since they are (almost) always
there.

-M[n]
Show only files Modified since n days ago. If n is not
specified, it defaults to 0, and thus only those files
modified today are listed. n is an integer (0<=n<32768), and
must follow immediately after the letter M (without spaces).

-B[n]
Show only files last modified Before n days ago. If n is not
specified, it defaults to 0, and thus only those files NOT
modified today are listed. n meets the same constraints
specified for the M option.

-MB
Show only files Modified since last Backup. This uses the
modify bit supported by MS-DOS. This option can be used in
combination with any of the other options of SDIR.


E. Listing Options

-T[O]
Include Titles [Only] in the directory listing. Titles include
an expanded version of the directory and wildcards specified,
the disk volume label, bytes remaining on the disk, number of
files found, and bytes used in the files found. By default,
titles are not output. This makes the response faster, and the
output is more easily used as input to other programs.

-Sn
Use a Sector size n (0 space usage of a group of files. By default, SDIR uses a
sector size read from the disk. For a DSDD disk this creates
an effective sector size of 1024 bytes, and for a 10Mbyte hard
disk an effective sector size of 4096 bytes. File sizes
smaller than an exact sector size boundary are allocated the
full extra sector on the disk, thus using more space than you
might expect. You can use a non-default size to determine how
much space files will take when transferred to another disk.

-C
Output the directory in a Compressed format. Here only the
filename and extension are shown. The filenames are arranged
in 5 columns across the screen, so you can fit a lot of files
in a little space. The order of the display reads DOWN the
columns, which seems a more natural way to read than the DOS
DIR/W command provides. Subdirectory entries, if selected,
have the character \ appended.

-E
Precede each filename with the complete pathname to its
directory; otherwise SDIR output is normal. This option is
useful with REP.

-W
Include Whole pathname with each file found. No other file
information (such as size, date, etc.) is written out. This
option is useful when SDIR is to supply input to another
program such as the command repeater REP. If -W is specified,
the -T and -C options are overridden.

-P
Print the directory listing (at normal size) instead of
sending it to the display screen. When the printing options
are set, output redirection is not required in order to print
the data. Specifying -P will send form feeds to the printer in
order to avoid the perforations. Titles are included by
default when printing is specified. If -P is specified, the -W
and -C options are overridden.

-PT
Print the directory at Tiny size (on Epson or compatible
printers only). The size is such that the listing will fit
conveniently on a floppy disk or in its sleeve.


F. File Attribute Codes

SDIR displays file attributes when the default display options
are used. The codes for the attributes are:

n - normal file
s - system file
d - directory
r - read-only file
h - hidden file
m - modified since backup


G. Examples

SDIR

Displays all the normal files in the current directory, sorted
in alphabetical order by filename, then by extension.

SDIR -t -dt -m5 c:\*.pas

Displays all normal files in the root directory of drive C
having the extension .PAS, and which were modified within the
last 5 days. The listing will be presented in order of newest
file first. Titles are included in the output.

SDIR -ae -ds mydir

Displays all of the normal files in the subdirectory mydir of
your current directory. The listing will be sorted first by
extension, then by descending size.

SDIR a: -pt

Makes a tiny printout of the files on drive A.

SDIR *tmp.exe

Displays all .EXE files whose name ends in TMP. The DOS DIR
command will not give the proper answer here.

SDIR >old.dir

Saves the current directory listing in the file old.dir.

SDIR \oldsrc\rpl*.pas -w | REP {type @0} -s

Pipes the output of the SDIR command to the input of the REP
command. All the files matching rpl*.pas in the remote
directory specified are typed to the screen.

SDIR ..\pascal\*.pas

Looks at the subdirectory named pascal which is linked to the
directory above the current one.

-----------------------------------------------------------

6. ROOT - File Finder

A. Purpose

ROOT serves several purposes. First, it will show the
subdirectory structure of a disk. This is similar to the DOS
TREE function but more powerful (and even readable!).

Second, ROOT will find a file or files anywhere in the
subdirectory structure and then perform any of several
functions relative to the file.

Third, ROOT is designed to work well with the command repeater
REP. By providing path and filename inputs to REP, ROOT helps
to automate many tedious tasks.

ROOT has many options, which are described below.


B. Usage

ROOT [drive][pathname] [options] [output redir]

Optional entries are here and throughout this manual enclosed
in brackets, [ ].

Drive is a single letter followed by a colon (A:, B:, C:, ...,
to a maximum of H:). It indicates which drive is to be
searched. If not specified, the default drive is used.
Pathname can optionally follow drive. If provided, pathname
specifies the starting directory for searching. If not
specified, the current default directory is used. DOS pathname
shorthand is supported. If a new drive is specified, the
starting directory defaults to the root directory of that
drive.

Each option must be specified individually, and separated from
other options by a space or tab. The order of the entries is
not important. The case (upper or lower) of entries is not
important.

When you are displaying a subdirectory diagram and output is
going to the screen, ROOT will display the current directory
location in low intensity, while the rest of the
subdirectories are in high intensity.

If output is being sent to the screen, and the screen has been
filled, ROOT will prompt "MORE?". Type (carriage return)
to get another single line of output, or Y to get
another screenful, or any other key to quit operation of ROOT.

The output of ROOT can be redirected using the standard MS-DOS
facilities. This allows the results of ROOT to be sent to a
file, the printer, or the input of another program. By
default, the output of ROOT is sent to the screen.

Obtain a summary of ROOT options at any time by typing

ROOT -?


C. Command Options

-S
Show total Size (in bytes) used by files in each subdirectory;
otherwise size is not shown. Size is cluster-justified using a
cluster size read from the drive being searched.

-N
Do Not sort the subdirectory entries alphabetically. Otherwise
each level of the root structure is shown in alphabetical
order.

-W
Show Whole pathname for each subdirectory entry, otherwise
only the significant words of each pathname are shown.

-T
Start the subdirectory search at the Top level directory,
otherwise ROOT searches downward from the current default
directory. (This is equivalent to specifying a pathname of \).

-F filename
Find the file named filename. Wildcard characters accepted by
DOS (*, ?) are acceptable in filename. Filename must be
separated with a space from the -F flag, but must follow
immediately thereafter. -T is the only other option that has
meaning when -F is used. The search for filename will find a
maximum of 512 matching files in each directory, but will warn
when that limit is reached. The search is also limited to 6
levels of subdirectories below the starting directory. Again,
a warning will be issued when this limit is reached. When a
file matching filename is found, ROOT offers the following
commands:

Q Quit searching.
S Search further.
T Type the file to the screen.
P Print the file on the default printer.
C Copy the file to your current directory.
M Move default directory to where file was found and quit.
D Delete the file.
E Execute the file and quit (.COM and .EXE files only).
I Display full directory information regarding the file.
? Display a help message.

You will be prompted to choose one of these commands.

If the output of ROOT has been redirected to anywhere besides
the screen, interactive prompting will not occur. Instead, a
list of the matches found will be written to the redirected
output device. This list is comprised of a single line for
each match, where each line is a complete pathname and
filename.


D. Examples

ROOT

Displays the subdirectory diagram of the default drive,
starting at the current directory, sorted alphabetically, on
the screen.

ROOT c:\ -s -w >prn

Prints the entire directory structure of disk C: including the
number of bytes used in each subdirectory, and the whole
pathname of each subdirectory.

ROOT -t -f *.pas

Searches the current drive, starting at the top level, for all
files with the extension .PAS. For each file found, ROOT will
offer the choices described at the end of the options section.

ROOT -t -f *.pas >pasfiles.dat

Writes a file pasfiles.dat containing the names and locations
of all files having the extension .PAS.

ROOT -f *.bak | REP {del @0} -q

Pipes the output of ROOT into REP, which then deletes all .BAK
files found in and below the current directory, after getting
user confirmation for each one.

ROOT .. -n

Shows all subdirectories attached to the parent of the current
directory, in the order that they are stored on the disk.

-----------------------------------------------------------

7. REP - Command Repeater

A. Purpose

REP combines a programmable text parser with the ability to
execute any command allowed by MS-DOS. Programmers often want
to run the same program several times, with the only
difference being the data or files supplied as input to the
program. REP was developed specifically for this purpose, but
it offers a level of programmability that will make it
suitable for a wide range of applications.

A few examples may illustrate REP's value. REP can
automatically:

o Delete all .BAK files older than 90 days in all
subdirectories.

o Copy files from all subdirectories of a diskette into a
single directory of a target drive.

o Print all the source files making up a program.

o Use RPL to find all instances of the variable XXX1 in all
source files of a program (and optionally replace it).

o Use DIFF to find differences between .BAK files and their
latest versions, and save the results to a single file.

o Print 107 copies of a disk label.

o Run a batch file and prompt for or provide the changing
input parameters to the batch file.

o Act as a general purpose (albeit slow) parser using the
ECHO statement.

We will show how to implement these examples later in this
section.

The basic operation of REP is straightforward, although it
offers so much power that you may initially be hesitant about
using it. REP reads one text line at a time from the standard
input. Using specified delimiters, REP then parses the text
line into words, and from those words and other options REP
builds a full command description. REP then executes the
command. When the command is finished, control is returned to
REP, which reads another text line from standard input and
repeats the process.


B. Usage

REP is called as follows:

REP [options] {Command} [input redir]

REP does not offer a prompted input mode, except to allow you
to specify a command tail if you have typed just REP.

Command is any DOS internal, .COM, .EXE, or .BAT command along
with associated command line arguments. Command must be
surrounded by curly braces {}. It may contain spaces. The
command should be entered in just the same way as it is
usually entered at the DOS prompt, with two exceptions. First,
the Command name and its parameters or options may include
parser symbols, to be described below. Second, Command may not
directly specify input/output redirection. Redirection is
obtained using a REP option below. If I/O redirection were
specified within Command, MS-DOS would interpret it to mean
that we were redirecting the I/O of REP and not that of
Command.

If you want Command to specify another call to REP (the DOS
equivalent of recursion), you must be careful. REP does not
support nesting of curly braces, so the second level Command
must be hidden using the REP -F option to be described below.
RAM space will also set a limit on this form of recursion.

REP reads data lines from the standard input, which you may
wish to redirect so that they come from a file, or from the
output of another program. If REP input is not redirected, you
will be prompted to enter a data line at the keyboard. In that
case, you should terminate each data line with return>. When you are done, enter .

REP parses each input data line into words using a set of
delimiters. By default, the delimiters are space (ASCII 32)
and tab (ASCII 9). You may change the delimiter set using a
REP option. When parsing is complete, REP has a set of words
that may be used to perform substitutions into Command and
into other areas of the REP command line.

The symbols by which you refer to the parsed words are as
follows:

@n
nth parsed word of input line (0
@0
Entire input line (zero, not "oh").

@L
Last word of input line. (@l ok)

@P
All but last word of input line. (@p ok)

@B
Discretionary backslash. (@b ok)

@C(Start,Length)
Copies a substring of the standard input line starting at
position Start and extending for Length characters.

For example, if the input data line is

TEMPFILE DAT 32544 2-11-85 12:02

then @1 refers to TEMPFILE, @3 refers to 32544, @L refers to
12:02, and @P refers to TEMPFILE DAT 32544 2-11-85

Note that the delimiter characters themselves do not appear in
the parsed words, with the exception of the @0 and @P symbols.
For these symbols, all delimiters internal to the string
remain as in the input line.

The @B symbol is introduced to work around a quirk of MS-DOS.
The pathname of the root directory ends in backslash \, but no
other subdirectory names end in backslash. When we want to
append a filename to a pathname, there must always be a
backslash between the pathname and filename. The discretionary
backslash symbol looks at the previous character of the
string. If it does not find a backslash, it adds one. If it
does find one, it doesn't add another one. In the Examples
section we illustrate the importance of this symbol.

Parser symbols may be used anywhere in Command, Device, or
Path (see below for usage of Device and Path) in order to
customize the command for the particular data line read.

@ is also used as an escape character, when a literal instance
of any of the special characters must be included. Special
characters are #,^,@,{, and }. Thus, to obtain a literal right
curly brace, you must enter @}.


C. Command Options

1. Redirecting Command Input and Output

-I Device
redirect Command Input.

-O Device
redirect Command Output.

-A Device
Append Command output.

Device is any valid MS-DOS device (e.g. PRN, CON, NUL) or file
(including optional pathname). As in MS-DOS, using PRN (the
printer) as an input device is not acceptable. Only files are
meaningful when using the append option. These three command
options are the equivalent of the symbols <, >, and >>
respectively that normally are entered at the DOS prompt.
Device must be separated by a space from the option keyword,
but otherwise must immediately follow it. Device may include
parser symbols, which will be replaced by input data before
Command is executed.

If you want command input to come from the keyboard and
command output to go to the screen, then there is no need to
specify these options. Note that standard input of Command is
redirected independently from the standard input of REP.


2.Changing Current Directory

Some commands require that any files used be located in the
current default directory. REP offers the capability of
automatically changing directory before Command is executed,
by means of the following option:

-M Path
Move to directory Path before executing command.

Path is any valid MS-DOS 2.X pathname. Path may contain parser
symbols, which will be replaced by input data before Command
is executed.

You may wish to guarantee that you return to the initial
directory at the end of the REP run. This is specified with
the following option.

-H
Return to initial (Home) directory after executing all
commands.


3.Changing Delimiters Used by the Parser

As mentioned above, the data line parser uses space and tab as
default delimiters. Sometimes you may wish to use other
characters instead. This is specified with the following option:

-D Delims
Provide a new Delimiter set to the parser. When -D is
specified, the default delimiters are forgotten. Delims may
not contain any blank space and must be separated from the -D
keyword by at least a space. Each character in Delims is
significant. The new delimiters are simply typed one after the
other. Order of the delimiters is not important.

In some cases you may wish to specify non-printable or blank
characters as delimiters. REP supports the following conventions
for interpreting characters:

c
Any printable character c, if c has no special meaning.

#nnn
ASCII character nnn (0<=nnn<=255, must be terminated with
non-numeric character). For example, #13 is a carriage return.

^c
Control character c (c must be a legal control character). For
example, ^I is a tab.

If you wish to use # or ^ as a literal delimiter, you must
specify it indirectly, either with an escape symbol (e.g. @^) or
using the ASCII sequence (e.g. ^ is #94).


4.Sending Keystrokes to the Command

REP can supply a limited amount of interactive input to the
commands that it executes. For example, a program may require
that you answer a Yes/No prompt before it continues with the
rest of its activities. REP handles this by temporarily taking
over the keyboard interrupt handler. It requires IBM BIOS
compatibility to function correctly. The number of keystrokes
that REP can send to the command is not limited by the size of
the keyboard buffer. Key stuffing in REP is compatible with
Superkey and other programs which expand the keyboard buffer
size. You can specify up to 255 characters which REP will pass
along to the command being executed.

REP also overrides the keyboard clearing operation that many
programs perform at startup. Certain programs may work around
the approved clearing technique, but REP works at least with
Turbo Pascal, WordStar and DOS commands such as FORMAT.

To send keystrokes, use the following option:

-K Keys
Send Keys to the keyboard buffer before command begins.

Specify Keys exactly the same way as Delims in the previous
option. In addition, you may use Parser Symbols within Keys.

Some programs (PC-File III version 3 is an example) cannot
keep up with the speed of machine-generated keystroking. To
slow down the keystrokes for such programs, use the -Wn
option:

-Wn
Wait n keystrokes.

If it appears that your program cannot keep up, try -W2 or
larger. You can send function and keys to your commands:
Simply use a #0 to start the sequence and follow that by
the scan code of the character. For example, A is #0#30,
function key is #0#59, and shift is #0#84. You can
find these scan codes in your BASIC manual or the Turbo Pascal
version 3.0 manual.


5.Double-Checking Commands

You may be concerned that a command repeater could make an
error and run wild, deleting everything on your disk. This is
possible, but REP provides options to control the possibility.
It is especially important to use these control options when
you are developing a new REP application.

-Q
Query (Yes/No/Quit) before proceeding with each command.

-X
eXit after DOS errors (otherwise, the next data line is
attempted).

-W File
Write commands to File without executing them. You can check
the contents of this File to make sure it will do what you
intend. The file can then be executed as a batch file. Default
extension for File is .BAT. It may be preceded by optional
drive and pathname.

Note that by default, REP does none of these.

A DOS error is triggered if REP tries to move to a
non-existent directory, or tries to execute a non-existent
command, or tries to open a non-existent file as standard
input or output. Whenever these conditions are encountered,
the command line in question is not executed. However, when
the -X option is specified, REP gives up totally and does not
attempt any more data lines.

REP cannot tell whether the command that was executed had its
own internal error unless that command sends back a return
code. Unfortunately, most programs do not provide return
codes, but REP tests them anyway, just in case. The TurboPower
Utilities themselves provide return codes.

If for some reason you need to abort the operation of REP, you
should hold the and keys down continuously
until the program is stopped. REP checks for breaks in several
ways, but because it can nest several levels of command
interpreters it may take several breaks to stop all
operations.


6.REP Output Options

You may wish to know where in a long list of commands REP
currently is, or you may wish to include the command line in
the output of the command being executed for later reference.
These possibilities are specified as follows:

-C
Append each Command line to command output before it is
executed.

-S
Write each command line to Screen before it is executed.

The first option, -C, looks at the output destination of the
command itself and appends the command line to that device or
file.

The second case writes the command to the screen, no matter
where command output is going.


7.Taking REP Command Line from a File

You will probably not want to repeatedly type REP command
lines once you have them figured out. There are two ways to
avoid this. First, you can make the entire REP command part of
a batch file, possibly with batch parameters passed into REP.
Second, you can have REP read its command line from a file as
specified by the following option:

-F File
Read first line of File and include it into REP command line.

File is any valid DOS filename (including optional drive and
pathname). If no pathname is supplied, and the file is not found
in the current directory, REP searches the PATH environment to
find File. Note that only a single line of File is read. Other
lines may be used as comments. -F commands may be nested to
arbitrary depth. The default extension for File is .REP.

By means of -F options or batch commands, REP calls may be
nested to any depth, limited only by available RAM. Each time
REP is called, it reduces the amount of RAM available to a
called program by about 40K bytes.


8.Miscellaneous Options

-B
Beep after each command is completed.

-Rn
Repeat command n times (0 specified, REP will not read any data from the standard input,

and thus will not supply any parsed words to the command. -R
is meant for brute force repetition, such as the example of
printing 107 disk labels.


D. Examples

Many of the following examples are general purpose, and are
best implemented as single line batch files. In the following,
some commands have been split over two lines in order to fit
the dimensions of the manual. They should of course be typed
on a single line when you use them.

root -t -f *.bak | REP {del @0} -q

Finds all .BAK files (in all subdirectories of the default
drive) and deletes them after first getting confirmation.

root -t -w | REP {sdir @1 -b90 -dt} -a oldfiles.dat -c

Makes a file OLDFILES.DAT containing all filenames whose
contents were not modified in the last ninety days.

root -t -w | REP {sdir @[email protected]*.bak -b90 -w} | REP {del @0}

Deletes all .BAK files older than 90 days (no confirmation).
Note the use of @B to append a backslash to the pathname.

root -t -w | REP {sdir *.bak -b90 -w} -m @1 -h |
REP {del @0} -w delfile.bat

Illustrates another way of approaching the previous example,
using the -m (move to directory) option and the -h (return
home) option. This also illustrates the use of the write batch
facility. This command does not execute any deletions. Instead
it builds a batch file DELFILE.BAT that contains the deletion
commands. You can then examine the batch file and execute if
it is OK.

root a: -w | REP {sdir @1 -w} | REP {copy @0 @L} -d \ -s

Copies all files in all subdirectories of drive a: to the
current default directory.

root -w c: | REP {sdir -t @1} -o prn

Prints a directory of every subdirectory on disk C:.

sdir rpl*.inc | REP {print @[email protected]}

Prints all files in the current directory which match
rpl*.inc. Note that a literal period . must be inserted into
the output of SDIR to get a legal filename.

REP {print dlabel.dat} -r107

Prints 107 copies of dlabel.dat.

REP {rpl -n -m lineval}
-i @0 -a lineval.loc -c -s
MAKE.RPL contains a line naming each source file that goes
into making up the program RPL. This command then creates a
file named LINEVAL.LOC that contains the name of the file,
then each line number and program line containing the word
lineval. In using RPL, your match patterns may not contain the
characters {, }, # or ^ unless you hide them using the -F
option of RPL, or "escape" them with the REP character @.

REP {diff @[email protected] @1.BAK -c} -d . -a rpl.dif -c
Runs DIFF on all files making up RPL and writes the number of
differences between each file and its backup to RPL.DIF. Note
the only parser delimiter is a period. If a .BAK file doesn't
exist, REP will continue with the next input file.

sdir | REP {ECHO @4 @3} -a sizedate.dat

REP acts as a simple parser to put the date and size of each
file in the current directory into the file sizedate.dat.

root \ -w | REP {sdir @1 -mb -to -s1024} -a baksize.dat

Builds a file baksize.dat that shows the size of files
requiring backup in every directory of a disk. Sector rounding
assumes that you will write backup to a DSDD floppy. See the
supplied batch file BAKUP.BAT for a more sophisticated version
of this function.

REP {mybatfil @1 @2 2311 1266 6697} -q

MYBATFIL is your own batch file requiring 5 parameters. REP
waits for you to type the first two parameters followed by a
, then echoes the command line and gets you to confirm
it before it runs the batch job.

-----------------------------------------------------------

8. DIFF - Text File Difference Finder

A. Purpose

DIFF is used to compare the contents of two text files and
then report any differences. It uses a sophisticated algorithm
to regain synchronization after unmatched lines are found. It
therefore can be used to compare files with an arbitrarily
large number of differences, or files with a very small number
of differences (including zero).

DIFF offers a number of features that generalize the
comparison process. By default, DIFF compares every character
of each file. Optionally, certain types of differences can be
ignored. These include the case (upper/lower) of characters,
blanks and tabs, blank lines and formfeeds, arbitrary
characters, and comments delimited as they are in Pascal.

One application of DIFF is as an archival mechanism. To reduce
archival storage space, only the differences between
successive file versions need be stored. To support this
application, DIFF offers an output format which is an EDLIN
editor script. With this script, the previous version of a
file can be automatically generated from the current version.
Use of this script will be described in detail in the Examples
section.

DIFF also provides output formats meant to be read by humans.
These are intended to answer the general questions, "Are these
two files different? and (perhaps) where? and (perhaps) how?"
The various options will be described below.


B. Usage

Simply typing DIFF will initiate an interactive prompting
session wherein DIFF asks about everything it needs to know.
This is referred to as prompt mode. To fully understand the
meaning of the prompts, it is worthwhile to study the options
section below. To get started, however, you can just dive in
and type DIFF.

DIFF may also be called with command line arguments. In this
case, the command line format is:

DIFF [options] OrigFile ModFile [output redir]

Options will be described in detail below. OrigFile and
ModFile are any valid DOS filenames whose contents are ASCII
text. Both files must be specified. There is no default
extension. Files should be found in the current default
directory unless preceded by an optional drive and pathname.

DIFF sends its results to the standard output device. By
default this is the screen. To send results to a file, to the
printer, or to the input of another program use the MS-DOS 2.X
redirection facilities.

DIFF maintains a status line during execution showing its
current location. This status information will NOT be
redirected when the standard output is redirected.

DIFF is designed to be used only with ASCII text files. Using
it with binary files will probably overflow DIFF's input
buffers and is sure to produce garbage for results. Each line
of the ASCII text file must be terminated by at least a
(ASCII character 13). Line feed characters
(ASCII 10) are ignored. The longest line allowed is 1024
characters. During the comparison, any line longer than 1024
characters will be broken at 1024 and an extra return> sequence inserted. This may lead to
unexpected differences showing up in the output.

DIFF contains an integral MORE filter. Whenever output is
written to the screen (not redirected to a file or any other
device), and 24 lines of output are written, DIFF will prompt
"MORE?". Enter a space or a Y to get another screenful, a
to get another line, or any other key to quit.

The largest files that DIFF can compare are limited by RAM
space, and by the degree of dissimilarity of the two files.
DIFF uses all available memory, up to a maximum of 128K bytes,
for internal text string storage from the two files being
compared. As a result, DIFF can generally compare two files
each up to 64K bytes in size, but with no more than 3000 lines
per file. However, if you have the source code this can be
increased by looking at the constant declarations in DIFF.PAS.
If two files bearing no relation to one another are compared,
the maximum file size may be less. If any of the DIFF options
which disregard text features are being used, the maximum file
size is reduced by roughly half, since DIFF must internally
store both the original and compressed text lines. If DIFF
runs out of memory, it will complain and halt gracefully. If
this happens you should divide your files into smaller pieces
before running DIFF. DIFF can be interrupted by typing
or . Output can be temporarily stopped by
typing and then continued by typing .


C. Command Options

1.Disregarding Differences

By default, none of these options is active.

-DC
Disregard Case of alphabetic characters. All characters are
converted to upper case before comparison (this occurs
internally to the program, and does not affect the input
files, or the appearance of the output).

-DS
Disregard Spacing. All blanks and tabs are removed from each
text line before comparison. This includes leading blanks,
trailing blanks, and blanks in the middle of the line.

-DB
Disregard Blank lines. Any line consisting of only
(carriage return), (carriage return linefeed), or the
combination of either of these with a formfeed , will be
ignored. Please note that the line numbers reported by DIFF in
this case count only the non-blank lines.

-DP
Disregard Pascal comments. Any sequence of text that is
properly delimited by the Pascal comment identifiers (which
are { } (* *) ) will be ignored. Comments may extend across
multiple lines and still be properly handled. Note that the
appearance of these comment delimiters within Pascal literals
(e.g., string assignments) will still cause DIFF to interpret
the comment delimiters. Watch out for this!

-DK Keys
Disregard any characters contained within the list Keys. Keys
are specified just like Keys or Delims for the REP utility.
You may specify up to 63 characters in the key list.

Again, note that using the Disregard options will reduce
DIFF's file capacity limits. The exact amount of reduction
also depends on the extra degree of similarity gained by
disregarding text features.

If you wish to compare two Pascal files, one of which has been
formatted by PF, and the other of which hasn't, you should use
the two options -DC and -DS. Then DIFF reports only
substantive differences.


2.Formatting the Output of DIFF

-S
Build an EDLIN Script. In this mode DIFF builds a list of
editor commands which will automatically convert OrigFile into
ModFile. We will describe how to run this script in the
Examples section below. By default, Script mode is not active.

When -S is selected, all Disregard options are ignored. Script
mode is intended to produce an exact replica of ModFile, and
therefore no text features can be disregarded. When -S is
selected, none of the other Output Formatting options can be
activated. -S is a self-contained option. However, the
miscellaneous options described in section 3 below are still
available.

-M
Send Matched lines of the two files to the output. By default,
this option is not active. If any of the Disregard options are
active, the full versions of two lines may be different

although they match after compression. In this case, DIFF will
display the text line as it appears in ModFile.

-ND
Do Not send Deleted lines to the output. Deleted lines are
those that appear in OrigFile but do not appear (in the same
relative order, at least) in ModFile. By default, all deleted
lines are sent to the output.

-NI
Do Not send Inserted lines to the output. Inserted lines are
those not found in OrigFile that must be inserted (or
appended) to create ModFile. By default, all inserted lines
are sent to the output.

-NH
Do Not send a Header line for each line sent to the output.
The header line tells whether the text line following is
matched, deleted, or inserted. It also gives the line number
of the text line. The line number refers to position in
OrigFile for deleted lines, and to the position in ModFile for
inserted lines. For matched lines, the position in both
OrigFile and ModFile is reported. By default, a header line is
sent with each text line that goes to the output.

-B
Use a Block format in reporting the output. In this mode, all
consecutive instances of inserts (or deletes, or matches) are
grouped into a continuous block of output, preceded by a
single header line. In this case the header line reports the
range of line numbers affected. This format is easier for
humans to look at, but harder for machines to interpret. The
-M, -ND, and -NI options may be applied in combination with
block mode, but the headers cannot be turned off. By default,
block mode is not active.

-C
Count number of differing lines and send this number to the
output. This mode turns off all other output formatting modes
and produces just a single number. By default, this mode is
not active.


3.Miscellaneous Options

-Wn
Write a maximum of n differences before quitting. n is an
integer between 1 and 32767. By default, all differences are
reported, no matter how many. n must immediately follow the W,
with no spaces intervening.

-P
Send Performance statistics to the screen when the run has
been completed. Performance statistics include the amount of
RAM space used and the processing rate (in lines per second).
By default, performance statistics are not printed.

-?
Display a help screen describing the command options. DIFF
halts after displaying the help screen.


D. Using the Script Mode

DIFF myfile.pas myfile.bak -s >myfile.s00

Produces an EDLIN script that will convert myfile.pas into
myfile.bak. The script is saved on a file named myfile.s00.
Once the script has been created, the .BAK file can be
deleted. A series of scripts having sequentially numbered
extensions can be stored to represent the entire history of a
file.

To use the script of this example to recreate rev 00 of
myfile, type

COPY myfile.pas myfile.r00
EDLIN myfile.r00
As EDLIN executes, you will see a trace of its activities sent
to the screen. When EDLIN finishes, myfile.r00 will be
identical to what myfile.bak was when you started.

If you do not wish to watch EDLIN's activities, call it as
follows:

EDLIN myfile.r00 nul

If you want to save the editing trace, call EDLIN as follows:

EDLIN myfile.r00 myfile.t00

Myfile.t00 will contain the trace.

If you have saved a series of scripts, and want to regenerate
a version of the file that is more than one version out of
date, then it is necessary to call EDLIN once for each script
between the current file and the desired one.

If you have edited the file since you created a script, and
you wish to regenerate an old version, you must use the .BAK
version of the file as a starting point, and not the most
current version. To avoid confusion, it is better to keep two
full versions of the file around at any time. One version is
the last one which was archived using DIFF, and the other
version is the current one. The previously archived version
should be named as something besides *.BAK so that it isn't
overwritten during editing.

E. Further Examples

DIFF -B xxx.old xxx.new

All differences between xxx.old and xxx.new will be sent to
the screen in block format.

DIFF -M -ND -NI -NH xxx.old xxx.new >xxx.mat

All of the matched lines between xxx.old and xxx.new will be
sent to the file xxx.mat. No header information, only the
text, is included.

DIFF -C -DC -DS -DB xxx.old xxx.new

Reports the number of lines that differ between xxx.old and
xxx.new after differences in case, spacing and blank lines are
disregarded.

DIFF -DK ;,.':()! -DC xxx.old xxx.new

Reports the differences after case and common punctuation are
disregarded.

-----------------------------------------------------------

9. RPL - Pattern Match and Replace

A. Purpose

RPL is used to find occurrences of specified text patterns in
a file and then optionally replace the matched text with other
text. RPL provides much more flexibility in what can be
matched and replaced than the typical replace command offered
by a text editor.

RPL is not an interactive program like an editor. It takes a
set of patterns and files as an input, and produces lines of
output based on all of the input. The input and output can be
of any size, limited only by disk space and an individual line
length of 1024 characters.

RPL takes its input from the standard input device and writes
its output to the standard output device. Because RPL is often
used in pipelines (where the output of one program feeds the
input of another program), the use of standard I/O is
important and powerful.

To specify match and replace patterns, RPL uses regular
expressions in the style of the Unix operating system. If you
haven't seen regular expressions before, you will probably
think that a sparrow has been hopping on the top row of your
keyboard when you see your first one. However, like the APL
language, regular expressions carry a heavy load of meaning in
a small amount of space.

To get you started with RPL, about a dozen RPL application
commands have been supplied. As you will see in the Examples
section, these commands can handle some useful and complicated
problems. Because RPL application commands are often general
purpose and reusable, RPL is designed to provide convenient
access to libraries of these commands.

Those familiar with Unix text utilities will find that RPL is
somewhere between EGREP and AWK with regard to the complexity
of problems that it can solve. Several words of the regular
expression syntax have been changed from their format in Unix
in order to mesh well with MS-DOS.


B. Usage

Simply typing RPL will initiate an interactive prompting
session. To fully understand the meaning of the prompts, it is
worthwhile to study the material below. If you have not used
regular expressions before, you should definitely read the
sections below.

RPL may be called with command line arguments as follows:

RPL [-F CommandFile] [options] [I/O redir]

Options are described in detail below.


1.Command Files

The entry -F CommandFile specifies that RPL should read the
first line of a text file named CommandFile and interpret it
as part of the command line. To facilitate the use of
libraries of command files, RPL will search the system PATH
environment in order to find CommandFile (after checking the
current default directory). This allows all command files to
be kept in one place (e.g. on your utilities diskette, or in
the top level directory of your hard disk) and accessed from
anywhere on the system. The default extension of a command
file is .PAT.

Note that ONLY the first line of CommandFile is read and
interpreted as a command line. This line may be up to 255
characters long. The line read from CommandFile may also
contain -F options, which allows nesting of command files to
arbitrary depth. Since lines after the first are not
interpreted, they may be used for comments describing the
purpose of the command they follow.

Any single command option or regular expression must be
specified completely on a single line. See the Options section
below.


2.Specifying Input and Output

If you call RPL with any options on the command line, you must
specify an input file as part of the command line (or input
must have been previously redirected). If you call RPL with
nothing on the command line, you will be prompted for an input
filename.

The input file contains the text which will be read to find
matches. Only a single input file may be specified (this
differs from Unix, which allows multiple files). The file may
be specified simply by name, or preceded by the MS-DOS input
redirection character <. This file will be found in the
current default directory, or can be preceded by a drive and
pathname if you wish.

RPL writes its output to the standard output device. By
default this is the screen. You may redirect the output by
using any of the MS-DOS 2.X techniques.

You should NOT redirect the output to the same file from which
the input is taken. Because line lengths may change
significantly during transformation, this might write output
over unread input.


3.Behavior at Runtime

If RPL cannot find a file, or if the syntax of a regular
expression or command option is wrong, it will complain (in a
somewhat helpful manner) and halt.

You can obtain a summary of RPL syntax at any time by typing
the command

RPL -?

RPL contains an integral MORE filter. Whenever output is
written to the screen (not redirected to a file or other
device) and 24 lines of output are written, the program will
prompt "MORE?". Enter a space or a Y for another screenful, a
to get a single line, and any other key to quit. For
some RPL patterns, the MORE filter may not be effective, since
RPL can write one line containing many linefeeds.

If RPL standard output is not writing to the screen, it will
maintain a status indicator that shows the current line number
within the input text file. The line processing rate of RPL
varies significantly, depending on what patterns were
specified and what the input text is. For the application
commands supplied with RPL, the processing rate varies from
about two lines per second to over fifty lines per second, so
you may want some reassurance as to where RPL is.


C. Command Options

RPL allows control over several independent aspects of the
matching and replacing process. These aspects include
specifying the regular expression patterns, deciding what
types of lines to output, and miscellaneous formatting
commands. These areas will be described one at a time.

1.Specifying the Regular Expressions

RPL allows up to three regular expressions to be applied
during a single pass over a text file. The three types of
expressions are Select/Avoid, Match, and Replace.

A Select/Avoid expression is used exclusively to determine
which lines of text are suitable for further processing. If a
line of text does not meet the Select/Avoid criteria, no
further matching or replacing will be done on that line.
Whether that line is output or not comes under the control of
the output options described in the next section. Generally, a
Select/Avoid expression should be a relatively simple one to
filter out lines that may waste a lot of time in more complex
Match and Replace expressions.

Use of a Select/Avoid expression is optional. Either a -S or a
-V option may be specified, but not both. These expressions
mean:

-S Select
Use only lines from the input file that match the regular
expression Select.

-V Select
Use only lines from the input file that do NOT match Select.

Select represents a regular expression in the format to be
described. The regular expression must be separated from the
option letter by at least one blank, but otherwise must
immediately follow it on the command line. The option letter
and the regular expression must be found on the same command
line (in the same command file).

A Match expression is also optional. However, you must provide
at least one Select, Avoid, or Match expression, or RPL will
have nothing to work with.

The Match expression is the main workhorse of RPL. Replacement
of text is based on what the Match expression finds. To
specify a Match expression, use the following command option:

-M Match
The syntax of this option is the same as for the Select/Avoid
option.

Finally, an optional Replace expression can be specified. This
expression tells RPL how to go about replacing text that was
matched. To specify a Replace expression, use the following
command option:

-R Replace
The syntax of this option is the same as for the Select/Avoid
option (although the regular expression syntax is different,
as you will see later).


2.Deciding What Lines to Output

Depending upon what regular expressions have been specified,
there are many combinations of the input lines that we may
want to output. We provide three additional command options to
control the selection:

-US
Output Un-Selected lines.

-OM
Output Only Modified lines.

-OC
Output only a Count of the lines that would otherwise have
been written out.

In the following table we describe all of the combinations of
expressions, options and output behavior. 1 means the expression
or option has been specified. The -OC option is not shown in the
table. It always has the effect of forcing just a count of the
lines that would otherwise have been written out.


Output Combinations

Expressions Options
S/V M R US OM Lines Output
--- --- --- --- --- ------------------
1 0 0 0 0 selected only
0 1 0 0 0 matched only
1 1 0 0 0 selected AND matched
0 1 1 0 0 unmatched unchanged,
matched modified
1 1 1 0 0 selected AND unmatched
unchanged, matched
modified
1 1 1 1 0 unselected unchanged,
selected AND unmatched
unchanged, matched
modified
0 1 1 0 1 matched modified
1 1 1 0 1 selected AND matched
modified
1 1 1 1 1 unselected unchanged,
selected AND matched
modified

Combinations not shown will produce either an error or trivial
results. Perhaps the best way to get the feel of these will be
to study the examples provided.

The operation of a find function in a typical editor is like
that of the second row of the table.

The operation of the replace function in a typical editor of
like that of the fourth row of the table.

The Select and Match expressions combine in a logical AND
fashion. Text must match BOTH expressions to be considered for
output.

The Select/Avoid expression supports the additional function
of "avoiding" text. Only text which does NOT match the
expression will be considered for output or modification.


3.Miscellaneous Formatting Commands

RPL supports two additional formatting commands:

-N
Write the line number (from the input text file) to the start
of each output line.

-I
Ignore case. All text is internally converted to upper case
before matching proceeds. This does not affect the appearance
of the output.


D. Select/Avoid and Match Expressions

Regular expressions are combinations of normal text characters
and special characters that are interpreted by RPL. A regular
expression may be as simple as a fixed text string (e.g.,
PROCEDURE will match all instances of the word PROCEDURE) or
it may be extremely complicated and match a broad class of
text strings.

A Match expression operates on a single line of text at one
time. No match can span multiple lines of text.

Select/Avoid and Match regular expressions are composed of the
following:

A ? matches any single character except newline. A newline is
really two characters in a specific order --
followed by . To match a newline, you must always
explicitly specify a newline.

A ^ matches at the beginning of a line only. A ^ occurring
ANYWHERE in the match expression (except within a character
class) is interpreted in this manner. This allows meaningful
use of ^ in combination with grouping or alternation (see
below).

A $ matches at the end of a line only. As with ^ the $
character retains its special meaning anywhere within the
expression (except in a character class).

A \ followed by a single character matches that character. In
this way \* will match an asterisk, \\ will match a backslash,
\$ will match a dollar sign, etc. However, the following
sequences are special:

\s space (ASCII #32)
\t tab (ASCII #9)
\b backspace (ASCII #8)
\r return (ASCII #13)
\l linefeed (ASCII #10)
\n newline (#13 followed by #10)
\i input redirection <
\o output redirection >
\p pipe character |
\w word delimiter
Matches any of the following:
\t\s!"&()*+,-./:;<=>[email protected][\]^`{|}~
\h hex character
Matches any of 0123456789ABCDEF

Case is ALWAYS significant when using the special characters.
Thus \s will match a space while \S will match a capital
letter S.

When a regular expression is specified on a command line or in
a command file, the regular expression may NOT contain any
blank space. The special characters above should be used to
produce instances of blanks and tabs.

A single character not otherwise endowed with special meaning
matches that character. Thus z matches a single instance of
the letter z. Intuitive, huh?

A string enclosed in brackets [] specifies a character class.
Any single character in the string will be matched. For
example, [abc] will match an a, b, or c. Ranges of ASCII
letters and numbers may be abbreviated as, for example,
[a-z0-9]. If the first symbol following the [ is ^ then a
negative character class is specified. In this case, the
string matches all characters EXCEPT those enclosed in the
brackets. For example, [^a-z] matches everything except lower
case characters (and newlines). The special characters defined
above may be used inside of character classes with the
exception of \n, \w and \h, which are shorthand for their own
character classes. If the characters - or ] are to be used
literally inside of a character class, they should be preceded
by the escape character \. Note that *?+(){}!^$#& are not
special characters when found inside a character class.

A regular expression followed by * matches zero or more
matches of the regular expression. This is referred to as a
closure. Thus ba*b matches the string bb (no instances of a),
bab (one instance), or baaaaaab (several instances).

A regular expression followed by a + matches one or more
matches of the regular expression. This is another type of
closure. In this case ba+b will not match bb, but it will
match bab, or baaaaaab.

A regular expression followed by a ! matches zero or one
matches of the regular expression. This is another closure.
Here, ba!b will match bb or bab, but not baaaaaab.

Two regular expressions concatenated match a match of the
first followed by a match of the second. Thus (abc)(def)
matches the string abcdef.

Two regular expressions separated by # match either a match of
the first or a match of the second. This is referred to as
alternation. Any number of regular expressions can be strung
together in this way. Alternation matches are tested in order
from left to right, and the first match obtained is used. Then
the remaining alternate expressions are skipped over. (Unix
users note that we do not use a | character for alternation,
since it unavoidably causes MS-DOS to start a pipeline.) See
the next paragraph for an example.

A regular expression enclosed in parentheses () matches a
match of the regular expression. Parentheses are used to
provide grouping, and may be nested to arbitrary depth. Open
and close parentheses must be balanced. For example, the
following two expressions are not equivalent, and the second
probably expresses what was intended:

PROCEDURE#FUNCTION

(PROCEDURE)#(FUNCTION)

The first expression is equivalent to PROCEDUR(E#F)UNCTION.
The second expression matches either of the two well-known
words.

A regular expression enclosed in curly braces {} forms a
tagged match word. Whatever was matched within the braces may
be referred to by a Replace expression in a manner to be
described. Tagged match words may NOT be nested. Open and
close braces must be balanced. A maximum of nine tagged match
words may be referenced by the Replace expression. Note that
the use of curly braces in Select/Avoid expressions is
meaningless. However, these expressions share an expression
interpreter with the Match expressions, so no error will be
flagged. As an example, consider the expression b{a*}b. If the
string tested is bab, then the tagged match word will contain
a single a. If the string tested is baaaaaab, then the tagged
match word will contain aaaaaa. If the string tested is bb,
then the tagged match word will be empty.

Regular expressions are interpreted from left to right. The
order of precedence of operators at the same parenthesis level
is [] then *+! then # then concatenation.

Tag braces are interpreted strictly from left to right and do
not control precedence in any way. The first tagged match word
found is given a tag of 1, the second a tag of 2, and so on up
to a maximum tag of 9. The tag number that each word receives
is based on when it is encountered in the line. If tags are
skipped over as a result of alternation, then any remaining
tags in a line will receive shifted tag numbers. For example,
consider the expression:

(FUNCTION)#({PROCEDURE})\s+{[^\s(]+}

If a line contains the word PROCEDURE then the word following
PROCEDURE will have a tag number of 2. If a line contains the
word FUNCTION, then the word following FUNCTION will have a
tag number of 1. It is up to the user to take advantage of
this behavior. Generally, it is good practice to surround an
entire set of alternates with tag markers:

{(FUNCTION)#(PROCEDURE)}\s+{[^\s(]+}


E. Replace Regular Expressions

Replace regular expressions are constructed the same way as
Match regular expressions, but the number of operators is
reducedi. The replacement process occurs in the following
manner: any text of the input line that does not match the
Match expression is sent to the output unchanged. The Match
expression will find a string of text that starts at the
leftmost position in the input line that matches, and
continues to the rightmost position that matches. The string
of matched text is operated upon by the Replace expression and
output. The Match expression is then tried again on the input
line, starting at the first position beyond the previous match
string. This recurs until the end of line is found.

Replace expressions are composed of the following:

When a regular expression is specified on a command line or in
a command file, the regular expression may NOT contain any
blank space. The special characters below should be used to
produce instances of blanks, tabs and the null expression.

If a null Replace expression is desired, the -R keyword must
either occupy the last position on its input line or use the
special symbol \z to indicate a null expression. Null Replace
expressions are used to delete text strings from a file.

A single character not otherwise endowed with special meaning
is sent to the output.

A \ followed by a single character sends that character to the
output. In this way a \& will write an ampersand and a \\ will
write a backslash. However, the following sequences are
special:

\s space (ASCII #32)
\t tab (ASCII #9)
\b backspace (ASCII #8)
\r return (ASCII #13)
\l linefeed (ASCII #10)
\n newline (#13 followed by #10)
\i input redirection <
\o output redirection >
\p pipe character |
\z null expression

Unless a newline combination was explicitly matched in the
Match expression, it is not necessary to explicitly specify
newlines in the Replace expression. Each newline of the input
text line will be written out in the unmatched category of
output. Extra newlines may be added or separate lines may be
combined by understanding this feature (see the examples).

Another special case occurs when \ is followed by a single
digit in the range of 1 through 9. In this case the tagged
match word found by the Match expression is sent to the
output. If a tagged match word for that tag number was not
defined, or if the tagged match word didn't match anything,
then nothing is output. The tagged match words may be output
in any order and may be repeated any number of times.

An & appearing in the Replace expression causes all text
matched by the match expression to be sent to the output. &
may appear in the Replace expression as many times as desired.


F. Examples

We will start with some simple examples and work up to
complicated ones. Some of the example commands are split onto
two lines of the manual. When you actually use them, they
should be typed on a single line.

RPL -N -M PROCEDURE
Finds all lines containing the word PROCEDURE in the file
myfile.pas and writes them to the screen, preceded by their
line number.

RPL -I -N -M (PROCEDURE)#(FUNCTION)
Does the same thing as the example above, but also shows lines
containing the word FUNCTION. The matching is not
case-sensitive.

RPL -M oldvariable -R newvariable myfile.new

Replaces all instances of the word oldvariable with the word
newvariable in myfile.pas. The modified text is written to the
file myfile.new.

RPL -M ^{?+[^\s]}\s* -R \1 myfile.cln

Strips trailing blanks from each line of the file myfile.txt.
Output is sent to the file myfile.cln. The Match expression
can be explained as follows:

Anchor at the beginning of the line ^
Start a tag word {
Match one or more instances of any character ?+
Match one instance of any character but a space [^\s]
End the tag word }
Match any and all trailing spaces \s*

The Replace expression then outputs the tagged word, which
contains the entire line up to the last non-blank character.

Empty lines and lines consisting entirely of blanks are output
unchanged. Leading blanks are output as in the input file. A
more complex expression could be used to remove blanks from
all-blank lines as well.

RPL -M ? -OC
Counts the number of non-empty lines in longfile.pas. The
match command -M ? matches any line containing at least one
character (besides a ). The -OC option says to present
just a count of the result.

RPL -I -M \w*{['$#A-Z0-9]+}\w* -R \1\n file2

Strips word delimiters and puts one word per line into file2.
The Match expression can be explained as follows:

Find zero or more instances of any word delimiter \w*
Start a tag word {
Match one or more instances of any character that might
reasonably form part of a capitalized word ['$#A-Z0-9]+
End the tag word }
Find zero or more instances of any word delimiter \w*

The Replace expression then writes the tagged word and adds a
newline to put each word on a separate line.

This command will create empty lines mixed in with the lines
containing a single word. The blank lines may be removed with
the expressions of the previous example, or they may be
removed using a supplied application command as follows.

RPL -F rmblines file2.cln

Calls the command file RMBLINES.PAT to remove all blank lines
from file2 and write the result to file2.cln.


If you intend to use an RPL application repeatedly, it will be
worthwhile to spend some time optimizing it for performance.
Regular expressions producing the same result can vary
significantly in processing speed. The following list
describes the relative performance of various RPL capabilities
(listed fastest to slowest):

fixed text strings anchored to the beginning of line

fixed text strings not anchored to the beginning of line

expressions including character classes, alternation or
grouping

expressions including closures (especially * and +)

In general, the longer the expression, the slower will be the
response.


 December 8, 2017  Add comments

Leave a Reply