Dec 242017
 
Utility for modifying ASCII text files. Similiar to AWK.
File DO33.ZIP from The Programmer’s Corner in
Category Word Processors
Utility for modifying ASCII text files. Similiar to AWK.
File Name File Size Zip Size Zip Type
DO.COM 36349 21702 deflated
DO.DOC 25362 7546 deflated

Download File DO33.ZIP Here

Contents of the DO.DOC file


-DO v3.3 $10 LARRIE HUTTON 10361 Glen Hannah,Laurel,MD 20723 301-604-3827-
(s) Strip LF to CR/LF or (x)-out LFs. (p) Paragraph or (o)ne-line lines.
(w) WordStar from ASCII. (h) High bits out [WS to ASCII].
(u) Upper or (l)ower case only. (q) Quicksort [2048 lins / 127 chr].
(n) Number lines; no trailing spaces. (b) Bomb repeated & blank lines.
(e) Expand tabs with spaces. (t) Tag lines within range.
(c) Copy lines between two columns. (i) Insert, at column, a string.
(f) Find or (r)eplace string. (a) Ask before replacing.
(j) Join or (m)erge-alpha two files. (g) Glue two files side by side.
(k) Change all letters but 1st to low. (d) Makes "Tom Mix,OH" "Mix,Tom,OH"
-----------------------------------------------------------------------------
NOTES on find, replace, and ask: Upper case (F,R,A) causes commands to
ignore case. ASCII values entered with #nnn#; eg, #13##10# is CR/LF.
-----------------------------------------------------------------------------
COMMAND LINE EXAMPLES (where: i=input l=line c=col s=string #n#=ascii)
-----------------------------------------------------------------------------
do s i do p d:\dir\i do w i > d:\dir\o do h i
do u i do q con: c c s do n i > prn do b con: > prn
do e i c do t i +/-l +/-l do c i c c do i i c s s s
do r i s // s do a i s#n#s // s do j i i +/-l do g i i
-----------------------------------------------------------------------------
Choose from above (or "?" for cmnd ln sntx, "+" for 9 lns, "-" for 4 lns):


-----------------------------------------------------------------------------
D O . D O C

Larrie V. Hutton
10361 Glen Hannah Drive
Laurel, MD 20723
(301) 604-3827

This is admittedly pretty sparse documentation for a fairly complicated
program--the DO text file manipulator. I had hoped the program screen
would be self-explanatory. It wasn't, so here goes. The manual enclosed
here (to use the term loosely) is divided into three small sections: [1] a
BRIEF description of most of the less obvious commands, [2] a section
providing some details of the screen and command line formats with a few
sample batch file suggestions, and [3] a listing of the comments in the
source code. The latter is included because some of you may find it
helpful to see the internal documentation, but I am not going to elaborate
(I have a job!). Incidentally, the last section was produced easily by
using DO to search for all lines with an asterisk.


INDIVIDUAL COMMANDS

(s) Strips LF's (ASCII 10) or CR's (ASCII 13) and replaces them with
a CR/LF combination. Useful for files that overwrite lines (have
only CR's) or begin just below the end of the previous line (have
only LF's).

(x) Rubs out LF's and leaves only CR's. Useful for uploading to UNIX
editors, for example, that count both CR's and LF's as valid line
terminators. Without this ability, files are often double spaced
when they should not be.

(p) Paragraphs are formed from single lines by replacing CR/LF
combinations with spaces; joins consecutive lines. Good for
putting ASCII text into form suitable for most word processors.
A CR/LF combination IS issued, however, for blank lines--in fact,
two CR/LF combinations are issued at blank lines to preserve the
"original intent".

(o) One-lines paragraphs--the previous command in reverse. Useful
for taking long text strings and turning them into proper ASCII
format. Lines are formed at the next word break whenever the
number of characters exceeds 60.

(w) WordStar from ASCII. Not perfect, but does a pretty good job at
guessing, and much easier than hand-editing an ASCII file from
scratch.

(h) Reverse of the previous, but much easier to implement. Simply
replaces all high bits with low, so produces proper ASCII text.
Useful for peeking at a WS file.

(u) Upper case text only produced.

(l) Lower case text only produced.

(q) Really a merge sort for text, and a shell sort for numbers. Very
fast and flexible. Maximum number of lines shows on screen;
since routine uses the heap, the maximum file size is dependent
upon available RAM.

(n) Numbers lines (default is to use first eight columns for this
information). Also removes trailing blanks from all lines.
Sometimes the latter function is really what is needed; in that
case, number the lines and copy only from the ninth column and
beyond with the (c) option. Blank lines are counted, but the
numbers aren't printed.

(b) Bombs (deletes) all blank lines and duplicate lines. Sometimes
useful after sorting.

(e) Expands tabs with tabs. Default is eight spaces.

(t) Tags lines: Lets you copy lines 37-94 only, for example.
Entering -5 to -1 would produce the LAST 5 lines.

(c) Copies only text that falls within the specified columns. If the
end column is not specified, assumes to end. If end column is
negative, pads with spaces if necessary to assure that all lines
are same length.

(i) Inserts a string at any given column position. Could be used,
for example, to widen borders. If a CR/LF is inserted (with
#13##10#), causes lines to be split at that point into two lines.

(f) Finds strings in text. Again, can look for any ASCII string by
using #n# combinations. Upper case F causes case to be ignored.

(r) Similar, but replaces all instances of one string with another.
Can also be used to produce output that has only lines containing
the string, or all lines NOT containing the string (by using
"????" or "!!!!" as the replacement string. On the command line,
a double slash (//), with a space on each side, should separate
the old and new strings. Again, ASCII values can be entered
directly. An R causes case to be ignored.

(a) Asks before replacing. Lets you see the line with and without
the replacement before you decide.

(j) Joins two files by letting you insert the second file at any line
in the first.

(m) Merges two files alphabetically. In other words, if you have two
sorted files, this will create a single sorted file. Since there
is no limit on file size, this would permit files too large to be
(q)uick sorted to be split into two (or many) files, sorted with
(q), and then merged.

(g) Glues two files side-by-side. Most useful if the first file has
been padded with spaces to a consistent line length with the (c)
command.

(k) Change all letters but first letter in word to lower case; first
letter is capitalized. Two common uses would be to take a
mailing list with all caps and turn it into a more respectable-
looking file, or to clean up inconsistently formatted source
code.

(d) Takes the last word in the first field of a comma-delimited data
file and creates a new, first field. For example, mailing lists
that don't separate the last and first names with a comma (and
therefore can't be easily sorted by last name) will have a last-
name field first, followed by the rest of the name as the second
field. All other fields are unchanged. Rather special-purpose,
but can be a real time-saver.

DETAILS OF OPERATION

Entering DO alone at the DOS command line brings up work area that
lets you see the operation of each command. This is useful for
testing. If you do not wish to produce a new file (or printer
output), merely press RETURN when you are asked for the output file
name. You will be prompted for all other values. If you press RETURN
at these prompts, a default value (usually appropriate) will be
assumed.

Pressing the space bar during the operation of a command permits you
to step through the process--sometimes helpful to see what is going
on. Depending upon the command, the stepping process may occur
character-by-character or line-by-line. Pressing ENTER reverts to
full-speed operation. Pressing ESC causes a graceful abort at that
point, and the output file (if any) is closed with all processing up
to that point contained therein. DO NOT specify an extant file as
your output file unless you want to overwrite the file. Summary
statistics are printed at the conclusion of each command.

An exception to the space bar option during the operation of a command
is the sort, which cannot be single stepped (because the file must be
read in its entirety before processing can occur). However, pressing
the "+" key lets you see how much room is left on the heap at any
point. I originally had that feature for debugging purposes, and it
was interesting to watch, so I left it in.

Before a command is issued, there are three options that may be
specified. A "+" causes the input and output windows to be 9 lines
long, rather than 3; a "-" reverts to the default of 3. Both commands
also turn off the blinking price of the program, if you find that
annoying. A "?" shows you what the DOS command line syntax should be
for the last command issued.

Issuing commands at the DOS prompt is MUCH faster, and permits you to
build batch files using DO, but you lose the feedback from the
windows. The DOS option is forced by including a parameter (s, h, q,
etc.) at the DOS prompt. If you do not specify enough parameters, and
defaults cannot be logically assumed by the program, you will be asked
for the missing information. Output can be redirected if desired.
For example, the command

DO h infile > outfile

will cause a file named "outfile" to be produced that is created when
the high bits are stripped from "infile". If you do redirect the
output at the DOS prompt, output will be directed to the screen. This
is often a useful way to check your output before sending it to a disk
file or printer, or simply to peek at an otherwise unreadable file.

To show the utility of using DO in batch files, I include one that
will take a WS file (%1) and produce, in ASCII paragraph format, a
sorted list of all unique words in the original file. It might be
called by entering "getuniq infile outfile" at the DOS prompt:

do h %1 > scrap.1
do r scrap.1 #32# // #13##10# > scrap.2
do q scrap.2 > scrap.1
do b scrap.1 > scrap.2
do p scrap.2 > scrap.1
do o scrap.1 > %2
del scrap.*

This file will take, say, two batch files (%1 and %2) and produce a
third file (%3) in which the two original files are placed side-by-
side. It might be invoked by a command like "glue in1 in2 outfile":

do c %1 1 -40 > scrap.1
do g scrap.1 %2 > %3
del scrap.*

One last point: it is permissible to use the console as an input file
in either the DOS command line or menu mode. This may be done by
entering "con:" (or simply ENTER) at the prompt for the input file.
This is often useful for simply checking to see how a particular
command affects text of your choosing. It is also a quick way to
create a small output file that is already processed. To end console
entry, enter as your last character.

Finally, if you wish merely to see summary statistics, just enter NUL
as your output file. For example, "do h infile > nul" will give
information on high bit characters quickly, but without creating an
output file or showing output to the screen.

It pays to spend some time experimenting if you deal very often with
text files. Please feel free to pass this program on to friends. If
you liked it (or didn't) or if you have suggestions, please let me
know.

{****************************************************************************
*
* Written by Larrie Hutton, 10361 Glen Hannah, Laurel, MD, (301) 604-3827.
* Version 3.3, last revised 1/1/90.
*
* This program consists of a collection of useful utility routines. They
* are intended to supplement the DOS commands, not to replace them. Each
* routine is contained as a case statement in the ConvertFile procedure.
*
* The commands can be entered as command line parameters, in which case all
* windowing of input and output is suppressed. (This also speeds up the
* program by a factor of about 3.)
*
* The BASIC OPERATION is as follows: All file I/O (from the ConvertFile
* procedure) is handled through the ReadIn and WriteOut procedures, which
* in turn make decisions about windowing and character-vs-line processing.
* GetWindow, driven by ReadIn and WriteOut, handles the details of cursor
* positioning and window sizing. The details of the three core procedures
* (GetWindow, ReadIn, and WriteOut) can be seen in the first three
* procedures of this program.
*
* The original version of this program was an ASCII-to-WordStar conversion
* utility, which remains essentially unchanged.
*
* Twenty-two other utility options are provided: (S)tripping all LFs [with
* CRs replaced by CR/LF], (X)-ing out all LFs, (P)aragraphing adjacent
* lines [with (O) to reverse], (Q)uicksort, (H)igh bit removal [or WS to
* ASCII], (U)pper case conversion [or (L) for lower], (N)umbering lines,
* (B)ombing repeated and empty lines, (E)xpanding tabs, (T)agging lines,
* (C)opying a file between any two column positions, (I)nserting a string at
* any column position, (F)inding or (R)eplacing one string with another,
* (A)sking before replacing, (J)oining one file into another at a given line
* or (M)erging them alphabetically, (G)luing two files line by line [useful
* for column formats], (K) converting all initial letters to upper case and
* all others to lower [useful for all upper-case mailing lists], and (D)
* making the last word in a comma-delimited file the first field [useful
* for mailing lists in which first and last names not entered separately].
*
* MAJOR GLOBAL VARIABLES
*
* Choice - selection from menu. Console - true if command line.
* Commands - records user input. Cursor - array of cursor positions.
* Active - current window (of 3). ActiveWin - array of window coordinates.
*
****************************************************************************}
{****************************************************************************
*
* Given the appropriate window, calculates new coordinates and saves old
* cursor position in the Cursor array. Makes WinNum the Active window.
*
****************************************************************************}
{****************************************************************************
*
* Writes input data into appropriate window (input or output), taking data
* type (line or character) into account.
*
****************************************************************************}
{****************************************************************************
*
* Writes output data into appropriate window (input or output), taking data
* type (line or character) into account.
*
****************************************************************************}
{****************************************************************************
*
* Beeps speaker for msec duration.
*
****************************************************************************}
{****************************************************************************
*
* Converts string to upper case.
*
****************************************************************************}
{****************************************************************************
*
* Returns given string repeated Factor times.
*
****************************************************************************}
{****************************************************************************
*
* Replaces instances of OldSt with NewSt in Line.
*
****************************************************************************}
{****************************************************************************
*
* Determines space left on heap.
*
****************************************************************************}
{****************************************************************************
*
* Determines number of lines that Mergesort can handle.
*
****************************************************************************}
Lines := ( FreeMem * 16.0 - MaxArray * 4.0 - 8192 ) / ( MaxStr + 10.0 )
{****************************************************************************
*
* User escape and pause.
*
****************************************************************************}
#43: writeln( con, 16.0 * FreeMem:6:0, ' bytes free on heap.... ' );
{****************************************************************************
*
* Performs standard Shell sort on numeric array.
*
****************************************************************************}
{**************************************************************************
* Converts any real number to a string representation that does not use
* exponents; strips all trailing zeros.
**************************************************************************}
{****************************************************************************
*
* Read the data in, perform mergesort, and print results. The Rev
* parameter controls reverse sort; the Col parameter the column on which
* the sort is performed.
*
****************************************************************************}
{**************************************************************************
* Initializes pointers and fields and reads data.
**************************************************************************}
{**************************************************************************
* Returns pointer (Merge) to new list gotten from merged sublists.
**************************************************************************}
{************************************************************************
* Sets calling variable and PntrB to PntrA; PntrA advances to next rec.
************************************************************************}
{**************************************************************************
* Recursively sorts sublist at First, or MergeSorts both; Merge is pntr.
**************************************************************************}
{**************************************************************************
* Realigns pointer array; rsets link.
**************************************************************************}
{**************************************************************************
* Prints out linked list and reclaims heap.
**************************************************************************}
{****************************************************************************
*
* Replaces any #nn# combination in Line with its ASCII representation.
*
****************************************************************************}
{****************************************************************************
*
* Gets string values from keyboard or command line, as appropriate.
*
****************************************************************************}
{****************************************************************************
*
* Returns number from string; default if error. Gets input from command
* line if possible.
*
****************************************************************************}
{****************************************************************************
*
* Returns adjusted line numbers, size of file, and string reps of lines.
*
****************************************************************************}
{****************************************************************************
*
* Sets up information necessary for Replace and Ask operations.
*
****************************************************************************}
{****************************************************************************
*
* Shows menu if there are no command line parameters.
*
****************************************************************************}
{****************************************************************************
*
* Choose menu option.
*
****************************************************************************}
{****************************************************************************
*
* Gets input and output file names.
*
****************************************************************************}
{**************************************************************************
* Checks for I/O errors and aborts if ambiguous.
**************************************************************************}
{****************************************************************************
*
* Strips high bit from all ASCII Characters (Choice = 'H'), or converts
* to standard WordStar file (Choice = 'W'). See menu for other options.
*
****************************************************************************}
'S': {********************** STRIP LFs AND CHANGE ALL CR TO CR/LF ******}
'X': {************************************** X-OUT ALL LINE FEEDS ******}
'P': {*************** PARAGRAPH FORMAT (CONSECUTIVE LINES JOINED) ******}
'O': {********** ONE-LINE PARAGRAPHS AT COLUMN 60 FROM LONG LINES ******}
'W': {*************************************** WORDSTAR FROM ASCII ******}
'H': {******************** HIGH BITS REMOVED (ALSO ASCII FROM WS) ******}
'U': {**************************** CONVERT FILE TO ALL UPPER CASE ******}
'L': {**************************** CONVERT FILE TO ALL LOWER CASE ******}
'Q': {********************************** QUICKSORT (REALLY MERGE) ******}
'N': {******************* NUMBER LINES AND REMOVE TRAILING BLANKS ******}
'B': {***************************** BOMB REPEATED AND BLANK LINES ******}
'E': {*********************************************** EXPAND TABS ******}
'T': {****************************** TAG LINES BETWEEN TWO POINTS ******}
'C': {*********************************************** COLUMN COPY ******}
'I': {******************** INSERT STRING AT GIVEN COLUMN POSITION ******}
'F': {*********************************************** FIND STRING ******}
'R': {*************************** REPLACE ONE STRING WITH ANOTHER ******}
'A': {******************* ASK AND REPLACE ONE STRING WITH ANOTHER ******}
'J': {***** JOIN TWO FILES BY INSERTING SECOND INTO FIRST AT LINE ******}
'M': {*************** MERGE TWO FILES ALPHABETICALLY LINE BY LINE ******}
'G': {********** GLUE FILES BY APPENDING FILE2'S LINES TO FILE1'S ******}
'K': {**** CONVERT ALL INITIAL LETTERS TO UPPER & OTHERS TO LOWER ******}
'D': {******* MAKE FIRST FIELD INTO TWO--LAST WORD TO FIRST FIELD ******}
'+': {****************************************** MAKE BIG WINDOWS ******}
'-': {*************************************** MAKE LITTLE WINDOWS ******}
'?': {*********************************** REPEAT PREVIOUS COMMAND ******}
{****************************************************************************
*
* Closes input and output files.
*
****************************************************************************}
{****************************************************************************
*
* M A I N P R O G R A M
*
****************************************************************************}
begin { DO_ }
FirstTime := true;
ShowDirections;
GetChoice;
while ValidChoice do
begin
GetFiles;
ConvertFile;
CloseFiles;
GetChoice
end
end. { DO_ }


 December 24, 2017  Add comments

Leave a Reply