Category : Miscellaneous Language Source Code
Archive   : QTAWKD42.ZIP
Filename : QTAWK.DOC

 
Output of file : QTAWK.DOC contained in archive : QTAWKD42.ZIP
I




















-EQTAwkF-€
-EUtility Creation ToolF-€


For PC/MS-DOS
Version 4.20 10-10-90








Saturday, November 10, 1990





-E(c) Copyright 1989, 1990 Pearl BoldtF-€

Darnestown, MD 20878



































































QTAwk - ii - QTAwk






QTAwk License
Utility Creation Program
Version 4.20 10-10-90
(c) Copyright 1988 - 1990 Pearl Boldt. All Rights Reserved.

Pearl Boldt
Quik Trim
13012 Birdale Lane
Darnestown, MD 20878
CompuServe ID: 72040.434

Registration Information

QTAwk is a copyrighted program protected by both U.S. and
international copyright law. If you obtained QTAwk from a
shareware disk vendor, an on-line computer service or bulletin
board, a friend or colleague, or another similar source, you have
an unregistered (trial) copy. You may use this copy without
charge for a limited period of time under the terms of the QTAwk
license agreement (below). After this time is up, you must
register and pay for QTAwk to continue using it.

This method of distribution is known as shareware. It allows you
to determine whether QTAwk meets your needs before you pay for
it.

The registration fee for a single copy of QTAwk is $50. Payment
of this fee entitles you to:

* A disk with the latest version of QTAwk, registered to you.

* One copy of the printed QTAwk manual.

* An upgrade to the next release of QTAwk.

* Technical support via electronic mail or telephone.

If you prefer, you may register for $35 and receive only the
disk and notices of future upgrades. Network, site, and corporate
licenses are also available; contact the copyright holder for
more information.

Upgrade Information

If you purchased QTAwk version 4.02 or later at the $50 rate, or
a site license for version 4.02 or later, you are entitled to a
free upgrade to version 4.20. If you are not entitled to a free
upgrade, or you wish to order a version 4.20 manual use the order


QTAwk - iii - QTAwk






form following the License Agreement.

QTAwk License Agreement

1. Copyright: The QTAwk program and all other programs and
documentation distributed or shipped with it are Copyright
1988 - 1990 Pearl Boldt and are protected by U.S. and
International Copyright law. In the rest of this document,
this collection of programs is referred to simply as "QTAwk".
You are granted a license to use your copy of QTAwk only
under the terms and conditions specified in this license
agreement.

2. Definitions: QTAwk is distributed in two forms. A
"registered" copy of QTAwk is a copy distributed on diskette,
purchased from the copyright holder. A "shareware" copy of
QTAwk is a copy distributed on diskette or via an electronic
bulletin board, on-line service, or other electronic means,
obtained from a shareware disk vendor, or obtained from
another individual.

3. Shareware Copies: Shareware copies of QTAwk are distributed
to allow you to try the program before you pay for it. They
are Copyright 1988 - 1990, Pearl Boldt and do not constitute
"free" or "public domain" software. You may use a shareware
copy of QTAwk at no charge for a trial period of up to 21
days. If you wish to continue using QTAwk after that period,
you must purchase a registered copy. If you choose not to
purchase a registered copy, you must stop using QTAwk, though
you may keep copies and pass them along to others. You may
give QTAwk to others for noncommercial use IF:

=> All Files And Documentation Accompany The Programs.
=> The Files Are Not Modified In Any Way.

4. Registered Copies: Registered copies of QTAwk are
distributed to those who have purchased them from the
copyright holder.

5. Use of One Copy on Two Computers: If you have a registered
copy of QTAwk which is licensed for use on a single computer,
you may install it on two computers used at two different
locations (for example, at work and at home), provided there
is no possibility that the two computers will be in use at
the same time, and provided that you yourself have purchased
QTAwk, or if QTAwk was purchased by your employer, that you
have your employer's explicit permission to install QTAwk on
two systems as described in this paragraph. The right to


QTAwk - iv - QTAwk






install one copy of QTAwk on two computers is limited to
copies originally licensed for use on a single computer, and
may not be used to expand the number of systems covered under
a multi-system license.

6. Use of QTAwk on Networks or Multiple Systems: You may
install your registered copy of QTAwk on a computer attached
to a network, or remove it from one computer and install it
on a different one, provided there is no possibility that
your copy will be used by more users than it is licensed for.
A "user" is defined as one keyboard which is connected to a
computer on which QTAwk is installed or used, regardless of
whether or not the user of the keyboard is aware of the
installation or use of QTAwk in the system.

7. Making Copies: You may copy any version of QTAwk for normal
backup purposes, and you may give copies of the shareware
version to other individuals subject to paragraph (4) above.
You may not give copies of the registered version to any
other person for any purpose, without explicit written
permission from the copyright holder.

8. Distribution Restrictions: You may NOT distribute QTAwk
other than through individual copies of the shareware version
passed to friends and associates for their individual,
non-commercial use. Specifically, you may not place QTAwk or
any part of the QTAwk package in any user group or commercial
library, or distribute it with any other product or as an
incentive to purchase any other product, without express
written permission from the copyright holder and you may not
distribute for a fee, or in any way sell copies of QTAwk or
any part of the QTAwk package. If you are a shareware disk
vendor approved by the Association of Shareware Professionals
(ASP), you may place QTAwk in your library without prior
written permission, provided you notify the copyright holder
within 15 days of doing so and provided your application has
been fully approved in writing by the ASP, and is not simply
submitted or awaiting review.

9. Use of QTAwk: QTAwk is a powerful program. While we have
attempted to build in reasonable safeguards, if you do not
use QTAwk properly you may destroy files or cause other
damage to your computer software and data. You assume full
responsibility for the selection and use of QTAwk to achieve
your intended results. As stated below, the warranty on QTAwk
is limited to replacement of a defective program diskette or
manual.



QTAwk - v - QTAwk






10. LIMITED WARRANTY: All warranties as to this software,
whether express or implied, are disclaimed, including without
limitation any implied warranties of merchantability, fitness
for a particular purpose, functionality or data integrity or
protection are disclaimed.

11. Satisfaction Guarantee: If you are dissatisfied with a
registered copy of QTAwk for any reason (whether or not you
find a software error or defect), you may return the entire
package at any time up to 90 days after purchase for a full
refund of your original registration fee.

Questions may be sent to:

Pearl Boldt
Quik Trim
13012 Birdale Lane
Darnestown, MD 20878
CompuServe ID: 72040.434































QTAwk - vi - QTAwk






QTAwk 4.20 Order Form
Utility Creation Program
Version 4.20 10-10-90
(c) Copyright 1988 - 1990 Pearl Boldt. All Rights Reserved.

Return to:
Pearl Boldt
Quik Trim
13012 Birdale Lane
Darnestown, MD 20878

Make all Checks Payable to: Pearl Boldt

Name:
Company:
Address:

Phone:

Register QTAwk to: Company (___) or Individual (___)
Send information on: Site Licenses (___), Reseller Pricing (___)

I have read and agree to abide by the QTAwk license agreement,

Signature:

Where did you hear about QTAwk?


Quantity Price

Disk, manual, next update ($50/copy): ________ $ ________.____
Disk only, no update ($35/copy): ________ $ ________.____

Disk size: ___ 5.25" acceptable ___ 3.5" required

Subtotal $ ________.____

Shipping charges, per copy: $ ________.____

Disk, manual, next update: ³ Disk only:
US standard - included ³ US standard - included
US 2-day - $8.00 (US) ³ US 2-day - $8.00 (US)
Canada (air) - $5.00 (US) ³ Canada (air) - $3.00 (US)
All Others (air) - $10.00 (US) ³ All Others (air) - $5.00 (US)

Total enclosed: $ ________.____



QTAwk - vii - QTAwk






===> Please read the following before ordering! <===

Order Information

International Orders:

Orders from outside the U.S. must be paid by a check or money
order in U.S. funds and drawn on a U.S. bank; or by an
international postal money order in U.S. dollars. Checks which
are not in U.S. funds and drawn on a U.S. bank will be returned
due to extremely high charges imposed by U.S. banks to collect
the funds. Purchase orders (minimum $200) can be accepted from
outside the U.S., but you must contact us before ordering.

Company Purchase Orders:

Purchase orders for amounts of $100 and over are accepted from
established U.S. companies; orders under $100 are accepted but
must be prepaid. Have your purchasing agent contact Pearl Boldt
for terms. Credit references will be required for new customers.

Multi-System Licenses:

Multi-system licensing arrangements are available for network,
site, and corporate use of QTAwk. Check the line on the order
form or con- tact us for more information. A sample schedule of
license fees is below; contact us for pricing on the exact number
of systems you wish to license. The fee includes a master
diskette and one manual; addi- tional manuals are $10 each (less
for over 100 copies).


Systems Price Systems Price Systems Price
2 85.00 15 425.00 50 1,150.00
3 120.00 20 550.00 60 1,160.00
4 155.00 25 675.00 70 1,320.00
5 190.00 30 750.00 80 1,480.00
10 330.00 40 950.00 100 1,800.00

Return to:
Pearl Boldt
Quik Trim
13012 Birdale Lane
Darnestown, MD 20878

Make all Checks Payable to: Pearl Boldt




QTAwk - viii - QTAwk






QTAwk Update History

==> QTAwk Version 4.02. This version contains two additions
from the previous versions:

1. The command line argument, double hyphen, "--", stops
further scanning of the command line for options. The double
hyphen argument is not passed to the QTAwk utility in the
ARGV array or counted in the ARGC variable. Since QTAwk only
recognizes two command options, this has been included to be
compatible with the latest Unix(tm) conventions.

2. The built-in array ENVIRON has been added. This array
contains the environment strings passed to QTAwk. Changing a
string in ENVIRON will have no effect on the environment
strings passed in the QTAwk "system" built-in function.
Environment strings are set with the PC/MS-DOS "SET" command.
The strings are of the form:

name = string

where the blanks on either side of the equal sign, '=', are
optional and depend on the particular form used in the "SET"
command. The QTAwk utility may scan the elements of ENVIRON
for a particular name or string as desired.


==> QTAwk Version 4.10. This version contains one addition from
the previous versions:

1. In previous versions, the GROUP pattern keyword could accept
patterns consisting only of a regular expression constant.
For version 4.10, The GROUP pattern keyword has been expanded
to accept {re] constants, string constants and variables. The
variables are evaluated at the time the GROUP patterns are
first utilized to scan an input record. The value is
converted to string form and interpreted as a regular
expression.

GROUP /regular expression constant/ { ... }
GROUP "string constant" { ... }
GROUP Variable_name { ... }

GROUP patterns are still converted into an internal form for
regular expressions only once, when the pattern is first used
to scan an input line. Any variables in a GROUP pattern will
be evaluated, converted to string form and interpreted a
regular expression.


QTAwk - ix - QTAwk






==> QTAwk Version 4.20, dated 10/11/90. This version contains
three additions from the previous versions:

1. The behavior of the RS pre-defined variable has been
changed. It is now similar to the behavior of the FS
variable. If RS is assigned a value, which when converted to
a string value, is a single character in length, then that
character becomes the record separator. If the string is
longer in length than a single character, then it is treated
as a regular expression. The string matching the regular
expression is treated as a record separator. As for FS, the
string value is converted to the internal regular expression
form when the assignment is made.

2. Two new functions have been added:
getc() --> reads a single character from the current input
file. The character is returned by the function.
fgetc(file) --> reads a single character from the file
'file'. The character is returned by the function.

These functions allow the user to naturally obtain single
characters from any file including the standard input file
(which would be the keybord if not redirected or piped).

3. Error messages now have a numerical value displayed in
addition to the short error message. The error messages are
listed in numerical order in the QTAwk documentation with a
short explanation of the error. In some cases, an attempt has
been made to provide guidance as to what may have caused the
error and possible remedies. Since the error messages are
generated at fixed points within QTAwk and may be caused by
different reasons in different utilities during
interpretation or during execution on data files, it is not
possible to list every possible reason for the display of the
error messages. The line number within the QTAwk utility on
which the error was discovered and the input data file record
number are provided in the error message to provide some help
to the user in attempting to ascertain the real reason for
the error.











QTAwk - x - QTAwk






Introduction

QTAwk is called a Utility Creation Tool and not a programming
language because it is intended for the average computer user as
well as the more experienced user and programmer. QTAwk has been
designed to make it easy for the average user to create those
small, or maybe not so small, utilities needed to accomplish
small, or not so small, everyday jobs. The jobs which are too
small to justify the time and cost of using the traditional
computer programming language and maybe hiring a professional
programmer to accomplish.

This paper presents a description of the QTAwk utility creation
tool and its use. Most computer users have many small tasks to
accomplish that are usually left undone for lack of the proper
tool. Typically these tasks require finding one or more records
within a file and executing some action depending on the record
located.

In order to accomplish these tasks the user needs a tool which
will allow the following to be accomplished easily:

1. reading files record by record,

2. spliting (parsing) the records read into words or fields,

3. determining if a record, or records, satisfy a
pre-determined match criteria, i.e. finding the "proper"
record(s),

4. when the proper records are found, executing some action or
actions on the records or fields of the records.

QTAwk supplies the user with all of these features in an easy to
use manner. Specifying the name of a file is all the user need do
to open the file and read it record by record. The user may
easily change what a "record" is or let it default to an ASCII
text line as used by all text editors and which can be written by
all word processors. QTAwk will automatically split (parse)
records into fields. Initially a field is a word or a sequence of
non-blank characters. The user may change the definition of a
field easily to adapt to the needs of a particular situation.

Arithmetic expressions, logical expressions or regular
expressions may be used to define the criteria for selecting
records for action. Regular expressions are a powerful means of
describing the criteria for selecting, i.e matching, the text of
records. Arithmetic expressions utilize the ordinary arithmetic


QTAwk - xi - QTAwk






operators (addition, subtraction, multiplication, etc.) for
describing the criteria for selecting records and logical
expressions utilize the logical operators (less than, equal to,
greater than, etc.) for selecting records.

Of all the operators available in QTAwk, the regular expression
operators may be only ones most readers are not familiar with.
Regular expressions are a powerful and useful tool for working
with text. Yet for all their power, they are surprisingly simple
and easy to use when learned. Chapter Two explains regular
expressions fully, in a manner that will make them usable by a
person totally unfamiliar with them.

QTAwk is patterned after The Awk Programming Language by Alfred
V. Aho, Brian W. Kernighan and Peter J. Weinberger. The Awk
program implementing The Awk Programming Language is available on
most Unix (tm) systems. Aho, Kernighan and Weinberger invented
the automatic input loop and the pattern-action pairs used in
QTAwk and are to be heartily congratulated for this. Without Awk,
QTAwk would not exist. QTAwk is an extensive expansion of The Awk
Programming Language in many important aspects. In addition, some
of the admitted shortcommings of The Awk Programming Language
have been corrected.

A short summary of the major differences between QTAwk and Awk
is given below. Appendix II contains a more detailed listing of
the differences.

1. Expanded set of regular expression operators,
2. Use of "named expression"s in regular expressions to simplify
construction of complicated regular expressions,
3. Expanded arithmetic operator set,
4. Expanded set of pre-defined patterns giving more control in
the sequence of utility execution,
5. True multi-dimensional arrays
6. Integration of the multi-dimensional arrays with the
arithmetic operators allowing the assignment of and operation
on entire arrays.
7. Integration of the multi-dimensional arrays with user-defined
functions allowing the use of arrays in functions in a
natural and intuitive manner,
8. Expanded set of keywords allowing local variables,
'switch'/'case' flow control, and premature closure of the
current input file,
9. Expanded set of arithmetic and string built-in functions, a
new array function, and new variable access functions,
10. Corrected input function syntax,
11. Added new Input/Output functions,


QTAwk - xii - QTAwk






12. Expanded formatted I/O capability,
13. Expanded user-defined functions allowing variable number of
arguments,
14. New user controlled utility execution trace capability,
15. Expanded list of built-in variables giving more control and
access to QTAwk utility execution.












































QTAwk - xiii - QTAwk






Table of Contents


Table of Contents

QTAwk License ............................................... iii
== Registration Information ................................. iii
== Upgrade Information ...................................... iii
== QTAwk License Agreement ................................... iv
QTAwk 4.20 Order Form ....................................... vii
== Order Information ....................................... viii
== International Orders: ................................... viii
== Company Purchase Orders: ................................ viii
== Multi-System Licenses: .................................. viii
Update History ............................................... ix
Introduction ................................................. xi

1.0 TUTORIAL ................................................ 1-1
1.1 Data .................................................... 1-1
1.2 Running QTAwk ........................................... 1-2

2.0 REGULAR EXPRESSIONS ..................................... 2-1
2.1 'OR' Operator ........................................... 2-2
2.2 Character Classes ....................................... 2-2
2.3 Closure ................................................. 2-4
2.4 Repetition Operator ..................................... 2-6
2.5 Escape Sequences ........................................ 2-8
2.6 Position Operators ...................................... 2-9
2.7 Examples ................................................ 2-9
2.8 Look Ahead Operator .................................... 2-11
2.9 Match Classes .......................................... 2-11
2.10 Named Expressions ..................................... 2-12
2.11 Predefined Names ...................................... 2-15
2.12 Operator Summary ...................................... 2-17

3.0 EXPRESSIONS ............................................. 3-1
3.1 New/Changed Operators ................................... 3-2
3.2 Sequence Operator ....................................... 3-4
3.3 Match Operator Variables ................................ 3-5
3.4 Constants ............................................... 3-5

4.0 STRINGS and REGULAR EXPRESSIONS ......................... 4-1
4.1 Regular Expression and String Translation ............... 4-1
4.2 Regular Expressions in Patterns ......................... 4-1

5.0 PATTERN-ACTIONS ......................................... 5-1
5.1 QTAwk Patterns .......................................... 5-1
5.2 QTAwk Predefined Patterns ............................... 5-2



.............................- xiv -.............................






................................................Table of Contents


6.0 VARIABLES and ARRAYS .................................... 6-1
6.1 QTAwk Arrays ............................................ 6-2
6.2 QTAwk Arrays in Arithmetic Expressions .................. 6-3

7.0 GROUP PATTERNS .......................................... 7-1
7.1 GROUP Pattern Advantage ................................. 7-1
7.2 GROUP Pattern Disadvantage .............................. 7-1
7.3 GROUP Pattern Regular Expressions ....................... 7-2

8.0 STATEMENTS .............................................. 8-1
8.1 QTAwk Keywords .......................................... 8-1
8.2 'cycle' and 'next' ...................................... 8-1
8.3 'delete' and 'deletea' .................................. 8-3
8.4 'if'/'else' ............................................. 8-5
8.5 'in' .................................................... 8-5
8.6 'switch', 'case', 'default' ............................. 8-5
8.7 Loops ................................................... 8-7
8.8 'while' ................................................. 8-7
8.9 'for' ................................................... 8-7
8.10 'do'/'while' ........................................... 8-8
8.11 'local' ................................................ 8-8
8.12 'endfile' .............................................. 8-9
8.13 'break' ............................................... 8-10
8.14 'continue' ............................................ 8-10
8.15 'exit opt_expr_list' .................................. 8-10
8.16 'return opt_expr_list' ................................ 8-10

9.0 BUILT-IN FUNCTIONS ...................................... 9-1
9.1 Arithmetic Functions .................................... 9-1
9.2 String Functions ........................................ 9-3
9.3 I/O Functions ........................................... 9-9
9.4 Miscellaneous Functions ................................ 9-11
9.4.1 Expression Type ...................................... 9-11
9.4.2 Execute String ....................................... 9-11
9.4.3 Array Function ....................................... 9-13
9.4.4 System Control Function .............................. 9-14
9.4.5 Variable Access ...................................... 9-14

10.0 FORMAT SPECIFICATION .................................. 10-1
10.1 Output Types .......................................... 10-2
10.2 Output Flags .......................................... 10-3
10.3 Output Width .......................................... 10-4
10.4 Output Precision ...................................... 10-4

11.0 USER-DEFINED FUNCTIONS ................................ 11-1
11.1 Local Variables ....................................... 11-1


.............................- xv -..............................






................................................Table of Contents


11.2 Argument Checking ..................................... 11-1
11.3 Variable Length Argument Lists ........................ 11-2
11.4 Null Argument List .................................... 11-3
11.5 Arrays and Used-Defined Functions ..................... 11-3

12.0 TRACE STATEMENTS ...................................... 12-1
12.1 Selective Statement Tracing ........................... 12-1
12.2 Trace Output .......................................... 12-1

13.0 BUILT-IN VARIABLES .................................... 13-1
13.1 User Function Variable Argument Lists ................. 13-5

14.0 COMMAND LINE INVOCATION ............................... 14-1
14.1 Multiple QTAwk Utilities .............................. 14-1
14.2 Setting the Field Separator ........................... 14-2
14.3 Setting Variables on the Command Line ................. 14-2
14.4 QTAwk Execution Sequence .............................. 14-3

15.0 LIMITS ................................................ 15-1

16.0 Appendix I ............................................ 16-1

17.0 Appendix II ........................................... 17-1

18.0 Appendix III .......................................... 18-1

19.0 Appendix IV ........................................... 19-1

20.0 Appendix V ............................................ 20-1



















- xvi -






Section 1.0 Tutorial


E-1.0 TUTORIALF-€

E1.1 DataF

QTAwk is designed to be used to search data or text files using
short user created utilities. The types of files that QTAwk is
designed to work with are "text" files, commonly called ASCII
files. The files contain user readable text and numbers. The text
is contained in lines and the lines end with carriage-return,
new-line character pairs or single new-line characters. Text
files are written by application programs and word processors and
text editors.

The information in the files is grouped by fields on a single
line or by lines separated by a blank line or some other
"special" characters. For example, the following lines list
information on various states:

US # 10461 # 4375 # MD # Annapolis ( Maryland )
US # 40763 # 5630 # VA # Richmond ( Virgina )
US # 2045 # 620 # DE # Dover ( Delaware )
US # 24236 # 1995 # WV # Charleston ( West Virginia )
US # 46047 # 12025 # PA # Harrisburg ( Pennsylvania )
US # 7787 # 7555 # NJ # Trenton ( New Jersey )
US # 52737 # 17895 # NY # Albany ( New York )
US # 9614 # 535 # VT # Montpelier ( Vermont )
US # 9278 # 975 # NH # Concord ( New Hampshire )
US # 33265 # 1165 # ME # Augusta ( Maine )

Each line, or record in QTAwk, consists of 12 words. The 12
words of the first record are:

1: US
2: #
3: 10461
4: #
5: 4375
6: #
7: MD
8: #
9: Annapolis
10: (
11: Maryland
12: )

The first word lists the country, the third word lists the state


QTAwk - 1-1 - QTAwk






Section 1.1 Tutorial


area in square miles, the fifth word lists the state population
in thousands. the seventh word lists the state abbreviation, the
nineth word lists the state capital, and the eleventh word lists
the state name. The second, fourth, sixth, eighth, tenth and last
words are word separators. The word separators are not necessary
for QTAwk, but make each line easier for people to read. A copy
of this entire file, states.dta, is given in Appendix IV.

This information could be manipulated in various ways. A few of
the ways in which this could be done are:
1. the manner of listing changed, or
2. only lines meeting certain criteria listed:
a) those states with a minimum area,
b) those states with a minimum population,
c) population greater than a minimum and less than a
maximum,
d) area less than a maximum and population greater than a
minimum,
e) population density (population / area) less than a
maximum.
3. the list could be sorted
a) alphabetically by
1: state capital,
2: state name,
3: state abbreviation.
b) by area,
c) by population.
4. some Information could be deleted from the list such as the
capital.

There are many more ways to manipulate the information. In order
to do so the information in the list must first be read record by
record and each record split into its constituent parts. Once the
parts for each record have been determined, the information can
be easily manipulated, changed, or rearranged.

E1.2 Running QTAwkF

QTAwk is started from the DOS command prompt, giving the QTAwk
utility to run and the files to search. The QTAwk utility may be
written directly on the command line or contained in one or more
files named on the command line. If given on the command line, it
is usually enclosed in double quotes:

QTAwk "$5 > 50000 {print;}" states.dta



QTAwk - 1-2 - QTAwk






Section 1.2 Tutorial


This QTAwk utility will print the record for every state for
which the area is greater than 50,000 square miles.

The example shows the form of QTAwk utilities, a sequence of
patterns and actions in the form:

pattern1 { action1 }
pattern2 { action2 }
pattern3 { action3 }
.

.
.

QTAwk opens the files named on the command line, reads a record,
splits (parses) each record into the individual words or fields
and compares the record with each pattern in the order in which
they have been written in the QTAwk utility. If the record
matches a pattern, the corresponding action contained in braces
is executed.

Patterns may be arithmetic expressions, logical expressions,
regular expressions or combinations of all three types of
expressions. The example above has a logical expression pattern.

Under DOS, programs indicate the end of text lines in ASCII
files with a Carriage Return, Newline character pair. QTAwk
follows the practice of converting all such pairs to a single
newline when reading the file. In writing files, QTAwk converts
single Newline characters to a Carriage Return, Newline pair.

For the data in the "states" data file, a question that may be
asked is the total population of Canada. The first field can be
used to identify the data for Canada and the fifth field contains
population data. The following utility will sum the population
data for Canada:

$1 == "Canada" { Total += $5 }
END { print Total; }

In this example, when the first field of a record is equal to
"Canada", the fifth field is accumulated into the variable Total.
When all records have been processed, Total is printed. The
printing of Total is accomplished in the action associated with
the pattern 'END'. 'END' is a pre-defined pattern, the associated
action is executed after closing the input file.



QTAwk - 1-3 - QTAwk






Section 1.2 Tutorial


The remaining chapters explain QTAwk expressions, patterns,
action statements and more. All of these are combined into a
QTAwk utility. In using and creating QTAwk utilities, the user
needs to remember the fundamental QTAwk processing sequence:

1. QTAwk opens each input file and reads the file record by
record,
2. as each record is read, it is split into fields,
3. the record is then compared against the patterns for matches,
4. When a match is found, the associated action is executed.

Keeping this fundamental loop in mind will make using QTAwk very
simple indeed.



































QTAwk - 1-4 - QTAwk






Section 2.0 Regular Expression


E-2.0 REGULAR EXPRESSIONSF-€

Regular expressions are a means of describing sequences of
"characters". In the discussion of QTAwk, "character" will be
taken to mean any character from the extended ASCII sequence of
characters from ASCII '1' to ASCII '255'. Appendix I contains a
listing of the ASCII characters with both their decimal and
hexadecimal equivalent.

A string is a finite sequence of characters. The length of a
string is the number of characters contained in the string. A
special string is the empty string, also called the null string,
which is of zero length, i.e., it contains no characters. We
shall use the symbol 'î' below to refer to the null string.

Another way to think of a string is as the concatenation of a
sequence of characters. Two strings may be concatenated to form
another string. Concatenating the two strings:

"abcdef"

and

"ghijklmn"

forms the third string:

"abcdefghijklmn"

In many instances, it is desirable to describe a string with
several alternatives for one or more of the characters. Thus we
may wish to find the strings:

FRED

or

TED

A convenient manner of describing both strings with the same
regular expression is

/(FR|T)ED/

Strings in QTAwk are enclosed in double quotes, ", and regular
expressions are enclosed in slashes, '/'.


QTAwk - 2-1 - QTAwk






Section 2.1 Regular Expression


E2.1 'OR' OperatorF

The symbol '|' means "or" and so the above regular expression
would be read as: The string "FR" or the string "T" concatenated
with the string "ED". The parenthesis are used to group strings
into equivalent positions in the resultant regular expression. In
this manner it is possible to build a regular expression for
several alternative strings.

In many instances it is also desirable to build regular
expressions that contain many alternatives for one character,
i.e., one character strings. For example, we may want to find all
instances of the words "doing" or "going". We could build the
regular expression:

/(d|g)oing/

E2.2 Character ClassesF

Although the last regular expression is a fairly simple example,
it serves to introduce the notion of "character class". If we
define the notation:

[dg] = (d|e)

then we may write the regular expression as:

/[dg]oing/

The character class notation saves us from having to explicitly
write the "or" symbols in the regular expression. The "or" is
implied between each character of the class.

Now suppose that we wanted to expand our search to all five
letter words ending in "ing" and starting with any lower-case
letter and having any lower-case letter as the second character.
We would write the regular expression:

/(a|b|c|d|...|x|y|z)(a|b|c|d|...|x|y|z)ing/

or

/[abcd...xyz][abcd...xyz]ing/

Regular expressions in these cases can not only get very long,
but can be very tedious to write and are very prone to error. We


QTAwk - 2-2 - QTAwk






Section 2.2 Regular Expression


introduce the notion of a range of characters into the character
class and define:

[a-z] = [abcd...xyz] = (a|b|c|d|...|x|y|z)

The above regular expression can now be written:

/[a-z][a-z]ing/

a considerable savings and less error prone. The hyphen, '-', is
recognized as expressing a range of characters only when it
occurs within a character class. Within character classes, the
hyphen loses this significance in the following three cases:

1. when it is the first character of the character class, e.g.,

[-b] = (-|b)

2. when it is the last character of the character class, e.g.,

[b-] = (b|-)

3. when the first character of the indicated range is greater
in the ASCII collating sequence than the second character of
the indicated range, e.g.,

[z-a]

would be recognized as:

(z|-|a)

In interpreting the range notation in character classes, QTAwk
uses the ASCII collating sequence.

[0-Z]

is equivalent to:

[0123456789:;<=>?@A-Z]

Continuing the last example, if we did not want to limit the
first character to lower-case, but also wanted to include the
possibility of upper-case letters, we could use the following
regular expression:



QTAwk - 2-3 - QTAwk






Section 2.2 Regular Expression


/([A-Z]|[a-z])[a-z]ing/

This regular expression allows the first letter to be any
character in the range from A to Z or in the range from a to z.
But the "or" is implied in character classes, shortening the
above regular expression to:

/[A-Za-z][a-z]ing/

If we now wish to expand the above from all five letter words
ending in "ing" to all six letter words ending in "ing", we could
write the regular expression as:

/[A-Za-z][a-z][a-z]ing/

In general, if we did not want to specify the number of
characters between the first letter and the "ing" ending, we
could specify an regular expression as:

/[A-Za-z](î|[a-z])(î|[a-z])...(î|[a-z])ing/

By specifying the null string in the 'or' regular expression,
the regular expression allows a character in the range a to z or
no character to match. The shortest string matched by this
regular expression would be a single upper or lower case letter
followed by "ing". The regular expression would also match any
string starting with an upper or lower case letter with any
number of lower case letters following and ending in "ing".

E2.3 ClosureF

What we need to describe this regular expression is a notation
for specifying "zero or more" copies of a character or string.
Such a notation exists and is written as:

/[A-Za-z][a-z]*ing/

where the notation

[a-z]*

means zero or more occurrences of the character class [a-z].
This operation is called closure and the '*' is called the
closure operator. In general, the notation may be used for any
regular expression within a regular expression. The following are
valid regular expressions using the notion of zero or more


QTAwk - 2-4 - QTAwk






Section 2.3 Regular Expression


occurrences of an regular expression within another regular
expression:

/mis*ion/

would match "miion", "mision", "mission", "misssion",
"missssion", etc.

/bot*om/

would match "boom", "botom", "bottom", "botttom", "bottttom,
etc.

/(Fr|T)*ed/

would match "ed", "Fred", "Ted", "FrFred", "TTed", "FrFrFred",
"TTTed", "FrTFred", "FrFrTed", "TFrFred", etc.

As an extension to the '*' operator, we frequently would want to
search for one or more occurrences of a regular expression. As
above we would write this as:

/[A-Za-z][a-z][a-z]*ing/

The [a-z][a-z]* construct would ensure that at least one letter
occurred between the initial letter and the string "ing". This
occurs often enough that the notation

[a-z]+ = [a-z][a-z]*

has been adopted to handle this situation. Thus use the operator
'*' for zero or more occurrences and the operator '+' for one or
more occurrences. The '+' operator is called the positive closure
operator.

In many cases it is desirable to search for either zero or one
regular expression. For example, it would be desirable to search
for names preceded by either Mr or Mrs The regular expression:

/Mrs*/

would find: Mr and Mrs and Mrss and Mrsss, etc. The following
regular expression will accomplish what we really want in this
case:

/Mr(î|s)/


QTAwk - 2-5 - QTAwk






Section 2.3 Regular Expression


This regular expression would find 'Mr' followed by zero or one
's'.

The operator '?' has been selected to denote 'zero or one' of
the preceding regular expression. Thus,

/Mrs?/ = /Mr(î|s)/

E2.4 Repetition OperatorF

In some cases we wish to specify a minimum and maximum repeat
count for a regular expression. For example, suppose it was
desirable for a regular expression to contain a minimum of 2 and
a maximum of 4 copies of "abc". We could specify this as:

/abcabc(abc)?(abc)?/

The notation {2,4} has been adopted for expressing this. The
general form of the repetition operator is {n1,n2}. n1 and n2 are
integers, with n1 greater than or equal to 1 and n2 greater than
or equal to n1, 1 <= n1 <= n2. A repetition count would be
specified as:

/r{n1,n2}/ = /rrrrrrrrrrrrrr?r?r?r?r?r?/
³<ÄÄÄ n1 ÄÄÄ>³ ³
³<ÄÄÄÄÄÄÄÄ n2 ÄÄÄÄÄÄÄÄ>³

The above could be expressed as:

/(abc){2,4}/ = /(abc)(abc)(abc)?(abc)?/

Since the repetition operator repeats the immediately preceding
regular expression, the parenthesis around "abc" are necessary to
repeat the whole string. Without the parenthesis the regular
expression would expand as:

/abc{2,4}/ = /abccc?c?/

The repetition operator can be used to repeat either single
characters, groups of characters, character classes or quoted
strings. The use of the operator is illustrated below for each
case:

1. Single characters:

/abc{2,4}/ = /abccc?c?/


QTAwk - 2-6 - QTAwk






Section 2.4 Regular Expression


2. Groups of regular expressions:

/(abc){2,4}/ = /(abc)(abc)(abc)?(abc)?/

3. character classes:

/[abc]{2,4}/ = /[abc][abc][abc]?[abc]?/

4. quoted string:

/"abc"{2,4}/ = /"abcabc(abc)?(abc)?"/

For quoted strings, the whole of the string contained within
quotes is repeated, with all repetitions maintained within
the quotes.

5. named expressions (described later):

/{abc}{2,4}/ = /{abc}{abc}{abc}?{abc}?"/

A special case exists for character classes in which the class
of characters to exclude is greater than the class of characters
to include. For example, suppose that we wanted in a certain
character position to include all characters that weren't
numerics. We could build a character class of all characters and
leave the numerics out. An easier method is to use the
"complemented" or "negated" character class. A special operator
has been introduced for this purpose. The logical NOT symbol,
'!', occurring as the first character in a character class,
negates the class, i.e., any character NOT in the class is
recognized at the character position.

Thus, to define the negated character class of all characters
which are not numerics, we would specify:

[!0-9]

To define all characters except the semi-colon, we would
specify:

[!;]

Note that the symbol '!' has this special meaning only as the
FIRST character in a character class. The caret symbol, '^', as
the FIRST character in a character class may also be used to
negate a character class. Traditionally, the caret been used for


QTAwk - 2-7 - QTAwk






Section 2.4 Regular Expression


this purpose, but QTAwk allows the logical NOT operator, '!'
also.

Utilizing the above concepts for building regular expressions by
concatenating characters, concatenating regular expressions to
build more complicated regular expressions, using parenthesis to
nest regular expressions within regular expressions, using
character classes to denote constructs with implied "or"s, using
the closure operators, '*', '+' and '?', and the repetition
operator, {n1,n2}, for expressing multiple copies, very
complicated regular expressions may be built for searching for
strings in files.

E2.5 Escape SequencesF

To round out the ability for building regular expressions for
searching, we need only a few more tools. In some cases we may
wish for the regular expression to contain blanks or tab
characters. In addition, other non-printable characters may be
included in regular expressions. These characters are defined
with "escape sequences". Escape sequences are two or more
characters used to denote a single character. The first character
is always the backslash, '\'. The second character is by
convention a letter as follows:

\a == bell (alert) ( \x07 )
\b == backspace ( \x08 )
\f == formfeed ( \x0c )
\n == newline ( \x0a )
\r == carriage return ( \x0d )
\s == space ( \x20 )
\t == horizontal tab ( \x09 )
\v == vertical tab ( \x0b )
\c == c [ \\ == \ ]
\ooo == character represented by octal value ooo
1 to 3 octal digits acceptable
\xhhh== character represented by hexadecimal value hhh
1 to 3 hexadecimal digits acceptable

Any other character following the backslash is translated to
mean that character. Thus '\c' would become a single 'c', '\['
would become '[', etc. The latter is necessary in order to
include such characters as '[', ']', '-', '!', '(', ')', '*',
'+', '?' in regular expressions without invoking their special
meanings as regular expression operators.



QTAwk - 2-8 - QTAwk






Section 2.6 Regular Expression


E2.6 Position OperatorsF

Three additional special characters have, by convention, been
defined for use in writing regular expressions, namely the period
'.', the caret, '^' and the dollar sign, '$'. The period has been
assigned to mean "any character" in the set of characters except
the newline character, '\n'. For our use the period means any
character from ASCII 1 to ASCII 9 inclusive and ASCII 11 to ASCII
255 inclusive.

The caret and the dollar sign are position indicators and not
character indicators. The caret, '^', is used to indicate the
beginning or start of the search string. Thus, any character
following the caret in a regular expression must be the first
character of the string to be searched otherwise the match fails.
The dollar sign , '$', is used to indicate the end of the search
string. Thus, any character preceding the dollar sign in a
regular expression must be the last character of the string to be
searched or the match fails.

To indicate "beginning of line", the caret must be in the first
character position of a regular expression. Similarly, to
indicate "end of line", the dollar sign must be in the last
character position of a regular expression. In any other
position, these characters lose their special significance. Thus,
the regular expression:

/(^|[\s\t])A/

means that 'A' must be the first character on a line, or be
preceded by a space or tab character to match. Similarly

/A($|[\s\t])/

means that 'A' must be the last character on a line or be
followed by a space or tab character.

E2.7 ExamplesF

The regular expression:

/[A-Za-z][a-z]\s+.*/

will match an upper or lower-case letter followed by a
lower-case letter followed by one or more blanks followed by any
character except a newline zero or more times.


QTAwk - 2-9 - QTAwk






Section 2.7 Regular Expression


The regular expression:

/\([A-Z]+\)[!\s]+/

will match a left parenthesis followed by one or more uppercase
letters followed by a right parenthesis followed by one or more
characters which are not blanks.

The regular expression:

/[\s\t]+ARCHIVE([\s\t]+|$)/

will match a blank or tab followed by the word (in upper-case)
"ARCHIVE" followed either by one or more blanks or tabs or by the
end of line. Note this kind of construct is handy for finding
words as independent units and not buried within other words.

The regular expression:

/([\s\t]+|$)/

is necessary to find words with trailing blanks or that end the
search line. If only [\s\t]+ had been used then words ending the
search line would not be found, since there are no trailing
blanks or tabs.

Note that for files with the newline character, '\n', at the end
of all lines, commonly called ASCII text files, it is possible to
search for regular expressions that may span more than one line.
For example, if we wanted to find all sequences of the names

Ted, Alice, George and Mary

that were separated by spaces, tabs or line boundaries, we would
write the following regular expression:

/[\t-\r\s]+Ted[\t-\r\s]+Alice[\t-\r\s]+Mary[\t-\r\s]/

The regular expression:

/^As\s+(Fred|Ted|Jed|Ned)\s+(began|ended)(\s+|$)/

will match the beginning of the search line followed by "As",
i.e., 'A' as the first character of the search line, followed by
one or more blanks followed by "Fred" or "Ted" or "Jed" or "Ned"
followed by one or more blanks followed by "began" or "ended"


QTAwk - 2-10 - QTAwk






Section 2.7 Regular Expression


followed by one or more blanks or the end of the search line.
This could be modified slightly to be:

/^As\s+(Fr|T|J|N)ed\s+(began|ended)(\s+|$)/

or

/^As\s+(Fr|[TJN])ed\s+(began|ended)(\s+|$)/

either form will result in exactly the same search.

E2.8 Look Ahead OperatorF

Sometimes it is necessary to find a regular expression, but only
when it is followed by another regular expression. Thus we wish
to find "Mr", but only when it is followed by "Smith". The
"look-ahead" operator, '@', is used to denote this situation. In
general, if r is a regular expression we wish to match, but only
when followed by the regular expression s, then we would express
this as:

/r@s/

Thus, to find "Mr", but only when followed by "Smith", we have:

/Mr@[\s\t]+Smith/

E2.9 Match ClassesF

There are also circumstances in which we wish to find pairs of
characters. For example, we wish to find all clauses in a letter
enclosed within parenthesis, "()", braces, "{}", or brackets,
"[]". We could write several separate regular expressions which
are identical except that one would use parenthesis, another
braces, etc. A simpler method has been introduced using the
concept of matched character classes. A matched character class
is denoted as:

[#\(\{\[] and [#\)\}\]]

The first instance of a "matched character class" in a regular
expression will match any character in the class. The second
instance will match only the character in the position of the

class matched by the first instance. For example, in the above
two classes, if the character that matched the first class was
'[', then only a ']' would match the second class and not a ')'


QTAwk - 2-11 - QTAwk






Section 2.9 Regular Expression


or a '}'. Note the use of the backslash above to avoid any
confusion in interpreting the characters "()", "{}", and "[]" as
characters and regular expression operators. Except for ']', the
backslash is not needed since the characters do act as operators
within a character class. For the character ']', the backslash is
necessary to prevent early termination of the character class.

Note that matched character classes cannot be nested. Thus, the
span of characters between two different matched character
classes cannot overlap. If we wanted to find regular expressions
contained within "([" and ")]" or within "{[" and "}]", the
instances of each in the regular expression could not overlap,
i.e., we could NOT write a regular expression like:

this /[#\(\[] exp [#\{\[] contains [#\)\]] two [#\}\]] matched/
³<ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ>³ ³
³ ³<ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ>³

This regular expression would be interpreted as:

/this [#\(\[] exp [#\{\[] contains [#\)\]] two [#\}\]] matched/
³<ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ>³ ³<ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ>³

E2.10 Named ExpressionsF

If the strings to be found using regular expressions are
complicated, the associated regular expressions can become very
difficult to understand. This makes it very hard to determine if
the regular expression is correct. For example, the regular
expression (as one line):

/^[A-Za-z_][A-Za-z0-9_]*([\s\t]+\**[A-Za-z_][A-Za-z0-9_]*)*
\((([\s\t]*[\*&]*[A-Za-z_][A-Za-z0-9_]*[\s\t]*)(,([\s\t]*
[\*&]*[A-Za-z_][A-Za-z0-9_]*[\s\t]*))*)*\)([\s\t]*
(\/\*.*\*\/)[\s\t]*)*$/

will find function definitions in C language programs.
Constructing and analysing this regular expression as a single
entity, is difficult.

Breaking such regular expressions into smaller units, which are
shorter and simpler, makes the task much easier. QTAwk has
introduced the concept of "named expressions" for this purpose.
Named expressions are QTAwk variable names enclosed in braces,
'{' '}'. In translating the regular expression into internal form
QTAwk, scans the regular expression for named expressions and


QTAwk - 2-12 - QTAwk






Section 2.10 Regular Expression


substitutes the current value of the variable named. If a
variable does not exist by the name specified, no substitution is
made.

By defining a variable:

fst = "first words";

Then the following regular expression:

/The {fst} of the child/

would expand into:

/The first words of the child/

Named expressions allow for building up regular expressions from
smaller more easily understood regular expressions and for
re-using the smaller regular expressions. The following example
QTAwk utility builds the previous regular expression for
recognizing C language function definitions (all on one line)
from many smaller regular expressions. Each constituent regular
expression is built to recognize a particular part of the
function definition. When combined into the final regular
expression, the three parts of the definition can be easily
understood. The final regular expression is expanded in the final
print statement. It spans several 80 character lines and is much
more difficult to understand due to its length and complexity.




















QTAwk - 2-13 - QTAwk






Section 2.10 Regular Expression


Example:
BEGIN {
# define variables for use in regular expressions:
# Define C name expression
c_n = /[A-Za-z_][A-Za-z0-9_]*/;
# Define C comment expression
# Note: Does NOT allow comment to span lines
c_c = /(\/\*.*\*\/)/;
# Define single line comment
c_slc = /({_w}*{c_c}{_w}*)*/;
# Define C name with pointer
c_np = /\**{c_n}/;
# Define C name with pointer or address
c_ni = /[\*&]*{c_n}/;
# Define C function type and name declaration
c_fname = /{c_n}({_w}+{c_np})*/;
# Define expression for first argument in function list
c_first_arg = /({_w}*{c_ni})/;
# Define expression for remaining argument in function list
c_rem_arg = /({_w}*,{c_first_arg})*/;
# Define C function argument list
c_arg_list = /\(({c_first_arg}{c_rem_arg})*\)/;
#
# Expression to find all C function definitions
totl_name = /^{c_fname}{c_arg_list}{c_slc}$/;
#
# print total expression to illustrate expansion of named
# expressions
# Refer to the description of the 'replace' function
#
print replace(totl_name);
}

The string output by this utility is:

^[A-Za-z_][A-Za-z0-9_]*([\s\t]+\**[A-Za-z_][A-Za-z0-9_]*)*
\((([\s\t]*[\*&]*[A-Za-z_][A-Za-z0-9_]*[\s\t]*)(,([\s\t]*
[\*&]*[A-Za-z_][A-Za-z0-9_]*[\s\t]*))*)*\)([\s\t]*
(\/\*.*\*\/)[\s\t]*)*$

Note that in printing the regular expression, the leading and
trailing slash, '/', were not printed.






QTAwk - 2-14 - QTAwk






Section 2.11 Regular Expression


E2.11 Predefined NamesF

In translating regular expressions, names starting with an
underscore and followed by a single upper or lower case letter
are reserved as predefined. The following predefined names are
currently available for use in named expressions:

Alphabetic
{_a} == [A-Za-z]
Brackets
{_b} == [{}()[]<>]
Control Character
{_c} == [\x001-\x01f\x07f]
Digit
{_d} == [0-9]
Exponent
{_e} == [DdEe][-+]?{_d}{1,3}
Floating point number
{_f} == [-+]?({_d}+\.{_d}*|{_d}*\.{_d}+)
Floating, optional exponent
{_g} == {_f}({_e})?
Hexadecimal digit
{_h} == [0-9A-Fa-f]
Integer
{_i} == [-+]?{_d}+
alpha-Numeric
{_n} == [A-Za-z0-9]
Octal digit
{_o} == [0-7]
Punctuation
{_p} == [\!-/:-@[-`{-\x07f]
double or single Quote
{_q} == {_s}["'`]
Real number
{_r} == {_f}{_e}
zero or even number of Slashes
{_s} == (^|[!\\](\\\\)*)
printable character
{_t} == [\s-~]
graphical character
{_u} == [\x01f-~]
White space
{_w} == [\s\t]
space, \t, \n, \v, \f, \r, \s
{_z} == [\t-\r\s]



QTAwk - 2-15 - QTAwk







Section 2.11 Regular Expression


The above predefined names will take precedence over any
variables with identical names in replacing named expressions in
regular expressions and the 'replace' function.













































QTAwk - 2-16 - QTAwk






Section 2.12 Regular Expression


E2.12 Operator SummaryF

The QTAwk regular expression operators are summarized below:

^ matches Beginning of Line
$ matches End of Line as last character of regular expression
\c matches following (hexadecimal value shown in parenthesis):
\a == bell (alert) ( \x07 )
\b == backspace ( \x08 )
\f == formfeed ( \x0c )
\n == newline ( \x0a )
\r == carriage return ( \x0d )
\s == space ( \x20 )
\t == horizontal tab ( \x09 )
\v == vertical tab ( \x0b )
\c == c [ \\ == \ ]
\ooo == character represented by octal value ooo
1 to 3 octal digits acceptable
\xhhh== character represented by hexadecimal value hhh
1 to 3 hexadecimal digits acceptable

. matches any character except newline, '\n'
[abc0-9] Character Class - match any character in class
[^abc0-9] Negated Character Class - match any character not in
class
[!abc0-9] Negated Character Class - match any character not in
class
[#abc0-9] Matched Character Class - for second match, class
character must match in corresponding position
* - Closure, Zero or more matches
+ - Positive Closure, One or More matches
? - Zero or One matches
r(s)t embedded regular expression s
r|s|t '|' == logical 'or' operator. Regular expression r or s or
t
@ - Look-Ahead, r@t, matches regular expression 'r' only when r
is followed by regular expression 't'. Regular expression t
not contained in final match. Symbol loses special meaning
when contained within parenthesis, '()', or character class,
'[]'.
r{n1,n2} - at least n1 and up to n2 repetitions of {re} r
n1, n2 integers with 1 <= n1 <= n2
r{2,6} ==> rrr?r?r?r?
r{3,3} ==> rrr
Expressions grouped by ", (), [], or names, "{name}"
repeated as a group: (Note the treatment of quoted {ex}s)


QTAwk - 2-17 - QTAwk






Section 2.12 Regular Expression


(r){2,6} ==> (r)(r)(r)?(r)?(r)?(r)?
[r]{2,6} ==> [r][r][r]?[r]?[r]?[r]?
{r}{2,6} ==> {r}{r}{r}?{r}?{r}?{r}?
"r"{2,6} ==> "rr(r)?(r)?(r)?(r)?"

{named_expr} - named expression. In regular expressions "{name}"
is replaced by the string value of the corresponding
variable. Unrecognized variable names are not replaced.








































QTAwk - 2-18 - QTAwk






Section 3.0 Expressions


E-3.0 EXPRESSIONSF-€

QTAwk provides a rich set of operators which may be used in
expressions. The QTAwk operators are listed below from highest to
lowest precedence:

Operation Operator Associativity
grouping () left to right
array subscripting [] left to right
field $ left to right
tag $$ left to right
logical negation (NOT) ! left to right
one's complement ~ left to right
increment/decrement ++ -- right to left
unary plus/minus + - left to right
exponentiation ^ right to left
multiply, divide, * / % left to right
remainder
binary plus/minus + - left to right
concatenation left to right
concatenation ï left to right
shift left/right << >> left to right
relational < <= > >= left to right
equality == != left to right
matching ~~ !~ left to right
array membership in left to right
bit-wise AND & left to right
bit-wise XOR @ left to right
bit-wise OR | left to right
logical AND && left to right
logical OR || left to right
conditional ? : right to left
assignment = ^= *= /= %= right to left
+= -= &= @= |=
<<= >>= ï=
sequence , left to right












QTAwk - 3-1 - QTAwk






Section 3.1 Expressions


E3.1 New/Changed OperatorsF

Note that QTAwk has changed some operators from C and Awk. QTAwk
has retained the Awk exponentiation operator (the C bitwise XOR
operator) and made '@' the bitwise XOR operator. QTAwk has
changed the Awk match operators to '~~' and '!~' to bring them
more in alignment with the equality operators, '==' and '!='.
This has freed up the single tilde to restore it to its C meaning
of one's complement. QTAwk has also brought forward the remainder
of the C operators: shift, '<<' and '>>', bit-wise operators,
'&', '@' and '|', and the sequence operator, ','.

QTAwk has retained the practice of forcing string concatenation
by placing two constants, variables or function calls adjacent.
QTAwk has introduced the string concatenation operator, 'ï'
(character 239, 0xef of the extended ASCII character set). The
string concatenation operator has the advantage of making
concatenation explicit and allowing the string concatenation
assignment operator, 'ï='. Thus, string concatenation operations
which previously had to be written as:

new_string = new_string old_string;

may now be written:

new_string ï= old_string;

Thus a loop to build a string of numerics which previously was
written as:

for( i = 8 , j = 9 ; i ; i-- ) j = j i;

can be written as:

for( i = 8 , j = 9 ; i ; i-- ) j ï= i;

and will produce a value for j of:

"987654321"

The string concatenation operator will make some constructs
work as expected. For example, the statements:

ostr = "prefix";
suff = "suffix";
k = 1;


QTAwk - 3-2 - QTAwk






Section 3.1 Expressions


j = ostr ++k suff;
print j;
print ostr;

will produce the seemly odd output:

prefix1suffix
1

This results from two factors:

1. In tokenizing the statements, white space is used to break
keyword, variable and function names. Otherwise it is
ignored.

2. The increment operator, '++', has higher precedence than
string concatenation.

Thus, QTAwk processes the following stream of tokens:

1. j
2. =
3. ostr
4. ++
5. k
6. suff
7. ;

In interpreting the stream, '++' is encountered immediately
after 'ostr' and is interpreted as a postfix operator operating
on 'ostr' instead of a prefix operator operating on 'k'. Thus,
the stream apears to QTAwk as:

j = ostr++ k suff;

After concatenating the current string value of ostr, "prefix",
with the string value of k, "1", ostr is converted to a numeric,
yielding a value of zero, 0, which is incremented to one, 1.

This seemingly anomalous situation can be remedied in two ways:

1. Surround ++k with parenthesis, thus explicitly binding '++'
to 'k':

j = ostr (++k) suff;



QTAwk - 3-3 - QTAwk






Section 3.1 Expressions


2. Use the string concatenation operator, 'ï', to make explicit
the string concatenation:

j = ostr ï ++k suff;

or

j = ostr ï ++k ï suff;

The output produced by this, is what was really desired:

prefix2suffix
prefix


In addition, QTAwk has added one operator, the tag operator,
'$$'. The tag operator is analogous to the field operator, but
can be followed only by the single numerical value of zero (0).
This operator returns the string matched by a regular expression.
When used in an action, the last regular expression match in the
corresponding pattern will set the value of the tag operator. If
there was no regular expression match in the pattern, $$0 is the
null string. The operator may also be used in the 'sub' and
'gsub' functions in the same manner that '&' is used. Regular
expressions used in actions will not disturb the value of the tag
operator set by the pattern. The pattern/action pair:

/[-+]?[0-9]+/ {
print $$0;
if ( $0 ~~ /[789]/ ) print $$0;
}

With the input line:

this line contains an integer 12745

both 'print' statements will output "12745"

E3.2 Sequence OperatorF

QTAwk uses the C sequence operator, the comma, ','. Using the
sequence operator, expressions may be combined into an expr_list:

expression_1 , expression_2 , expression_3 , ...

As in C, a list of expressions separated by the sequence


QTAwk - 3-4 - QTAwk






Section 3.2 Expressions


operator is valid anywhere an expression is valid. Such lists of
expressions separated by the sequence operator will be referred
to as an expression list or expr_list. Each expression in an
expr_list is evaluated in turn. The final value of the expr_list
is the value of the last expression. The sequence operator is
very useful in the loop control statements discussed below.

E3.3 Match Operator VariablesF

QTAwk has defined two new built-in variables associated with the
match operators, MLENGTH and MSTART. Whenever the match operator
is executed MLENGTH is set equal to the length of the matching
string or zero if no match is found. MSTART is set equal to the
position of the start of the matching string or zero if no match
is found. These built-in variables are completely analogous to
the built-in variables RLENGTH and RSTART for the built-in
'match' function.

E3.4 ConstantsF

Expressions in QTAwk can contain several types of constants:

1. numeric constants
2. character constants
3. string constants
4. regular expressions

Numeric constants have several forms: integer constants and
floating point constants. Integers follow the C practice of
allowing decimal, octal and hexadecimal base constants.

Decimal constants match the form:

[-+]?[0-9]+

Octal constants match the form:

0[0-7]+

Hexadecimal constants match the form:

0[xX][0-9A-Fa-f]+

The results of all three of the following expressions are
equivalent. All set the variable, int_cons, to the integer value,
11567.


QTAwk - 3-5 - QTAwk






Section 3.4 Expressions



int_cons = 11567;

int_cons = 026457;

int_cons = 0x2d2f;

Floating point numeric constants match the form:

{_g}

Note that octal and hexadecimal integers are only recognized in
QTAwk expressions and not in the fields of input records. Only
decimal and floating point numeric constants are recognized in
input fields.

String constants are character sequences enclosed in double
quotes, ". The same escape sequences allowed in regular
expressions are allowed in string constants.

Character constants are single characters enclosed in single
quotes, ', The same escape sequences allowed in strings and
regular expressions are allowed in character constants. All three
of the following expressions will set the variable, chr_cons, to
'A':

chr_cons = 'A';

chr_cons = '\x041';

chr_cons = '\101'

QTAwk will maintain variables set to character constants as
single characters, but they may be used in arithmetic expressions
as any other number and QTAwk will automatically convert them to
their numeric value.

The 'substr' function will return a character constant when the
requested substring is only a single character wide.









QTAwk - 3-6 - QTAwk






Section 4.0 Strings and Regular Expressions


E-4.0 STRINGS and REGULAR EXPRESSIONSF-€

Strings and regular expressions in QTAwk are very similar, yet
very different. Regular expressions can be used wherever strings
are used and strings may be used in most cases where a regular
expression may be used.

E4.1 Regular Expression and String TranslationF

Regular expressions and strings used as regular expressions are
turned into an internal form for scanning the target string for a
match. For regular expressions this process of conversion into
the internal form is done once, when the regular expression is
first used. For strings the process is done every time the string
is used as a regular expression. The process of conversion into
the internal form can be time consuming if done repeatedly. The
judicious use of strings and regular expressions can give both
flexibility and speed. By using regular expressions in those
places where the content of the regular expression will not
change after the first use, the speed of a single conversion can
be attained. By using strings in those places where a regular
expression is called for, e.g., the first argument of the 'gsub'
function and the right hand expression for the match operators,
the flexibility of dynamically changing expressions can be gained
at the expense of speed.

E4.2 Regular Expressions in PatternsF

There are, however, some places where strings cannot be used as
regular expressions. The most notable of these is as stand-alone
regular expressions in patterns. Stand-alone regular expressions
in patterns are a shorthand for:

$0 ~~ /re/

Thus, complex expressions may be built from stand-alone regular
expressions in patterns. For example, the pattern:

/re1/ && /re2/

will match only those records for which both regular
expressions re1 and re2 match. Using the logical, relational,
equality and bit-wise operators, two or more regular expressions
may be combined in patterns to test records against more than one
regular expression. The following pattern:



QTAwk - 4-1 - QTAwk






Section 4.2 Strings and Regular Expressions


/re1/ != /re2/

will select only those records matching re1 and NOT matching
re2 But records matching re2 and not matching re1 will also be
selected.

!/re1/

will select those records not matching the regular expression.
To use regular expressions in this manner the following logical
truth table may be used for selecting desired records which match
or do not match desired regular expressions:

r1 T T F F
r2 T F T F

== T F F T
!= F T T F
<= T F T T
< F F T F
> F T F F
>= T T F T
& T F F F
| T T T F
@ F T T F
&& T F F F
|| T T T F

Thus, if you wanted to select only those records that matched
both regular expressions and reject those records that did not
match both, the following patterns are the only ones to do so:

/re1/ & /re2/

or

/re1/ && /re2/

To select those records matching only re1 and not re2 or both,
the following patterns could be used:

/re1/ > /re2/

or

/re1/ && !/re2/


QTAwk - 4-2 - QTAwk






Section 4.2 Strings and Regular Expressions


Regular expressions and strings may also be used in 'case'
statements as described later. However, strings are not
equivalent to regular expressions in the 'case' statement.













































QTAwk - 4-3 - QTAwk

























































QTAwk - 4-4 - QTAwk






Section 5.0 Pattern-Actions


E-5.0 PATTERN-ACTIONSF-€

QTAwk recognizes utilities in the following format:

pattern { action }

The opening brace, '{', of the action must be on the same line
as the pattern. Patterns control the execution of actions. When a
pattern matches a record, the associated action is executed.
Patterns consist of valid QTAwk expressions or regular
expressions. The sequence operator acquires a special meaning in
pattern expressions and loses its meaning as a sequence operator.

QTAwk follows the C practice in logical operations of
considering a nonzero numeric value as true and a zero numeric
value as false. This has been expanded in QTAwk for strings by
considering the null string as false and any non-null string as
true. When a logical operation is performed, the operation
returns an integer value of one (1) for a true condition and an
integer value of zero (0) for a false condition.

E5.1 QTAwk PatternsF

QTAwk recognizes the following type of patterns:

1. { action }
the pattern is assumed TRUE for every record and the action
is executed for all records.

2. expression
the default action {print;} is executed for every record for
which expression evaluates to TRUE.

3. expression { action }
the actions are executed for each record for which expression
evaluates to TRUE.

4. /regular expression/ { action }
the actions are executed for each record for which the
regular expression matches a string in the record (TRUE
condition). The regular expression may be specified
explicitly as shown or specified by a variable with a regular
expression value. For example, setting the variable, var_re,
as:

var_re = /Replacement String/;


QTAwk - 5-1 - QTAwk






Section 5.1 Pattern-Actions


and specifying the pattern as:

var_re { action }

would be identical to:

/Replacement String/ { action }

The use of a variable has the advantage of being able to
change to the value of the variable. Changing the variable to
another regular expression gives QTAwk utility the capability
of dynamically changing patterns recognized.

5. compound pattern { action }
the pattern combines regular expressions with logical NOT,
'!', logical AND, '&&', logical OR, '||', bit-wise AND, '&',
bit-wise OR, '|', bit-wise XOR, '@', the relational
operators, '<=', '<', '>', '>=', the equality operators, '=='
and '!=', and the matching operators, '~~' and '!~'. The
action is executed for each record for which the compound
pattern is TRUE.

6. expression1 , expression2 { action }
range pattern. The action is executed for the first record
for which expression1 is TRUE and every record until
expression2 evaluates TRUE. The range is inclusive. This
illustrates the special meaning of the sequence operator in
patterns.

7. predefined pattern { action }
the predefined patterns are described next

E5.2 QTAwk Predefined PatternsF

QTAwk provides five predefined patterns, all of which (except
for the 'GROUP' pattern) require actions. The five predefined
patterns are:

1. BEGIN { action }
the action(s) associated with the BEGIN pattern are executed
once prior to opening the first input file. There may be
multiple BEGIN { action } combinations. Each action is
executed in the order in which it is specified.

2. INITIAL { action }
or


QTAwk - 5-2 - QTAwk






Section 5.2 Pattern-Actions


INITIALIZE { action }
the action(s) associated with the INITIAL (INITIALIZE)
pattern are executed after each input file is opened and
before the first record is read. There may be multiple
INITIAL { action } combinations. Each action is executed in
the order in which it is specified.

3. GROUP re { action }
GROUP re { action }
GROUP re { action }
GROUP re { action }
or
GROUP re
GROUP re { action }
GROUP re
GROUP re { action }
or
GROUP re
GROUP re { action }
GROUP re
GROUP re

the pattern associated with the 'GROUP' pattern keyword may
be a single regular expression constant, a string constant or
a variable name. All consecutive GROUP/action pairs are
grouped and the search for the regular expressions optimized
over the group. Each regular expression of the GROUP may have
a separate action associated with it. In this case the
appropriate action is executed if the regular expression is
matched on the current input record. If the action for a
regular expression is not given, then the next action
explicitly given is executed. If no action is given for the
last regular expression of a GROUP, then the default action

{ print ; }

is assigned to it. When one of the regular expressions of
the GROUP is matched, the built-in variable, NG, is set equal
to the number of the regular expression. The numbering of the
regular expressions in the GROUP starts with one, 1.

There may be more than one GROUP of regular expression
patterns. Any pattern not preceded with the 'GROUP' keyword
will cause a GROUP to be terminated. The occurrence of the
'GROUP' keyword again will start a new GROUP and the
numbering of the new group starts at one, 1.


QTAwk - 5-3 - QTAwk






Section 5.2 Pattern-Actions


GROUP patterns are discussed in more detail later.

4. NOMATCH { ACTION }
the action(s) associated with the NOMATCH pattern are
executed for each record for which no pattern is TRUE. There
may be multiple NOMATCH { action } combinations. Each action
is executed in the order in which it is specified.

5. FINAL
or
FINALIZE
the actions associated with the FINAL (FINALIZE) pattern are
executed after the last record of each input file has been
read and before the file is closed. There may be multiple
FINAL { action } combinations. Each action is executed in the
order in which it is specified.

6. END ( action )
the action(s) associated with the END pattern are executed
once after the last input file has been closed. There may be
multiple END { action } combinations. Each action is executed
in the order in which it is specified.

Note that there may be multiple predefined pattern-action pairs
defined in an QTAwk utility. Each action is executed at the
appropriate time in the order defined.






















QTAwk - 5-4 - QTAwk






Section 6.0 Variables and Arrays


E-6.0 VARIABLES and ARRAYSF-€

Variables in QTAwk are of four kinds:
1. user defined
2. built-in
3. field
4. tag

The names of user defined variables start with an upper or lower
case character or underscore optionally followed by one or more
upper or lower case characters, digits or underscores. Most QTAwk
built-in variables are named with upper case letters and
underscores (only three are defined with lower case characters).

Variables are defined by using them in expressions. Variables
have numeric, string or regular expression values or all three
depending upon the context in which they are used in expressions
or function calls. Except for variables defined with the 'local'
keyword, all variables are global in scope. That is they are
accessible and can be changed anywhere within a QTAwk utility.
Local variables will be discussed later when the 'local' keyword
is discussed. All variables are initialized with a zero (0)
numeric value and the null string value when created by
reference. The value of the variable is changed with the
assignment operator, '=' or 'op='.

var1 = 45.87;

var2 = "string value";

var3 = /[\s\t]+[A-Za-z_][A-Za-z0-9_]+/;

var1 has a numeric value of 45.87 from the assignment
statement. It has a string value of "45.87" and a value as a
regular expression of /45\.87/. The string and regular expression
values of var1 may be changed by changing the value of the
built-in variable "OFMT". The string value of OFMT is used to
convert numeric values to string and regular expression values.
OFMT is initialized with a value of "%.6g" and can be changed
with an assignment statement. Such changes would then affect the
string and regular expression values of numeric quantities. For
example, if OFMT is assigned a value of "%u", then the string and
regular expression values of var1 would become "45" and /45/
respectively.

The numeric values of both var2 and var3 is zero (0).


QTAwk - 6-1 - QTAwk






Section 6.0 Variables and Arrays


The string value of var3 is "[\s\t]+[A-Za-z_][A-Za-z0-9_]+".
Note that the tab escape sequence, '\t', is not expanded in
converting the regular expression to a string. The reverse is not
true. One difference between strings and regular expressions is
the time at which escape sequences such as '\t' are translated to
ASCII hexadecimal characters. For strings, the translation is
done when the strings are read from the QTAwk utility. For
regular expressions the escape sequences are translated when the
regular expression is converted to internal form. For this
reason, strings used in the place of regular expressions undergo
a double translation, first when read from the QTAwk utility and
second when converted into the internal regular expression form.
The second translation of strings used for regular expressions is
the reason backslash characters, '\', must be doubled for strings
used in this manner.

E6.1 QTAwk ArraysF

Arrays in QTAwk are a blending of Awk and C. The use of the Awk
associative arrays is continued and expanded to allow integer
indices. The use of the comma to delineate multiple array indices
is discontinued. The comma is now the sequence operator and will
be so treated in array index expressions. Thus, the reference

A[i,j]

will now reference the element of A subscripted by the current
value of the variable j. As a consequence of this the Awk
built-in variable SUBSEP has been dropped. QTAwk allows
multidimensional arrays referenced in the same manner as C. Thus:

A[i][j]

references the jth column of the ith row of the two-dimensional
array A. Array subscripts may be strings. Thus:

A[i]["state"]

would reference the "state" element of the ith row of the two
dimensional array, A. QTAwk allows array indices to be either
integers or strings. Integer or string indices may be used on the
same array. Integer indices are stored before string indices,
integer indices follow the usual numeric ordering and string
indices follow the ASCII collating sequence. The ordering will be
apparent in use of the 'in' form of the 'for' statement:



QTAwk - 6-2 - QTAwk






Section 6.1 Variables and Arrays


for ( k in A ) statement

k is stepped through the indices of the singly dimensioned
array, A, in the order stored. Thus if A has the following
indices: 1, 3, 5, 7, 8, 9, 10, 12, 14, "county", "state", "zip".
Then k would be stepped through the indices in that order. Note
that allowing both string and integer indices overcomes the
disconcerting order of the "stringized numerical" indices of Awk.
Specifically, index 10 does not precede 2 as "10" does precede
"2" in Awk. QTAwk still allows the use of numeric strings such as
"10", "2", etc., but in most cases where such strings would be
used, the user should be aware that integer indices are now
available and will prevent the counterintuitive ordering of Awk.

Note that only indexed elements of an array actually referenced
exist. Thus, for the array A above, the elements for indices 2,
4, 6 and 13 do not exist since they have not been referenced.
This follows the general philosophy that a variable does not
exist until it has been referenced.

E6.2 QTAwk Arrays in Arithmetic ExpressionsF

When Arrays are used in arithmetic expressions in QTAwk, the
entire array is operated on or assigned. For example, if the
variable 'B' is a 3x3 array with the following values:

B[1][1] = 11, B[1][2] = 12, B[1][3] = 13
B[2][1] = 21, B[2][2] = 22, B[2][3] = 23
B[3][1] = 31, B[3][2] = 32, B[3][3] = 33

Assigning B to the variable 'A':

A = B

will duplicate the entire array into A.

A[1][1] = 11, A[1][2] = 12, A[1][3] = 13
A[2][1] = 21, A[2][2] = 22, A[2][3] = 23
A[3][1] = 31, A[3][2] = 32, A[3][3] = 33

If A and B are array variables and C is a scalar (non-array)
variable, then the following expression forms for the assignment
operators, 'op=', are legal:

1. A = B
assign one array to a second. The original elements of array


QTAwk - 6-3 - QTAwk






Section 6.2 Variables and Arrays


A are deleted and the the elements of B duplicated into A.

2. C = B
assigning an array to a variable currently a scalar. Again
the elements of B are duplicated into elements of C which
becomes an array.

3. A = C
assigning a scalar to a variable which is an array. The
elements of the array are discarded and the variable becomes
a scalar.

4. A = B[i]...[j]
assigning an array element to a variable which is currently
an array. Since the element of an array is a scalar, this
case is essentially the same as the immediately previous
case.

5. A[i]...[j] = B[k]...[l]
since array elements are scalars, this is the usual scalar
assignment case.

6. A op= C
the 'op=' operator is applied to every element of A. Thus, A
+= 2, would add '2' to every element of A.

7. A op= B
the 'op=' operator is applied to every element of A for which
an element exists in B with identical indices. No elements
are created in A to match elements of B with indices
different from any element of A. Thus, the sequence of
statements:

A = B;
A += B;

would leave every element of A with twice the value of the
corresponding element of B.

There are two cases of using arrays with the assignment
operators that are not legal and for which QTAwk will issue an
error message at runtime.

1. A[i]...[j] = B
2. A[i]...[j] op= B
3. C op= B


QTAwk - 6-4 - QTAwk






Section 6.2 Variables and Arrays


These are all variations on the same expression. In the first
case, the expression is attempting to assign an array to a
scalar, an array element. Since an array element cannot be
further expanded into an array, the assignment is not allowed. In
the second and third cases, the expressions are attempting to
operate on a scalar with an array and assign the result to the
scalar. Both of these expressions fail for the same reason, an
array cannot operate on a scalar. It is possible for a single
value, a scalar, to operate on every element of an array, but the
reverse, having each element of the array operate on the scalar
is not permitted.

The reasoning prohibiting the second and third case above is
extended to all binary expressions involving arrays in QTAwk. In
general, arrays are allowed in expressions with binary arithmetic
operators:

~ ^ * / % + - << >> & @ |

as well as string concatenation:

A B (equivalent to A ï B)

In such expressions, arrays are allowed in the following forms:

1. A op B
2. A op C

But not as

C op A

It could be argued that expressions such as,

2 + A

should be allowed since '+' is commutative and the expression
could be written equivalently as,

A + 2

This is true for addition, but not for all of the binary
arithmetic operators. For example, the division operator is not
commutative.

2 / A


QTAwk - 6-5 - QTAwk






Section 6.2 Variables and Arrays


could not be written equivalently as:

A / 2

For this reason, QTAwk does not allow any array expressions of
the form:

scalar op array

The unary arithmetic operators may also be used to operate on
entire arrays:

++A (pre-fix increment operator)

--A (pre-fix decrement operator)

A++ (post-fix increment operator)

A-- (post-fix decrement operator)

-A (Unary minus operator)

+A (Unary plus operator)

~A (Unary one's complement operator)

An expression such as:

A + B

will result in an array with element indices identical to those
of A, and with values which are the sum of the elements of A and
B, which have identical indices. If A has an element for which B
does not have a corresponding element, the resultant element
value is equal to the A element value. Elements of B which have
no corresponding element in A are not represented in the reultant
array.

An array with elements of double the value of the elements of B
can created as:

A = B;
D = A + B;

or as



QTAwk - 6-6 - QTAwk






Section 6.2 Variables and Arrays


D = B + B;

or as

D = B * 2;

any of the above sequence of statements will result in an array,
D, with elements with indices identical to B, and with double the
element values. The array A could be made an array with elements
twice the element values of B with the statement:

A = B;
A *= 2;

Arrays may be used in expressions with arithmetic operators and
the whole array will be utilized in the expression. This does not
extend to the logical operators:

! < <= >= > == != ~~ !~ && ||

Using an array with a logical operator will result in the first
element in the array only being used in the expression.


























QTAwk - 6-7 - QTAwk

























































QTAwk - 6-8 - QTAwk






Section 7.0 Group Patterns


E-7.0 GROUP PATTERNSF-€

GROUP patterns follow the syntax:

GROUP re1 { optional action }
GROUP re2 { optional action }
GROUP re3 { optional action }
GROUP re4 { optional action }
GROUP re5 { optional action }
GROUP re6 { optional action }

Actions are optional with any particular regular expression in
the group. If no action is given, the next action specified in
the group is executed. If no action is specified for the last
regular expression in a group, the default action, "{print;}" is
assigned to it.

Any utility may have more than one GROUP of patterns. A group is
terminated by any pattern not starting with the 'GROUP' keyword.

E7.1 GROUP Pattern AdvantageF

GROUP patterns have two distinct advantages in QTAwk:

1. the regular expressions contained in the GROUP are optimized
to decrease search time, and
2. input records are searched once for all regular expressions
in a GROUP. If the regular expressions were organized as
individual pattern-actions, each record is searched
separately for each regular expression.

For utilities containing many regular expression patterns for
which to search, a program organized into a one or more GROUPs
can be many times faster than a utility organized as ordinary
pattern/action pairs. For example, the QTAwk utility ansicstd.exp
shown in Appendix III searches a C source file listing for ANSI C
Standard defined names. The utility organizes the search into a
single GROUP and will search a source file approximately 6 times
faster than the same utility organized as separate pattern/action
pairs without the use of a GROUP.

E7.2 GROUP Pattern DisadvantageF

GROUP patterns have one disadvantage compared to ordinary
pattern/action pairs. QTAwk will find only one of the regular
expressions in a GROUP. A set of GROUP patterns:


QTAwk - 7-1 - QTAwk






Section 7.2 Group Patterns


GROUP re1 { action1; }
GROUP re2 { action2; }
GROUP re3 { action3; }

is similar in execution to:

re1 { action1; next; }
re2 { action2; next; }
re3 { action3; next; }

If more than one regular expression in a group will match a
given string in the input record, the regular expression listed
first in the GROUP will be matched and the appropriate action
executed. If all regular expression patterns in a GROUP must be
found in input records, then separate pattern-action pairs must
be used.

E7.3 GROUP Pattern Regular ExpressionsF

The regular expressions associated with the GROUP pattern can be
either regular expression constants, e.g.,

GROUP /regular expression constant/

a string constant, e.g.,

GROUP "string constant"

or a variable, e.g.,

GROUP var_name


GROUP patterns are converted into an internal form for regular
expressions only once, when the pattern is first used to scan an
input line. Any variables in a GROUP pattern will be evaluated,
converted to string form and interpreted as a regular expression.











QTAwk - 7-2 - QTAwk






Section 8.0 Statements


E-8.0 STATEMENTSF-€

QTAwk has departed from Awk by using the C convention of using
the semi-colon, ';', as a statement terminator. QTAwk treats
newline characters as white space and ignores them, except for
terminating comments. Comments are introduced by the symbol, '#',
and continue to the next newline character. Thus the Awk practice
of letting newlines terminate some statements can no longer be
used. The Awk rules for terminating statements with the newline
except under some conditions can now be forgotten. In QTAwk,
terminate all statements with a semi-colon, ';'.

QTAwk provides braces for grouping statements to form compound
statements. Various keywords are available for controlling the
logical flow of statement execution and for looping over
statements multiple times.

E8.1 QTAwk KeywordsF

The QTAwk keywords are:

1. break
2. case
3. continue
4. cycle
5. default
6. delete
7. deletea
8. do
9. else
10. endfile
11. exit
12. for
13. if
14. in
15. local
16. next
17. return
18. switch
19. while

The keywords 'cycle', 'deletea', 'local' and 'endfile' are new
to QTAwk. The keywords 'switch', 'case' and 'default' have been
appropriated from C with expanded functionality over C.

E8.2 'cycle' and 'next'F


QTAwk - 8-1 - QTAwk






Section 8.2 Statements


The 'cycle' and 'next' statements allow the user to control the
execution of the QTAwk outer loop which reads records from the
current input file and compares them against the patterns. Both
statements, restart the pattern matching.

The 'next' statement causes the next input record to be read
before restarting the outer pattern matching loop with the first
pattern-action pair.

The 'cycle' statement may use the current input record or the
next input record for restarting the outer pattern matching loop.
As each input record is read from the current input file, the
built-in variable CYCLE_COUNT is set to one. The 'cycle'
statement increments the numeric value of CYCLE_COUNT by one and
compares the new value to the numeric value of the built-in
MAX_CYCLE variable. One of two actions is taken depending on the
result of this comparison:

1. If CYCLE_COUNT is greater than MAX_CYCLE, then the next
input record is read, setting NR, FNR, $0, NF and the record
fields $1, $2, ... $NF, before restarting the outer pattern
matching loop. This is identical to the action of the 'next'
keyword.

2. If CYCLE_COUNT is less than or equal to MAX_CYCLE, the
current values of NR, FNR, $0, NF and the record fields are
utilized when restarting the outer pattern matching loop.

The default value of MAX_CYCLE is 100. Both CYCLE_COUNT and
MAX_CYCLE are built-in variables and may be set by the user's
utility. Setting MAX_CYCLE is useful to control the number of
iterations possible on a record. Setting MAX_CYCLE to 1 would
make the 'cycle' and 'next' keywords identical.

If the value of CYCLE_COUNT is set by the user's utility, care
should be taken to prevent the possibility of the utility
entering a loop from which it cannot exit.

The 'cycle' statement is useful when it is necessary to process
the current input record through the outer pattern match loop
more than once. The following utility is a trivial example of one
such use. This utility will print each record with the record
number multiple times. The number of times is determined by the
value assigned MAX_CYCLE in the 'BEGIN' action.

BEGIN {


QTAwk - 8-2 - QTAwk






Section 8.2 Statements


MAX_CYCLE = 10;
}

{
print FNR,$0;
cycle;
}

E8.3 'delete' and 'deletea'F

The 'delete' and 'deletea' statements allow the user to delete
individual elements of an array or an entire array respectively.
The form of the 'delete' and 'deletea' statements are:

delete A[expr_list];

and

deletea A;

The first form will delete the element of array A referenced by
the subscript determined by 'expr_list'. The second form will
delete the entire array. Note that for singly dimensioned arrays,
the 'deletea' statement is equivalent to the statement:

for ( j in A ) delete A[j];

The use of the 'deletea' statement is encouraged for simplicity
and speed of execution. The 'delete' statement may be used for
arrays of any dimension. However, for arrays with dimension
greater than 2, the elements of the array are not deleted, but
simply initialized to zero and the null string. This behavior has
to do with the structure of arrays and the 'holes' which could be
left by deleting elements. For singly dimensioned arrays, there
is no problem, since there can be no 'hole' left by deleting an
element. For example consider the singly dimensioned array:

A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9]

If the array element A[5] is deleted

A[1] A[2] A[3] A[4] ____ A[6] A[7] A[8] A[9]

Then the remaining elements 'shift' to fill the 'hole'.

A[1] A[2] A[3] A[4] A[6] A[7] A[8] A[9]


QTAwk - 8-3 - QTAwk






Section 8.3 Statements


For two-dimensional arrays a complication arises in trying to
fill the 'hole' left by deleting an array element.

A[1][1] A[1][2] A[1][3] A[1][4] A[1][5] A[1][6]
A[2][1] A[2][2] A[2][3] A[2][4] A[2][5] A[2][6]
A[3][1] A[3][2] A[3][3] A[3][4] A[3][5] A[3][6]
A[4][1] A[4][2] A[4][3] A[4][4] A[4][5] A[4][6]
A[5][1] A[5][2] A[5][3] A[5][4] A[5][5] A[5][6]
A[6][1] A[6][2] A[6][3] A[6][4] A[6][6] A[6][6]

If element A[4][4] is deleted, then we have the 'hole':

A[1][1] A[1][2] A[1][3] A[1][4] A[1][5] A[1][6]
A[2][1] A[2][2] A[2][3] A[2][4] A[2][5] A[2][6]
A[3][1] A[3][2] A[3][3] A[3][4] A[3][5] A[3][6]
A[4][1] A[4][2] A[4][3] _______ A[4][5] A[4][6]
A[5][1] A[5][2] A[5][3] A[5][4] A[5][5] A[5][6]
A[6][1] A[6][2] A[6][3] A[6][4] A[6][6] A[6][6]

In trying to fill the 'hole', we have a choice of shifting the
elements below the deleted element up to fill the 'hole', column
priority, or shifting the elements to the right of the deleted
element to fill the 'hole', row priority. In QTAwk, row priority
is used in filling the 'hole':

A[1][1] A[1][2] A[1][3] A[1][4] A[1][5] A[1][6]
A[2][1] A[2][2] A[2][3] A[2][4] A[2][5] A[2][6]
A[3][1] A[3][2] A[3][3] A[3][4] A[3][5] A[3][6]
A[4][1] A[4][2] A[4][3] A[4][5] A[4][6]
A[5][1] A[5][2] A[5][3] A[5][4] A[5][5] A[5][6]
A[6][1] A[6][2] A[6][3] A[6][4] A[6][6] A[6][6]

For arrays of higher dimensions the situation is even more
complicated. Not only do elements have to be "shifted", but
elements in the array will have to be discarded to do so. For
example, if A is a 3x3x3 array and element A[2][2][2] is deleted,
then element A[2][2][3], if it existed, would also be deleted by
shifting other elements to fill the 'hole'. QTAwk will in this
case initialize the element A[2][2][2] to zero and the null
string rather than delete the element and lose other elements.
Thus, the 'delete' statement only truely deletes elements for one
and two dimensional arrays.

The 'deletea' statement, however, works on arrays of any
dimension. For multi-dimensional arrays, the 'deletea' would be
equivalent to nested 'for' statements. For example, if the


QTAwk - 8-4 - QTAwk






Section 8.3 Statements


'delete' statement truely deleted elements of a three dimensional
array, then the 'deletea' statement could be imagined as
equivalent to:

for ( i in A )
for ( j in A[i] )
for ( k in A[i][j] ) delete A[i][j][k]

E8.4 'if'/'else'F

The 'if' and 'else' keywords provide for executing one of
possibly two statements conditioned upon the TRUE or FALSE value
of an expr_list. The form of the 'if'/'else' statement is:

if ( expr_list ) statement1

or

if ( expr_list ) statement1 else statement2

If expr_list when evaluated, produces a TRUE value then
statement1 is executed. If the expr_list produces a FALSE value,
then for the second form, statement2 is executed.

E8.5 'in'F

The 'in' keyword allows the user to test membership in arrays in
expressions. The form of an expression containing the 'in'
keyword is:

expression in A

if the value of 'expression' is a current subscript value of
the array A, the expression yields a TRUE value, otherwise FALSE.
For multidimensional arrays, the statement:

expression in A[i]

would test if 'expression' is a valid column subscript in the
ith row of array A. Note that A may have more than two dimensions
for this statement to be correct. The next higher dimension than
stated in the expression is always tested.

E8.6 'switch', 'case', 'default'F

QTAwk includes an expanded form of the C 'switch'/'case'


QTAwk - 8-5 - QTAwk






Section 8.6 Statements


statements. In C, the 'switch'/'case' statements must be of the
form:

switch ( expr_list ) {
case constant1: statement
case constant2: statement
case constant3: statement
case constant4: statement
default: statement
}

The expr_list of the 'switch' statement must evaluate to an
integral value and 'constant1', 'constant2', 'constant3', and
'constant4', must be compile-time integral constant values. In
QTAwk, the 'case' statement may contain any valid QTAwk
expression or expr_list:

switch ( expr_list ) {
case expr_list1: statement
case expr_list2: statement
case expr_list3: statement
case expr_list4: statement
default: statement
}

The expr_lists of the case statements are evaluated in turn at
execution time. The resultant value is checked against the value
of the expr_list of the 'switch' statement using the following
logic.


if ( cexpr is a regular expression ) sexpr ~~ cexpr;
else sexpr == cexpr;

where cexpr is the value of the case expr_list and sexpr is the
value of the 'switch' statement expr_list. Thus if cexpr is a
regular expression, a match operation is performed. If cexpr is a
string, a string comparison is performed. If cexpr is a numeric,
a numerical comparison is performed. It is possible to have case
statements with differing types of expr_list values in the same
'switch' statement and the proper comparison is made.

Once a TRUE value is returned by a case statement comparison,
the execution falls through from 'case' to 'case' with no further
comparisons made. The fall through of execution is broken by the
use of the 'break' statement as in C.



QTAwk - 8-6 - QTAwk






Section 8.6 Statements


Note that the expr_list of a 'case' statement is evaluated at
execution time and it is possible for some 'case' expr_lists to
never be evaluated. Thus side effects from the evaluation of
'case' expr_lists should not be relied upon. This is particularly
true where execution falls through from one 'case' statement to
the next.

If the expr_list of a 'case' statement evaluates to a regular
expression, then two built-in variables are set when the match
operation is performed: CLENGTH and CSTART. CLENGTH is set to the
length of the matching string found (or zero) and CSTART is set
to the starting position of the matching string found (or zero).
CLENGTH and CSTART are completely analogous to RLENGTH and RSTART
set for the 'match' function and MLENGTH and MSTART for the match
operators, '~~' and '!~'.

The 'default' keyword is provided in analogy to C. The
statements following the 'default' statement are executed if the
'switch' expr_list matches no 'case' expr_list. The 'default'
statement may be combined with other 'case' statements. It need
not be the last statement as shown.

E8.7 LoopsF

QTAwk has four forms of loop control statements:

1. for ( exp1 ; exp2 ; exp3 ) stmt
2. for ( var in array ) stmt
3. while ( exp ) stmt
4. do stmt while ( exp );

E8.8 'while'F

The 'while' statement has the form:


while ( expr_list ) stmt

the expr_list is evaluated and if TRUE 'stmt' is executed and
expr_list is re-evaluated. This cycle continues until expr_list
evaluates to FALSE, at which point the cycle is terminated and
execution resumes with the utility after 'stmt'.

E8.9 'for'F

The 'for' statement has two forms:



QTAwk - 8-7 - QTAwk






Section 8.9 Statements


1. for ( exp1 ; exp2 ; exp3 ) stmt
2. for ( var in array ) stmt

In the first form the following sequence of operations are
performed:
1. The expressions in expr_list1 are evaluated,
2. The expressions in expr_list2 are evaluated,
3. The action taken is depenedent upon whether the resultant
value of expr_list2 is true or false:
a) TRUE
1: Execute 'stmt', which may be a compound statement.
2: Execute the expressions in expr_list3.
3: Control returns to item 2. above.
b) FALSE - terminate loop


The second form may also be used for multi-dimensional arrays:

for ( var in array[s_expr_list]...[s_expr_list] ) stmt

For each subscript in the next higher index level in the array
reference, var is set to the index value and 'stmt' is executed.
'stmt' may be a compound statement. For a multidimensional array,
the second form may be used to loop sequentially through the
indices of the next higher index level. Thus for a two
dimensional array:

for ( i in A )
for ( j in A[i] )

will loop through the indices in the array in row order.

E8.10 'do'/'while'F

The form of the 'do'/'while' statement is:

do stmt while ( expr_list );

'stmt' is executed, expr_list evaluated and if TRUE 'stmt' is
executed again else the loop is terminated. Note that 'stmt' is
executed at least once.

E8.11 'local'F

The 'local' keyword is used to define variables within a
compound statement that are local to the compound statement and


QTAwk - 8-8 - QTAwk






Section 8.11 Statements


that disappear when the statement is exited. The 'local' keyword
may be used within any compound statement, but is especially
useful in user-defined functions as described later. Variables
defined with the 'local' keyword may be assigned an initial value
in the statement and multiple variables may be defined with a
single statement. If a variable is not assigned an initial value,
it is initialized to zero and the null string just as global
variables are initialized.

Thus:

local i, j = 12, k = substr(str,5);

will define three variables local to the enclosing compound
statement:
1. i initialized to zero/null string,
2. j initialized to 12, and
3. k initialized to a substring of the variable 'str'

Local variables initialized explicitly in 'local' statements may
be initialized to constants, the values of global variables,
values returned by built-in functions, values returned by
user-defined functions or previously defined local variables. If
the value is set to that of a previously defined local variable,
the variable may not be defined in the same 'local' statement.
Thus:

local k = 5;
local j = k;

is correct, but

local k = 5, j = k;

is not. In the latter case QTAwk will quietly assume that the k,
to which j is assigned, is a global variable.

E8.12 'endfile'F

The 'endfile' keyword causes the utility to behave as if the end
of the current input file has been reached. Any 'FINAL' actions
are executed, if any input files remain to be processed from the
command line, the next is opened for processing. If no further
input files remain to be processed, any 'END' actions are
executed.



QTAwk - 8-9 - QTAwk






Section 8.13 Statements


E8.13 'break'F

This keyword will terminate the execution of the enclosing
'while', 'for', 'do'/'while' loop or break execution in cascaded
'case' statements.

E8.14 'continue'F

This keyword will cause execution to jump to just after the last
statement in the loop body and execute the next iteration of the
enclosing loop. The loop may be any 'for', 'while' or
'do'/'while'.

E8.15 'exit opt_expr_list'F

This statement causes the utility to behave as if the end of the
current input file had been reached. Any further input files
specified are ignored. If there are any FINAL or END actions,
they are executed.

If encountered in a FINAL action, the action is terminated, any
further input files are ignored and any END actions are executed.

If encountered in an END action, the execution of the action is
terminated and utility execution is terminated.

The optional expr_list is evaluated and the resultant value
returned to DOS upon termination by QTAwk as the exit status. If
no expr_list is present, or no 'exit' statement encountered,
QTAwk returns a value of zero for the exit status.

E8.16 'return opt_expr_list'F

This statement will cause execution to return from a user
defined function. If the optional expr_list is present, it is
evaluated and the resultant value returned as the functional
value.











QTAwk - 8-10 - QTAwk






Section 9.0 Built-in Functions


E-9.0 BUILT-IN FUNCTIONSF-€

QTAwk offers a rich set of built-in arithmetic, string, I/O,
array and system functions. The array of built-in functions
available has been extended over that available with Awk. The I/O
functions have been changed to match the functional syntax of all
other built-in and user defined functions.

E9.1 Arithmetic FunctionsF

QTAwk offers the following built-in arithmetic functions. Those
marked with an asterisk, '*', are new to QTAwk:

1. acos(x) ==> return arc-cosine of x (refer to the DEGREES
built-in variable).

2. asin(x) ==> return arc-sine of x (refer to the DEGREES
built-in variable).

3. atan2(y,x) ==> return arc-tangent of y/x, -ã to ã (refer to
the DEGREES built-in variable).

4. cos(x) ==> return cosine of x (refer to the DEGREES built-in
variable).

5. * cosh(x) ==> return hyperbolic cosine of x

6. exp(x) ==> return e^x

7. * fract(x) ==> return fractional portion of x

8. int(x) ==> return integer portion of x

9. log(x) ==> return natural (base e) logarithm of x

10. * log10(x) ==> return base 10 logarithm of x

11. * pi() ==> return pi

12. * pi ==> return pi

13. rand() ==> return random number r, 0 <= r < 1

14. sin(x) ==> return sine of x (refer to the DEGREES built-in
variable).



QTAwk - 9-1 - QTAwk






Section 9.1 Built-in Functions


15. * sinh(x) ==> return hyperbolic sine of x

16. sqrt(x) ==> return square root of x

17. srand(x) ==> set x as new seed for rand()

18. srand() ==> set current system time as new seed for rand()









































QTAwk - 9-2 - QTAwk






Section 9.2 Built-in Functions


E9.2 String FunctionsF

QTAwk offers the following built-in string handling functions.
Those marked with an asterisk, '*', are new to QTAwk:

1. * center(s,w) ==> return string s centered in w blank
characters.

2. * center(s,w,c) ==> return string s centered in w 'c'
characters.

3. * copies(s,n) ==> return n copies of string s.

4. * deletec(s,p,n) ==> return string s with n characters
deleted starting at position p.

5. gsub(r,s) ==> substitute s for strings matched by regular
expression, r, globally in $0, return number of substitutions
made.

6. gsub(r,s,t) ==> substitute s for strings matched by regular
expression, r, globally in string t, return number of
substitutions made.

7. index(s1,s2) ==> return position of string s2 in string s1.

8. * insert(s1,s2,p) ==> return string formed by inserting
string s2 into string s1 starting at position p.

9. * justify(a,n,w) ==> return string w characters long formed
by justifying n elements of array a padded with blanks. If n
elements of array a with at least one blank between elements
would exceed width w, then the number of elements justified
is reduced to fit in the length w.

10. * justify(a,n,w,c) ==> return string w characters long
formed by justifying n elements of array a padded with
character 'c'. If n elements of array a with at least one 'c'
character between elements would exceed width w, then the
number of elements justified is reduced to fit in the length
w.

11. length ==> return number of characters in $0.

12. length() ==> return number of characters in $0.



QTAwk - 9-3 - QTAwk






Section 9.2 Built-in Functions


13. length(s) ==> return number of characters in string s.

14. match(s,r) ==> return true/false if string s contains a
substring matched by r. Set RLENGTH to length of substring
matched (or zero) and RSTART to start position of substring
matched (or zero).

15. * overlay(s1,s2,p) ==> return string formed by overlaying
string s2 on string s1 starting at position p. May extend
length of s1. If p > length(s1), s1 padded with blanks to
appropriate length.

16. * remove(s,c) ==> return string formed by removing all 'c'
characters from string s

17. * replace(s) ==> return string formed by replacing all
repeated expressions, {n1,n2}, and named expressions, {name},
in string s. Same operation performed for strings used as
regular expressions.

18. * sdate(fmt) ==> return current system date formatted
according to integer value of fmt.
mname == full month name
amname == abbreviated month name (3 characters)
wkday == full day name
aday == abbreviated day name (3 characters)
integer value of fmt:
0 - mm/dd/yy
1 - mm/dd/yyyy
2 - dd/mm/yy
3 - dd/mm/yyyy
4 - amname dd, yyyy
5 - mname dd, yyyy
6 - aday mm/dd/yyyy
7 - wkday mm/dd/yyyy
8 - aday, amname dd, yyyy
9 - wkday, mname dd, yyyy
10 - return amname
11 - return month name
12 - return aday
13 - return wkday
14 - return current system date in form yymmdd for sorting
15 - return number of days this century
16 - return number of days this year
a value greater than 16, gives a run-time error and QTAwk
halts execution.


QTAwk - 9-4 - QTAwk






Section 9.2 Built-in Functions


19. split(s,a) ==> split string s into array a on field
separator FS. Return number of fields. The same rules applied
to FS for splitting the current input record apply to the use
of fs in splitting s into a.

20. split(s,a,fs) ==> split string s into array a on field
separator fs. Return number of fields. The same rules applied
to FS for splitting the current input record apply to the use
of fs in splitting s into a.

21. * srange(c1,c2) ==> return string formed from character by
concatenating characters from c1 to c2 inclusive. If c2 < c1
null string returned. Thus,

srange('a','k') == "abcdefghijk".

22. * srev(s) ==> return string formed by reversing string s.

srev(srange('a','k')) == "kjihgfedcba".

23. * stime(fmt) ==> return current system time formatted
according to the integer value of fmt.
0 - hh:mm:ss:00, 0 <= hh <= 24
1 - hh:mm:ss, 0 <= hh <= 24
2 - hh:mm, 0 <= hh <= 24
3 - hh:mm:ss am/pm
4 - hh:mm am/pm
a value greater than 4, gives a run-time error and QTAwk
halts execution.

24. * stran(s) ==> return string formed by translating
characters in string s matching characters in string value of
built-in variable, TRANS_FROM, to corresponding character in
string value of built-in variable, TRANS_TO. if no
corresponding character in TRANS_TO, then replace with blank.
TRANS_FROM and TRANS_TO intially set to:

TRANS_FROM = srange('A','Z');

TRANS_TO = srange('a','z');

25. * stran(s,st) ==> return string formed by translating
characters in string s matching characters in string value of
built-in variable, TRANS_FROM, to corresponding character in
st. if no corresponding character in st then replace with
blank.


QTAwk - 9-5 - QTAwk






Section 9.2 Built-in Functions


26. * stran(s,st,sf) ==> return string formed by translating
characters in string s matching characters in sf to
corresponding character in st. if no corresponding character
in st then replace with blank.

27. * strim(s) ==> return string formed by trimming leading and
tailing white space from string s. Leading white space
matches the regular expression /^[\s\t]+/. Tailing white
space matches the regular expression /[\s\t]+$/.

28. * strim(s,le) ==> return string formed by trimming string
matching le and tailing white space from string s. Differing
actions are taken depending the type of le:

ÄÄÄÄÄÄÄÄÄÄÄÄÂÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
le type ³ action
ÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
regular ³
expression ³ delete first string matching regular expression
³
string ³ convert to regular expression and delete first
³ matching string
³
single ³
character ³ delete all leading characters equal to 'le'
³
non-zero ³
numeric ³ delete leading white space matching /^[\s\t]+/
³
zero ³
numeric ³ ignore
ÄÄÄÄÄÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

strim(s,TRUE) is equivalent to the form strim(s)

The following all delete the leading dashes from the given
string:

strim("------ remove leading -------",/^-+/);
strim("------ remove leading -------",/-+/);
strim("------ remove leading -------",'-');
==> "remove leading -------"

29. * strim(s,le,te) ==> return string formed by trimming
string matching le and string matching te from s. 'le' and
'te' may be a regular expression, a string, a single


QTAwk - 9-6 - QTAwk






Section 9.2 Built-in Functions


character or a numeric. Differing actions are taken depending
the type of le and te:

ÄÄÄÄÄÄÄÄÄÄÄÄÂÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
le/te type ³ action
ÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
regular ³
expression ³ delete first string matching regular expression
³
string ³ convert to regular expression and delete first
³ matching string
³
single ³
character ³ delete all leading/tailing characters equal to
³ 'le'/'te' respectively
³
non-zero ³
numeric ³ delete leading/tailing white space matching /^[\s\t]+/
³ or /[\s\t]+$/ respectively
³
zero ³
numeric ³ ignore
ÄÄÄÄÄÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

strim(s,TRUE,TRUE) is equivalent to the form strim(s)


strim("======remove leading and tailing-------",'=','-')
or
strim("======remove leading and tailing-------",/^=+/,'-')
or
strim("======remove leading and tailing-------",'+',/-+$/)
or
strim("======remove leading and tailing-------",/^=+/,/-+$/)
==> "remove leading and tailing"

strim("======remove leading-------",'=',FALSE)
==> "remove leading-------"

strim("======remove tailing-------",FALSE,'-')
==> "======remove tailing"

30. * strlwr(s) ==> return string s translated to lower-case.

31. * strupr(s) ==> return string s translated to upper-case.



QTAwk - 9-7 - QTAwk






Section 9.2 Built-in Functions


32. sub(r,s) ==> substitute s for leftmost string matched by
regular expression, r, in $0, return number of substitutions
made (0/1).

33. sub(r,s,t) ==> substitute s for leftmost string matched by
regular expression, r, in t, return number of substitutions
made (0/1).

34. substr(s,p) ==> return string formed from suffix of string
s starting at position p.

35. substr(s,p,n) ==> return string formed from n characters of
string s starting at position p. If n == 1, a character
constant is returned.


































QTAwk - 9-8 - QTAwk






Section 9.3 Built-in Functions


E9.3 I/O FunctionsF

QTAwk offers the following built-in I/O functions. Those marked
with an asterisk, '*', differ from those in AWK.

1. * getline();
or
* getline ; ==> reads next record from current input file
into $0. Sets fields, NF, NR and FNR. Returns the number of
characters read, 0 if end-of-file was encountered or -1 if an
error occurred.

2. * getline(v);
or
* getline v ; ==> reads next record from current input file
into variable v. Sets NR and FNR. Returns the number of
characters read, 0 if end-of-file was encountered or -1 if an
error occurred.

3. * fgetline(F) ==> reads next record from file F into $0.
Sets fields and NF. Returns the number of characters read, 0
if end-of-file was encountered or -1 if an error occurred.

4. * fgetline(F,v) ==> reads next record from file F into
variable v. Returns the number of characters read, 0 if
end-of-file was encountered or -1 if an error occurred.

5. * fprint(F);
or
* fprint F ; ==> prints $0 to file 'F' followed by ORS.
Returns number of characters printed.

6. * fprint(F,...);
or
* fprint F,... ; ==> prints expressions in the expr_list,
'...', to the file 'F', each separated by OFS. The last
expression is followed by ORS. Returns number of characters
printed.

7. fprintf(F,fmt,...) ==> print expr_list, ..., to file 'F'
according to format 'fmt'. Returns the number of characters
printed.

8. print();
or
print ; ==> prints $0 to standard output file followed by


QTAwk - 9-9 - QTAwk






Section 9.3 Built-in Functions


ORS. Returns number of characters printed.

9. print(...);
or
print ... ; ==> prints expressions in the expr_list, ..., to
the standard output file, each separated by OFS. The last
expression is followed by ORS. Returns number of characters
printed.

10. printf(fmt,...) ==> print expr_list, ..., to standard
output file according to format, fmt. Returns number of
characters printed.

11. sprintf(fmt,...) ==> return string formed by formatting
expr_list, ... , according to format, fmt.

12. close(F) ==> close file F.

The use of the re-direction and pipeline operators, '<', '>',
'>>' and '|', have been discontinued as error prone. The use of
the syntax:

{ print $1, $2 > $3 }

has been replaced by the 'fprint' function:

{ fprint($3,$1,$2); }

or

{ fprint $3,$1,$2; }

















QTAwk - 9-10 - QTAwk






Section 9.4 Built-in Functions


E9.4 Miscellaneous FunctionsF

E9.4.1 Expression TypeF

*e_type(expr) ==> returns the type of 'expr'. Function evaluates
the expression 'expr' and returns the type of the final result.
The return is an integer defining the type:

Return Type
0 Un-initialized (returned when 'expr' is a variable which
has not had a value assigned to it. Also
if not been assigned since acted on
by "deleta" statement)
1 Regular Expression Value
2 String Value
3 Single Character Value
4 Integral Value
5 Floating Point Value

local lvar;
e_type(lvar) ==> 0
e_type(/string test/) ==> 1
e_type("string test") ==> 2
e_type('a') ==> 3
e_type(45) ==> 4
e_type(45.6) ==> 5
e_type(45.6 ï "") ==> 2
e_type("45.6" + 0.0) ==> 5
e_type("45" + 0) ==> 4



E9.4.2 Execute StringF

QTAwk offers two forms of a function to execute QTAwk dynamic
expressions or statements. The first form will execute strings as
QTAwk expressions or statements. The second will execute array
elements as QTAwk expressions or elements.

*execute(s[,se[,rf]]) ==> execute string s as an QTAwk statement
or expression. If se == TRUE, then string s is executed as an
expression and the resultant value is returned by the 'execute'
function. If se == FALSE, then string s is executed as a
statement and the constant value of one, 1, is returned. The se
parameter is optional and defaults to FALSE. Any built-in or
user-defined function may be executed in the 'execute' function


QTAwk - 9-11 - QTAwk






Section 9.4.2 Built-in Functions


except the 'execute' function itself. New variables may be
defined as well as new constant strings and regular expressions.

The optional rf parameter is the error recovery flag. If rf =
FALSE (the default value), an error encountered in parsing or
executing the string s will cause QTAwk to issue the appropriate
error message and halt execution. If rf == TRUE, an error
encountered in parsing or executing the string s will cause QTAwk
to issue the appropriate error message, discontinue parsing or
execution of the string and continue executing the current QTAwk
utility. Attempting to execute the 'execute' function from within
the 'execute' function is a fatal error and will always cause
QTAwk to halt execution.

The following string can be executed as either an expression or
statement:

nvar = "power2 = 2 ^ 31;";

If executed as an expression:

print execute(nvar,1);

the output will be: 2147483648

If executed as a statement:

print execute(nvar,0);

or

print execute(nvar);

the output will be: 1

Multiple statements/expressions may be executed with a compound
statement of the form:

pvar = "{ pow8 = 2 ^ 8; pow16 = 2 ^ 16; pow31 = 2 ^ 31; }";

Then

execute(pvar,0);

or



QTAwk - 9-12 - QTAwk






Section 9.4.2 Built-in Functions


execute(pvar);

will set the three variables:

1. pow8
2. pow16
3. pow31

even if the variables were not previously defined. If the
variables were not previously defined, they will added to the
list of the utility variables.

Note that attempting to execute pvar as an expression:

execute(pvar,1);

will result in the error message "Undefined Symbol". All three
expressions may be executed, as an expression, by the use of the
sequence operator in the following manner:

pvar = "pow8 = 2 ^ 8 , pow16 = 2 ^ 16 , pow31 = 2 ^ 31;";

*execute(a[,se[,rf]]) ==> execute the elements of array a as an
QTAwk statement or expression. The se and rf parameters have the
same function and default values as above. For example, the
compound statement contained in pvar above may be split amoung
the elements of an array:

avar[1] = "{";
avar[2] = "pow8 = 2 ^ 8;";
avar[3] = "pow16 = 2 ^ 16;";
avar[4] = "pow31 = 2 ^ 31;";
avar[5] = "}";

and executed as:

execute(avar);

or

execute(avar,0);



E9.4.3 Array FunctionF



QTAwk - 9-13 - QTAwk






Section 9.4.3 Built-in Functions


QTAwk offers the following built-in array function.

rotate(a) - the values of the array are rotated.
The value of the first element goes to the last element, the
second to the first, third to the second, etc. If the array has
the following elements:

1. a[1] = 1
2. a[2] = 2
3. a[3] = 3
4. a[4] = 4

then rotate(a) will have the result:

1. a[1] = 2
2. a[2] = 3
3. a[3] = 4
4. a[4] = 1

It is not necessary to specify one-dimensional arrays. If:

1. a[1][1] = 1
2. a[1][2] = 2
3. a[1][3] = 3
4. a[1][4] = 4

Then rotate(a[1]) will produce the result:

1. a[1][1] = 2
2. a[1][2] = 3
3. a[1][3] = 4
4. a[1][4] = 1



E9.4.4 System Control FunctionF

1. system(e) ==> executes the system command specified by the
string value of the expression e.



E9.4.5 Variable AccessF

There are two built-in functions available for access to
variables. The first, "pd_sym", accesses pre-defined variables


QTAwk - 9-14 - QTAwk






Section 9.4.5 Built-in Functions


and the second, ud_sym, accesses user-defined variables. Each has
two forms:

pd_sym(name_str)

or

pd_sym(name_num,name_str)

1. To access pre-defined variables, the function "pd_sym" may
be used. This function has been supplied to provide a
pre-defined variable access function similar to the function
"ud_sym" for accessing user-defined variables. The forms and
returns are similar.

2. To access user-defined variables where the variable name may
not be known in advance, the function "ud_sym" has been
supplied. The first form:

ud_sym(name_expr)

is useful in situations where the variable name is not known
until the statement is to be executed. In these cases,
name_expr may be any expression or variable with the string
value of the unknown variable. In this form, the string value
of name_expr is used to access the variable. ud_sym returns
the variable in question, if one exists with the string value
passed.

The functional return value may be used in any expression
just as the variable itself would. This includes operating on
the return value with the array index operators, "[]".

Note: This form may be used to access both local and global
variables. If both a local and global variable have been
defined with the desired name and the local variable is
within scope, then the local variable is returned.

The second form:

ud_sym(name_expr,name_str)

is useful in those situations where it may be impractical to
use string values to access the variables, e.g., in a "for",
"while" or "do" loop, but a numeric value can be used to
access the variables.



QTAwk - 9-15 - QTAwk






Section 9.4.5 Built-in Functions


The user variables are accessed in the order defined in the
user utility starting with one (1). If the integer value of
name_expr exceeds the number of user-defined variables, then
a non-variable is returned. The second parameter must be a
variable. Upon return, this variable will have a string value
equal to the name of the variable found or the null string if
name_expr exceeds the number of user-defined variables. The
return value of this variable may be tested to assure that a
variable was found.

The functional return value may be used in any expression
just as the variable itself would. This includes operating on
the return value with the array index operators, "[]".


Note: This form may be used to access global variables ONLY.
Local variables cannot be accessed with this form of the
function.

The following short function will return the number of
user-defined global variables:

# function to return the current number of
# GLOBAL variables defined in utility
function var_number(display) {
local cnt, j, jj;

for ( cnt = 1, j = ud_sym(cnt,jj) ; jj ; j = ud_sym(++cnt,jj) )
if ( display ) print cnt ï " : " ï jj ï " ==>" ï j ï "<==";
return cnt - 1;
}

The following function may be called with the name of the
variable desired. The value of the variable will be returned.
Note that the appropriate variables have been defined in the
"BEGIN" action.

BEGIN {
#define the conversion variables
_kilometers_to_statute_miles_ = 1.609344;
_statute_miles_to_kilometers_ = 1/1.609344;
_inches_to_centimeters_ = 2.54;
_centimeters_to_inches_ = 1/2.54;
_radians_to_degrees_ = 180/pi;
_degrees_to_radians_ = pi/180;
}
# function to return the appropriate conversion


QTAwk - 9-16 - QTAwk






Section 9.4.5 Built-in Functions


function conversion_factor(conversion_name) {
local name = '_' ï conversion_name ï '_';
return ud_sym(name);
}












































QTAwk - 9-17 - QTAwk

























































QTAwk - 9-18 - QTAwk






Section 10.0 Format Specification


E-10.0 FORMAT SPECIFICATIONF-€

QTAwk follows the Draft ANSI C language standard for the format
string in the 'printf' and 'fprintf' functions except for the 'P'
and 'n' types, which are not supported and will give
unpredictable results.

A format specification has the form:

%[flags][width][.precision][h | l | L]type

which is matched by the following regular expression:

/%{flags}?{width}?{precision}?[hlL]?{type}/

with:

flags = /[-+\s#0]/;
width = /([0-9]+|\*)/;
precision = /(\.([0-9]+|\*))/;
type = /[diouxXfeEgGcs]/;

Each field of the format specification is a single character or
a number signifying a particular format option. The type
character, which appears after the last optional format field,
enclosed in braces '[..]', determines whether the associated
argument is interpreted as a character, a string, or a number.
The simplest format specification contains only the percent sign
and a type character (for example, %s). The optional fields
control other aspects of the formatting, as follows:

1. flags ==> Control justification of output and printing of
signs, blanks, decimal points, octal and hexadecimal
prefixes.

2. width ==> Control minimum number of characters output.

3. precision ==> Controls maximum number of characters printed
for all or part of the output field, or minimum number of
digits printed for integer values.

4. h, l, L ==> Prefixes that determine size of argument
expected (this field is retained only for compatibility to C
format strings).

a) h ==> Used as a prefix with the integer types d, i, o,


QTAwk - 10-1 - QTAwk






Section 10.0 Format Specification


x, and X to specify that the argument is short int, or
with u to specify a short unsigned int

b) l == > Used as a prefix with d, i, o, x, and X types to
specify that the argument is long int, or with u to
specify a long unsigned int; also used as a prefix with
e, E, f, g, and G types to specify a double, rather than
a float

c) L ==> Used as a prefix with e, E, f, g, and G types to
specify a long double

If a percent sign, '%', is followed by a character that has no
meaning as a format field, the character is simply copied to the
output. For example, to print a percent-sign character, use "%%".

E10.1 Output TypesF

Type characters:

1. d ==> integer, Signed decimal integer
2. i ==> integer, Signed decimal integer
3. u ==> integer, Unsigned decimal integer
4. o ==> integer, Unsigned octal integer
5. x ==> integer, Unsigned hexadecimal integer, using "abcdef"
6. X ==> integer, Unsigned hexadecimal integer, using "ABCDEF"
7. f ==> float, Signed value having the form

[-]dddd.dddd

where dddd is one or more decimal digits. The number of
digits before the decimal point depends on the magnitude of
the number, and the number of digits after the decimal point
depends on the requested precision.

8. e ==> float, Signed value having the form

[-]d.dddde[sign]ddd,

where d is a single decimal digit, dddd is one or more
decimal digits, ddd is exactly three decimal digits, and sign
is + or -.

9. E ==> float, Identical to the e format, except that E
introduces the exponent instead of e.
10. g ==> float, Signed value printed in f or e format,


QTAwk - 10-2 - QTAwk






Section 10.1 Format Specification


whichever is more compact for the given value and precision.
The e format is used only when the exponent of the value is
less than -4 or greater than the precision argument. Trailing
zeros are truncated and the decimal point appears only if one
or more digits follow it.

11. G ==> float, Identical to the g format, except that E
introduces the exponent (where appropriate) instead of e.

12. c ==> character, Single character

13. s ==> string, Characters printed up to the first null
character ('\0') or until the precision value is reached.

E10.2 Output FlagsF

Flag Characters

1. '-' ==> Left justify the result within the given field
width. Default: Right justify.

2. '+' ==> Prefix the output value with a sign (+ or -) if the
output value is of a signed type. Default: Sign appears only
for negative signed values (-).

3. blank (' ') ==> Prefix the output value with a blank if the
output value is signed and positive. The blank is ignored if
both the blank and + flags appear. Default: No blank.

4. '#' ==> When used with the o, x, or X format, the # flag
prefixes any nonzero output value with 0, 0x, or 0X,
respectively. Default: No blank.

5. '#' ==> When used with the e, E, or f format, the # flag
forces the output value to contain a decimal point in all
cases. Default: Decimal point appears only if digits follow
it.

6. '#' ==> When used with the g or G format, the # flag forces
the output value to contain a decimal point in all cases and
prevents the truncation of trailing zeros. Default: Decimal
point appears only if digits follow it. Trailing zeros are
truncated.

7. '#' ==> Ignored when used with c, d, i, u or s



QTAwk - 10-3 - QTAwk






Section 10.2 Format Specification


8. '0' ==> For d, i, o, u, x, X, e, E, f, g, and G conversions,
leading zeros (following any indication of sign or base) are
used to pad to the field width; no space padding is
performed. If the 0 and - flags both appear, the 0 flag will
be ignored. For d, i, o, u, x, and X conversions, if a
precision is specified, the 0 flag will be ignored. For other
conversions the behavior is undefined. Default: Use blank
padding

If the argument corresponding to a floating-point specifier is
infinite or indefinite, the following output is produced:

+ infinity ==> 1.#INFrandom-digits
- infinity ==> -1.#INFrandom-digits
Indefinite ==> digit.#INDrandom-digits

E10.3 Output WidthF

The width argument is a non-negative decimal integer controlling
the minimum number of characters printed. If the number of
characters in the output value is less than the specified width,
blanks are added to the left or the right of the values
(depending on whether the - flag is specified) until the minimum
width is reached. If width is prefixed with a 0 flag, zeros are
added until the minimum width is reached (not useful for
left-justified numbers).

The width specification never causes a value to be truncated; if
the number of characters in the output value is greater than the
specified width, or width is not given, all characters of the
value are printed (subject to the precision specification).

The width specification may be an asterisk (*), in which case an
integer argument from the argument list supplies the value. The
width argument must precede the value being formatted in the
argument list. A nonexistent or small field width does not cause
a truncation of a field; if the result of a conversion is wider
than the field width, the field expands to contain the conversion
result.

E10.4 Output PrecisionF

The precision specification is a non-negative decimal integer
preceded by a period, '.', which specifies the number of
characters to be printed, the number of decimal places, or the
number of significant digits. Unlike the width specification, the


QTAwk - 10-4 - QTAwk






Section 10.4 Format Specification


precision can cause truncation of the output value, or rounding
in the case of a floating-point value.

The precision specification may be an asterisk, '*', in which
case an integer argument from the argument list supplies the
value. The precision argument must precede the value being
formatted in the argument list.

The interpretation of the precision value, and the default when
precision is omitted, depend on the type, as shown below:

1. d,i,u,o,x,X ==> The precision specifies the minimum number
of digits to be printed. If the number of digits in the
argument is less than precision, the output value is padded
on the left with zeros. The value is not truncated when the
number of digits exceeds precision. Default: If precision is
0 or omitted entirely, or if the period (.) appears without a
number following it, the precision is set to 1.

2. e, E ==> The precision specifies the number of digits to be
printed after the decimal point. The last printed digit is
rounded. Default: Default precision is 6; if precision is 0
or the period (.) appears without a number following it, no
decimal point is printed.

3. f ==> The precision value specifies the number of digits
after the decimal point. If a decimal point appears, at least
one digit appears before it. The value is rounded to the
appropriate number of digits. Default: Default precision is
6; if precision is 0, or if the period (.) appears without a

number following it, no decimal point appears.

4. g, G ==> The precision specifies the maximum number of
significant digits printed. Default: Six significant digits
are printed, without any trailing zeros that are truncated.

5. c ==> No effect. Default: Character printed

6. s ==> The precision specifies the maximum number of
characters to be printed. Characters in excess of precision
are not printed. Default: All characters of the string are
printed.






QTAwk - 10-5 - QTAwk

























































QTAwk - 10-6 - QTAwk






Section 11.0 User-Defined Functions


E-11.0 USER-DEFINED FUNCTIONSF-€

QTAwk supports user-defined functions and has enhanced them over
Awk in several important aspects.

E11.1 Local VariablesF

In QTAwk it is no longer necessary to declare local variables as
excess arguments in the function definition. QTAwk has included
the 'local' keyword. This keyword may be used in any compound
statement, but was invented specifically for user-defined
functions. Consider the simple function to accumulate words from
the current input record in the formatting utility:

# accumulate words for line
function addword(w) {
local lw = length(w); # length of added word

# check new line length
if ( cnt + size + lw >= width ) printline(yes);
line[++cnt] = w; # add word to line array
size += lw;
}

That lw is local to the function and will disappear when the
function is exited is obvious from the definition of lw. It is
also now easy to pick out the function arguments, 'w' in this
case. The initialization of lw to the length of the argument
passed is also easily picked up from the definition. The 'local'
keyword thus truly separates local variables from the function
arguments.

E11.2 Argument CheckingF

By using the '_arg_chk' built-in variable, it is also possible
to have QTAwk now do some argument checking for the user. If
_arg_chk is TRUE, then QTAwk will, at run-time, check the number
of arguments passed against the number of arguments defined. If
the number passed differs from the number defined, then a
run-time error is issued and QTAwk halts. When '_arg_chk' is
FALSE, QTAwk will check at run-time only that the number of
arguments passed is less than or equal to the number defined.
This follows the Awk practice and allows for the use of arguments
defined, but not passed, as local variables. The default value of
'_arg_chk' is FALSE. It is recommended that '_arg_chk' be set to
TRUE and the 'local' keyword used to define variables meant to be


QTAwk - 11-1 - QTAwk






Section 11.2 User-Defined Functions


local to a function.

E11.3 Variable Length Argument ListsF

QTAwk allows user-defined functions to be defined with a
variable number of arguments. The actual number of arguments will
be determined from the call at run-time. QTAwk follows the C
syntax for defining a function with a variable number of
arguments:

# function to determine maximum value
function max(...) {
local max = vargv[1];
local i;

for ( i = 2 ; i <= vargc ; i++ )
if ( max < vargv[i] ) max = vargv[i];
return max;
}

The ellipses, '...', is used as the last argument in a
user-defined argument list to indicate that a variable number of
arguments follow. In the max function shown, no fixed arguments
are indicated. Within the function, the variable arguments are
accessed via the built-in singly-dimensioned array, 'vargv'. The
built-in variable 'vargc' is set equal to the number of elements
of the array and, hence, the variable number of arguments passed
to the function. Since the variable arguments are passed in a
singly dimensioned array, the 'for' statement may be used to
access each in turn:

# function to determine maximum value
function max(...) {
local max = vargv[1];
local i;

for ( i in vargv )
if ( max < vargv[i] ) max = vargv[i];
return max;
}

A user-defined function may have fixed arguments and a variable
number of arguments following:

# function with both fixed and variable number of arguments
function sample(arg1,arg2,...) {


QTAwk - 11-2 - QTAwk






Section 11.3 User-Defined Functions


.
.
.
}

If a user-defined function is to have a variable number of
arguments, then the 'local' keyword must be used to define local
variables. The ellipses denoting the variable arguments must be
last in the function definition argument list.

E11.4 Null Argument ListF

A user defined function may be defined with no arguments.
Consider the function to accumulate words from input records for
the text formatter:

# function to add current line to parsed text
function add_line() {
for ( i = 1 ; i <= NF ; i++ ) if ( length($i) ) addword($i);
}

In the case of a user-defined function with no arguments to be
passed, the function may be invoked with no parenthesized
parameter list. Consider the invocation of the add-line function
in the text formatter. The action executed for input records
which do not start with a format control word is:

{
if ( format ) add_line;
else if ( table_f ) format_table($0);
else output_line($0,FALSE);
}

In QTAwk, the add_line function may be invoked as "add_line" as
above or as "add_line()", with a null length parameter list.

QTAwk has also relaxed the Awk rule that the left parenthesis of
the parameter list must immediately follow a user-defined
function invocation. QTAwk allows blanks between the name and the
left parenthesis. The blanks are ignored.

E11.5 Arrays and Used-Defined FunctionsF

Just as arrays are integrated into QTAwk expressions, arrays are
also integrated into the passing of arguments to, and the return
value from, user-defined functions. Used-defined functions may


QTAwk - 11-3 - QTAwk






Section 11.5 User-Defined Functions


return arrays as well as scalars. This will be illustrated in a
sample utility later.

QTAwk passes scalar arguments to user-defined functions by
value, i.e., if a scalar variable is specified as an argument to
a function, a copy of the variable is passed to the function and
not the variable itself. This is called pass by value. Thus, if
the function alters the value of the argument, the variable is
not altered, only the copy. When the function terminates, the
copy is discarded, and the variable still retains its original
value.

In contrast, QTAwk passes array variables by "reference". This
means that the local variable represented by the function
argument, is the referenced variable and not a copy. Any changes
to the local variable are actually made to the referenced
variable.

In QTAwk, function arguments may also be constant arrays and not
variable arrays, i.e., the argument may be the result of an
arithmetic operation on an array. For example, if A is an array,
then the result of the expression

"A + 10"

is an array and would be passed as a constant array as a
function argument. Such arrays are discard at function
termination.

QTAwk passes by reference under three conditions:

1. The argument is a global or local variable and an array,

2. The argument is a global or local variable and used as an
array, i.e., indexed or referenced by an 'in' statement, in
the called function. This is true whether the referenced
variable is a scalar or array when the function is called. If
the referenced variable was a scalar when the function is
called, then at function termination, if the statement(s) in
which the argument was indexed WERE EXECUTED, the referenced
variable will be an array with the index values referenced.
This behaviour is identical to creating array elements in
global variables by referencing the elements.

3. The argument is a global or local scalar variable and at
function termination the argument is an array. In this case,


QTAwk - 11-4 - QTAwk






Section 11.5 User-Defined Functions


the argument may not have been directly referenced as an
array, but may be the result of an operation involving an
array. Alternatively the argument may have been passed to
another function which referenced it as an array or set it to
the result of operations on arrays.

The following QTAwk utility with a user-defined function will
illustrate the use of arrays and scalars as function arguments
and the return of arrays by user-defined functions.







































QTAwk - 11-5 - QTAwk






Section 11.5 User-Defined Functions


BEGIN {
# create arrays 'a' and 'b'
for ( i = 1 ; i < 6 ; i++ ) a[i] = b[i] = i;
# create scalars 'c' and 'f'
c = f = 10;
# pass scalar variables/values and return scalar value
print "scalar : "set_var(c,c,c)" and c == "c;
# pass two arrays, 'a' & 'b',
# and one scalar constant, 'c+0'
# function will return an array "== a + b + (c+0)"
d = set_var(a,b,c+0);
# print returned array 'd' (== a + b + (c+0))
for ( i in d ) print "d["i"] = "d[i];
#print scalar 'c' to show unchanged
print c;
# pass two arrays, 'a' & 'b',
# and one scalar variable, 'c'
# function will return an array "== a + b + c"
e = set_var(a,b,c);
# print returned array
for ( i in e ) print "e["i"] = "e[i];
# print former scalar, 'c',
# converted to array by operation c = b + 2;
for ( i in c ) print "c["i"] = "c[i];
# pass two arrays, 'a' & 'b', and constant array, 'b+0'
h = set_var(a,b,b+0);
# print returned array
for ( i in h ) print "h["i"] = "h[i];
# print array 'b' to assure that unchanged
for ( i in b ) print "b["i"] = "b[i];
# attempt illegal operation in function: w = f + b
# adding array, 'b', to scalar, 'f'.
# error message will be issued and execution halted
g = set_var(f,b,f);
}

function set_var(x,y,z) {
# create local variable
local w = x + y + z;
# alter third argument
# if first & second arguments arrays,
# this will convert third to an array
# (if not already passed as an array).
z = y + 2;
return w;
}


QTAwk - 11-6 - QTAwk

























































QTAwk - 11-7 - QTAwk






Section 11.5 User-Defined Functions


This QTAwk utility illustrates several ideas in using arrays and
user-defined functions in QTAwk. The line:

print "scalar : "set_var(c,c,c)" and c == "c;

calls the function 'set_var' with three scalar variables, all
'c'. Three copies of 'c' are actually passed. The local variable,
'w', is computed using scalar quantities and is a scalar
quantity. Since argument 'y' is a scalar quantity, the result of
the expression:

z = y + 2;

is a scalar and the third argument, 'c', is unchanged. A
functional value of 30 (== c + c + c) is returned.

The line:

d = set_var(a,b,c+0);

passes arrays as the first and second arguments. The third
argument is a constant scalar value, and thus cannot be changed
by the function called. The return value of the function:

w = x + y + z; (== a + b + 10;)

is an array. The line:

for ( i in d ) print "d["i"] = "d[i];

prints the values of the array:

d[1] = 12
d[2] = 14
d[3] = 16
d[4] = 18
d[5] = 20

The line:

e = set_var(a,b,c);

passes arrays as the first and second arguments. The third
argument is a variable scalar value, and thus can be changed by
the function called if the third argument at function termination
is an array. The return value of the function:


QTAwk - 11-8 - QTAwk






Section 11.5 User-Defined Functions



w = x + y + z; (== a + b + c;)

is an array as above. Note that at function termination, the
third argument is now an array since it was set to the result of
an operation on an array:

z = y + 2;

which is now equivalent to:

z = b + 2;

Thus, at function termination the scalar variable 'c' has been
converted to an array. The line:

for ( i in c ) print "c["i"] = "c[i];

will print the values of the array elements:

c[1] = 3 ( == b[1] + 2)
c[2] = 4 ( == b[2] + 2)
c[3] = 5 ( == b[3] + 2)
c[4] = 6 ( == b[4] + 2)
c[5] = 7 ( == b[5] + 2)

The line:

h = set_var(a,b,b+0);

passes arrays as the first and second arguments. The third
argument is a constant array value, and thus cannot be changed by
the function called. The return value of the function is an array
as above. Note that at function termination, the third argument
is again an array as above. However, the third argument has been
passed as a constant array and thus no variable is changed as 'c'
was above. The third argument is discarded at function
termination. The line:

for ( i in b ) print "b["i"] = "b[i];

prints the array 'b' to assure that it was not changed.

The line:

g = set_var(f,b,f);


QTAwk - 11-9 - QTAwk






Section 11.5 User-Defined Functions


will result in an illegal operation in the function:

local w = x + y + z; (== f + b + f;)

this operation is now attempting to add an array to a scalar:

f + b

This operation will result in an error message and halt
execution.

The above sample QTAwk utility illustrates the power of
user-defined functions in automatically handling scalars and
arrays as both arguments and return values and adjusting
accordingly. The same function may be used interchangably for
both arrays and scalars with natural and predictable results.
































QTAwk - 11-10 - QTAwk






Section 12.0 Trace Statements


E-12.0 TRACE STATEMENTSF-€

QTAwk has added a facility for debugging utilities. This
facility is activated through the built-in variable 'TRACE'.
QTAwk can trace the loop control statements, 'if', 'while', 'do',
'for' (both forms), and 'switch'. In addition, built-in functions
and user-defined functions are traced.

By default, TRACE is set to FALSE and no tracing is done. The
variable may be set to any value, numeric, string or regular
expression and the value will determine the statements traced. If
TRACE has a nonzero numeric value then QTAwk will trace all
statements of the type listed.

E12.1 Selective Statement TracingF

If TRACE has a string value, then the string is compared against
the keywords:

1. if
2. while
3. do
4. for
5. switch
6. function_b (built-in functions)
7. function_u (user-defined functions)

If an exact match (case is important) is found, then the
statement is traced. If TRACE is set to a regular expression,
then the keywords are matched against the regular expression. If
a match is found, then the statement is traced.

E12.2 Trace OutputF

In tracing a statement, QTAwk issues a message to the standard
output file. The message issued will have the form:


Stmt Trace: stmt_str value_str
Action File line: xxxx
Scanning File: FILENAME
Line: xxxxx
Record: xxxxxx

where stmt_str is the appropriate keyword listed above for the
statement traced and value_str is a value dependent upon the
statement traced as listed below:


QTAwk - 12-1 - QTAwk






Section 12.2 Trace Statements


keyword value string
if ==> 0/1 conditional expression TRUE/FALSE
while ==> 0/1 conditional expression TRUE/FALSE
do ==> 0/1 conditional expression TRUE/FALSE
for ==> 0/1 conditional expression TRUE/FALSE
for ==> subscript value
switch ==> switch expression value
function_b ==> function name
function_u ==> function name

When a statement that can be traced is encountered, the value
of the statement is determined, e.g., for an 'if' statement, the
value of the conditional is evaluated before issuing the trace
statement.

The following TRACE values will trace the statements indicated:

1. TRACE = "if";
This value will trace all 'if' statements, indicating the
TRUE/FALSE value of the conditional.

2. TRACE = /^[iwd]/;
This value will trace all 'if', 'while' and 'do' statements,
indicating the TRUE/FALSE value of the conditional.

3. TRACE = /_u$/;
This value will trace all user-defined functions, indicating
the function name in the trace message.




















QTAwk - 12-2 - QTAwk






Section 13.0 Built-in Variables


E-13.0 BUILT-IN VARIABLESF-€

QTAwk offers the following built-in variables. The variables may
be set by the user. Those marked with an asterisk, '*', are new
to QTAwk:

1. * _arg_chk ==> TRUE/FALSE. Default value = FALSE. If FALSE,
the number of arguments passed to a user defined function is
checked only to ensure that the number is not more than
defined. Arguments defined, but not passed are initialized
and passed for use as local variables as in Awk. If TRUE, the
number of arguments passed to a user defined function is
checked for number against the number defined for the
function, unless the function was defined with a variable
number of arguments. If the number passed is not exactly
equal to the number defined, an error message is issued and
execution halted. For this case, any local variables must be
defined with the 'local' keyword.

2. ARGC ==> set equal to the number of arguments passed to
QTAwk as in Awk.

3. * ARGI ==> equal to the index value in ARGV of the next
command line argument to be processed. This value may be
changed and will change the array element of ARGV processed
next. When the last element of ARGV is the current input
file, ARGI is set to one of two integer values:

a) the integer value of the index of the last element of
ARGV plus one, or
b) if the last element of ARGV has a string index, ARGI is
set to zero.

Setting ARGI to zero, ARGC or a value for which there is no
element of ARGV with a corresponding index value, will cause
the current input file to be the last command line argument
processed.

4. ARGV ==> one-dimensional array with elements equal to the
arguments passed to QTAwk as in Awk. The index values are
integers ranging in value from zero to ARGC. ARGV[0] ==
filename by which QTAwk invoked, including full path
information.

5. *CYCLE_COUNT ==> value for the current cycle through the
outer pattern match loop for the current record. Value


QTAwk - 13-1 - QTAwk






Section 13.0 Built-in Variables


incremented by the 'cycle' statement.

6. * DEGREES ==> TRUE/FALSE. Default value = FALSE. If FALSE
trigonometric functions assume radian values are passed and
return radian values. If TRUE trigonometric functions assume
degree values are passed and return degree values.

7. ENVIRON ==> one-dimensional array with elements equal to the
environment strings passed to QTAwk. The index values are
integers ranging in value from zero to the number of
environment strings defined less one.

8. * FALSE ==> predefined with zero, 0, constant value.

9. FILENAME ==> equal to string value of current filename,
including any path specified. If assigned a new value, the
file with a name equal to the new string value is opened (or
an error message displayed if the filename is illegal). The
new file becomes the current input file. The former input
file is not closed and may continue to be input by
re-assigning FILENAME, putting the name in ARGV for future
use or read with the 'fgetline' function.

10. FNR ==> equal to current record number of current input
file.

11. FS ==> value of current input field separator. The default
value for FS is /[\t-\r\s]+/, i.e., any consecutive white
space characters. If FS is set on the command line or in the
user utility then the following rules apply (see also RS
below):
a) setting to a single blank, ' ' or " ", will set FS to the
default value of /[\t-\r\s]+/,
b) setting to a single character or a value which when
converted to a string yields a string of a single
character in length, 'x' or "x", will cause the single
character to be used as the input record field separator,
c) setting to a regular expression, a multiple character
string or a value which when converted to a string yields
a multiple character string, the string will be
considered a regular expression and converted to the
regular expression internal form when the assignment is
made. Input records are scanned for a string matching the
regular expression and matching strings become field
separators. The length of matching strings is governed by
the LONGEST_EXPR built-in variable.


QTAwk - 13-2 - QTAwk






Section 13.0 Built-in Variables


12. * LONGEST_EXP ==> TRUE/FALSE. default == TRUE. If TRUE
longest string matching a regular expression is found in:
a) patterns
b) match operators, '~~' and '!~'
c) 'match' function
d) 'gsub' function
e) 'sub' function
f) 'strim' function
g) input record separator strings when RS is considered a
regular expression,
h) input record field separator strings when FS is
considered a regular expression,
i) field separator matching in 'split' function when field
separator is a regular expression

If FALSE then the first string matching the regular
expression is found.

13. *MAX_CYCLE ==> Maximum value for CYCLE_COUNT for cycling
through outer pattern match loop with current input record.
Default value == 100.

14. NF ==> equal to the number of fields in the current input
record

15. * NG ==> set to the number of the expression matching the
current input record for GROUP patterns.

16. NR ==> total number of records read so far across all input
files.

17. OFMT ==> output and string conversion format for numbers.
Default value of "%.6g".

18. OFS ==> output field separator. Default value of a single
blank, ' '.

19. ORS ==> output record separator. Default value of a single
newline character, '\n'.

20. * RETAIN_FS ==> TRUE/FALSE. Default value = FALSE. If FALSE
then OFS is used between fields to reconstruct $0 whenever a
field value is altered. If TRUE the original field separator
characters are retained in reconstructing $0 whenever a field
value is altered.



QTAwk - 13-3 - QTAwk






Section 13.0 Built-in Variables


21. RS ==> input record separator. The default value for RS is
a single newline character, '\n'. If RS is set on the command
line or in the user utility then the following rules apply
(see also FS above):
a) setting to the null string, "", will set RS to the
regular expression /\n\n/. Thus, blank lines, i.e., two
consecutive newline characters, bound input records.
b) setting to a single character or a value which when
converted to a string yields a string of a single
character in length, 'x' or "x", will cause the single
character to be used as the input record separator,
c) setting to a regular expression, a multiple character
string or a value which when converted to a string yields
a multiple character string, the string will be
considered a regular expression and converted to the
regular expression internal form when the assignment is
made. The input file character stream is scanned for a
string matching the regular expression and matching
strings become record separators. The length of matching
strings is governed by the LONGEST_EXPR built-in
variable.

22. * TRACE ==> control statement tracing. Default value =
FALSE. Determines whether statements are traced during
execution.

23. * TRANS_FROM ==> string used by 'stran' function for
translating from. Default value is
"ABCDEFGHIJKLMNOPQRSTUVWXYZ".

24. * TRANS_TO ==> string used by 'stran' function for
translating to. Default value is
"abcdefghijklmnopqrstuvwxyz".

25. * TRUE ==> predefined with one, 1, constant value.

26. * CLENGTH ==> length of string matched in 'case' statement

27. * CSTART ==> start of string matched in 'case' statement

28. * MLENGTH ==> length of string matched in match operators,
'~~' and '!~'

29. * MSTART ==> start of string matched in match operators,
'~~' and '!~'



QTAwk - 13-4 - QTAwk






Section 13.0 Built-in Variables


30. RLENGTH ==> length of string matched in 'match' function

31. RSTART ==> start of string matched in 'match' function

E13.1 User Function Variable Argument ListsF

1. * vargc ==> count of variable arguments passed to current
invocation

2. * vargv ==> singly-dimensioned array of variable arguments
passed to current function invocation. Indexing numeric
starting at one, 1.




































QTAwk - 13-5 - QTAwk

























































QTAwk - 13-6 - QTAwk






Section 14.0 Invoking QTAwk


E-14.0 COMMAND LINE INVOCATIONF-€

There are two ways of specifying utilities to QTAwk:
1. Specifying the utility on the command line, e.g.,

QTAwk "/^$/{if(!bc++)print;next;}{bc=FALSE;print;}" file1

This short command line utility will read file1, printing
only the first blank line in a series of blank lines. All
other lines are printed.

Note that the "utility" has been enclosed in double quotes,
". This is necessary to keep PC/MS-DOS from interpreting the
utility as a file. In addition, if the utility contains
symbols recognized by PC/MS-DOS, e.g., the re-direction
operators, '<' or '>', the quotes keep PC/MS-DOS from
recognizing the symbols. If the utility contains quotes,
e.g., a constant string definition, then the imbedded quotes
should be preceded by a back-slash, '\'.

For example, the short utility:

QTAwk "{print FNR\" : \"$0;}" file1

prints each line of file1 preceded by the line number. The
constant string,

" : "

separates the line number from the line. Back-slashes must
precede the quotes surrounding the constant string.

2. -futilityfile
or
-f utilityfile

When a utility may be used frequently or grows too long to
include on the command line as above, it becomes necessary to
store it in a file. The utility may then be specified to
QTAwk with this option. The blank between the 'f' command
line option and the utility file name is optional.

E14.1 Multiple QTAwk UtilitiesF

More than one utility file may be specified to QTAwk in this
manner. Each utility file specified is read in the order


QTAwk - 14-1 - QTAwk






Section 14.1 Invoking QTAwk


specified and combined into a single QTAwk utility. In this
manner it is possible to keep constantly used pattern-actions or
user-defined functions in separate files and combine them into
QTAwk utilities as necessary. The order of the utility files is
not important except for the order in which predefined patterns
are executed and the order in which pattern-action pairs are
executed. Thus if a utility file contained only common
user-defined functions, it may be defined in any order in
relation to other utility files.

Scanning of the command line for arguments may be stopped with
the double hyphen command line argument, "--". This argument is
not passed to the QTAwk utility.

This method of specifying utilities to QTAwk cannot be combined
with the command line utility definition method.

The command line is scanned for all utility files specified with
the 'f' option prior to reading the utility files or any input
files. The utility files are then "removed" from the command line
and the command line argument count.

E14.2 Setting the Field SeparatorF

The QTAwk input record separator, FS, may be set on the command
line with the 'F' option.

QTAwk -F "/:/"

or

QTAwk -F/:/

The blank between the 'F' and the string or regular expression
defining the new input record separator is optional. This option
may only be specified once on the command line. The command line
is scanned for all 'F' options prior to reading any utility files
or input files. The option and the new value for FS are then
"removed from the command line and the command line count.

Another method is available for setting FS prior to reading the
input files. This method is more general, may be used multiple
times on the command line and may be used to set any utility
variable and not just FS.

E14.3 Setting Variables on the Command LineF


QTAwk - 14-2 - QTAwk






Section 14.3 Invoking QTAwk


Including the following on the command line:

var = value

or

var=value

will set the variable 'var' to the value specified. 'var' may
be any built-in or user-defined variable in the QTAwk utility.
'var' must be a variable defined in the current QTAwk utility or
a run-time error will occur and QTAwk will stop processing.

E14.4 QTAwk Execution SequenceF

QTAwk execution follows the following sequence:

1. The command line is scanned for any 'f' and 'F' options. If
any such options are found, they are removed from the command
line.

2. The QTAwk utility is read and processed. If any 'f' options
were found in the preceding step, the associated utility
files are opened, read and processed in the order specified.
If no 'f' options were specified, the first command line
argument is processed as the QTAwk utility and then removed
from the command line arguments.

3. The ARGC and ARGV built-in variables are set according to
the command line parameters. The ARGI built-in variable is
set to 1.

4. Any "BEGIN" actions in the QTAwk utility are executed. This
is done prior to any further interpretation of the command
line arguments.

5. The command line argument, ARGV[ARGI], is examined. One of
two actions is taken depending on the form of the argument:

a) An argument of the form:

var = value

or

var=value


QTAwk - 14-3 - QTAwk






Section 14.4 Invoking QTAwk


is interpreted as setting the variable specified, to
the value specified.

b) Any other argument is interpreted as a file name. The
file specified is opened for input. If the file does not
exist, an error message is issued and execution halted.
If a single hyphon, '-', is specified, it is interpreted
as representing the standard input file. If no command
line arguments are specified beyond the QTAwk utility or
variable setting commands, the standard input file is
read for input.

6. Any "INITIAL" actions in the QTAwk utility are executed.

7. The input file is read record by record and matched against
the patterns present in the QTAwk utility. If no
pattern/action pairs are given in the QTAwk utility, each
record is read, the NF, FNR, NR and field variables are set
and the record is then discarded. If an 'exit' or 'endfile'
statement is executed, action passes to the next step below.

8. When the end of the input file is reached or an "exit" or
"endfile" statement is executed, any 'FINAL' actions are
executed.

9. The input file is closed.

10. If an "exit" statement was executed, processing passes to
step 11) below, else the following steps are executed:

a) The element of ARGV corresponding the the current index
value of ARGI is sought. If none is found, processing
proceeds as if the "exit" statement was executed.
b) ARGI is set to the index value of the next element of
ARGV. If there is no next element of ARGV, processing
proceeds as if the "exit" statement was executed.
c) processing continues with step 5) above.

11. Any "END" actions in the QTAwk utility are executed.

12. QTAwk execution halts.







QTAwk - 14-4 - QTAwk






Section 15.0 QTAwk Limits


E-15.0 LIMITSF-€

QTAwk has the following limitations:

1024 fields

4096 characters per input record

4096 characters per formatted output record

256 characters in character class (with character ranges
expanded)

256 user-defined functions

256 local variables

256 global variables

1024 characters in constant strings

1024 characters in regular expressions on input

4096 characters in regular expressions after expansion of named
expressions and repetition operators.

4096 characters in strings used as regular expressions after
expansion of named expressions and repetition operators.

4096 characters in strings returned by 'replace' functions

4096 characters in input strings read by 'getline' and fgetline'
functions

4096 characters in strings after substitution for 'gsub' and
'sub' functions

4096 characters maximum in strings returned by following
functions:
1. copies
2. deletec
3. insert
4. overlay
5. remove




QTAwk - 15-1 - QTAwk

























































QTAwk - 15-2 - QTAwk






Section 16.0 Appendix I


E-16.0 Appendix IF-€

ASCII character set
( escape sequences shown for non-printable characters )

dec hex char dec hex char dec hex char dec hex char
Ñ Ñ Ñ
0 00 NUL ³ 32 20 \s ³ 64 40 @ ³ 96 60 `
1 01 ^ SOH ³ 33 21 ! ³ 65 41 A ³ 97 61 a
2 02 ^ STX ³ 34 22 " ³ 66 42 B ³ 98 62 b
3 03 ^ ETX ³ 35 23 # ³ 67 43 C ³ 99 63 c
4 04 ^ EOT ³ 36 24 $ ³ 68 44 D ³ 100 64 d
5 05 ^ ENQ ³ 37 25 % ³ 69 45 E ³ 101 65 e
6 06 ^ ACK ³ 38 26 & ³ 70 46 F ³ 102 66 f
7 07 ^\a BEL ³ 39 27 ' ³ 71 47 G ³ 103 67 g
8 08 ^\b BS ³ 40 28 ( ³ 72 48 H ³ 104 68 h
9 09 ^ \t HT ³ 41 29 ) ³ 73 49 I ³ 105 69 i
10 0A ^ \n LF ³ 42 2A * ³ 74 4A J ³ 106 6A j
11 0B ^ \v VT ³ 43 2B + ³ 75 4B K ³ 107 6B k
12 0C ^ \f FF ³ 44 2C , ³ 76 4C L ³ 108 6C l
13 0D ^
\r CR ³ 45 2D - ³ 77 4D M ³ 109 6D m
14 0E ^ SO ³ 46 2E . ³ 78 4E N ³ 110 6E n
15 0F ^ SI ³ 47 2F / ³ 79 4F O ³ 111 6F o
16 10 ^ DLE ³ 48 30 0 ³ 80 50 P ³ 112 70 p
17 11 ^ DC1 ³ 49 31 1 ³ 81 51 Q ³ 113 71 q
18 12 ^ DC2 ³ 50 32 2 ³ 82 52 R ³ 114 72 r
19 13 ^ DC3 ³ 51 33 3 ³ 83 53 S ³ 115 73 s
20 14 ^ DC4 ³ 52 34 4 ³ 84 54 T ³ 116 74 t
21 15 ^ NAK ³ 53 35 5 ³ 85 55 U ³ 117 75 u
22 16 ^ SYN ³ 54 36 6 ³ 86 56 V ³ 118 76 v
23 17 ^ ETB ³ 55 37 7 ³ 87 57 W ³ 119 77 w
24 18 ^ CAN ³ 56 38 8 ³ 88 58 X ³ 120 78 x
25 19 ^ ³ 57 39 9 ³ 89 59 Y ³ 121 79 y
26 1A SUB ³ 58 3A : ³ 90 5A Z ³ 122 7A z
27 1B ^ ESC ³ 59 3B ; ³ 91 5B [ ³ 123 7B {
28 1C ^ FS ³ 60 3C < ³ 92 5C \ ³ 124 7C |
29 1D ^ GS ³ 61 3D = ³ 93 5D ] ³ 125 7D }
30 1E ^ ³ 62 3E > ³ 94 5E ^ ³ 126 7E ~
31 1F ^ ³ 63 3F ? ³ 95 5F _ ³ 127 7F









QTAwk - 16-1 - QTAwk






Section 16.0 Appendix I


ASCII character sets. (continued)

dec hex char dec hex char dec hex char dec hex char
Ñ Ñ Ñ
128 80 ^€ ³ 160 A0   ³ 192 C0 À ³ 224 E0 à
129 81 ^ ³ 161 A1 ¡ ³ 193 C1 Á ³ 225 E1 á
130 82 ^‚ ³ 162 A2 ¢ ³ 194 C2  ³ 226 E2 â
131 83 ^ƒ ³ 163 A3 £ ³ 195 C3 à ³ 227 E3 ã
132 84 ^„ ³ 164 A4 ¤ ³ 196 C4 Ä ³ 228 E4 ä
133 85 ^… ³ 165 A5 ¥ ³ 197 C5 Å ³ 229 E5 å
134 86 ^† ³ 166 A6 ¦ ³ 198 C6 Æ ³ 230 E6 æ
135 87 ^‡ ³ 167 A7 § ³ 199 C7 Ç ³ 231 E7 ç
136 88 ^ˆ ³ 168 A8 ¨ ³ 200 C8 È ³ 232 E8 è

137 89 ^‰ ³ 169 A9 © ³ 201 C9 É ³ 233 E9 é
138 8A ^Š ³ 170 AA ª ³ 202 CA Ê ³ 234 EA ê
139 8B ^‹ ³ 171 AB « ³ 203 CB Ë ³ 235 EB ë
140 8C ^Œ ³ 172 AC ¬ ³ 204 CC Ì ³ 236 EC ì
141 8D ^ ³ 173 AD ­ ³ 205 CD Í ³ 237 ED í
142 8E ^Ž ³ 174 AE ® ³ 206 CE Î ³ 238 EE î
143 8F ^ ³ 175 AF ¯ ³ 207 CF Ï ³ 239 EF ï
144 90 ^ ³ 176 B0 ° ³ 208 D0 Ð ³ 240 F0 ð
145 91 ^‘ ³ 177 B1 ± ³ 209 D1 Ñ ³ 241 F1 ñ
146 92 ^’ ³ 178 B2 ² ³ 210 D2 Ò ³ 242 F2 ò
147 93 ^“ ³ 179 B3 ³ ³ 211 D3 Ó ³ 243 F3 ó
148 94 ^” ³ 180 B4 ´ ³ 212 D4 Ô ³ 244 F4 ô
149 95 ^• ³ 181 B5 µ ³ 213 D5 Õ ³ 245 F5 õ
150 96 ^– ³ 182 B6 ¶ ³ 214 D6 Ö ³ 246 F6 ö
151 97 ^— ³ 183 B7 · ³ 215 D7 × ³ 247 F7 ÷
152 98 ^˜ ³ 184 B8 ¸ ³ 216 D8 Ø ³ 248 F8 ø
153 99 ^™ ³ 185 B9 ¹ ³ 217 D9 Ù ³ 249 F9 ù
154 9A ^š ³ 186 BA º ³ 218 DA Ú ³ 250 FA ú
155 9B ^› ³ 187 BB » ³ 219 DB Û ³ 251 FB û
156 9C ^œ ³ 188 BC ¼ ³ 220 DC Ü ³ 252 FC ü
157 9D ^ ³ 189 BD ½ ³ 221 DD Ý ³ 253 FD ý
158 9E ^ž ³ 190 BE ¾ ³ 222 DE Þ ³ 254 FE þ
159 9F ^Ÿ ³ 191 BF ¿ ³ 223 DF ß ³ 255 FF












QTAwk - 16-2 - QTAwk






Section 17.0 Appendix II


E-17.0 Appendix IIF-€

Major differences between QTAwk and Awk.

1. Expanded Regular Expressions
All of the Awk regular expression operators are allowed plus
the following:
a) complemented character class using the Awk notation,
'[^...]', as well as the Awk/QTAwk and C logical negation
operator, '[!...]'.

b) Matched character classes, '[#...]'. These classes are
used in pairs. The position of the character matched in
the first class of the pair, determines the character
which must match in the position occupied by the second
class of the pair.

c) Look-ahead Operator. r@t regular expression r is matched
only when followed by regular expression t.

d) Repetition Operator. r{n1,n2} at least n1 and up to n2
repetitions of regular expression r. 1 <= n1 <= n2

e) Named Expressions.
{named_expr} is replaced by the string value of the
corresponding variable.

2. Consistent statement termination syntax. The QTAwk Utility
Creation Tool utilizes the semi-colon, ';', to terminate all
statements. The practice in Awk of using newlines to
"sometimes" terminate statements is no longer allowed.

3. Expanded Operator Set
The Awk set of operators has been changed to more closely
match those of C. The Awk match operator, '~', has been
changed to '~~' so that the similarity between the match
operators, '~~' and '!~', to the equality operators, '==' and
'!=", is complete. The single tilde symbol, '~', reverts to
the C one's complement operator, an addition to the operator
set over Awk. The introduction of the explicit string
concatenation operator. The remaining "new" operators to
QTAwk are:






QTAwk - 17-1 - QTAwk






Section 17.0 Appendix II


Operation Operator
tag $$
one's complement ~
concatenation ï
shift left/right << >>
matching ~~ !~
bit-wise AND &
bit-wise XOR @
bit-wise OR |
sequence ,

The carot, '^', remains as the exponentiation operator. The
symbol '@' is used for the exclusive OR operator.

4. Expanded set of recognized constants in QTAwk utilities:
a) decimal integers,
b) octal integers,
c) hexadecimal integers,
d) character constants, and
e) floating point constants.

5. Expanded Pre-defined patterns giving more control:
a) INIITAL - similar to BEGIN. Actions executed after
opening each input file and before reading first record.
b) FINAL - similar to END. Actions executed after reading
last record of each input file and before closing file.
c) NOMATCH - actions executed for each input record for
which no pattern was matched.
d) GROUP - used to group multiple regular expressions for
search optimization. Can speed search by a factor of six.

6. True multi-dimensional arrays
The use of the comma in index expressions to simulate
multiple array indices is no longer supported. True multiple
indices are supported. Indexing is in the C manner,
'a[i1][i2]'. The SUBSEP built-in variable of AWK has been
dropped since it is no longer necessary.

7. Integer array indices as well as string indices
Array indices have been expanded to include integers as well
as the string indices of Awk. Indices are not automatically
converted to strings as in Awk. Thus, for true integer
indices, the index ordering follows the numeric sequence with
an integer index value of '10' following a value of '2'
instead of preceeding it.



QTAwk - 17-2 - QTAwk






Section 17.0 Appendix II


8. Arrays integrated into QTAwk
QTAwk integrates arrays with arithemetic operators so that
the operations are carried out on the entire array. QTAwk
also integrates arrays into user-defined functions so that
they can be passed to and returned from such functions in a
natural and intuitive manner. Awk does not allow returning
arrays from user-defined functions or allow arithmetic
operators to operate on whole arrays.

9. NEW keywords:

a) cycle
similar to 'next' except that may use current record in
restarting outer pattern matching loop.
b) deletea
similiar to 'delete' except that ALL array values
deleted.
c) switch, case, default
similiar to C syntax with the allowed 'switch' and 'case'
values expanded to include any legal QTAwk expression,
evaluated at run-time. The expressions may evaluate to
any value including any numeric value, string or regular
expression.
d) local
new keyword to allow the declaration and use of local
variables within compound statements, including
user-defined functions. Its use in user defined functions
instead of the Awk practice of defining excess formal
parameters, leads to easier to read and maintain
functions. The C 'practice' of allowing initialization in
the 'local' statement is followed.
e) endfile
similar to 'exit'. Simulates end of current input file
only, any remaining input files are still processed.

10. Expanded arithmetic functions
QTAwk includes 18 built-in arithmetic functions. All of the
functions supported by Awk plus the following:
a) acos(x)
b) asin(x)
c) cosh(x)
d) fract(x)
e) log10(x)
f) pi() or pi
g) sinh(x)



QTAwk - 17-3 - QTAwk






Section 17.0 Appendix II


11. Expanded string functions
QTAwk includes 33 built-in string functions. All of the
functions supported by Awk plus the following:
a) center(s,w) or center(s,w,c)
b) copies(s,n)
c) deletec(s,p,n)
d) insert(s1,s2,p)
e) justify(a,n,w) or justify(a,n,w,c)
f) overlay(s1,s2,p)
g) remove(s,c)
h) replace(s)
i) sdate(fmt)
j) srange(c1,c2)
k) srev(s)
l) stime(fmt)
m) stran(s) or stran(s,st) or stran(s,st,sf)
n) strim(s) or strim(s,c) or strim(s,c,d)
o) strlwr(s)
p) strupr(s)

12. New Miscellaneous functions
a) The function 'rotate(a)' is provided to rotate the
elements of the array a.
b) execute(s) or execute(s,se) or execute(s,se,rf) - execute
string s
c) execute(a) or execute(a,se) or execute(a,se,rf) - execute
array a
d) pd_sym - access pre-defined symbols
e) ud_sym - access user defined symbols

13. New I/O functions
I/O function syntax has been made consistent with syntax of
other functions. The redirection operators, '<', '>' and
'>>', and pipeline operator, '|', have been deleted as
excessively error prone in expressions. The functional syntax
of the 'getline' function has been made identical to that of
the other built-in functions. The new functions 'fgetline',
'fprint' and 'fprintf' have been introduced for reading and
writing to files other than the current input file. The new
functions 'getc()' and 'fgetc()' have been introduced for
single character input.

14. Expanded capability of formatted Output.
The limited output formatting available with the Awk 'printf'
function has been expanded by adopting the complete output
format specification of the draft ANSI C standard.


QTAwk - 17-4 - QTAwk






Section 17.0 Appendix II


15. Use of 'local' keyword
The 'local' keyword has been introduced to allow for
variables local to user-defined functions (and any compound
statement). This expansion makes the Awk practice of defining
'extra' formal parameters no longer necessary.

16. Expanded user-defined functions
With the 'local' keyword, QTAwk allows the user to define
functions that may accept a variable number of arguments.
Functions, such as finding the minimum/maximum of a variable
number of variables, are possible with one function rather
than defining separate functions for each possible
combination of arguments.

17. User controlled trace capability
A user controlled statement trace capability has been added.
This gives the user a simple to use mechanism to trace
utility execution. Rather than adding 'print' statements,
merely re-defining the value of a built-in variable will give
utility execution trace information, including utility line
number.

18. Expanded built-in variable list
With 30 built-in variables, QTAwk includes all (with the
exception of SUBSEP) of the built-in variables of Awk plus
the following:
a) _arg_chk - used to determine whether to check number of
arguments passed to user-defined functions.
b) ARGI - index value in ARGV of next command line argument.
Gives more control of command line argument processing.
c) CYCLE_COUNT - count number of outer loop cycles with
current input record.
d) DEGREES - if TRUE, trigonometric functions assume degree
values, radians if FALSE.
e) ENVIRON - array of environment strings passed to QTAwk
f) FALSE - pre-defined with constant value, 0.
g) TRUE - predefined with constant value, 1
h) LONGEST_EXP - used to control whether the longest or the
first string matching a regular expression is found.
i) MAX_CYCLE - maximum number of outer loop cycles permitted
with current input record.
j) NG - equal to the number of the regular expression in a
group matching a string in the current input record.
k) RETAIN_FS - if TRUE the original characters separating
the fields of the current input record are retained
whenever a field is changed, causing the input record to


QTAwk - 17-5 - QTAwk






Section 17.0 Appendix II


be re-constructed. If FALSE the output field separator,
OFS, is used to separate fields in the current input
record during reconstruction. The latter practice is the
only method available in Awk.
l) TRACE - value used to determine utility statement
tracing.
m) TRANS_FROM/TRANS_TO - strings used by 'stran' function if
second and/or third arguments not specified.
n) CLENGTH - similiar to 'RLENGTH' of Awk. Set whenever a
'case' value evaluates to a regular expression.
o) CSTART - similiar to 'RSTART' of Awk. Set whenever a
'case' value evaluates to a regular expression.
p) MLENGTH - similiar to 'RLENGTH' of Awk. Set whenever a
stand-alone regular expression is encountered in
evaluting a pattern.
q) MSTART - similiar to 'RSTART' of Awk. Set whenever a
stand-alone regular expression is encountered in
evaluting a pattern.
r) vargc - used only in used-defined functions defined with
a variable number of arguments. At runtime, set equal to
the actual number of variable arguments passed.
s) vargv - used only in used-defined functions defined with
a variable number of arguments. At runtime, an single
dimensioned array with each element set to the argument
actually passed.

19. Definition of built-in variable, RS, expanded to include
string form. If RS set to a string longer than one character,
then string intrepreted as a regular expression and any
string matching regular expression becomes record separator.

20. In QTAwk, setting built-in variable, "FILENAME", to another
value will change the current input file. Setting the
variable in Awk, has no effect on current input file.

21. Corrected admitted problems with Awk. The problems
mentioned on page 182 of "The Awk Programming Language" have
been corrected. Specifically: 1) true multi-dimensional
arrays have been implemented, 2) the 'getline' syntax has
been made to match that of other functions, 3) declaring
local variables in user-defined functions has been corrected,
4) intervening blanks are allowed between the function call
name and the opening parenthsis (in fact, under QTAwk it is
permissable to have no opening parenthesis or argument list
for user-defined functions that have been defined with no
formal arguments).


QTAwk - 17-6 - QTAwk






Section 18.0 Appendix III


E-18.0 Appendix IIIF-€


The following QTAwk utility is designed to search C source code
files for keywords defined in the ANSI C standard. It is included
here to illustrate the use of the the 'GROUP' keyword.

# QTAwk utility to scan C source files for keywords
# defined in the ANSI C standard keywords:
# macro or function names defined in the standard
# types or constants defined in the standard
#
# program to illustrate GROUP pattern keyword
#
# input: C source file
# output: all lines containing ANSI C standard defined keywords
#
# use 'GROUP' pattern keyword to form one large GROUP of
# patterns to speed search. Only two actions defined:
# 1) action to print macro or function names
# 2) action to print types or constants
#
#
BEGIN {
#
# ANSI C key words
#
# expression for leader
ldr = /(^|[\s\t])/;
# opening function parenthesis - look-ahead to find
o_p = /@[\s\t]*\(/;
#
# define strings for formatted output
#
tls = "Total Lines Scanned: %lu\n";
tlc = "Total Line containing macro/function names: %lu\n";
tlt = "Total Line containing type/constant names: %lu\n";
}
#
#
# Following are macro or functions names as defined
# by ANSI C standard
#
# 1
GROUP /{ldr}assert{o_p}/
# 2


QTAwk - 18-1 - QTAwk






Section 18.0 Appendix III


# Following regular expression split across 2 lines
# for documentation only.
GROUP /{ldr}is(al(num|pha)|cntrl|x?digit|graph|
p(rint|unct)|space|(low|upp)er){o_p}/
# 3
GROUP /{ldr}to(low|upp)er{o_p}/
# 4
GROUP /{ldr}set(locale|v?buf){o_p}/
# 5
GROUP /{ldr}a(cos|sin|tan2?|bort){o_p}/
# 6
GROUP /{ldr}(cos|sin|tan)h?{o_p}/
# 7
GROUP /{ldr}(fr|ld)?exp{o_p}/
# 8
GROUP /{ldr}log(10)?{o_p}/
# 9
GROUP /{ldr}modf{o_p}/
# 10
GROUP /{ldr}pow{o_p}/
# 11
GROUP /{ldr}sqrt{o_p}/
# 12
GROUP /{ldr}ceil{o_p}/
# 13
GROUP /{ldr}(f|l)?abs{o_p}/
# 14
GROUP /{ldr}f(loor|mod){o_p}/
# 15
GROUP /{ldr}jmp_buf{o_p}/
# 16
GROUP /{ldr}(set|long)jmp{o_p}/
# 17
GROUP /{ldr}signal{o_p}/
# 18
GROUP /{ldr}raise{o_p}/
# 19
GROUP /{ldr}va_(arg|end|list|start){o_p}/
# 20
GROUP /{ldr}re(move|name|wind){o_p}/
# 21
GROUP /{ldr}tmp(file|nam){o_p}/
# 22
GROUP /{ldr}(v?[fs])?printf{o_p}/
# 23
GROUP /{ldr}[fs]?scanf{o_p}/


QTAwk - 18-2 - QTAwk






Section 18.0 Appendix III


# 24
GROUP /{ldr}f?get(c(har)?|s|env){o_p}/
# 25
GROUP /{ldr}f?put(c(har)?|s){o_p}/
# 26
GROUP /{ldr}ungetc{o_p}/
# 27
# Following regular expression split across 2 lines
# for documentation only.
GROUP /{ldr}f(close|flush|(re)?open|read|write|
[gs]etpos|seek|tell|eof|ree|pos_t){o_p}/
# 28
GROUP /{ldr}clearerr{o_p}/
# 29
GROUP /{ldr}[fp]error{o_p}/
# 30
GROUP /{ldr}ato[fil]{o_p}/
# 31
# Following regular expression split across 2 lines
# for documentation only.
GROUP /{ldr}str(to(d|k|u?l)|n?c(py|at|mp)|
coll|r?chr|c?spn|pbrk|str|error|len){o_p}/
# 32
GROUP /{ldr}s?rand{o_p}/
# 33
GROUP /{ldr}(c|m|re)?alloc{o_p}/
# 34
GROUP /{ldr}_?exit{o_p}/
# 35
GROUP /{ldr}(f|mk|asc|c|gm|local|strf)?time{o_p}/ {
printf("Macro/function\n%uE - %luR: %s\n%s\n",NG,FNR,$0,$$0);
mf_count++;
}
#
# following are types or constants
#
# 36
GROUP /errno/
# 37
GROUP /NULL/
# 38
GROUP /offsetof/
# 39
GROUP /(fpos|ptrdiff|size|wchar)_t/
# 41
GROUP /NDEBUG/


QTAwk - 18-3 - QTAwk






Section 18.0 Appendix III


# 42
GROUP /LC_(ALL|COLLATE|CTYPE|NUMERIC|TIME)/
# 43
GROUP /E(DOM|RANGE|OF)/
# 44
GROUP /HUGE_VAL/
# 45
GROUP /sig_atomic_t/
# 46
GROUP /SIG(_(DFL|ERR|IGN)|ABRT|FPE|ILL|INT|SEGV|TERM)/
# 47
GROUP /FILE/
# 48
GROUP /_IO[FLN]BF/
# 49
GROUP /BUFSIZ/
# 50
GROUP /L_tmpnam/
# 51
GROUP /(OPEN|RAND|TMP|U(CHAR|INT|LONG|SHRT))_MAX/
# 52
GROUP /SEEK_(CUR|END|SET)/
# 53
GROUP /std(err|in|out)/
# 54
GROUP /l?div_t/
# 55
GROUP /CLK_TCK/
# 56
GROUP /(clock|time)_t/
# 57
GROUP /tm_(sec|min|hour|[mwy]day|mon|year|isdst)/
# 58
GROUP /CHAR_(BIT|M(AX|IN))/
# 59
GROUP /(INT|LONG|S(CHAR|HRT))_(M(IN|AX))/
# 60
GROUP /(L?DBL|FLT)_((MANT_)?DIG|EPSILON|M(AX|IN)(_(10_)?EXP)?)/
# 61
GROUP /FLT_R(ADIX|OUNDS)/ {
printf("type/constant\n%uE - %luR: %s\n%s\n",NG,FNR,$0,$$0);
tc_count++;
}

FINAL {
printf(tls,FNR);


QTAwk - 18-4 - QTAwk






Section 18.0 Appendix III


printf(tlc,mf_count);
printf(tlt,tc_count);
}













































QTAwk - 18-5 - QTAwk

























































QTAwk - 18-6 - QTAwk






Section 19.0 Appendix IV


E-19.0 Appendix IVF-€

This is a complete copy of the data file, states.dta, used in to
illustrate QTAwk. The fields of the first record for the default
field separator FS = /{_z}+/ is shown below followed by the
fields for the record separator FS = /{_w}+[\#()]({_w}+|$)/


Fields for Default FS = /{_z}+/
1. US -- country/continent name
2. # -- separator
3. 47750 -- area, square miles
4. # -- separator
5. 4515 -- population in thousands
6. # -- separator
7. LA -- abbreviation (US & Canada only)
8. # -- separator
9. Baton -- first half capital city name
10. Rouge -- second half capital city name
11. ( -- separator
12. Louisiana -- state/country name
13. ) -- Terminator

.so off Fields for FS = /[\s\t]+[\#()]([\s\t]+|$)/:
1. US -- country/continent name
2. 47750 -- area, square miles
3. 4515 -- population in thousands
4. LA -- abbreviation (US & Canada only)
5. Baton Rouge -- full capital city name
6. Louisiana -- state/country name

US # 10461 # 4375 # MD # Annapolis ( Maryland )
US # 40763 # 5630 # VA # Richmond ( Virgina )
US # 2045 # 620 # DE # Dover ( Delaware )
US # 24236 # 1995 # WV # Charleston ( West Virginia )
US # 46047 # 12025 # PA # Harrisburg ( Pennsylvania )
US # 7787 # 7555 # NJ # Trenton ( New Jersey )
US # 52737 # 17895 # NY # Albany ( New York )
US # 9614 # 535 # VT # Montpelier ( Vermont )
US # 9278 # 975 # NH # Concord ( New Hampshire )
US # 33265 # 1165 # ME # Augusta ( Maine )
US # 8286 # 5820 # MA # Boston ( Massachusetts )
US # 5019 # 3160 # CT # Hartford ( Conneticut )
US # 1212 # 975 # RI # Providence ( Rhode Island )
US # 52669 # 6180 # NC # Raleigh ( North Carolina )
US # 31116 # 3325 # SC # Columbia ( South Carolina )


QTAwk - 19-1 - QTAwk






Section 19.0 Appendix IV


US # 58914 # 5820 # GA # Atlanta ( Georgia )
US # 51704 # 4015 # AL # Montgomery ( Alabama )
US # 42143 # 4755 # TN # Nashville ( Tennessee )
US # 40414 # 3780 # KY # Frankfort ( Kentucky )
US # 58668 # 10925 # FL # Tallahassee ( Florida )
US # 68139 # 4395 # WA # Olympia ( Washington )
US # 412582 # 8985 # OR # Salem ( Oregon )
US # 147045 # 830 # MT # Helena ( Montana )
US # 83566 # 1020 # ID # Boise ( Idaho )
US # 110562 # 945 # NV # Carson City ( Nevada )
US # 84902 # 1690 # UT # Salt Lake City ( Utah )
US # 97808 # 525 # WY # Cheyenne ( Wyoming )
US # 104094 # 3210 # CO # Denver ( Colorado )
US # 158704 # 25620 # CA # Sacramento ( California )
US # 121594 # 1425 # NM # Sante Fe ( New Mexico )
US # 114002 # 3040 # AZ # Phoenix ( Arizona )
US # 70702 # 690 # ND # Bismark ( North Dakota )
US # 77120 # 715 # SD # Pierre ( South Dakota )
US # 77350 # 1615 # NE # Lincoln ( Nebraska )
US # 82282 # 2450 # KS # Topeka ( Kansas )
US # 69697 # 5040 # MO # Jefferson City ( Missouri )
US # 69957 # 3375 # OK # Oklahoma City ( Oklahoma )
US # 266805 # 16090 # TX # Austin ( Texas )
US # 86614 # 4205 # MN # St Paul ( Minnesota )
US # 56275 # 2970 # IA # Des Moines ( Iowa )
US # 53191 # 2375 # AR # Little Rock ( Arkansas )
US # 47750 # 4515 # LA # Baton Rouge ( Louisiana )
US # 47691 # 2640 # MS # Jackson ( Mississippi )
US # 57872 # 11620 # IL # Springfield ( Illinois )
US # 66213 # 4800 # WI # Madison ( Wisconsin )
US # 97107 # 9090 # MI # Lansing ( Michigan )
US # 36417 # 5585 # IN # Indianapolis ( Indiana )
US # 44786 # 10760 # OH # Columbus ( Ohio )
US # 591004 # 515 # AK # Juneau ( Alaska )
US # 6473 # 1045 # HI # Honolulu ( Hawaii )
Canada # 255285 # 2370 # AB # Edmonton ( Alberta )
Canada # 366255 # 2885 # BC # Victoria ( British Columbia )
Canada # 251000 # 1060 # MB # Winnipeg ( Manitoba )
Canada # 251700 # 1010 # SK # Regina ( Saskatchewan )
Canada # 21425 # 875 # NS # Halifax ( Nova Scotia )
Canada # 594860 # 6585 # PQ # Quebec ( Quebec )
Canada # 2184 # 126 # PE # Charlottetown ( Prince Edward Island )
Canada # 156185 # 585 # NF # St John's ( New Foundland )
Canada # 28354 # 715 # NB # Fredericton ( New Brunswick )
Canada # 412582 # 8985 # ON # Toronto ( Ontario )
Canada # 1304903 # 51 # NW # Yellowknife ( Northwest Territories


QTAwk - 19-2 - QTAwk






Section 19.0 Appendix IV


)
Canada # 186300 # 23 # YU # Whitehorse ( Yukon Territory )
Europe # 92100 # 14030 # Bonn ( West Germany )
Europe # 211208 # 55020 # Paris ( France )
Europe # 94092 # 56040 # London ( United Kingdom )
Europe # 27136 # 3595 # Dublin ( Ireland )
Europe # 194882 # 38515 # Madrid ( Spain )
Europe # 116319 # 56940 # Rome ( Italy )
Europe # 8600383 # 275590 # Moscow ( Russia )
Europe # 120728 # 37055 # Warsaw ( Poland )
Europe # 32377 # 7580 # Vienna ( Autria )
Europe # 35921 # 10675 # Budapest ( Hungary )




































QTAwk - 19-3 - QTAwk

























































QTAwk - 19-4 - QTAwk






Section 20.0 Appendix V


E-20.0 Appendix VF-€

QTAwk error returns. When QTAwk encounters an error which it
cannot correct, it genmerates and displays an error message in
the format:

1: Error (xxxx): Error Message Text
2: From 'execute' Function.
3: Action File line: llll
4: Scanning File: utility filename
5: Line: llll
6: Record: rrrr

Line 2 is generated only if the error occured during execution
of the 'execute' function. Lines 4 to 6 are displayed only if an
input file is currently being scanned.

On a normal exit QTAwk returns a value of zero, 0, to PC/MS-DOS.
This value may be set with the 'exit' statement. On encountering
an error which generates an error message, QTAwk exits with a
non-zero value between 1 and 6. The worning messages below will
exit with a value of zero. The exit values generated on detecting
an error are:

1. Warning Errors ==> 0 , error value < 1000
2. File Errors ==> 2 , 2000 <= error value < 3000
3. Regular Expression Errors ==> 3 , 3000 <= error value < 4000
4. Run-Time Errors ==> 4 , 4000 <= error value < 5000
5. Interpretation Errors ==> 5 , 5000 <= error value < 6000
6. Memory Error ==> 6 , 6000 <= error value < 7000

The 'error value' range shown in the above list, shown the range
of the numeric value shown in the error message for that type of
error.

The error number displayed on line 1 may be used to find the
error diagnostic from the following listing.

1. Warning Errors
0

Invalid Option.
The only valid command line options are:
-- -> to stop command line option scanning
-f -> to specify a utility filename
-F -> to specify the input record field separator.



QTAwk - 20-1 - QTAwk






Section 20.0 Appendix V


10
Warning, Attempt To Close File Not Open.
An attempt has been made to close a file with the 'close'
function, that is not currently open.

2. File Errors
2000
2010
File Not Found: {filename}
The filename given in the error message, was been
specified on the command line. The file named does not
exist. QTAwk displays this error message and terminates
processing.

3. Regular Expression Errors
3000
Stop pattern without a start
The range pattern has the form:

expression , expression

The comma, ',', is used to separate the expressions of
the pattern. The associated action is executed when the
first or start expression is TRUE. Execution continues
for every input record until, and including, the second
or stop expression is TRUE. A comma, ',', has been found
in a pattern without the first expression. This is
usually caused by inbalanced braces, "{}". Check all
prior braces to ensure that every left brace, '{', has an
associated terminating right brace, '}'.

3010
Already have a stop pattern
The range pattern has the form: expression , expression
The comma, ',', is used to separate the expressions of
the pattern. The associated action is executed when the
first or start expression is TRUE. Execution continues
for every input record until, and including, the second
or stop expression is TRUE. A second comma, ',', has been
found in a pattern. This may be caused by the unbalanced
braces as for error number 3000 above. A second cause may
stem from the fact that new patterns for pattern/action
pairs must be separated from previous patterns by a
new-line if no action, i.e., the dafault action, is
associated with the previous pattern.



QTAwk - 20-2 - QTAwk






Section 20.0 Appendix V


4. Run-Time Errors
4000
Command Line Variable Set - Not Used.
Only variables defined in the QTAwk utility may be set on
the command line with the form "variable = value"

4010
Missing Opening Parenthesis On 'while'.
The proper syntax for the 'while' statement is:

while ( conditional_expression ) statement

The left parenthesis, '(', starting the conditional
expression was not found following the 'while' keyword.
Check that the syntax conforms to the form above.

4020
Missing Opening Parenthesis On 'switch'.
The form of the 'switch' construct is:

switch ( switch_expression ) statement

The left parenthesis, '(', was not found following the
'switch' keyword.

4030
Unable to compile regular Expression
QTAwk was unable to convert a regular expression to
internal form. Please contact the QTAwk author with
information on the circumstances of this error message.

4040
Internal Array Error.
Internal error. Please contact the QTAwk author with
information on the circumstances of this error message.

4050
pre/post '++' or '--' Needs a Variable.
The pre/post ++/-- operators operate on variables only.
This error is usually generated because of an incorrect
understanding of the precedence rules. The operator was
associated by QTAwk when the utility line was parsed than
the user expected. Check the precedence rules and the
syntax of the line cited.

4060


QTAwk - 20-3 - QTAwk






Section 20.0 Appendix V


'$$' will accept '0' argument only.
The '$$' operator assumes the value of the string matched
by the last explicit or implicit match operator, '~~' or
'!~'. Implicit matching is done in patterns. The only
value which is permissable for the '$$' operator is zero.

4070
Undefined Symbol.
A symbol has been found which QTAwk does not recognize.
This error should not occur and represents an internal
error. Please contact the QTAwk author with information
on the circumstances of this error message.

4080
Internal Error #200
Internal error. Please contact the QTAwk author with
information on the circumstances of this error message.

4090
Attempt To Delete Non-existent Array Element.
The 'delete' statement was followed with a reference to
an array element that does not exist.

4100
Internal GROUP parse error 1001.
Internal error. Please contact the QTAwk author with
information on the circumstances of this error message.

4100
Warning, Attempt To Close File Not Successful.
An attempt has been made to close a file with the 'close'
function. The close action has not been successful,
usually because the file named does not exist. Check the
name specified.

4110
'strim' Function Result Exceeds Limits.
The built-in function 'strim' has been called with a
string to trim which exceeds the maximum limits of 4096
characters.

4120
Cannot Nest 'execute' Function.

The 'execute' function cannot be executed with a
string/array executed by this function. An attempt has
been made to do this. Check the string/array which was


QTAwk - 20-4 - QTAwk






Section 20.0 Appendix V


executed.

4130
'(g)sub' Function Result Exceeds Limits.
The function 'sub' or 'gsub' has been called to replace
matching strings and the resultant string after
replacement would exceed the limit of 4096 characters.

4140
Missing ')' for Function Call.
A built-in function has been called with a left
parenthesis starting the argument list, but no right
parenthesis terminating the argument list. Check the line
in question.

4150
[sf]printf functions take a minimum of 1 argument.
The first arguments for the 'fprintf' and 'sprintf'
functions are necessary to specify the file or string
respectively as the target for the output string.

4160
[sf]printf needs format string as first argument
The 'fprintf' and 'sprintf' functions need a format
string which specifies the output. The format string is
the second argument and must be specified for these
functions.

4170
4180
4190
Format Specifications Exceed Arguments To Print.
The 'printf', fprintf' and 'sprintf' functions use a
format string to control the output. Certain characters
strings in the format control the output of numerics and
imbedded strings. There must be exactly one extra
argument for each of these character control strings.
This error occurs when there are more control strings
than extra arguments.

4220
Third Argument For '(g)sub' Function Must Be A Variable.
The optional third argument of the 'sub' and 'gsub'
functions must be a variable. The string value of this
variable is replaced after string substitution has been
accomplished.


QTAwk - 20-5 - QTAwk






Section 20.0 Appendix V


4230
Excessive Length Specified 'substr' Function.
The form of the 'substr' function is: substr(s,p[,n]).
The third argument is optional, but if specified cannot
exceed 4096.

4240
Start Position Specified Too Great, 'substr' Function.
The form of the 'substr' function is: substr(s,p[,n]).
The second argument cannot exceed 4096.

4250
Incorrect Time Format.
he form of the time function is: stime(fmt) here fmt is
converted to an integer and must be in the range:

0 <= fmt <= 4

4260
Incorrect Date Format.
The form of the time function is: sdate(fmt) where fmt is
converted to an integer and must be in the range:

0 <= fmt <= 16

4270
'rotate' Function Needs Array Member As Argument.
The argument for the 'rotate' function must be an array.
If a variable is used, make sure that it is an array when
the function is called.

4280
Excessive Width Specified 'center' Function.
The second argument specifies the width of the line in
which to center the string value of the first argument.
The width specified cannot exceed 4096.

4290
Excessive Copies Specified 'copies' Function.
The second argument of the 'copies' function specifies
the number of copies of the string value of the first
argument to return. The number of copies specified cannot
exceed 65,536. See error number 4300 below also.

4300
'copies' Function Return Exceeds Limits.


QTAwk - 20-6 - QTAwk






Section 20.0 Appendix V


The 'copies' function returns the string value of the
first argument, copied the number of times specified by
the second argument. The total length of the returned
string:

arg2 * length(arg1)

cannot exceed 4096 characters.

4310
Excessive Characters Specified 'deletec' Function.
The 'deletec' function deletes the number of characters
specified by the third argument starting at the position
specified by the second argument from the string value of
the first argument. The form of the function is:

deletec(string,start,num)

The number of characters specified to delete, 'num',
cannot exceed 65,536. If 'num' is zero or exceeds the
number of characters remaining in the string from the
start position, then the remainder of the string is
deleted. See also error 4320 below.

4320
Excessive Characters Specified 'deletec' Function.
The 'deletec' function deletes the number of characters
specified by the third argument starting at the position
specified by the second argument from the string value of
the first argument. The form of the function is:

deletec(string,start,num)

The start is negative or greater than the length of the
string value of the first argument, then no characters
are deleted.

4330
'deletec' Intermediate Result Exceeds Limits.
The 'deletec' function deletes the number of characters
specified by the third argument starting at the position
specified by the second argument from the string value of
the first argument. The form of the function is:

deletec(string,start,num)



QTAwk - 20-7 - QTAwk






Section 20.0 Appendix V


If the length of the string value of the first argument
exceeds 4096 then this error is triggered.

4340
Start Position Specified Too Great, 'insert' Function.
The 'insert' function inserts the string value of the
second argument into the string value of the frist
argument, starting at the position specified by the third
argument. The form of the function is:

insert(string1,string2,start)

The third argument cannot exceed 65,536. If start
exceeds the length of the string value of 'string1', then
the string value of 'string2' is concatenated onto the
string value of 'string1'

4350
'insert' Function Intermediate Result Exceeds Limits.
The 'insert' function inserts the string value of the
second argument into the string value of the frist
argument, starting at the position specified by the third
argument. The form of the function is:

insert(string1,string2,start)

The length of the string value value of 'string1' cannot
exceed 4096 in length. The result of insert 'string2'
into 'string1' cannot exceed 4096 also. See error number
4360 below.

4360
'insert' Function Return Exceeds Limits.
The 'insert' function inserts the string value of the
second argument into the string value of the frist
argument, starting at the position specified by the third
argument. The form of the function is:

insert(string1,string2,start)

The length of the string value value of 'string1' cannot
exceed 4096 in length. The result of insert 'string2'
into 'string1' cannot exceed 4096 also. See error number
4350 above.

4370


QTAwk - 20-8 - QTAwk






Section 20.0 Appendix V


Start Position Specified Too Great, 'overlay' Function.
The 'overlay' function overlays the string value of the
second argument into the string value of the frist
argument, starting at the position specified by the third
argument. The form of the function is:

overlay(string1,string2,start)

The third argument cannot exceed 65,536. If start
exceeds the length of the string value of 'string1', then
blanks are appended to 'string1' to create a string of
length 'start'. The second string is then concatenated to
this string. See also error numbers 4380, 4390, and 4400
below.

4380
'overlay' Function Result Exceeds Limits.
The 'overlay' function overlays the string value of the
second argument into the string value of the frist
argument, starting at the position specified by the third
argument. The form of the function is:

overlay(string1,string2,start)

The third argument cannot exceed 4096. If start exceeds
the length of the string value of 'string1', then blanks
are appended to 'string1' to create a string of length
'start'. The second string is then concatenated to this
string. See also error number 4370 above and 4390 and
4400 below.

4390
'overlay' Function Intermediate Result Exceeds Limits.
The 'overlay' function overlays the string value of the
second argument into the string value of the frist
argument, starting at the position specified by the third
argument. The form of the function is:

overlay(string1,string2,start)

The length of the string value of 'string1' cannot
exceed 4096 characters. See also error number 4370 and
4380 above and 4400 below.

4400
'overlay' Function Result Exceeds Limits.


QTAwk - 20-9 - QTAwk






Section 20.0 Appendix V


The 'overlay' function overlays the string value of the
second argument into the string value of the frist
argument, starting at the position specified by the third
argument. The form of the function is:

overlay(string1,string2,start)

The length of the resultant string after overlaying
'string2' onto 'string1' cannot exceed 4096. See also
error numbers 4370, 4380, and 4390 above.

4410
'remove' Function Intermediate Result Exceeds Limits.
The 'remove' function removes all characters specified by
the second argument from the string value of the first
argument. The form of the function is:

remove(string,char)

The length of 'string' before any character are removed
cannot exceed 4096.

4420
Excessive Width Specified 'justify' Function.
The 'justify' function forms a string from the elements
of the array specified by the first argument. The string
will have a length specified by the integer value of the
third arugument and will be formed from the number of
array elements specified by the second argument. Any
padding characters necessary between array elements can
be specified by the optional fourth argument. The form of
the function is:

justify(array_var,count,width [,pad_char] );

The width specified cannot exceed 65,536. See also error
number 4430 below.

4430
Excessive Number Of Array Elements Specified 'justify'
Function.
The 'justify' function forms a string from the elements
of the array specified by the first argument. The string
will have a length specified by the integer value of the
third arugument and will be formed from the number of
array elements specified by the second argument. Any


QTAwk - 20-10 - QTAwk






Section 20.0 Appendix V


padding characters necessary between array elements can
be specified by the optional fourth argument. The form of
the function is:

justify(array_var,count,width [,pad_char] );

The count of array elements to use cannot exceed 65,536.
See also error number 4420 above.

4440
Bad Function Call - Internal Error.
An internal error has occured in calling a built-in
function. Please contact the QTAwk author with
information on the circumstances of this error.

4450
Missing ')' for Function Call.
A user-defined function has been called with an argument
list and no right parenthesis, ')', terminating the
argument list.

4460
More Arguments For Function Than Defined. Function:
{User_Function_Name}.
More argument are passed to the user defined function
named in the error message than were defined for the
function. Check the user function name or the definition
of the function for necessary extra arguments.

4470
Less Arguments For User Function Than Defined. Function:
'{User_Function_Name}'.
Less arguments are passed to the user defined function
named in the error message than were defined for the
function. This error message is generated ONLY if the
built-in variable '_arg_chk' has a TRUE value. Variables
local to a user-defined function should be defined with
the 'local' keyword.

4480
Constant Passed For Function Array Parameter.
A parameter to a user defined function used as an array
within the function cannot be passed a constant value.
Only a variable can be passed for this parameter. If the
statement where the variable is indexed as an array is
executed, the variable will be an array upon return from


QTAwk - 20-11 - QTAwk






Section 20.0 Appendix V


the function.

4490
Internal Error - Misalignment Of Local List ( ).
This is an internal QTAwk error. It should ideally never
happen. If this error message is generated, please
contact the QTAwk author with information on the
circumstances.

4500
Cannot Assign Array To Array Element.
Arrays can be assigned to variables, however, it is an
error it attempt to assign an array to a single element
of another array.

4510
Array Cannot Operate on Scalar.
A scalar may operate on an array, but the reverse is not
true.

4520
Assignment Operator needs a Variable on left.
The assignment operator, '=', or any of the
operator/assignment operators, 'op=', only operate on a
variable to the left of the operator.

4530
Stack Underflow
Internal stack error. Please contact the QTAwk author
with information on the circumstances of this error
message.

5. Interpretation Errors
5000
Expecting Filename After 'f' Option.
QTAwk utility files are specified on the command line
with the 'f' option. The filename of the utility
immediately follows the 'f' flag. A blank between the
flag and the filename is optional. This message is
generated when no arguments follow the 'f' flag.

5010
Unable To Compile Regular Expression
The input record field separator may be specified on the
command line with the 'F' option. If the string specified
for FS is longer than a single character, a regular


QTAwk - 20-12 - QTAwk






Section 20.0 Appendix V


expression is assumed. This error message is generated
when QTAwk is unable to convert the string into a regular
expression internal form for whatever reason.

5020
'-F' command line option specified more than once.
The command line 'F' option to specifiy the input record
field separator, FS, may be specified only once.

5030
Internal Error #3.
Internal error. Please contact the QTAwk author with
information on the circumstances of this error message.

5040
Internal Error #2.
Internal error. Please contact the QTAwk author with
information on the circumstances of this error message.

5050
BEGIN/END/NOMATCH/INITIAL/FINAL Patterns or User Function
Require An Action.
The pre-defined patterns:

BEGIN
INITIAL
NOMATCH
FINAL
END

must have actions associated with them. The brace
opening the action must be on the same line as the
pre-defined pattern.

5060
Exceeded Internal Stack Size on Scan.
The internal stack for containing parsed tokens has been
exceeded. Attempt to simplify the utility in the area
where this error occurred.

5070
Underflow Internal Stack on Scan.
This is an internal error. If this error occurs please
contact the QTAwk author with information on the
circumstances of this error message.



QTAwk - 20-13 - QTAwk






Section 20.0 Appendix V


5080
Missing ')' For Function Call.
A used defined function argument list must be terminated
with a right parenthesis, ')'. A symbol has been found
which cannot be part of the argument list and is not a
right parenthesis.

5090
Function Call Without Parenthisized Argument List.
A user defined function definition must include an
argument list. The argument list may be empty, e.g.,
"()", if there are no formal arguments.

5100
'fprint' Function Takes A minimum Of 1 Argument.
The 'fprint' built-in function must have at least the
name name of the output file specified.

5110
printf and 'sprintf' Functions Take A Minimum Of 1
Argument.
These functions must have at least a format string
defined.

5120
'fprintf' Function Needs A Minimum of Two Arguments.
This function needs an output file name and a format
string.

5130
Second Argument Of 'fgetline' Has To Be A Variable.
If two arguments are specified for the 'fgetline'
built-in function, the second must be a variable.

5140
Argument Of 'getline' Has To Be A Variable.
If an argument for the 'getline' function is specified,
it must be a variable.

5150
split Function Needs Variable Name As Second Argument.
The second second argument for the 'split' function must
be a variable. The peices into which the first argument
is split will be returned as array elements of the
variable specified.



QTAwk - 20-14 - QTAwk






Section 20.0 Appendix V


5160
'rotate' Function Needs Variable As Argument.
The argument of the 'rotate' function has to be an array
variable.

5170
'justify' Function Needs Variable As First Argument.
The format of the 'justify' built-in function is:

justify(a,n,w)

or

justify(a,n,w,c)

The first argument, 'a', must be an array variable. The
first n elements of the array are concatenated to form a
string 'w' characters long. a single space is used to
separate the concatenated elements. If the optional third
argument is specified, it is converted to a character
value and used to separate the elements.

5180
'[pu]_sym' Function Needs Variable As Second Argument.
The second argument must be variable whose value can be
changed to equal the string value of the name variable
specified.

5190
Bad Function Call
Internal QTAwk error. Please contact the QTAwk author
with information on the circumstances of this error.

5200
Improper Number Of Arguments, {Funcation_Name} Function.
The built-in function specified has been called with an
improper number of arguments for the function. Check the
user manual for the correct use of the intended function.

5210
Need Variable On Left Side Of Assignment.
In an assignment statement of the form:

variable = expression;

a variable must be specified on the left side of the


QTAwk - 20-15 - QTAwk






Section 20.0 Appendix V


assignment operator to receive the value of the
expression on the right side of the operator.

5220
Conditional Expression Error - Missing ':'
The form of the conditional expression is :

test_expression ? expression_1 : expression_2;

test_expression is evaluated, if the result is TRUE
(non-zero numeric or non-null string), expression_1 is
evaluated and the value becomes the value of the
conditional expression. If the value of test_expression
is FALSE (zero numeric or null string), expression_2 is
evaluated and the value becomes the value of the
conditional expression.

5230
'in' Operator Needs Array As Right Operand
The form of the 'in' operator is:

expression in array_var

The operand to the right of 'in', array_var here, has to
be a variable. If the variable is not an array, then the
value of the expression is FALSE.

5240
Missing ')' in Expression Grouping.
An expression has been scanned with unbalanced
parenthesis. Check for a missing terminating right
parenthesis.

5250
Pre-Increment/Decrement Operators Need Variable.
The increment and decrement operators, '++' and '--',
only operate on variables. An instance has been found in
which the operator has been used as a pre-fix operator on
something other than a variable. Check that grouping has
not changed a post-fix operator into a pre-fix operator.

5260
Undefined Symbol.
A symbol has been found which matches no defined QTAwk
syntax. This usually, but not always, occurs when the
terminating semi-colon, ';', has been left off a


QTAwk - 20-16 - QTAwk






Section 20.0 Appendix V


statement.

5270
Need Variable for Array Reference
A left bracket for indexing an array has been
encountered. However, the preceeding symbol was not a
variable. Only variables may be arrays and indexed.

5280
Missing Index For Array
A left bracket for indexing an array has been
encountered. However, the index expression is missing:
var[] a null index is not allowed in QTAwk.

5290
Missing ']' Terminating array index.
A left bracket and an index expression for indexing an
array have been encountered. However, the right bracket
terminating the index expression was not recognized.
Check that the array index follows the form:

var[index_expression]

5300
Post-Increment/Decrement Operators Need Variable.
The increment and decrement operators, '++' and '--',
only operate on variables. An instance has been found in
which the operator has been used as a post-fix operator
on something other than a variable. Check that grouping
has not changed a pre-fix operator into a post-fix
operator.

5310
'if' Keyword - No Expression To Test.
The proper syntax for the 'if' statement is:

if ( conditional_expression ) statement

The left parenthesis, '(', starting the conditional
expression was not found following the 'if' keyword.
Check that the syntax conforms to the form above.

5320
'if' Keyword - No Terminating ')' On Test Expression.
The proper syntax for the 'if' statement is:



QTAwk - 20-17 - QTAwk






Section 20.0 Appendix V


if ( conditional_expression ) statement

The right parenthesis, ')', terminating the conditional
expression was not found. Check that the syntax to
conforms the form above.

5330
'while' Keyword - No Terminating ')' On Test Expression.
The proper syntax for the 'while' statement is:

while ( conditional_expression ) statement

The right parenthesis, ')', terminating the conditional
expression was not found. Check that the syntax to
conforms the form above.

5340
Missing 'while' Part Of 'do'.
The proper syntax for the 'do' statement is:

do statement while ( conditional_expression );

The 'while' keyword was not found following the
statement portion. Check that a possible left brace, '{',
starting a compound statement may have been deleted or
for the possible misuse of a keyword as a variable.

5350
Missing '(' On 'while' Part Of 'do'.
The proper syntax for the 'do' statement is:

do statement while ( conditional_expression );

The left parenthesis, '(', starting the conditional
expression was not found following the 'while' keyword.
Check that the syntax conforms to the form above.

5360
Missing ')' On 'while' Part Of 'do'.
The proper syntax for the 'do' statement is:

do statement while ( conditional_expression );

The right parenthesis, ')', terminating the conditional
expression was not found. Check that the syntax to
conforms the form above.


QTAwk - 20-18 - QTAwk






Section 20.0 Appendix V


5370
Missing ';' Terminating 'do - while'.
The proper syntax for the 'do' statement is:

do statement while ( conditional_expression );

Note the semicolon following the right parenthesis
terminating the conditional expression. The semicolon is
necessary here.

5380
Missing Opening Parenthesis On 'for'.
The proper syntax for the 'for' statement is:

for ( intial_expression ; conditional_expression ;
loop_expression )
statement

or

for ( variable_name in array_name ) statement

The left parenthesis, '(', was not found following the
'for' keyword. Check that the syntax conforms to the form
above.

;li. 5390
5400
5420
Improper Syntax - 'for' Conditional.
The proper syntax for the 'for' statement is:

for ( intial_expression ; conditional_expression ;
loop_expression )
statement

One of the semicolons separating the three expressions
or the terminating right parenthesis was not found. Check
that the syntax follows the form above

5410
'in' Operator Needs Variable As Left Operand in 'for'
Expression.
The proper syntax for the 'for' statement is:

for ( variable_name in array_name ) statement


QTAwk - 20-19 - QTAwk






Section 20.0 Appendix V


the symbol following the left parenthesis and preceeding
the 'in' keyword must be a valid variable name.

5430
break/continue Keyword Outside Of Loop.
Either of these keywords must be used inside of a
'while', 'for' or 'do' loop. In addition, the 'break'
statement may be used inside a 'switch-case' construct to
terminate execution flow. One of the keywords has been
found outside of such a construct. Check for an imbalance
of braces, '{}', enclosing compund statements.

5440
'return' Statement Outside Of User Function.
The 'return' statement may only be used inside of a
user-defined function to terminate execution of the
function and cause execution to return to the place where
the function was called. The 'return' keyword was
encountered outside of the definition of such a function.
Check for the use of the keyword as a variable or for
unbalanced braces, '{}', enclosing the statements of the
function.

5450
Exceeded Limits on Number of Local Variable Definitions
(1).
QTAwk places a limit of 256 local variables within any
compound statement. An attempt has been made to define
more local variables than this limit allows.

5460
No Variables Defined With 'local' Keyword.
The form of local variable definition with the 'local'
keyword follows the form:

local var1, var2 = optional_expression;

The 'local' keyword was encountered followed immediately
by a semicolon. Check that the syntax follows the above
form.

5470
'switch' Keyword - No Terminating ')' On Expression.
The form of the 'switch' construct is:

switch ( switch_expression ) statement


QTAwk - 20-20 - QTAwk






Section 20.0 Appendix V


The right parenthesis, ')', terminating the
switch_expression was not found.

5480
'case/default Statement Without Switch Statement.
The 'case' keyword is used within the 'switch' statement
to specify case expressions to which execution should
transfer after matching the switch expression. A 'case'
keyword was found outside of the 'switch' statement.
Check for the use of the keyword as a variable or for
unbalanced braces enclosing a compound 'switch'
statement.

5490
Multiple 'default' Statements in 'switch'.
The 'default' keyword is used within a 'switch' statement
to specify a transfer point at execution should proceed
when the switch_expression fails to match any
case_expression. Only one 'default' transfer point is
allowed per 'switch' statement. Check for possible
unbalanced braces, '{}', enclosing a compound statement
in previous 'case' statements.

5500
Missing ':' Following Expression On Case Label.
The form of the 'case' statement is:

case case_expression:

A colon, ':', must terminate the case expression. QTAwk
did not find the terminating colon.

5510
Need Variable For 'delete' Reference
The form of the 'delete' statement is:

delete variable_name;

or

delete (variable_name);

or

delete variable_name[index];



QTAwk - 20-21 - QTAwk






Section 20.0 Appendix V



or

delete (variable_name[index]);

where variable must be a global or local variable.

5520
'deletea' Statement Variable Cannot Be Indexed.
The form of the 'deletea' statement is:

deletea variable_name;

or

deletea (variable_name);

where variable must be a global or local variable and
cannot be indexed.

5530
Need Variable For 'deletea' Reference
The form of the 'deletea' statement is:

deletea variable_name;

or

deletea (variable_name);

where variable must be a global or local variable.

5540
No ';' Terminating Statement.
All statements in QTAwk are terminated by a semicolon.
The terminating semicolon was not found by QTAwk.

5550
Internal Compilation Error - Action Strings.
This is an QTAwk internal error that should never happen.
If this error message in encountered, please contact the
QTAwk author with information on the circumstances of
this error.

5560
Error On Single Line Action. No Termination.
In parsing/compiling an action entered from the command


QTAwk - 20-22 - QTAwk






Section 20.0 Appendix V


line or by executing the 'execute' built-in function, the
end of the line was reached without reaching the end of
the action expression(s). Typically caused by a missing
right bracket, '}' (or unbalanced brackets - more left
brackets than right brackets).

5570
Too Many User Functions Defined.
QTAwk currently has a limit of 256 user defined
functions. The currently utility has attempted to define
more than that limit. Please contact the QTAwk author
with information on the circumstances of this error
message.

5580
Exceeded Limits on Number of Local Variable Definitions
(2).
QTAwk currently has a limit of 256 'local' variables
defined within any single compound statement. The
currently utility has attempted to define more than that
limit. Please contact the QTAwk author with information
on the circumstances of this error message.

5590
Expecting Function Name To Follow 'function' Keyword In
Pattern.
The 'function' keyword has been encountered in a pattern
without a function name immediately following. This
syntax error may be corrected by inserting the missing
name or by removing the function keyword from the
pattern.

5600
Multi-Defined Function Name.
The name supplied for a user defined function has been
used previously. The current usage attempts to redefine
the name. Change either the first use of the name or the
present.

5610
Unexpected Symbol - Function Argument List Definition.
A user defined function has been encoutered with the
accompanying list defining the passed argument names. The
form of the list is a variable name followed by 1) a
comma and more names, 2) an ellipses, '...' followed by a
right parenthesis, or 3) a right parenthesis ending the


QTAwk - 20-23 - QTAwk






Section 20.0 Appendix V


list. A symbol other than a comma or right parenthesis
has been found following a variable name.

5620
Expecting ')' To Terminate Function Parameter List.
A user defined function has been encoutered with the
accompanying list defining the passed argument names. The
form of the list is a variable name followed by 1) a
comma and more names, 2) an ellipses, '...' followed by a
right parenthesis, or 3) a right parenthesis ending the
list. A symbol has been found other than the right
parenthesis following the ellipses.

5630
Unexpected Symbol - Function Argument List Definition.
A user defined function has been encoutered with the
accompanying list defining the passed argument names. The
form of the list is a variable name followed by 1) a
comma and more names, 2) an ellipses, '...' followed by a
right parenthesis, or 3) a right parenthesis ending the
list. A symbol other than a comma or right parenthesis
has been found following a variable name.

5640
Expecting Parenthesized Argument Definition List For
Function.
A user defined function has the following form: function
function_name ( argument_list ) The left parenthesis of
the argument list was not found.

5650
Improper Syntax - Improper Ending For Pattern
A pattern expression must be ended by: 1) a comma (the
first expression in a range expression only), 2) the left
brace, '{', starting the associated action, 3) an
End-of-File, or 4) new line. a symbol other than above
has been encountered.

5660
GROUP Pattern Only Accepts a Regular Expression, a String
or a Variable.
The GROUP pattern keyword may only be followed by a
regular expression constant, a string constant or a
variable. A symbol other than one of these three has been
encountered.



QTAwk - 20-24 - QTAwk






Section 20.0 Appendix V


5670
Internal Parse Error: 1001.
Internal parser error. Please contact the QTAwk author
with information on the circumstances of this error
message.

5680
Local Variable With Reserved Name.
An attempt has been made to defione a local variable in
either a user defined function argument list or with the
'local' keyword, with a name equal to a reserved word.

5690
Improper Use of Keyword.
A pattern keyword has been encountered in an action
statement.

5700
User Function Variable Argument Keyword Outside Of User
Function.
The two predefined local variables: vargc and vargv can
only be used within user defined functions which have
been defined with a variable length argument list using
the ellipsis, '...'. One of these variables has been
encountered outside of a user defined function.

5710
Variable Argument Keyword In User Function Defined
Without Variable Number Of Arguments.
The two predefined local variables: vargc and vargv can
only be used within user defined functions which have
been defined with a variable length argument list using
the ellipsis, '...'. One of these variables has been
encountered inside of a user defined function which was
not defined with a variable length argument list.

5720
Internal Error - Variable Argument List Variable.
The two predefined local variables: vargc and vargv can
only be used within user defined functions which have
been defined with a variable length argument list using
the ellipsis, '...'. One of these variables has been
previously defined as a local variable within the current
compound statement.

5730


QTAwk - 20-25 - QTAwk






Section 20.0 Appendix V


Internal Error - Variable Argument List Variable.
The two predefined local variables: vargc and vargv can
only be used within user defined functions which have
been defined with a variable length argument list using
the ellipsis, '...'. One of these variables has been
previously defined as a global variable.

5740
Internal Parse Error: 1002.
Internal parser error. Please contact the QTAwk author
with information on the circumstances of this error
message.

5750
Internal Parse Error: 1003.
Internal parser error. Please contact the QTAwk author
with information on the circumstances of this error
message.

5760
Empty Regular Expression.
A regular expression must have some characters between
the beginning and ending slashes. A regular expression
has been encountered with none.

5770
Regular Expression - No Terminating /.
A regular expression constant must be contained on one
line and be terminated by a slash. A regular expression
has been been found with a no terminating slash before
encountering a new line.

5780
Internal Parse Error: 1004.
Internal parser error. Please contact the QTAwk author
with information on the circumstances of this error
message.

5790
String Constant - No Terminating ".
A string constant must be contained on one line and be
terminated by a double quote. A string constant has been
been found with a no terminating double qoute before
encountering a new line.

5800


QTAwk - 20-26 - QTAwk






Section 20.0 Appendix V


Internal Parse Error: 1005.
Internal parser error. Please contact the QTAwk author
with information on the circumstances of this error
message.

5810
Character Constant - No Terminating '.
a character constant must be contained on one line and be
terminated by a single quote. A character constant has
been been found with a no terminating single qoute before
encountering a new line.

5820
Character Constant Longer Than One Character
A character constant is a single character bounded by
single quotes as in 'A'. Escape sequences may also be
used for specifying the character for a character
constant, e.g., '\f' or '\x012' or '\022' are three ways
to specify a single form feed character. This error
reports that an attempt has been made to use single
quotes to bound more than a single character.

5830
Lexical Error - Illegal '.'
Periods are used only in floating point numerics, e.g.,
0.88 or .33, or in user defined function definitions to
indicate a variable number of arguments, e.g.,

function max(...) {

A period has been found which does not match either of
these uses.

5840
Lexical Error
A character has been read which does not fit any syntax
for a valid utility.

5850
Exceeded Max. Limits On Number Of Variables.
A amximum of 256 global variables may be defined in any
single QTAwk utility.

6. Memory Errors
6000
Out of Memory (n: , s: )


QTAwk - 20-27 - QTAwk






Section 20.0 Appendix V


The QTAwk utility has used all available memory and
attempted to exceed that limit. It is recommended that
the utility be made shorter, or split into multiple
utilities run separately.

6010
Insufficient Memory.
The QTAwk utility has used all available memory and
attempted to exceed that limit. It is recommended that
the utility be made shorter, or split into multiple
utilities run separately.

6020
6030
Action Too Long
An action has been defined which exceeds the limits set
for the internal length. The maximum length for the
internal form of any action is 409,600 characters.

6040
6050
6060
Out of Memory
The QTAwk utility has used all available memory and
attempted to exceed that limit. It is recommended that
the utility be made shorter, or split into multiple
utilities run separately.





















QTAwk - 20-28 - QTAwk



  3 Responses to “Category : Miscellaneous Language Source Code
Archive   : QTAWKD42.ZIP
Filename : QTAWK.DOC

  1. Very nice! Thank you for this wonderful archive. I wonder why I found it only now. Long live the BBS file archives!

  2. This is so awesome! 😀 I’d be cool if you could download an entire archive of this at once, though.

  3. But one thing that puzzles me is the “mtswslnkmcjklsdlsbdmMICROSOFT” string. There is an article about it here. It is definitely worth a read: http://www.os2museum.com/wp/mtswslnk/