Dec 262017
 
Text of Part 1 of PERL language from May 1990 UNIX Review.
File PERL-RK1.ZIP from The Programmer’s Corner in
Category UNIX Files
Text of Part 1 of PERL language from May 1990 UNIX Review.
File Name File Size Zip Size Zip Type
FIG.1 684 398 deflated
PERL-RK1.TXT 16681 6485 deflated

Download File PERL-RK1.ZIP Here

Contents of the PERL-RK1.TXT file



UNIX Review June 1990 volume 8 number 5 p30(7)

PERL: THE SUPER-LANGUAGE

by Rob Kolstad (Daemons and Dragons column)


System administrators often find themselves in the position of being the
local expert on the entire gamut of UNIX tools: awk, sed, shell scripts, and,
of course, C. Each of these has its strengths and weaknesses, along with
various quirks.

Larry Wall (famous for authoring patch) has written another fine program.
This one, a complete interpretive language, is called PERL. PERL stands for
"Practical Extraction and Report Language". I combines all the best features
of C, sed, awk, shell programming, database access, and text manipulation
into one giant, kitchen-sink language.

In this column, and some of those that follow, I will give an overview of
PERL. The presentation will include the subset of PERL that seems most
useful to me at this time. PERL's occasional obscure constructs will not be
mentioned. You should read PERL's reference manual to get a general feeling
of its power.

PERL
~~~~

PERL has many fine features for attacking those one-time adminisrative
programming tasks that appear so frequently in our profession:

It's easy to learn because it's very similar to other UNIX tools.

Because it's an interpreter, it can make program development incremental
and very fast.

It seems to execute programs much more quickly than its associated tools
such a sed or awk.

Since it's much richer than any of the other tools, hacks and kludges are
hardly every necessary.

As an interpreter, it's available across a large number of architectures.

It has few arbitrary limits (there are no limits on lengths of strings, for
instance).

It fits nicely into the UNIX tool-and-filter philosophy.

It's free.

PERL is easy to obtain. It is widely distributed on the comp.sources.unix
newsgroup and can be found in the newsgroup's archives. It is also available
from the archive servers uunet.uu.net [192.48.96.2] and
tut.cis.ohio-state.edu [128.148.8.60].

Data in PERL
~~~~~~~~~~~~

PERL has three fundamental scalar data types: the numeric type, which
combines both floating-point and integer formats (such as 0, 3. 14159, -7.5,
and 1.8e9); the boolean type, in which 0 represents false and 1 represents
true; and the string type, exemplified by strings surrounded by double quotes
that expand internal variable names (such as "There are $count pigs."), or
single quotes that don't expand internal variable names (such as 'Red
House').

All referneces to scalar variables in the PERL language are of the form
$name, where name is the variable's name. Here are some typical examples of
variables in a PERL program file (note the use of # for comments):

$foo = 3.14159 # numeric
$color = 'red' # numeric
$color = "was $color" #interpolate a variable
$host = 'hostname' # backquotes like shell

PERL resembles awk in that variables are typed by their most recent
assignment and require no previous declaration. They sort of "spring into
being" upon reference. One rarely sees the error message "undefined
variable" in PERL. Like global variables in BASIC and C, PERL variables are
initialized to 0, false, or the null string as appropriate.

PERL's output can be string-orineted, just like awk's. You must explicitly
insert newlines, as in C. The canonical example is:

print "Hello world\n";

You could put this line in a file (hello.perl) and invoke PERL with the
file's name:

% perl hello.perl

or use PERL's #! feature and include an extra line in the program file:

#!/usr/local/bin/perl
print "Hello world\n";

and invoke the program as its filename alone:

% hello.perl

You will probably tire of using the .perl extension as your confidence in the
language grows.

PERL supports an interesting variety of aggregate types (variable names that
represent collections of scalar variables). These include: numerically
indexed arrays of scalars, arbitrarily indexed associative arrays of scalars,
and functions (with or without arguments) that can return either scalar or
composite values. Scalar variables, aggregate variables, and functions all
have separate namespaces in PERL.

Vectors are one-dimensional and are indexed from the value of the scalar
variable $[ through to the highest element assigned. You can change $[ to
move the lowest bound from 0 (its default) to 1, as in FORTRAN, or to any
other integer. Vectors, like scalars, do not require declarations.

Vector elements are accessed nand assigned) as $name[index]. The vectorhs
index must be numeric when using vectors, but not necessarily when using
associative arrays (described later).

You can denote all the elements of a vector by prefixing its name with a @.
Here are some typical vector-assignment statements:

$zzz[4] = 3;
@foo = (1, 3, 5); # note list notation
@foo = (); # empty array
@foo = @bar; # copy entire array
@foo = @bar[2..5]; # copy slice of an array

PERL automatically copies a program's arguments into $ARGV[0] through to
$ARGV[$#ARGV]. Note that this differs from C's use of the highest index,
which is usually not defined. Further note that $ARGV[0] is the first
argument, not the invokved program name as in C.

PERL also supports associative arrays. Each element contains a scalar and is
addressed by an index that can be any scalar or, for multidimensional arrays,
a comman-separated list of scalars. Names of associative arrays also prefix
their name with a $ but curly braces ({}) surround the index (or indices).
You can access entire associative arrays by using the prefix %. Here are
some typical associative-array assignments:

$frogs{'green'} = 23; # 23 green frogs
$locn{$x, $y} = "rat"; # multidimensional
%foo = %bar; # copy entire array

The variable names prefixed by $, %, and @ take just a bit of getting used
to. Here's another simple example:

$frogs{'green', 'blue'} = "hi there";
print $frogs{'green', "blue"} . "\n";

which produces:

hi there

Here's a tricky one:

@frogs{'green', 'blue'} = (3, 5);
print $frogs{'green'}."\n";
print $frogs{'blue'}."\n";

which produces:

3
5

Mentioning frogs in an array context (@frogs) makes the assignment an
aggregate one, with elements going to both the green subscript and the blue
one.

PERL has lots of special, predefined variable names. Special scalars
include:

$0 - the name of the currently executing script;

$$ - the current process id;

$. - the current line number of he last input;

$[ - the index of the first element in a vector.

There are a host of others.

Special array variables include:

@ARGV - the command-line arguments;

@_ - the default line for many PERL functions (for instance, @_contains the
input arguments to a user-defined function);

% ENV - the current environment (such as % ENV{'HOME'}).

There are several more of these kinds of variables in PERL.

Here's an example that demonstrates some of the special variables:

#!/usr/local/bin/perl
print "My home is $ENV{'HOME'}\n";
print "My parameters are:\n";
for ($i = 0; $i <= $#ARGV; $i++)
{
printf ("--%s\n", $ARGV[$i]);
}

which, when run, produces:

% tester These are the args
My home is /mntkolstad
My parameters are:
-- These
-- are
-- the
-- args

Note that PERL has a for statement just like C's. We'll cover the control
statements later.

PERL uses all of C's operators except the type-casting and address operators
(& and *). Additionally, PERL has:

Exponentiation (**, **=):

print 2 ** 5;

which produces:

32

Range operator (..):

@a = 10..20;
print @a; # 11 numbers
for $i (60..75) { print $i; }
@new = @old[30..50];

String Concatenation (., .=);

print "Hello" . " " . "World\n";

which produces:

Hello World

String Repetition (x, x=):

print "-" X 20 . "\n";

which produces:

--------------------

Note that the string must be on the left and the count on the right.

String tests (eq, ne, It, le, get, ge):

if ("cat" lt "dog")
{
print" 'cat' is lexically less than 'dog'\n";
}

File tests (like the shell's):

if (-e" /tmp/foo" && -z " /temp/foo") {
print " /tmp/foo exists and is emty.\n";
}

Flow Control
~~~~~~~~~~~~

PERL has flow-control consructs very similar to those of C and awk (with the
exception that PERL lacks as case statement). Unlike C, however, PERL's
control constructs always require a set of enclosoing braces. Some prototype
examples of each flow-control construct are shown in Figure 1.

PERL uses next and last instead of C's continue and break. In fact, you can
label a looping construct and use next or last to refer to a particular loop
(instead of some enclosing loop):

guy: foreach $person (@people)
{
for ($i=0; $i {
if (substr($person, $i, 1) == "#" { next guy;}
. . .
}
}

Input/Output
~~~~~~~~~~~~

I most often use PERL's line-oriented I/O capabilities. PERL also has the
ability to read binary files (such as /etc/utmp). PERL uses file handles to
reference various paths to I/O. The file handles are in their own namespace
but by convention are almost always written in upper-case letters. File
handles are very similar to file identifiers (fids) used in C and UNIX
programs. They are an identifier that can specify which file to use for
various I/O operations.

A typical PERL open statement looks like this:

open (SCRATCH, "/tmp/scratch");

If this statement succeeds, the SCRATCH file handle will reference the file
/tmp/scratch that has been successfully opened for reading. the open command
returns 1 when it succeeds in opening a file (although pipe opens return
other numbers).

A better example might be:

if (open(SCRATCH," /tmp/scratch")!=1)
{ die "Can't open /tmp/scratch\n";}

Reading lines from the SCRATCH file handle is easy; just mention the file
handle between angle brackets:

$inline= ;

Even more conveniently, you can use the while construct in PERL to read each
line of a file successively:

if (open(SCRATCH, "/tmp/scratch")!=1)
{ die "Can't open /tmp/scratch\n";}
while ($line = )
{
# process $line
}
close (SCRATCH);

Even easier than that, PERL has a default variable named $_, which stores the
result of many operations that might normally use an argument. Particularly,
angle brackets read into the $_ variable when no other assignment is
mentioned. For example:

if (open(SCRATCH "/tmp/scratch")!=1)
{ die "Can't open /tmp/scratch\n";}
while ()
{
print $_;
}
close (SCRATCH);

will simply print the contents of /tmp/scratch if the open is successful
(much like cat).

By the way, PERL has the predefined file handles of STDIN, STDOUT, and
STDERR.

Sometimes it's better just to read an entire set of lines at once and then
process them. This is a piece of cake in PERL (and is fairly fast, as well).
Here is an example:

if (open(PEOPLE, "roster")!=1)
{ die "Can't open roster\n";}
@list_of_people = >;
close (PEOPLE);
while (@list_of_people)
{
# process $_
}

Some remark on reading from files:

Newlines are left intact on input lines. Remove the last character of any
string using PERL's chop function:

while ()
{
chop;
print $_; # string them all together!
}

Mentioning <> reads lines from files mentioned as arguments on the
invocation line or STDIN if no files were mentioned.

Open files for output this way:

open (OUT, ">outputfile");

Append output to existing files this way:

open (OUT, ">>outputfile");

Open a pipe to which to write this way:

open (OUT, " | lpr");

Open a pipe from which to read this way:

open (In, "netstate -a |");

Print to an output file handle by mentioning the handle after the print or
printf token:

printf OUT "Total of %d dogs\n", $ndogs;

A Real Example
~~~~~~~~~~~~~~

I use a database paradigm of lines with fixed-width strings to keep track of
some of the information at my site. I have a specification file called
master.spec that describes the database. A typical description looks like
this:

# length r=1 shortname Title
30 0 name L Name
12 0 hphone H Phone
38 0 hstreet H Street

. . .

Lines with # are comments and are ignored (a useful convention in all data
files). Each descriptor line has tab-separated fields that tell the length
of the fixed-length string, whether it is right- or left-justified, a name by
which to reference he string, and a title to describe the string. These
files are easy to make, and can be very useful. I have a program that reads
them and performs input and editing operations.

To process the files, I wrote a short set of extremely flexible PERL
routines. they respond automatically to changing the order or length of
fields specified in he master data file, can handle arbitrary field lengths,
and are easy to program. Here's the routine to read and parse the
master.spec file:

1 if (open(IN, "master.spec")!=1) {
2 die "Can't open master.spec";
3 }
4 $fieldcount = 0;
5 while ( )
6 {
7 if (substr($_, 0, 1) eq "#" ) { next; }
8 chop;
9 split (/\t/);
10 $len{$_[2]} = $_[0];
11 $start{$_[2]} = $fieldcount;
12 $fieldcount += $_[0];
13 }
14 close(IN);

These fourteen lines load two arrays: @len, an associative array of field
lengths (subscripted by the field's name), and @start, an associative array
telling where a particular field starts (also subscripted by the field's
name).

The first three lines open the file for input (in a fairly safe way). Line 4
initializes $fieldcount, which accumulates the total offset for strings on a
line. The while loop processes each line of the file (naming it $_ during
processing). Line 7 discards comments. Line 8 discards the newline. Line 9
performs the standard awk function of splitting a line into fields, this time
separating at tab characters (more on this next month). By default, the
split function places the components into the array @_. The next three lines
asign the numbers needed to keep track of field placement.

It is now a simple matter to access any field on a line. The function that
does this returns a string from an input line (which is always named $_ for
this routine). The function's single argument is the name for the field
(name from the master.spec file above). The argument appears in function
field as the first element of the @_ array.

Here is the function;

1 sub field {
2 local ($out);
3 die "Invalid field request $_[0]" if ($start{$_[0]} eq "");
4 $out = substr($_, $start{$_[0]}, $len{$_[0]});
5 return $out;
6 }

The second line sets up a local variable named $out. This is sort of a
declaration, though no type information can be specified. The next line uses
an alternate syntax for conditionals as it checks to make sure that a field's
name is valid (defined). note that this particular implementaion allows only
one master.spec to be active at a time. Line 5 sends back the function's
result.

We've seen only a brief glimpse of PERL's capabilities. Next month we'll
cover regular expressions, systems calls, directory access, subroutines,
funcitons, dbm databases, and several examples.

Go ahead and get PERL and try some easy examples. You'll love it.




 December 26, 2017  Add comments

Leave a Reply