Dec 182017
 
Brian Kernighan's UNIX For Beginners.
File UNIX-DOC.ZIP from The Programmer’s Corner in
Category UNIX Files
Brian Kernighan’s UNIX For Beginners.
File Name File Size Zip Size Zip Type
BEGIN.DOC 85573 24753 deflated

Download File UNIX-DOC.ZIP Here

Contents of the BEGIN.DOC file


B









UUUUNNNNIIIIXXXX FFFFoooorrrr BBBBeeeeggggiiiinnnnnnnneeeerrrrssss----SSSSeeeeccccoooonnnndddd EEEEddddiiiittttiiiioooonnnn



Brian W. Kernighan


Bell Laboratories

Murray Hill, New Jersey 07974



_A_B_S_T_R_A_C_T


This paper is meant to help new users get
started on the UNIX* operating system. It
includes:

o+basics needed for day-to-day use of the system -
typing commands, correcting typing mistakes,
logging in and out, mail, inter-terminal commun-
ication, the file system, printing files,
redirecting I/O, pipes, and the shell.

o+document preparation - a brief discussion of the
major formatting programs and macro packages,
hints on preparing documents, and capsule
descriptions of some supporting software.

o+UNIX programming - using the editor, programming
the shell, programming in C, other languages and
tools.

o+An annotated UNIX bibliography.



_I_N_T_R_O_D_U_C_T_I_O_N


From the user's point of view, the UNIX operating system

is easy to learn and use, and presents few of the usual

impediments to getting the job done. It is hard, however,

__________________________
* UNIX is a Trademark of Bell Laboratories.



November 16, 1985






2 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


for the beginner to know where to start, and how to make the

best use of the facilities available. The purpose of this

introduction is to help new users get used to the main ideas

of the UNIX system and start making effective use of it

quickly.


You should have a couple of other documents with you for

easy reference as you read this one. The most important is

_T_h_e _U_N_I_X _P_r_o_g_r_a_m_m_e_r'_s _M_a_n_u_a_l; it's often easier to tell you

to read about something in the manual than to repeat its

contents here. The other useful document is _A _T_u_t_o_r_i_a_l

_I_n_t_r_o_d_u_c_t_i_o_n _t_o _t_h_e _U_N_I_X _T_e_x_t _E_d_i_t_o_r, which will tell you

how to use the editor to get text - programs, data, docu-

ments - into the computer.


A word of warning: the UNIX system has become quite popu-

lar, and there are several major variants in widespread use.

Of course details also change with time. So although the

basic structure of UNIX and how to use it is common to all

versions, there will certainly be a few things which are

different on your system from what is described here. We

have tried to minimize the problem, but be aware of it. In

cases of doubt, this paper describes Version 7 UNIX.


This paper has five sections:


1.Getting Started: How to log in, how to type, what to do

about mistakes in typing, how to log out. Some of this is

dependent on which system you log into (phone numbers, for



November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 3


example) and what terminal you use, so this section must

necessarily be supplemented by local information.


2.Day-to-day Use: Things you need every day to use the

system effectively: generally useful commands; the file

system.


3.Document Preparation: Preparing manuscripts is one of

the most common uses for UNIX systems. This section con-

tains advice, but not extensive instructions on any of the

formatting tools.


4.Writing Programs: UNIX is an excellent system for

developing programs. This section talks about some of the

tools, but again is not a tutorial in any of the program-

ming languages provided by the system.


5.A UNIX Reading List. An annotated bibliography of docu-

ments that new users should be aware of.


_I. _G_E_T_T_I_N_G _S_T_A_R_T_E_D


_L_o_g_g_i_n_g _I_n


You must have a UNIX login name, which you can get from

whoever administers your system. You also need to know the

phone number, unless your system uses permanently connected

terminals. The UNIX system is capable of dealing with a

wide variety of terminals: Terminet 300's; Execuport, TI and

similar portables; video (CRT) terminals like the HP2640,

etc.; high-priced graphics terminals like the Tektronix


November 16, 1985






4 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


4014; plotting terminals like those from GSI and DASI; and

even the venerable Teletype in its various forms. But note:

UNIX is strongly oriented towards devices with _l_o_w_e_r _c_a_s_e.

If your terminal produces only upper case (e.g., model 33

Teletype, some video and portable terminals), life will be

so difficult that you should look for another terminal.


Be sure to set the switches appropriately on your device.

Switches that might need to be adjusted include the speed,

upper/lower case mode, full duplex, even parity, and any

others that local wisdom advises. Establish a connection

using whatever magic is needed for your terminal; this may

involve dialing a telephone call or merely flipping a

switch. In either case, UNIX should type ``llllooooggggiiiinnnn::::'' at you.

If it types garbage, you may be at the wrong speed; check

the switches. If that fails, push the ``break'' or ``inter-

rupt'' key a few times, slowly. If that fails to produce a

login message, consult a guru.


When you get a llllooooggggiiiinnnn:::: message, type your login name _i_n

_l_o_w_e_r _c_a_s_e. Follow it by a RETURN; the system will not do

anything until you type a RETURN. If a password is

required, you will be asked for it, and (if possible) print-

ing will be turned off while you type it. Don't forget

RETURN.


The culmination of your login efforts is a ``prompt char-

acter,'' a single character that indicates that the system

is ready to accept commands from you. The prompt character


November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 5


is usually a dollar sign $$$$ or a percent sign %%%%. (You may

also get a message of the day just before the prompt charac-

ter, or a notification that you have mail.)


_T_y_p_i_n_g _C_o_m_m_a_n_d_s


Once you've seen the prompt character, you can type com-

mands, which are requests that the system do something. Try

typing


ddddaaaatttteeee

followed by RETURN. You should get back something like


MMMMoooonnnn JJJJaaaannnn 11116666 11114444::::11117777::::11110000 EEEESSSSTTTT 1111999977778888

Don't forget the RETURN after the command, or nothing will

happen. If you think you're being ignored, type a RETURN;

something should happen. RETURN won't be mentioned again,

but don't forget it - it has to be there at the end of each

line.


Another command you might try is wwwwhhhhoooo, which tells you

everyone who is currently logged in:


wwwwhhhhoooo

gives something like


mmmmbbbb ttttttttyyyy00001111JJJJaaaannnn 11116666 00009999::::11111111
sssskkkkiiii ttttttttyyyy00005555JJJJaaaannnn 11116666 00009999::::33333333
ggggaaaammmm ttttttttyyyy11111111JJJJaaaannnn 11116666 11113333::::00007777

The time is when the user logged in; ``ttyxx'' is the

system's idea of what terminal the user is on.




November 16, 1985






6 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


If you make a mistake typing the command name, and refer

to a non-existent command, you will be told. For example,

if you type


wwwwhhhhoooommmm

you will be told


wwwwhhhhoooommmm:::: nnnnooootttt ffffoooouuuunnnndddd

Of course, if you inadvertently type the name of some other

command, it will run, with more or less mysterious results.


_S_t_r_a_n_g_e _T_e_r_m_i_n_a_l _B_e_h_a_v_i_o_r


Sometimes you can get into a state where your terminal

acts strangely. For example, each letter may be typed

twice, or the RETURN may not cause a line feed or a return

to the left margin. You can often fix this by logging out

and logging back in. Or you can read the description of the

command ssssttttttttyyyy in section I of the manual. To get intelligent

treatment of tab characters (which are much used in UNIX) if

your terminal doesn't have tabs, type the command


ssssttttttttyyyy ----ttttaaaabbbbssss

and the system will convert each tab into the right number

of blanks for you. If your terminal does have computer-

settable tabs, the command ttttaaaabbbbssss will set the stops correctly

for you.


_M_i_s_t_a_k_e_s _i_n _T_y_p_i_n_g


If you make a typing mistake, and see it before RETURN has


November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 7


been typed, there are two ways to recover. The sharp-

character #### erases the last character typed; in fact succes-

sive uses of #### erase characters back to the beginning of the

line (but not beyond). So if you type badly, you can

correct as you go:


dddddddd####aaaatttttttteeee########eeee

is the same as ddddaaaatttteeee.


The at-sign @@@@ erases all of the characters typed so far on

the current input line, so if the line is irretrievably

fouled up, type an @@@@ and start the line over.


What if you must enter a sharp or at-sign as part of the

text? If you precede either #### or @@@@ by a backslash \\\\, it

loses its erase meaning. So to enter a sharp or at-sign in

something, type \\\\#### or \\\\@@@@. The system will always echo a

newline at you after your at-sign, even if preceded by a

backslash. Don't worry - the at-sign has been recorded.



To erase a backslash, you have to type two sharps or two

at-signs, as in \\\\########. The backslash is used extensively in

UNIX to indicate that the following character is in some way

special.


_R_e_a_d-_a_h_e_a_d


UNIX has full read-ahead, which means that you can type as

fast as you want, whenever you want, even when some command

is typing at you. If you type during output, your input



November 16, 1985






8 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


characters will appear intermixed with the output charac-

ters, but they will be stored away and interpreted in the

correct order. So you can type several commands one after

another without waiting for the first to finish or even

begin.


_S_t_o_p_p_i_n_g _a _P_r_o_g_r_a_m


You can stop most programs by typing the character ``DEL''

(perhaps called ``delete'' or ``rubout'' on your terminal).

The ``interrupt'' or ``break'' key found on most terminals

can also be used. In a few programs, like the text editor,

DEL stops whatever the program is doing but leaves you in

that program. Hanging up the phone will stop most programs.


_L_o_g_g_i_n_g _O_u_t


The easiest way to log out is to hang up the phone. You

can also type


llllooooggggiiiinnnn

and let someone else use the terminal you were on. It is

usually not sufficient just to turn off the terminal. Most

UNIX systems do not use a time-out mechanism, so you'll be

there forever unless you hang up.


_M_a_i_l


When you log in, you may sometimes get the message


YYYYoooouuuu hhhhaaaavvvveeee mmmmaaaaiiiillll....



November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 9


UNIX provides a postal system so you can communicate with

other users of the system. To read your mail, type the com-

mand


mmmmaaaaiiiillll

Your mail will be printed, one message at a time, most

recent message first. After each message, mmmmaaaaiiiillll waits for

you to say what to do with it. The two basic responses are

dddd, which deletes the message, and RETURN, which does not (so

it will still be there the next time you read your mailbox).

Other responses are described in the manual. (Earlier ver-

sions of mmmmaaaaiiiillll do not process one message at a time, but are

otherwise similar.)


How do you send mail to someone else? Suppose it is to go

to ``joe'' (assuming ``joe'' is someone's login name). The

easiest way is this:


mmmmaaaaiiiillll jjjjooooeeee
_n_o_w _t_y_p_e _i_n _t_h_e _t_e_x_t _o_f _t_h_e _l_e_t_t_e_r
_o_n _a_s _m_a_n_y _l_i_n_e_s _a_s _y_o_u _l_i_k_e ...
_A_f_t_e_r _t_h_e _l_a_s_t _l_i_n_e _o_f _t_h_e _l_e_t_t_e_r
_t_y_p_e _t_h_e _c_h_a_r_a_c_t_e_r ``_c_o_n_t_r_o_l-_d'',
_t_h_a_t _i_s, _h_o_l_d _d_o_w_n ``_c_o_n_t_r_o_l'' _a_n_d _t_y_p_e
_a _l_e_t_t_e_r ``_d''.

And that's it. The ``control-d'' sequence, often called

``EOF'' for end-of-file, is used throughout the system to

mark the end of input from a terminal, so you might as well

get used to it.


For practice, send mail to yourself. (This isn't as

strange as it might sound - mail to oneself is a handy rem-



November 16, 1985






10 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


inder mechanism.)


There are other ways to send mail - you can send a previ-

ously prepared letter, and you can mail to a number of peo-

ple all at once. For more details see mmmmaaaaiiiillll(1). (The nota-

tion mmmmaaaaiiiillll(1) means the command mmmmaaaaiiiillll in section 1 of the _U_N_I_X

_P_r_o_g_r_a_m_m_e_r'_s _M_a_n_u_a_l.)


_W_r_i_t_i_n_g _t_o _o_t_h_e_r _u_s_e_r_s


At some point, out of the blue will come a message like


MMMMeeeessssssssaaaaggggeeee ffffrrrroooommmm jjjjooooeeee ttttttttyyyy00007777............

accompanied by a startling beep. It means that Joe wants to

talk to you, but unless you take explicit action you won't

be able to talk back. To respond, type the command


wwwwrrrriiiitttteeee jjjjooooeeee

This establishes a two-way communication path. Now whatever

Joe types on his terminal will appear on yours and vice

versa. The path is slow, rather like talking to the moon.

(If you are in the middle of something, you have to get to a

state where you can type a command. Normally, whatever pro-

gram you are running has to terminate or be terminated. If

you're editing, you can escape temporarily from the editor -

read the editor tutorial.)


A protocol is needed to keep what you type from getting

garbled up with what Joe types. Typically it's like this:





November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 11



Joe types wwwwrrrriiiitttteeee ssssmmmmiiiitttthhhh and waits.
Smith types wwwwrrrriiiitttteeee jjjjooooeeee and waits.
Joe now types his message (as many lines as he likes).
When he's ready for a reply, he signals it by typing ((((oooo)))),
which stands for ``over''.
Now Smith types a reply, also terminated by ((((oooo)))).
This cycle repeats until someone gets tired; he then
signals his intent to quit with ((((oooooooo)))), for ``over and
out''.
To terminate the conversation, each side must type a
``control-d'' character alone on a line. (``Delete'' also
works.) When the other person types his ``control-d'', you
will get the message EEEEOOOOFFFF on your terminal.



If you write to someone who isn't logged in, or who

doesn't want to be disturbed, you'll be told. If the target

is logged in but doesn't answer after a decent interval,

simply type ``control-d''.


_O_n-_l_i_n_e _M_a_n_u_a_l


The _U_N_I_X _P_r_o_g_r_a_m_m_e_r'_s _M_a_n_u_a_l is typically kept on-line.

If you get stuck on something, and can't find an expert to

assist you, you can print on your terminal some manual sec-

tion that might help. This is also useful for getting the

most up-to-date information on a command. To print a manual

section, type ``man command-name''. Thus to read up on the

wwwwhhhhoooo command, type


mmmmaaaannnn wwwwhhhhoooo

and, of course,


mmmmaaaannnn mmmmaaaannnn

tells all about the mmmmaaaannnn command.




November 16, 1985






12 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


_C_o_m_p_u_t_e_r _A_i_d_e_d _I_n_s_t_r_u_c_t_i_o_n


Your UNIX system may have available a program called

lllleeeeaaaarrrrnnnn, which provides computer aided instruction on the file

system and basic commands, the editor, document preparation,

and even C programming. Try typing the command


lllleeeeaaaarrrrnnnn

If lllleeeeaaaarrrrnnnn exists on your system, it will tell you what to do

from there.


_I_I. _D_A_Y-_T_O-_D_A_Y _U_S_E


_C_r_e_a_t_i_n_g _F_i_l_e_s - _T_h_e _E_d_i_t_o_r


If you have to type a paper or a letter or a program, how

do you get the information stored in the machine? Most of

these tasks are done with the UNIX ``text editor'' eeeedddd.

Since eeeedddd is thoroughly documented in eeeedddd(1) and explained in

_A _T_u_t_o_r_i_a_l _I_n_t_r_o_d_u_c_t_i_o_n _t_o _t_h_e _U_N_I_X _T_e_x_t _E_d_i_t_o_r, we won't

spend any time here describing how to use it. All we want

it for right now is to make some _f_i_l_e_s. (A file is just a

collection of information stored in the machine, a simplis-

tic but adequate definition.)


To create a file called jjjjuuuunnnnkkkk with some text in it, do the

following:


eeeedddd jjjjuuuunnnnkkkk(invokes the text editor)
aaaa (command to ``ed'', to add text)
_n_o_w _t_y_p_e _i_n
_w_h_a_t_e_v_e_r _t_e_x_t _y_o_u _w_a_n_t ...
.... (signals the end of adding text)


November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 13


The ``....'' that signals the end of adding text must be at the

beginning of a line by itself. Don't forget it, for until

it is typed, no other eeeedddd commands will be recognized -

everything you type will be treated as text to be added.


At this point you can do various editing operations on the

text you typed in, such as correcting spelling mistakes,

rearranging paragraphs and the like. Finally, you must

write the information you have typed into a file with the

editor command wwww:


wwww

eeeedddd will respond with the number of characters it wrote into

the file jjjjuuuunnnnkkkk.


Until the wwww command, nothing is stored permanently, so if

you hang up and go home the information is lost.|- But after

wwww the information is there permanently; you can re-access it

any time by typing


eeeedddd jjjjuuuunnnnkkkk

Type a qqqq command to quit the editor. (If you try to quit

without writing, eeeedddd will print a ???? to remind you. A second

qqqq gets you out regardless.)


Now create a second file called tttteeeemmmmpppp in the same manner.

You should now have two files, jjjjuuuunnnnkkkk and tttteeeemmmmpppp.
__________________________
|- This is not strictly true - if you hang up while
editing, the data you were working on is saved in a
file called eeeedddd....hhhhuuuupppp, which you can continue with at your
next session.



November 16, 1985






14 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


_W_h_a_t _f_i_l_e_s _a_r_e _o_u_t _t_h_e_r_e?


The llllssss (for ``list'') command lists the names (not con-

tents) of any of the files that UNIX knows about. If you

type


llllssss

the response will be


jjjjuuuunnnnkkkk
tttteeeemmmmpppp

which are indeed the two files just created. The names are

sorted into alphabetical order automatically, but other

variations are possible. For example, the command


llllssss ----tttt

causes the files to be listed in the order in which they

were last changed, most recent first. The ----llll option gives a

``long'' listing:


llllssss ----llll

will produce something like


----rrrrwwww----rrrrwwww----rrrrwwww---- 1111 bbbbwwwwkkkk 44441111 JJJJuuuullll 22222222 2222::::55556666 jjjjuuuunnnnkkkk
----rrrrwwww----rrrrwwww----rrrrwwww---- 1111 bbbbwwwwkkkk 77778888 JJJJuuuullll 22222222 2222::::55557777 tttteeeemmmmpppp

The date and time are of the last change to the file. The

41 and 78 are the number of characters (which should agree

with the numbers you got from eeeedddd). bbbbwwwwkkkk is the owner of the

file, that is, the person who created it. The ----rrrrwwww----rrrrwwww----rrrrwwww----

tells who has permission to read and write the file, in this

case everyone.



November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 15


Options can be combined: llllssss ----lllltttt gives the same thing as

llllssss ----llll, but sorted into time order. You can also name the

files you're interested in, and llllssss will list the information

about them only. More details can be found in llllssss(1).


The use of optional arguments that begin with a minus

sign, like ----tttt and ----lllltttt, is a common convention for UNIX pro-

grams. In general, if a program accepts such optional argu-

ments, they precede any filename arguments. It is also

vital that you separate the various arguments with spaces:

llllssss----llll is not the same as llllssss ----llll.


_P_r_i_n_t_i_n_g _F_i_l_e_s


Now that you've got a file of text, how do you print it so

people can look at it? There are a host of programs that do

that, probably more than are needed.


One simple thing is to use the editor, since printing is

often done just before making changes anyway. You can say


eeeedddd jjjjuuuunnnnkkkk
1111,,,,$$$$pppp

eeeedddd will reply with the count of the characters in jjjjuuuunnnnkkkk and

then print all the lines in the file. After you learn how

to use the editor, you can be selective about the parts you

print.


There are times when it's not feasible to use the editor

for printing. For example, there is a limit on how big a

file eeeedddd can handle (several thousand lines). Secondly, it


November 16, 1985






16 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


will only print one file at a time, and sometimes you want

to print several, one after another. So here are a couple

of alternatives.


First is ccccaaaatttt, the simplest of all the printing programs.

ccccaaaatttt simply prints on the terminal the contents of all the

files named in a list. Thus


ccccaaaatttt jjjjuuuunnnnkkkk

prints one file, and


ccccaaaatttt jjjjuuuunnnnkkkk tttteeeemmmmpppp

prints two. The files are simply concatenated (hence the

name ``ccccaaaatttt'') onto the terminal.


pppprrrr produces formatted printouts of files. As with ccccaaaatttt, pppprrrr

prints all the files named in a list. The difference is

that it produces headings with date, time, page number and

file name at the top of each page, and extra lines to skip

over the fold in the paper. Thus,


pppprrrr jjjjuuuunnnnkkkk tttteeeemmmmpppp

will print jjjjuuuunnnnkkkk neatly, then skip to the top of a new page

and print tttteeeemmmmpppp neatly.


pppprrrr can also produce multi-column output:


pppprrrr ----3333 jjjjuuuunnnnkkkk

prints jjjjuuuunnnnkkkk in 3-column format. You can use any reasonable

number in place of ``3'' and pppprrrr will do its best. pppprrrr has

other capabilities as well; see pppprrrr(1).


November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 17


It should be noted that pppprrrr is _n_o_t a formatting program in

the sense of shuffling lines around and justifying margins.

The true formatters are nnnnrrrrooooffffffff and ttttrrrrooooffffffff, which we will get

to in the section on document preparation.


There are also programs that print files on a high-speed

printer. Look in your manual under oooopppprrrr and llllpppprrrr. Which to

use depends on what equipment is attached to your machine.


_S_h_u_f_f_l_i_n_g _F_i_l_e_s _A_b_o_u_t


Now that you have some files in the file system and some

experience in printing them, you can try bigger things. For

example, you can move a file from one place to another

(which amounts to giving it a new name), like this:


mmmmvvvv jjjjuuuunnnnkkkk pppprrrreeeecccciiiioooouuuussss

This means that what used to be ``junk'' is now ``pre-

cious''. If you do an llllssss command now, you will get


pppprrrreeeecccciiiioooouuuussss
tttteeeemmmmpppp

Beware that if you move a file to another one that already

exists, the already existing contents are lost forever.


If you want to make a _c_o_p_y of a file (that is, to have two

versions of something), you can use the ccccpppp command:


ccccpppp pppprrrreeeecccciiiioooouuuussss tttteeeemmmmpppp1111

makes a duplicate copy of pppprrrreeeecccciiiioooouuuussss in tttteeeemmmmpppp1111.


Finally, when you get tired of creating and moving files,


November 16, 1985






18 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


there is a command to remove files from the file system,

called rrrrmmmm.


rrrrmmmm tttteeeemmmmpppp tttteeeemmmmpppp1111

will remove both of the files named.


You will get a warning message if one of the named files

wasn't there, but otherwise rrrrmmmm, like most UNIX commands,

does its work silently. There is no prompting or chatter,

and error messages are occasionally curt. This terseness is

sometimes disconcerting to newcomers, but experienced users

find it desirable.


_W_h_a_t'_s _i_n _a _F_i_l_e_n_a_m_e


So far we have used filenames without ever saying what's a

legal name, so it's time for a couple of rules. First,

filenames are limited to 14 characters, which is enough to

be descriptive. Second, although you can use almost any

character in a filename, common sense says you should stick

to ones that are visible, and that you should probably avoid

characters that might be used with other meanings. We have

already seen, for example, that in the llllssss command, llllssss ----tttt

means to list in time order. So if you had a file whose

name was ----tttt, you would have a tough time listing it by name.

Besides the minus sign, there are other characters which

have special meaning. To avoid pitfalls, you would do well

to use only letters, numbers and the period until you're

familiar with the situation.



November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 19


On to some more positive suggestions. Suppose you're typ-

ing a large document like a book. Logically this divides

into many small pieces, like chapters and perhaps sections.

Physically it must be divided too, for eeeedddd will not handle

really big files. Thus you should type the document as a

number of files. You might have a separate file for each

chapter, called


cccchhhhaaaapppp1111
cccchhhhaaaapppp2222
etc...

Or, if each chapter were broken into several files, you

might have


cccchhhhaaaapppp1111....1111
cccchhhhaaaapppp1111....2222
cccchhhhaaaapppp1111....3333
............
cccchhhhaaaapppp2222....1111
cccchhhhaaaapppp2222....2222
............

You can now tell at a glance where a particular file fits

into the whole.


There are advantages to a systematic naming convention

which are not obvious to the novice UNIX user. What if you

wanted to print the whole book? You could say


pppprrrr cccchhhhaaaapppp1111....1111 cccchhhhaaaapppp1111....2222 cccchhhhaaaapppp1111....3333 ........................

but you would get tired pretty fast, and would probably even

make mistakes. Fortunately, there is a shortcut. You can

say


pppprrrr cccchhhhaaaapppp****


November 16, 1985






20 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


The **** means ``anything at all,'' so this translates into

``print all files whose names begin with cccchhhhaaaapppp'', listed in

alphabetical order.


This shorthand notation is not a property of the pppprrrr com-

mand, by the way. It is system-wide, a service of the pro-

gram that interprets commands (the ``shell,'' sssshhhh(1)). Using

that fact, you can see how to list the names of the files in

the book:


llllssss cccchhhhaaaapppp****

produces


cccchhhhaaaapppp1111....1111
cccchhhhaaaapppp1111....2222
cccchhhhaaaapppp1111....3333
............

The **** is not limited to the last position in a filename - it

can be anywhere and can occur several times. Thus


rrrrmmmm ****jjjjuuuunnnnkkkk**** ****tttteeeemmmmpppp****

removes all files that contain jjjjuuuunnnnkkkk or tttteeeemmmmpppp as any part of

their name. As a special case, **** by itself matches every

filename, so


pppprrrr ****

prints all your files (alphabetical order), and


rrrrmmmm ****

removes _a_l_l _f_i_l_e_s. (You had better be _v_e_r_y sure that's what

you wanted to say!)




November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 21


The **** is not the only pattern-matching feature available.

Suppose you want to print only chapters 1 through 4 and 9.

Then you can say


pppprrrr cccchhhhaaaapppp[[[[11112222333344449999]]]]****

The [[[[............]]]] means to match any of the characters inside the

brackets. A range of consecutive letters or digits can be

abbreviated, so you can also do this with


pppprrrr cccchhhhaaaapppp[[[[1111----44449999]]]]****

Letters can also be used within brackets: [[[[aaaa----zzzz]]]] matches any

character in the range aaaa through zzzz.


The ???? pattern matches any single character, so


llllssss ????

lists all files which have single-character names, and


llllssss ----llll cccchhhhaaaapppp????....1111

lists information about the first file of each chapter

(cccchhhhaaaapppp1111....1111, cccchhhhaaaapppp2222....1111, etc.).


Of these niceties, **** is certainly the most useful, and you

should get used to it. The others are frills, but worth

knowing.


If you should ever have to turn off the special meaning of

****, ????, etc., enclose the entire argument in single quotes, as

in


llllssss ''''????''''



November 16, 1985






22 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


We'll see some more examples of this shortly.


_W_h_a_t'_s _i_n _a _F_i_l_e_n_a_m_e, _C_o_n_t_i_n_u_e_d


When you first made that file called jjjjuuuunnnnkkkk, how did the

system know that there wasn't another jjjjuuuunnnnkkkk somewhere else,

especially since the person in the next office is also read-

ing this tutorial? The answer is that generally each user

has a private _d_i_r_e_c_t_o_r_y, which contains only the files that

belong to him. When you log in, you are ``in'' your direc-

tory. Unless you take special action, when you create a new

file, it is made in the directory that you are currently in;

this is most often your own directory, and thus the file is

unrelated to any other file of the same name that might

exist in someone else's directory.


The set of all files is organized into a (usually big)

tree, with your files located several branches into the

tree. It is possible for you to ``walk'' around this tree,

and to find any file in the system, by starting at the root

of the tree and walking along the proper set of branches.

Conversely, you can start where you are and walk toward the

root.


Let's try the latter first. The basic tools is the com-

mand ppppwwwwdddd (``print working directory''), which prints the

name of the directory you are currently in.


Although the details will vary according to the system you

are on, if you give the command ppppwwwwdddd, it will print something


November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 23


like


////uuuussssrrrr////yyyyoooouuuurrrr----nnnnaaaammmmeeee

This says that you are currently in the directory yyyyoooouuuurrrr----nnnnaaaammmmeeee,

which is in turn in the directory ////uuuussssrrrr, which is in turn in

the root directory called by convention just ////. (Even if

it's not called ////uuuussssrrrr on your system, you will get something

analogous. Make the corresponding changes and read on.)


If you now type


llllssss ////uuuussssrrrr////yyyyoooouuuurrrr----nnnnaaaammmmeeee

you should get exactly the same list of file names as you

get from a plain llllssss: with no arguments, llllssss lists the con-

tents of the current directory; given the name of a direc-

tory, it lists the contents of that directory.


Next, try


llllssss ////uuuussssrrrr

This should print a long series of names, among which is

your own login name yyyyoooouuuurrrr----nnnnaaaammmmeeee. On many systems, uuuussssrrrr is a

directory that contains the directories of all the normal

users of the system, like you.


The next step is to try


llllssss ////

You should get a response something like this (although

again the details may be different):




November 16, 1985






24 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s



bbbbiiiinnnn
ddddeeeevvvv
eeeettttcccc
lllliiiibbbb
ttttmmmmpppp
uuuussssrrrr

This is a collection of the basic directories of files that

the system knows about; we are at the root of the tree.


Now try


ccccaaaatttt ////uuuussssrrrr////yyyyoooouuuurrrr----nnnnaaaammmmeeee////jjjjuuuunnnnkkkk

(if jjjjuuuunnnnkkkk is still around in your directory). The name


////uuuussssrrrr////yyyyoooouuuurrrr----nnnnaaaammmmeeee////jjjjuuuunnnnkkkk

is called the ppppaaaatttthhhhnnnnaaaammmmeeee of the file that you normally think

of as ``junk''. ``Pathname'' has an obvious meaning: it

represents the full name of the path you have to follow from

the root through the tree of directories to get to a partic-

ular file. It is a universal rule in the UNIX system that

anywhere you can use an ordinary filename, you can use a

pathname.


Here is a picture which may make this clearer:


(root)
/ | \
/ | \
/ | \
bin etc usr dev tmp
/ | \ / | \ / | \ / | \ / | \
/ | \
/ | \
adam eve mary
/ / \ \
/ \ junk
junk temp



November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 25


Notice that Mary's jjjjuuuunnnnkkkk is unrelated to Eve's.


This isn't too exciting if all the files of interest are

in your own directory, but if you work with someone else or

on several projects concurrently, it becomes handy indeed.

For example, your friends can print your book by saying


pppprrrr ////uuuussssrrrr////yyyyoooouuuurrrr----nnnnaaaammmmeeee////cccchhhhaaaapppp****

Similarly, you can find out what files your neighbor has by

saying


llllssss ////uuuussssrrrr////nnnneeeeiiiigggghhhhbbbboooorrrr----nnnnaaaammmmeeee

or make your own copy of one of his files by


ccccpppp ////uuuussssrrrr////yyyyoooouuuurrrr----nnnneeeeiiiigggghhhhbbbboooorrrr////hhhhiiiissss----ffffiiiilllleeee yyyyoooouuuurrrrffffiiiilllleeee


If your neighbor doesn't want you poking around in his

files, or vice versa, privacy can be arranged. Each file

and directory has read-write-execute permissions for the

owner, a group, and everyone else, which can be set to con-

trol access. See llllssss(1) and cccchhhhmmmmoooodddd(1) for details. As a

matter of observed fact, most users most of the time find

openness of more benefit than privacy.


As a final experiment with pathnames, try


llllssss ////bbbbiiiinnnn ////uuuussssrrrr////bbbbiiiinnnn

Do some of the names look familiar? When you run a program,

by typing its name after the prompt character, the system

simply looks for a file of that name. It normally looks

first in your directory (where it typically doesn't find


November 16, 1985






26 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


it), then in ////bbbbiiiinnnn and finally in ////uuuussssrrrr////bbbbiiiinnnn. There is nothing

magic about commands like ccccaaaatttt or llllssss, except that they have

been collected into a couple of places to be easy to find

and administer.


What if you work regularly with someone else on common

information in his directory? You could just log in as your

friend each time you want to, but you can also say ``I want

to work on his files instead of my own''. This is done by

changing the directory that you are currently in:


ccccdddd ////uuuussssrrrr////yyyyoooouuuurrrr----ffffrrrriiiieeeennnndddd

(On some systems, ccccdddd is spelled cccchhhhddddiiiirrrr.) Now when you use a

filename in something like ccccaaaatttt or pppprrrr, it refers to the file

in your friend's directory. Changing directories doesn't

affect any permissions associated with a file - if you

couldn't access a file from your own directory, changing to

another directory won't alter that fact. Of course, if you

forget what directory you're in, type


ppppwwwwdddd

to find out.


It is usually convenient to arrange your own files so that

all the files related to one thing are in a directory

separate from other projects. For example, when you write

your book, you might want to keep all the text in a direc-

tory called bbbbooooooookkkk. So make one with


mmmmkkkkddddiiiirrrr bbbbooooooookkkk


November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 27


then go to it with


ccccdddd bbbbooooooookkkk

then start typing chapters. The book is now found in

(presumably)


////uuuussssrrrr////yyyyoooouuuurrrr----nnnnaaaammmmeeee////bbbbooooooookkkk

To remove the directory bbbbooooooookkkk, type


rrrrmmmm bbbbooooooookkkk////****
rrrrmmmmddddiiiirrrr bbbbooooooookkkk

The first command removes all files from the directory; the

second removes the empty directory.


You can go up one level in the tree of files by saying


ccccdddd ........

``........'' is the name of the parent of whatever directory you

are currently in. For completeness, ``....'' is an alternate

name for the directory you are in.


_U_s_i_n_g _F_i_l_e_s _i_n_s_t_e_a_d _o_f _t_h_e _T_e_r_m_i_n_a_l


Most of the commands we have seen so far produce output on

the terminal; some, like the editor, also take their input

from the terminal. It is universal in UNIX systems that the

terminal can be replaced by a file for either or both of

input and output. As one example,


llllssss

makes a list of files on your terminal. But if you say




November 16, 1985






28 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s



llllssss >>>>ffffiiiilllleeeelllliiiisssstttt

a list of your files will be placed in the file ffffiiiilllleeeelllliiiisssstttt

(which will be created if it doesn't already exist, or

overwritten if it does). The symbol >>>> means ``put the out-

put on the following file, rather than on the terminal.''

Nothing is produced on the terminal. As another example,

you could combine several files into one by capturing the

output of ccccaaaatttt in a file:


ccccaaaatttt ffff1111 ffff2222 ffff3333 >>>>tttteeeemmmmpppp


The symbol >>>>>>>> operates very much like >>>> does, except that

it means ``add to the end of.'' That is,


ccccaaaatttt ffff1111 ffff2222 ffff3333 >>>>>>>>tttteeeemmmmpppp

means to concatenate ffff1111, ffff2222 and ffff3333 to the end of whatever is

already in tttteeeemmmmpppp, instead of overwriting the existing con-

tents. As with >>>>, if tttteeeemmmmpppp doesn't exist, it will be created

for you.


In a similar way, the symbol <<<< means to take the input for

a program from the following file, instead of from the ter-

minal. Thus, you could make up a script of commonly used

editing commands and put them into a file called ssssccccrrrriiiipppptttt.

Then you can run the script on a file by saying


eeeedddd ffffiiiilllleeee <<<
As another example, you can use eeeedddd to prepare a letter in

file lllleeeetttt, then send it to several people with



November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 29



mmmmaaaaiiiillll aaaaddddaaaammmm eeeevvvveeee mmmmaaaarrrryyyy jjjjooooeeee <<<

_P_i_p_e_s


One of the novel contributions of the UNIX system is the

idea of a _p_i_p_e. A pipe is simply a way to connect the out-

put of one program to the input of another program, so the

two run as a sequence of processes - a pipeline.


For example,


pppprrrr ffff gggg hhhh

will print the files ffff, gggg, and hhhh, beginning each on a new

page. Suppose you want them run together instead. You

could say


ccccaaaatttt ffff gggg hhhh >>>>tttteeeemmmmpppp
pppprrrr <<< rrrrmmmm tttteeeemmmmpppp

but this is more work than necessary. Clearly what we want

is to take the output of ccccaaaatttt and connect it to the input of

pppprrrr. So let us use a pipe:


ccccaaaatttt ffff gggg hhhh |||| pppprrrr

The vertical bar |||| means to take the output from ccccaaaatttt, which

would normally have gone to the terminal, and put it into pppprrrr

to be neatly formatted.


There are many other examples of pipes. For example,


llllssss |||| pppprrrr ----3333

prints a list of your files in three columns. The program


November 16, 1985






30 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


wwwwcccc counts the number of lines, words and characters in its

input, and as we saw earlier, wwwwhhhhoooo prints a list of

currently-logged on people, one per line. Thus


wwwwhhhhoooo |||| wwwwcccc

tells how many people are logged on. And of course


llllssss |||| wwwwcccc

counts your files.


Any program that reads from the terminal can read from a

pipe instead; any program that writes on the terminal can

drive a pipe. You can have as many elements in a pipeline

as you wish.


Many UNIX programs are written so that they will take

their input from one or more files if file arguments are

given; if no arguments are given they will read from the

terminal, and thus can be used in pipelines. pppprrrr is one

example:


pppprrrr ----3333 aaaa bbbb cccc

prints files aaaa, bbbb and cccc in order in three columns. But in


ccccaaaatttt aaaa bbbb cccc |||| pppprrrr ----3333

pppprrrr prints the information coming down the pipeline, still in

three columns.


_T_h_e _S_h_e_l_l


We have already mentioned once or twice the mysterious



November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 31


``shell,'' which is in fact sssshhhh(1). The shell is the program

that interprets what you type as commands and arguments. It

also looks after translating ****, etc., into lists of

filenames, and <<<<, >>>>, and |||| into changes of input and output

streams.


The shell has other capabilities too. For example, you

can run two programs with one command line by separating the

commands with a semicolon; the shell recognizes the semi-

colon and breaks the line into two commands. Thus


ddddaaaatttteeee;;;; wwwwhhhhoooo

does both commands before returning with a prompt character.


You can also have more than one program running _s_i_m_u_l_t_a_n_e_-

_o_u_s_l_y if you wish. For example, if you are doing something

time-consuming, like the editor script of an earlier sec-

tion, and you don't want to wait around for the results

before starting something else, you can say


eeeedddd ffffiiiilllleeee <<<
The ampersand at the end of a command line says ``start this

command running, then take further commands from the termi-

nal immediately,'' that is, don't wait for it to complete.

Thus the script will begin, but you can do something else at

the same time. Of course, to keep the output from interfer-

ing with what you're doing on the terminal, it would be

better to say


eeeedddd ffffiiiilllleeee <<<>>>ssssccccrrrriiiipppptttt....oooouuuutttt &&&&


November 16, 1985






32 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


which saves the output lines in a file called ssssccccrrrriiiipppptttt....oooouuuutttt.


When you initiate a command with &&&&, the system replies

with a number called the process number, which identifies

the command in case you later want to stop it. If you do,

you can say


kkkkiiiillllllll pppprrrroooocccceeeessssssss----nnnnuuuummmmbbbbeeeerrrr

If you forget the process number, the command ppppssss will tell

you about everything you have running. (If you are

desperate, kkkkiiiillllllll 0000 will kill all your processes.) And if

you're curious about other people, ppppssss aaaa will tell you about

_a_l_l programs that are currently running.


You can say


((((ccccoooommmmmmmmaaaannnndddd----1111;;;; ccccoooommmmmmmmaaaannnndddd----2222;;;; ccccoooommmmmmmmaaaannnndddd----3333)))) &&&&

to start three commands in the background, or you can start

a background pipeline with


ccccoooommmmmmmmaaaannnndddd----1111 |||| ccccoooommmmmmmmaaaannnndddd----2222 &&&&


Just as you can tell the editor or some similar program to

take its input from a file instead of from the terminal, you

can tell the shell to read a file to get commands. (Why

not? The shell, after all, is just a program, albeit a

clever one.) For instance, suppose you want to set tabs on

your terminal, and find out the date and who's on the system

every time you log in. Then you can put the three necessary

commands (ttttaaaabbbbssss, ddddaaaatttteeee, wwwwhhhhoooo) into a file, let's call it



November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 33


ssssttttaaaarrrrttttuuuupppp, and then run it with


sssshhhh ssssttttaaaarrrrttttuuuupppp

This says to run the shell with the file ssssttttaaaarrrrttttuuuupppp as input.

The effect is as if you had typed the contents of ssssttttaaaarrrrttttuuuupppp on

the terminal.


If this is to be a regular thing, you can eliminate the

need to type sssshhhh: simply type, once only, the command


cccchhhhmmmmoooodddd ++++xxxx ssssttttaaaarrrrttttuuuupppp

and thereafter you need only say


ssssttttaaaarrrrttttuuuupppp

to run the sequence of commands. The cccchhhhmmmmoooodddd(1) command marks

the file executable; the shell recognizes this and runs it

as a sequence of commands.


If you want ssssttttaaaarrrrttttuuuupppp to run automatically every time you

log in, create a file in your login directory called

....pppprrrrooooffffiiiilllleeee, and place in it the line ssssttttaaaarrrrttttuuuupppp. When the shell

first gains control when you log in, it looks for the

....pppprrrrooooffffiiiilllleeee file and does whatever commands it finds in it.

We'll get back to the shell in the section on programming.



_I_I_I. _D_O_C_U_M_E_N_T _P_R_E_P_A_R_A_T_I_O_N


UNIX systems are used extensively for document prepara-

tion. There are two major formatting programs, that is,

programs that produce a text with justified right margins,



November 16, 1985






34 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


automatic page numbering and titling, automatic hyphenation,

and the like. nnnnrrrrooooffffffff is designed to produce output on termi-

nals and line-printers. ttttrrrrooooffffffff (pronounced ``tee-roff'')

instead drives a phototypesetter, which produces very high

quality output on photographic paper. This paper was for-

matted with ttttrrrrooooffffffff.


_F_o_r_m_a_t_t_i_n_g _P_a_c_k_a_g_e_s


The basic idea of nnnnrrrrooooffffffff and ttttrrrrooooffffffff is that the text to be

formatted contains within it ``formatting commands'' that

indicate in detail how the formatted text is to look. For

example, there might be commands that specify how long lines

are, whether to use single or double spacing, and what run-

ning titles to use on each page.


Because nnnnrrrrooooffffffff and ttttrrrrooooffffffff are relatively hard to learn to

use effectively, several ``packages'' of canned formatting

requests are available to let you specify paragraphs, run-

ning titles, footnotes, multi-column output, and so on, with

little effort and without having to learn nnnnrrrrooooffffffff and ttttrrrrooooffffffff.

These packages take a modest effort to learn, but the

rewards for using them are so great that it is time well

spent.


In this section, we will provide a hasty look at the

``manuscript'' package known as ----mmmmssss. Formatting requests

typically consist of a period and two upper-case letters,

such as ....TTTTLLLL, which is used to introduce a title, or ....PPPPPPPP to



November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 35


begin a new paragraph.


A document is typed so it looks something like this:


....TTTTLLLL
ttttiiiittttlllleeee ooooffff ddddooooccccuuuummmmeeeennnntttt
....AAAAUUUU
aaaauuuutttthhhhoooorrrr nnnnaaaammmmeeee
....SSSSHHHH
sssseeeeccccttttiiiioooonnnn hhhheeeeaaaaddddiiiinnnngggg
....PPPPPPPP
ppppaaaarrrraaaaggggrrrraaaapppphhhh ............
....PPPPPPPP
aaaannnnooootttthhhheeeerrrr ppppaaaarrrraaaaggggrrrraaaapppphhhh ............
....SSSSHHHH
aaaannnnooootttthhhheeeerrrr sssseeeeccccttttiiiioooonnnn hhhheeeeaaaaddddiiiinnnngggg
....PPPPPPPP
eeeettttcccc....

The lines that begin with a period are the formatting

requests. For example, ....PPPPPPPP calls for starting a new para-

graph. The precise meaning of ....PPPPPPPP depends on what output

device is being used (typesetter or terminal, for instance),

and on what publication the document will appear in. For

example, ----mmmmssss normally assumes that a paragraph is preceded

by a space (one line in nnnnrrrrooooffffffff, 1/2 line in ttttrrrrooooffffffff), and the

first word is indented. These rules can be changed if you

like, but they are changed by changing the interpretation of

....PPPPPPPP, not by re-typing the document.


To actually produce a document in standard format using

----mmmmssss, use the command


ttttrrrrooooffffffff ----mmmmssss ffffiiiilllleeeessss ............

for the typesetter, and


nnnnrrrrooooffffffff ----mmmmssss ffffiiiilllleeeessss ............



November 16, 1985






36 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


for a terminal. The ----mmmmssss argument tells ttttrrrrooooffffffff and nnnnrrrrooooffffffff to

use the manuscript package of formatting requests.


There are several similar packages; check with a local

expert to determine which ones are in common use on your

machine.


_S_u_p_p_o_r_t_i_n_g _T_o_o_l_s


In addition to the basic formatters, there is a host of

supporting programs that help with document preparation.

The list in the next few paragraphs is far from complete, so

browse through the manual and check with people around you

for other possibilities.


eeeeqqqqnnnn and nnnneeeeqqqqnnnn let you integrate mathematics into the text

of a document, in an easy-to-learn language that closely

resembles the way you would speak it aloud. For example,

the eeeeqqqqnnnn input


ssssuuuummmm ffffrrrroooommmm iiii====0000 ttttoooo nnnn xxxx ssssuuuubbbb iiii ~~~~====~~~~ ppppiiii oooovvvveeeerrrr 2222

produces the output



999 _i_=078_R78_n999 _x_i _=99 278_J9__


9
The program ttttbbbbllll provides an analogous service for prepar-

ing tabular material; it does all the computations necessary

to align complicated columns with elements of varying

widths.




November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 37


rrrreeeeffffeeeerrrr prepares bibliographic citations from a data base,

in whatever style is defined by the formatting package. It

looks after all the details of numbering references in

sequence, filling in page and volume numbers, getting the

author's initials and the journal name right, and so on.


ssssppppeeeellllllll and ttttyyyyppppoooo detect possible spelling mistakes in a

document. ssssppppeeeellllllll works by comparing the words in your docu-

ment to a dictionary, printing those that are not in the

dictionary. It knows enough about English spelling to

detect plurals and the like, so it does a very good job.

ttttyyyyppppoooo looks for words which are ``unusual'', and prints

those. Spelling mistakes tend to be more unusual, and thus

show up early when the most unusual words are printed first.


ggggrrrreeeepppp looks through a set of files for lines that contain a

particular text pattern (rather like the editor's context

search does, but on a bunch of files). For example,


ggggrrrreeeepppp ''''iiiinnnngggg$$$$'''' cccchhhhaaaapppp****

will find all lines that end with the letters iiiinnnngggg in the

files cccchhhhaaaapppp****. (It is almost always a good practice to put

single quotes around the pattern you're searching for, in

case it contains characters like **** or $$$$ that have a special

meaning to the shell.) ggggrrrreeeepppp is often useful for finding out

in which of a set of files the misspelled words detected by

ssssppppeeeellllllll are actually located.


ddddiiiiffffffff prints a list of the differences between two files,



November 16, 1985






38 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


so you can compare two versions of something automatically

(which certainly beats proofreading by hand).


wwwwcccc counts the words, lines and characters in a set of

files. ttttrrrr translates characters into other characters; for

example it will convert upper to lower case and vice versa.

This translates upper into lower:


ttttrrrr AAAA----ZZZZ aaaa----zzzz <<<>>>oooouuuuttttppppuuuutttt


ssssoooorrrrtttt sorts files in a variety of ways; ccccrrrreeeeffff makes cross-

references; ppppttttxxxx makes a permuted index (keyword-in-context

listing). sssseeeedddd provides many of the editing facilities of

eeeedddd, but can apply them to arbitrarily long inputs. aaaawwwwkkkk pro-

vides the ability to do both pattern matching and numeric

computations, and to conveniently process fields within

lines. These programs are for more advanced users, and they

are not limited to document preparation. Put them on your

list of things to learn about.


Most of these programs are either independently documented

(like eeeeqqqqnnnn and ttttbbbbllll), or are sufficiently simple that the

description in the _U_N_I_X _P_r_o_g_r_a_m_m_e_r'_s _M_a_n_u_a_l is adequate

explanation.


_H_i_n_t_s _f_o_r _P_r_e_p_a_r_i_n_g _D_o_c_u_m_e_n_t_s


Most documents go through several versions (always more

than you expected) before they are finally finished.

Accordingly, you should do whatever possible to make the job



November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 39


of changing them easy.


First, when you do the purely mechanical operations of

typing, type so that subsequent editing will be easy. Start

each sentence on a new line. Make lines short, and break

lines at natural places, such as after commas and semi-

colons, rather than randomly. Since most people change

documents by rewriting phrases and adding, deleting and

rearranging sentences, these precautions simplify any edit-

ing you have to do later.


Keep the individual files of a document down to modest

size, perhaps ten to fifteen thousand characters. Larger

files edit more slowly, and of course if you make a dumb

mistake it's better to have clobbered a small file than a

big one. Split into files at natural boundaries in the

document, for the same reasons that you start each sentence

on a new line.


The second aspect of making change easy is to not commit

yourself to formatting details too early. One of the advan-

tages of formatting packages like ----mmmmssss is that they permit

you to delay decisions to the last possible moment. Indeed,

until a document is printed, it is not even decided whether

it will be typeset or put on a line printer.


As a rule of thumb, for all but the most trivial jobs, you

should type a document in terms of a set of requests like

....PPPPPPPP, and then define them appropriately, either by using one



November 16, 1985






40 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


of the canned packages (the better way) or by defining your

own nnnnrrrrooooffffffff and ttttrrrrooooffffffff commands. As long as you have entered

the text in some systematic way, it can always be cleaned up

and re-formatted by a judicious combination of editing com-

mands and request definitions.


_I_V. _P_R_O_G_R_A_M_M_I_N_G


There will be no attempt made to teach any of the program-

ming languages available but a few words of advice are in

order. One of the reasons why the UNIX system is a produc-

tive programming environment is that there is already a rich

set of tools available, and facilities like pipes, I/O

redirection, and the capabilities of the shell often make it

possible to do a job by pasting together programs that

already exist instead of writing from scratch.


_T_h_e _S_h_e_l_l


The pipe mechanism lets you fabricate quite complicated

operations out of spare parts that already exist. For exam-

ple, the first draft of the ssssppppeeeellllllll program was (roughly)


ccccaaaatttt ............ _c_o_l_l_e_c_t _t_h_e _f_i_l_e_s
|||| ttttrrrr ............ _p_u_t _e_a_c_h _w_o_r_d _o_n _a _n_e_w _l_i_n_e
|||| ttttrrrr ............ _d_e_l_e_t_e _p_u_n_c_t_u_a_t_i_o_n, _e_t_c.
|||| ssssoooorrrrtttt _i_n_t_o _d_i_c_t_i_o_n_a_r_y _o_r_d_e_r
|||| uuuunnnniiiiqqqq _d_i_s_c_a_r_d _d_u_p_l_i_c_a_t_e_s
|||| ccccoooommmmmmmm _p_r_i_n_t _w_o_r_d_s _i_n _t_e_x_t
_b_u_t _n_o_t _i_n _d_i_c_t_i_o_n_a_r_y

More pieces have been added subsequently, but this goes a

long way for such a small effort.




November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 41


The editor can be made to do things that would normally

require special programs on other systems. For example, to

list the first and last lines of each of a set of files,

such as a book, you could laboriously type


eeeedddd
eeee cccchhhhaaaapppp1111....1111
1111pppp
$$$$pppp
eeee cccchhhhaaaapppp1111....2222
1111pppp
$$$$pppp
etc.

But you can do the job much more easily. One way is to type


llllssss cccchhhhaaaapppp**** >>>>tttteeeemmmmpppp

to get the list of filenames into a file. Then edit this

file to make the necessary series of editing commands (using

the global commands of eeeedddd), and write it into ssssccccrrrriiiipppptttt. Now

the command


eeeedddd <<<
will produce the same output as the laborious hand typing.

Alternately (and more easily), you can use the fact that the

shell will perform loops, repeating a set of commands over

and over again for a set of arguments:


ffffoooorrrr iiii iiiinnnn cccchhhhaaaapppp****
ddddoooo
eeeedddd $$$$iiii <<< ddddoooonnnneeee

This sets the shell variable iiii to each file name in turn,

then does the command. You can type this command at the

terminal, or put it in a file for later execution.



November 16, 1985






42 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


_P_r_o_g_r_a_m_m_i_n_g _t_h_e _S_h_e_l_l


An option often overlooked by newcomers is that the shell

is itself a programming language, with variables, control

flow (iiiiffff----eeeellllsssseeee, wwwwhhhhiiiilllleeee, ffffoooorrrr, ccccaaaasssseeee), subroutines, and interrupt

handling. Since there are many building-block programs, you

can sometimes avoid writing a new program merely by piecing

together some of the building blocks with shell command

files.


We will not go into any details here; examples and rules


can be found in _A_n _I_n_t_r_o_d_u_c_t_i_o_n _t_o _t_h_e _U_N_I_X _S_h_e_l_l, by S. R.

Bourne.


_P_r_o_g_r_a_m_m_i_n_g _i_n _C


If you are undertaking anything substantial, C is the only

reasonable choice of programming language: everything in the

UNIX system is tuned to it. The system itself is written in

C, as are most of the programs that run on it. It is also a

easy language to use once you get started. C is introduced

and fully described in _T_h_e _C _P_r_o_g_r_a_m_m_i_n_g _L_a_n_g_u_a_g_e by B. W.

Kernighan and D. M. Ritchie (Prentice-Hall, 1978). Several

sections of the manual describe the system interfaces, that

is, how you do I/O and similar functions. Read _U_N_I_X _P_r_o_-

_g_r_a_m_m_i_n_g for more complicated things.


Most input and output in C is best handled with the stan-

dard I/O library, which provides a set of I/O functions that

exist in compatible form on most machines that have C


November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 43


compilers. In general, it's wisest to confine the system

interactions in a program to the facilities provided by this

library.


C programs that don't depend too much on special features

of UNIX (such as pipes) can be moved to other computers that

have C compilers. The list of such machines grows daily; in

addition to the original PDP-11, it currently includes at

least Honeywell 6000, IBM 370, Interdata 8/32, Data General

Nova and Eclipse, HP 2100, Harris /7, VAX 11/780, SEL 86,

and Zilog Z80. Calls to the standard I/O library will work

on all of these machines.


There are a number of supporting programs that go with C.

lllliiiinnnntttt checks C programs for potential portability problems,

and detects errors such as mismatched argument types and

uninitialized variables.


For larger programs (anything whose source is on more than

one file) mmmmaaaakkkkeeee allows you to specify the dependencies among

the source files and the processing steps needed to make a

new version; it then checks the times that the pieces were

last changed and does the minimal amount of recompiling to

create a consistent updated version.


The debugger aaaaddddbbbb is useful for digging through the dead

bodies of C programs, but is rather hard to learn to use

effectively. The most effective debugging tool is still

careful thought, coupled with judiciously placed print



November 16, 1985






44 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


statements.


The C compiler provides a limited instrumentation service,

so you can find out where programs spend their time and what

parts are worth optimizing. Compile the routines with the

----pppp option; after the test run, use pppprrrrooooffff to print an execu-

tion profile. The command ttttiiiimmmmeeee will give you the gross

run-time statistics of a program, but they are not super

accurate or reproducible.


_O_t_h_e_r _L_a_n_g_u_a_g_e_s


If you _h_a_v_e to use Fortran, there are two possibilities.

You might consider Ratfor, which gives you the decent con-

trol structures and free-form input that characterize C, yet

lets you write code that is still portable to other environ-

ments. Bear in mind that UNIX Fortran tends to produce

large and relatively slow-running programs. Furthermore,

supporting software like aaaaddddbbbb, pppprrrrooooffff, etc., are all virtually

useless with Fortran programs. There may also be a Fortran

77 compiler on your system. If so, this is a viable alter-

native to Ratfor, and has the non-trivial advantage that it

is compatible with C and related programs. (The Ratfor pro-

cessor and C tools can be used with Fortran 77 too.)


If your application requires you to translate a language

into a set of actions or another language, you are in effect

building a compiler, though probably a small one. In that

case, you should be using the yyyyaaaacccccccc compiler-compiler, which



November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 45


helps you develop a compiler quickly. The lllleeeexxxx lexical

analyzer generator does the same job for the simpler

languages that can be expressed as regular expressions. It

can be used by itself, or as a front end to recognize inputs

for a yyyyaaaacccccccc-based program. Both yyyyaaaacccccccc and lllleeeexxxx require some

sophistication to use, but the initial effort of learning

them can be repaid many times over in programs that are easy

to change later on.


Most UNIX systems also make available other languages,

such as Algol 68, APL, Basic, Lisp, Pascal, and Snobol.

Whether these are useful depends largely on the local

environment: if someone cares about the language and has

worked on it, it may be in good shape. If not, the odds are

strong that it will be more trouble than it's worth.


_V. _U_N_I_X _R_E_A_D_I_N_G _L_I_S_T


_G_e_n_e_r_a_l:


K. L. Thompson and D. M. Ritchie, _T_h_e _U_N_I_X _P_r_o_g_r_a_m_m_e_r'_s

_M_a_n_u_a_l, Bell Laboratories, 1978. Lists commands, system

routines and interfaces, file formats, and some of the

maintenance procedures. You can't live without this,

although you will probably only need to read section 1.


_D_o_c_u_m_e_n_t_s _f_o_r _U_s_e _w_i_t_h _t_h_e _U_N_I_X _T_i_m_e-_s_h_a_r_i_n_g _S_y_s_t_e_m. Volume

2 of the Programmer's Manual. This contains more extensive

descriptions of major commands, and tutorials and reference

manuals. All of the papers listed below are in it, as are


November 16, 1985






46 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


descriptions of most of the programs mentioned above.


D. M. Ritchie and K. L. Thompson, ``The UNIX Time-sharing

System,'' CACM, July 1974. An overview of the system, for

people interested in operating systems. Worth reading by

anyone who programs. Contains a remarkable number of one-

sentence observations on how to do things right.


The Bell System Technical Journal (BSTJ) Special Issue on

UNIX, July/August, 1978, contains many papers describing

recent developments, and some retrospective material.


The 2nd International Conference on Software Engineering

(October, 1976) contains several papers describing the use

of the Programmer's Workbench (PWB) version of UNIX.


_D_o_c_u_m_e_n_t _P_r_e_p_a_r_a_t_i_o_n:


B. W. Kernighan, ``A Tutorial Introduction to the UNIX Text

Editor'' and ``Advanced Editing on UNIX,'' Bell Labora-

tories, 1978. Beginners need the introduction; the advanced

material will help you get the most out of the editor.


M. E. Lesk, ``Typing Documents on UNIX,'' Bell Laboratories,

1978. Describes the ----mmmmssss macro package, which isolates the

novice from the vagaries of nnnnrrrrooooffffffff and ttttrrrrooooffffffff, and takes care

of most formatting situations. If this specific package

isn't available on your system, something similar probably

is. The most likely alternative is the PWB/UNIX macro pack-

age ----mmmmmmmm; see your local guru if you use PWB/UNIX.



November 16, 1985






_U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s 47


B. W. Kernighan and L. L. Cherry, ``A System for Typesetting

Mathematics,'' Bell Laboratories Computing Science Tech.

Rep. 17.


M. E. Lesk, ``Tbl - A Program to Format Tables,'' Bell

Laboratories CSTR 49, 1976.


J. F. Ossanna, Jr., ``NROFF/TROFF User's Manual,'' Bell

Laboratories CSTR 54, 1976. ttttrrrrooooffffffff is the basic formatter

used by ----mmmmssss, eeeeqqqqnnnn and ttttbbbbllll. The reference manual is

indispensable if you are going to write or maintain these or

similar programs. But start with:


B. W. Kernighan, ``A TROFF Tutorial,'' Bell Laboratories,

1976. An attempt to unravel the intricacies of ttttrrrrooooffffffff.


_P_r_o_g_r_a_m_m_i_n_g:


B. W. Kernighan and D. M. Ritchie, _T_h_e _C _P_r_o_g_r_a_m_m_i_n_g

_L_a_n_g_u_a_g_e, Prentice-Hall, 1978. Contains a tutorial intro-

duction, complete discussions of all language features, and

the reference manual.


B. W. Kernighan and D. M. Ritchie, ``UNIX Programming,''

Bell Laboratories, 1978. Describes how to interface with

the system from C programs: I/O calls, signals, processes.


S. R. Bourne, ``An Introduction to the UNIX Shell,'' Bell

Laboratories, 1978. An introduction and reference manual

for the Version 7 shell. Mandatory reading if you intend to

make effective use of the programming power of this shell.


November 16, 1985






48 _U_N_I_X _f_o_r _B_e_g_i_n_n_e_r_s


S. C. Johnson, ``Yacc - Yet Another Compiler-Compiler,''

Bell Laboratories CSTR 32, 1978.


M. E. Lesk, ``Lex - A Lexical Analyzer Generator,'' Bell

Laboratories CSTR 39, 1975.


S. C. Johnson, ``Lint, a C Program Checker,'' Bell Labora-

tories CSTR 65, 1977.


S. I. Feldman, ``MAKE - A Program for Maintaining Computer

Programs,'' Bell Laboratories CSTR 57, 1977.


J. F. Maranzano and S. R. Bourne, ``A Tutorial Introduction

to ADB,'' Bell Laboratories CSTR 62, 1977. An introduction

to a powerful but complex debugging tool.


S. I. Feldman and P. J. Weinberger, ``A Portable Fortran 77

Compiler,'' Bell Laboratories, 1978. A full Fortran 77 for

UNIX systems.


_M_a_y _1_9_7_9





















November 16, 1985





D O C U M E N T A T I O N M E N U

This menu allows you to access documents about some of the
features of UNIX and this system in general. Many of the
documents are very long. WARNING - Once the listing starts,
it will not STOP until the whole document has been listed.
It is recommended that you 'download' the document entitled
BEGIN and study it for a general background of the system.

Size in bytes Name of Document
-------------- -----------------
A 55330 ADVICE.doc
B 81391 BEGIN.doc
C 143449 CSH.doc
D 13599 SECURITY.doc
E 76644 SETUP.doc
F 73228 SHELL.doc
G 83376 SYNOPSIS.doc
H 68271 UNIX.doc
I 33799 UUCP_NET.doc
J 40932 U_IMPL.doc
K ????? List of Usenet Groups
L 88534 List of ARPANET Groups
Command (ESC to exit)?


 December 18, 2017  Add comments

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)