Category : OS/2 Files
Archive   : GNUGREP.ZIP
Filename : README.CRA

 
Output of file : README.CRA contained in archive : GNUGREP.ZIP
(Message inbox:135)
Date: Mon, 17 Oct 88 16:53:33 PDT
To: [email protected]
cc: darin%[email protected], [email protected]
From: James A. Woods
Subject: README.cray for GNU e?grep

I just sent this out to comp.unix.cray:

-------------------------------------------------------------------
From: [email protected] (James A. Woods)
Newsgroups: comp.unix.cray
Subject: GNU e?grep on Cray machines
Message-ID: <[email protected]>
Date: 17 Oct 88 23:47:29 GMT
Organization: NASA Ames Research Center, California
Lines: 66

# "What comes after silicon? Oh, gallium arsenide, I'd guess. And after
that, there's a thing called indium phosphide."
-- Seymour Cray, Datamation interview, circa 1980

Now that most Cray software development is done on Crays themselves,
thanks to Unix, GNU e?grep should come in handy. Of course, if you're
scanning GENBANK for the Human Genome Project at 10 MB/second (the raw
X/MP Unix I/O rate), you really do need the speed.

Sample, from one of the Ames Cray 2 machines:

stokes> time ./egrep astrian web2 # GNU egrep
alabastrian
Lancastrian
Zoroastrian
Zoroastrianism
0.5980u 0.0772s 0:01 35%
stokes> time /usr/bin/egrep astrian web2 # ATT egrep
alabastrian
Lancastrian
Zoroastrian
Zoroastrianism
7.6765u 0.1373s 0:15 49%

(web2 is a 2.4 MB wordlist, standard on BSD Unix.)

To bring up GNU E?GREP, ftp Mike Haertel's version 1.1 package from
'prep.ai.mit.edu' or 'ames.arc.nasa.gov'. Mention -DUSG in the Makefile,
and specify

#define SIGN_EXTEND_CHAR(c) ((c)>(char)127?(c)-256:(c))

in regex.c. [Cray characters, like MIPS chars, are unsigned, but the
compiler won't allow ... #define SIGN_EXTEND_CHAR(c) ((signed char) (c))]

However, at least on the Cray 2, there's a compiler bug involving the
increment operator in complex expressions, which requires the following
modification (also in regex.c):

change
m->elems[m->nelem++].constraint |= s2->elems[j++].constraint;
to
m->elems[m->nelem].constraint |= s2->elems[j].constraint;
m->nelem++;
j++;

Thanks go to Darin Okuyama of NASA ARC for providing this workaround.

-- James A. Woods (ames!jaw)
NASA Ames Research Center

P.S.
Though Crays are not at their best pushing bytes, the timing difference
is even more exaggerated with heavier regexpr processing, to wit:

time ./egrep -i 'as.*Trian' web2
...
0.7677u 0.0769s 0:01 44%
vs.
time /usr/bin/egrep -i 'as.*Trian' web2
...
16.1327u 0.1379s 0:32 49%

which is a mite unfair given a known System 5 egrep -i gaffe. You get
extra credit for vectorizing the inner loop of the Boyer/Moore/Gosper
code, though changing all chars to ints might help also.


  3 Responses to “Category : OS/2 Files
Archive   : GNUGREP.ZIP
Filename : README.CRA

  1. Very nice! Thank you for this wonderful archive. I wonder why I found it only now. Long live the BBS file archives!

  2. This is so awesome! 😀 I’d be cool if you could download an entire archive of this at once, though.

  3. But one thing that puzzles me is the “mtswslnkmcjklsdlsbdmMICROSOFT” string. There is an article about it here. It is definitely worth a read: http://www.os2museum.com/wp/mtswslnk/