Dec 252017
 
Text search. Including fuzzy search, and/or. Latest version.
File MAXFND23.ZIP from The Programmer’s Corner in
Category File Managers
Text search. Including fuzzy search, and/or. Latest version.
File Name File Size Zip Size Zip Type
MAXFIND.DOC 30959 10370 deflated
MF23.EXE 16889 10114 deflated
TXT.BAT 205 149 deflated

Download File MAXFND23.ZIP Here

Contents of the MAXFIND.DOC file



Manual for MAXFIND Version 2.3.

Copyright 1989 Stanley C. Peters All Rights Reserved

From: Stanley C. Peters Shareware $15
P. O. Box 2028
Fairfield, Iowa 52556


TABLE OF CONTENTS.


OVERVIEW .......................................... 1

USING THE PROGRAM ............................... 2
Specifying strings ............................. 2
Specifying Filenames ........................... 2
Options ........................................ 3
Fuzzy search ................................... 4
Scan windows ................................... 4
SEARCH STRATEGIES BY TYPE OF FILE ................. 5
Name and Address Lists ......................... 5
Letters ........................................ 5
Text file searches ............................. 5
Word processing documents ...................... 5
BBS files ...................................... 6
Program files .................................. 6
General Hints .................................. 7

REFERENCE SECTION ................................. 8
Options ........................................ 8

What's New with MAXFIND ............................. 10
Files in the distribution ........................... 10
WARRANTY ............................................ 10
LICENSE ............................................. 11
HOW TO PAY .......................................... 11
DISTRIBUTION ........................................ 12
REGISTRATION AND ORDER FORM .......................... 13



Page - 01

OVERVIEW.

Basically, this program works much like other FIND programs
that you may have used. Enter the program name at the DOS
prompt, followed by a string, and then the file name. But
there are several powerful advantages:

- Search for up to 15 strings on one pass over the file with
little performance penalty.

- Combination and/or searches are definable in an easy to use way.

- Will search subdirectories or the entire disk.

- Has a "fuzzy" search, spelling need not be exact.

- A help screen is available, enter mf at the DOS prompt.

- It works quickly! Now, with Version 2.3, simple searches on
my 8 mHz AT disk scan at 100 thousand bytes per second.
It moves at 70k bytes per second for quite complex searches.

- A scan window size option for matching and display. Great for
Name and Address lists and for finding phrases or quotes that
span more than one line.

- It is useful on word processors documents and data bases that
keep their data in an ASCII format.

The overall effect of this combination of features is a text
search program, that in the words of PC magazine (1/31/89),
"..holds its own with the best commercial programs."

Since then, Lotus has introduced Magellan, a shell with text
search. This $179 program will take two megabytes out of a
20meg hard disk for an index to the entire disk. I have
looked at it briefly and I'm impressed. Fast searches and
"launching" into the appropriate program. But I feel that
anyone who needs that, should ALSO have MaxFind. Because,
after all, finding "lost" information is not an easy job with
ANY program. MaxFind is the preferred tool for these cases:

- Names and words with unknown spelling.
- Longer documents. With my limited exposure to Magellan,
it appears that it finds FILES very quickly, but its up
to you to scroll through the document. MaxFind will
show you up to 24 found word(s), in context, on one
screen.
- Finding a quote. Magellan doesn't index about 150 common
words. If many of these are in the quote - problems.
- It won't fit on a floppy based laptop.

Page - 02


USING THE PROGRAM:

The program works from the DOS prompt much like the DOS FIND
command. Enter the program name ( mf ), followed by a string,
and complete with a file name to be searched.

SPECIFYING STRINGS:

You can include up to 15 strings. MAXFIND will search for all of
the strings in parallel, reading the file once. For example:
mf string1 string2 string3 my.doc

searches for three strings in file 'my.doc'.

Notice that the strings need not be inside quotes. You need
quotes when the string has an embedded blank, e.g.:
mf "mac intosh" my.fil

or when your string has DOS "piping" characters ( < > | ) such as:
mf "x < 10" "x<10" my.bas

If you use the AND option ( -a ), all the strings on the line
must be present. You can get a compound search using the slash
character / . Consider this:
mf bob smith address.fil -a

This searches for a line that contains bob and smith.

But what if the name in the file is Robert Smith? The slash is
used to separate equivalent names so:
mf bob/robert smith address.fil -a

would find Bob Smith and Robert Smith. Or, adding Robt:
mf bob/robert/robt smith address.fil -a

If you are not quite sure of the spelling, you can use the tilde (~)
in the string, it stands for any character. So 'g~ve' will find
'give' and 'gave'. If you want to search for the tilde, make it the
last (or only) character in the string. The more sophisticated
fuzzy search is discussed below.

SPECIFYING FILENAMES:

MAXFIND follows the DOS rules for ambiguous file names (afns).
It uses the "?" and "*" as DOS does. The "?" means "anything is ok"
for this position. The * means anything is OK up to the period
or the end of the name. So "oct*.ltr" selects all files with a
extension (suffix) of "ltr", where the name part starts with
"oct". Suppose you want to search a group of memos to find a
delinquent note to Mr. Jones. You could use this:
mf jones delinquent *.ltr -d

Page - 03
This will work just fine, if you are in the directory of all
letters. But if your letters are spread over several
directories, and the parent node for several subdirectories is
"c:wp", you could search all those subdirectories by this
request:
mf jones delinquent c:\wp\*.ltr -ds

The -s option tells MAXFIND to also search the directories beneath
c:\wp.

Or, you could search the entire C: drive for "*.ltr" with this:
mf jones delinquent c:\*.ltr -sd


OPTIONS:

There are quite a few available options (see below). You use a
minus sign to indicate them. They may occur after the program
name and they may also be the rightmost term(s). These are all
identical in action:
mf -a string1 string2 my.fil -c
mf -ac string1 string2 my.fil
mf string1 string2 my.fil -ac
mf string1 string2 my.fil -a -c

Feel free to enter "mf" at the DOS prompt to get help, I do. This
will appear on your screen:

Search options:
a - 'and', all must be present. f - "fuzzy", approximate spelling.
c - case sensitive search. w - match only if a word.
d - Span entire document, if necessary.
Output options:
l - show line numbers. t - to screen and > file.
m - stop after first match. u - Unix (grep) style output.
n - no pause each 24 lines.
Input options:
b - also search binary files. h - strip hi (8) bits.
AND searches using "sliding windows":
2 - 15 Window size (number of lines) for searching and displaying


The input and output options are discussed below in the reference
section. But more needs to be said about the searching options.

Finding text can be frustrating. This program offers several
strategies to aid searches. One might look at it as a TOOLKIT to
allow customized searches for data.

With windows, multiple strings and a fuzzy search, the problem
becomes one of reducing the number of "false" finds.

I can't know the nature of your data. So a little experimentation
on your part is indicated. So let us discuss what I have found to
be effective ways to use MAXFIND on assorted types of data files.
Page - 04

FUZZY SEARCH:

This option allows you to search when you don't know the exact
spelling of the word.

The technique used is inspired by the Soundex algorithm invented
about 70 years ago to search name files. Names that sound alike
should have the same Soundex number. It uses these rules:
- Vowels are ignored.
- Consonants that sound alike in a pronounced name are given
the same "number".
- Successive consonants with the same number are counted as one
( Willitt is equal to Wilith).

All of which is interesting, but you don't have to worry about
computing the numbers - it's done internally by MAXFIND.
Accented vowels (International Characters) are treated like any
other vowel (they are ignored).

You can get some curious results using the fuzzy option. If you
input "herc", it matches "character" and "horse". Notice that,
ignoring vowels and word boundaries, "character" has the embedded
sequence "hrc", just as "herc" does, so it matches. In the case
of "horse", s and c sound alike, so we again have a match.

So using fuzzy search alone will give many false hits. Combining
it with other search options will help a lot:
- Adding the case sensitive option (c) is effective when searching
for names, where the first letter of the name is capitalized:
mf Suzan Somers *.ltr -fc

- Specify the word option "w" will also reduce false hits.
- Use the AND, "a" option with several words.

WINDOWS:

If you enter a number ( 2 - 15 ), a "window" slides down your
data files seeking matches within this span of lines. Then it
displays the window, adding a "+-+" to separate the data. At
times, you may see a set with fewer lines. Look in the window
just above, to see the text that completes your request.

If you have lines that are longer than 80 characters (e.g., word
processing documents), select the window option by entering a
number to set a window size of two or more lines.

Page - 05

SEARCH STRATEGIES BY FILE TYPE:

Name and Address Lists:

This is an easy type of file to search. It is just what the
Soundex creators had in mind. Use fuzzy and AND:
mf name1 surname address.fil -fa

Since names should be words, we can add "w", so use: -faw.
If name is always a capitalized word, add case, "c": -fwac.
If we want to see surrounding lines, or add the State
to our search, add a number to get a window: -fwac5.

Letters:

Letters are short documents. Often we just want to know the DOS
filename of the letter. Document mode "d" is appropriate. This
will just show one matching line for each string we supply. If
you have an editor that will accept more than one file, the tee
(-t) option can be helpful:
mf name1 name2 topic1 topic2 c:letters\*.* -td >mf$$$

The tee option sends a copy of the screen output to a file, in
this case, mf$$$. Then, with a two file editor, edit mf$$$ in one
window, and look at your "hit" files in the other window.
You may want to use the alternate output format (-u) to shorten
the size of mf$$$.

Text files:

Here the important thing would be to search for a set of words and
to show the surrounding context. With a two file editor, tee
would allow us to capture the text for inclusion in another file.
Fuzzy word may also be helpful:
mf string1 string2 string3 *.ref -7atfw >mf$$$

If you then bring the file up in your editor, you can quickly "cut
and paste" any desired text to another document. Or if you simply
want to know the names of documents to edit, use the -d option.


WORD PROCESSING documents:

I have done limited testing using Word Perfect files and made some
adjustments to the program as a result. MAXFIND appears to be
effective at finding text, but "pasting" the found output back into
documents may have limited success. I suggest using a window of
at least two line and, if necessary, adjusting the input options.

Page - 06


BBS FILES:

Session log and history files are computer generated and have some
very useful regularities. TELIX can produce a usage log and this:
mf elapsed "++ at" connected telix.use >calls

yields a list of When, Who, and How long - useful when looking at
your phone bill.

I looked at my captured Compuserve logs and realized that this:
mf -3a date: subj: csrv\*.log >summary.msg

gives a nice summary of messages showing Who, When, and the subject
of the messages.

Many Bulletin Boards have a file that contains summary descriptions
of the files available for downloading. These files typically
contain 40 characters of text to describe a file. They come from
many authors and often have abbreviations. So they are quite
difficult to search. An easy example:
mf line/word count space.fil -a

to find a program to count words or lines.

Others get harder to find. Here the speed of searching will
permit you to make repeated tries. For example, I knew there was a
program which would list the disk drive table in the AT BIOS, but
I forgot its name. This succeeded:
mf list/display bios/drive/table space.fil -a

I found the file I wanted. In fact, I got nine "hits", and found
another two files that dealt with the same topic.

With ARC files, the file names within the ARC are in ASCII. So
MAXFIND will act as a file finder.


PROGRAM FILES:

Maxfind can be very useful to programmers. Allowing multiple
strings can give an instant cross reference for several labels.
Using the word, -w option, searching for 'eof' will yield 'eof'
but not 'sizeof'. Or get a cross reference on x, y, and z.

I comment all my function declarations starting with '/*f'. Then
"mf //*f *.c -ul" gives me a "by module" index to the functions
with their line numbers. Or entering:
mf //*f alloc( free( *.c -u
shows me which functions manage memory. Or finding where idx changed:
mf "idx =/idx=/+idx/-idx/idx-/idx+" *.c -u

Page - 07

General Hints:

By design, the program will split long lines into chunks of 80
characters. This makes the program usable with word processing
documents. If you have long lines, AND's may fail, because of
the definition of a line. Use -a2 or -a3 to expand MAXFIND's
scope.

If you want to search for the /, enter the / twice, the double
slash (//) indicates to MAXFIND that this is not an "or". So if
you want to search for the date 10/12/85, enter 10//12//85.

You could use -a2 to search for a phrase that starts on one line
and completes on the next. To find "in the course of time" use:
mf -a2 course time my.fil

To search for a string that starts with a minus sign, to avoid
having the program confuse it with an option, use an unusual
first string:
mf zzz -ing my.fil

If you decide to use BATCH files, you should be aware of one quirk of
DOS's SEMI-intelligent nature. At least with DOS 3.x, there are a few
characters it will delete from your input - they will NOT be seen by
MaxFind (or any program called via a BATCH file). They are: comma,
semicolon, TAB, and = sign.

If you are a user of DISK NAVIGATOR (another of my shareware
products) you may be interested in these macros to speed your
work:

phon mf address.fil -a4f
Key in 'phon', press ENTER, key in first and last name,
and press F4.

doc mf ^DP\*.* -wd
Key in 'doc', press ENTER, key in a set of words, and press
F4. All documents in the indicated directory that contain
the set of words will be shown.

Disk Navigator (a DOS utility Shell) is on CompuServe, PC-Sig
and other places as DNAV14.ARC. See also COMPUTE magazine's
disk for January 1989.


Page - 08

Reference Section:

Options:

a - 'and', all must be present.
All the strings on the command line must be present.
You can use the / symbol to get and/or combinations.

b - also search binary files.
MAXFIND normally bypasses files that don't appear to be
text files. If you want it to search all specified files,
use this option. You may have to use this option to
search some word processing document files. If your input
filename is an ARC, COM, EXE, or BIN file, binary is
assumed and this switch need not be entered.

c - case sensitive search.
The default is to ignore case (upper/lower) for letters.
Use this option to restrict your output. If you combine
this with the fuzzy option, case is checked only on the
first letter of your input word.

d - Span entire document, if necessary.
This option is particularly useful if you want to scan
many documents for the presence of a set of words. It
uses an AND search, when all search criteria has been
met, the last occurrence of each string will be shown.
In effect, the size of the window is the size of the
document, but only the "hits" will be shown.

f - fuzzy search, accept approximate spelling.
Find even if the words are spelled differently. Generally,
you should use this with the 'word' option, i.e., '-fw'.

h - strip hi (8) bits.
Enter this option if you want the hi bit stripped before
comparison and output. Use this option if you are
scanning word processor documents (Wordstar, and perhaps
others).

l - show line numbers.
Use this to show line numbers on each output line. For a "normal"
ASCII file this should agree with the lines in your document.
MAXFIND advances this count when it detects a CRLF, CR alone,
or LF alone. Or when 80 characters have passed without any of the
above.

m - stop after first match.
If you just want to find files that contain the strings,
use this option. MAXFIND will show the one line of the
file that satisfies the search criteria for each string
of your request.
Page - 09

n - no page pause.
Normally Maxfind pauses every 25 lines to let you scan
the screen. This option delivers lines continuously,
without the need to press a key at each screenfull.

t - Sends output to both the screen and a redirected ( > ) file.
When you add redirection (e.g. >myfile ) to the command
line, DOS sends what would normally have gone to the
screen to a file and there would be little or no screen
output. This option allow you to capture the results to
a file and see the action on the screen.

u - Unix (grep) style output.
This produces less screen output. Each line will be
prefixed with the name of the file containing the string.
The messages for file being searched, ie "+- filename>"
will not appear. Nor will the "hex file skipped" message
appear.

w - match only if a word.
A string will match only if the preceding and following
positions does not contain an alpha or numeric character.
If used with the fuzzy (-f), only the left edge of the
word is checked.

a number (from 2 thru 15)
If you enter a number MAXFIND will to two things:
- It will search for a match within a span of lines, that
is, all arguments need not occur on the same line.
- When it finds a match, MAXFIND will show it centered in the
number of lines you have specified. The set of lines
will be follow by this line: "+-+".
The text is displayed with little added "ornamentation"
and may not be clear at first glance. If you use the
line numbers option (see above) for a while to gain
familiarity with the style.


Page - 10

What's New with MAXFIND version 2.3:

Maintainence release:
Fix occasional problem of not finding words where []\_
was the second character of the word.

What's New with MAXFIND version 2.2:

Maintainence release:
Improved handling of "long lines" (e.g., dBase and Wordstar files).
Corrected rare problem when ~ wildcard or accented vowels are in
the search string.

What's New with MAXFIND version 2.1:

Still more speed:
Simple searches are 30% faster. (100k bytes per second at 8 mz)
Complex searches are 50% faster. ( 75k bytes per second at 8 mz)

What's New with MAXFIND version 2.0:

For the common case, searching for one word within one line,
the program is 40% faster. (more than 75k bytes per second at 8 mz)
Expanded discussion of usage with BBS files.

Whats New with MAXFIND version 1.1:

Bug fixed: Rarely, on large files the program would "stall".
For the European user, accented vowels (International Characters)
are handled properly.

Whats New with MAXFIND version 1.0:

The program name has been changed. It was SPFIND (SPFND4.ARC).
Fuzzy search for approximate spelling searches.

Files in the distribution:
MF.EXE The program.
MAXFIND.DOC Program documentation.
TXT.BAT A prototype batch file.

Several batch files discussed above are not distributed as batch files.
I suggest you make some like these with one or two active lines:
NAMES.BAT : mf %1 %2 %3 %4 %5 %6 %7 -a4fwc
TXT.BAT : mf %1 %2 %3 %4 %5 %6 %7 -aw5 >mf$$$.dat
edit mf$$$.dat ; use your editor name
LETTER.BAT: mf %1 %2 %3 %4 %5 %6 %7 -fawd

WARRANTY.

MAXFIND is distributed on an "AS IS" basis without warranty,
expressed or implied. Considerable testing effort has been
expended, but the user is advised to check the program's
suitability before relying on it. The user assumes full risk as
to the results of using this program. Any liability of the
author will be limited exclusively to product replacement. In no
event shall the author be liable for any consequential damages
arising from the use, or inability to use this program.
Page - 11

LICENSE.

MAXFIND is a copyrighted software that is being distributed as
shareware. It is NOT in the public domain. By using or
distributing this package, you agree to the conditions presented
herein.

You may use MAXFIND for your own personal use. If you find it
useful, you are requested to pay a Registration fee of $15. You
may use the program on multiple machines. Where there is the
potential for use on multiple machines at the same time, pay for
additional copies.

If you are using MAXFIND in a commercial, professional,
educational, or governmental organization, you are granted a
limited license, valid for thirty days, to use this package for
evaluation purposes; if you continue to use this package, you
must pay the registration fee. Operators of bulletin board
systems that offer public domain programs are exempted from
payment.



HOW TO PAY.

The price has been made attractive to encourage users of the
program to send payment. For each copy in use send:

5 or more 20 or more

All users: $15 $10 $7 each

Registered users will be notified by letter of updates to the
program. Add $6 if you wish to receive a disk with the latest
version. Non U. S. registrants should send $8 for the latest
disk, they may also use their local bank, with the check in
their native currency. Please make an appropriate adjustment
for exchange rate.

If you look at the size of the program and decide it isn't
large enough to pay for, think of its competition:
ZyINDEX $99 to $695
GoFer $89
Magellan $179.
MaxFind is an original piece of work with considerable research
and development behind it. It deserves support!
Page - 12

The idea of shareware with its low cost distribution of quality
programs is an American Treasure. Individuals with good ideas can
afford to implement them. The authors are talented people that may
forego salary to implement their ideas. They are making a bet that
their efforts will be accepted and that users will respond. A
survey has indicated that a very low percentage of users supply
support.

This is no way to keep the concept alive!

If MAXFIND does not fit your needs, please stop and think about what
shareware packages you do find useful. Support those that you use
regularly.

Send remittance to:
Stan Peters
P. O. Box 2028
Fairfield, Iowa 52556

I will check regularly for messages to me on MAXFIND or DISK NAVIGATOR on:
Compuserve, my id: 76525,1601
San Francisco area BBS, my id: Stan Peters
Space, Mountain View 415 969 0214

DISTRIBUTION.

You may freely copy this program for friends so long as the three
files are included unmodified. Non-profit user groups and bulletin
boards may also include it in their libraries.

For-profit organizations may distribute it provided there is a
PROMINENT statement urging users to support the user supported
concept. This should be in a brief index type (READ.ME?) file that
the user accesses to discover the contents of the disk. If in
doubt, write me, showing me how you get the point across to the
purchaser.

In no case may the cost per disk exceed $6.50. It is OK to put
MaxFind on a public domain or Shareware diskette that contains
primarily textual material such as the Bible, a sports database,
or other reference material so long as the three files are
included unmodified and it is clear that the user has not paid for
MaxFind.

Page - 13




REGISTRATION AND ORDER FORM

Stan Peters
P. O. Box 2028
Fairfield, Iowa 52556
---------------------------------------------------------------
PRICE PRICE
PRODUCT QTY EACH EXTENDED
----------------------------- --- ----- --------
MAXFIND v. 2.2 ___ $15.00 $_______


For Registered users:

Disk with current version ___ $6.00 $_______


Purchase Order (not prepaid) $5.00 $
-------


SUBTOTAL $_______


Iowa, add sales tax (6%) $
-------


TOTAL $
-------


Name: _________________________________Phone:________________

Address:_____________________________________________________

Address: ____________________________________________________

City, State, Zip: ___________________________________________



Where did you find the program?


--------------------------------------------------------------

Any Suggestions?


--------------------------------------------------------------


 December 25, 2017  Add comments

Leave a Reply