Contents of the CMP.DOC file
DOCUMENTATION FOR CMP.COM version 1.0 Sept. 11, 1983
BY Jack Gersbach
CMP.COM is a file comparison utility program that has the power
to perceive added and deleted sections of a file as well as changes.
The files to be compared may be specified on the same line with the
calling command. If they are not included, you will be prompted for
the file specifications. This gives you a chance to change disks before
the compare operation begins. If the file specifications are included
on the command line, it is assumed that the disk that contains CMP.COM
is conveniently mounted and you will not get the question "Compare
more files ?" before exiting to DOS.
The first specification must have a file name.
If a drive is not specified, the default drive is assumed. If the second
specification does not contain a file name, it is assummed to be the
same name as the first. If there is no second specification, the default
drive is assumed. If the file specifications are the same or if the
second defaults to the first, then the file will be compared to itself
and a warning message is issued. This is a handy way to check for disk
The output is displayed in chronological order.
Data unique to the most recent version of the file is listed first and
labeled "Ins". Following that, data unique to the older version is
displayed and labeled "Del". There may be only 1 file with unique data.
If neither files have unique data, then they are identical.
The display shows the offset within the file of the leftmost byte
in hexidecimal notation. This is always aligned to a paragraph
boundary making the low nibble zero. All data shown in the hex area
is unique. A double period is displayed if the byte is not unique to
the file. The dash separates bytes 7 and 8.
The ASCII representation appears on the right. A dash is displayed if
the byte is not unique. A period is displayed for non ASCII characters.
Data unique to A:FILE1.COM
Ins:00030 .. .. .. 4E 65 77 .. ..-.. .. .. .. .. .. .. .. [---New----------
Ins:00140 .. .. .. .. 69 .. .. ..-.. BA .. .. 43 .. .. .. [----i----.--C---]
Data unique to B:FILE1.COM
Del:00140 .. 61 .. .. .. .. 58 ..-.. 00 .. .. .. .. .. .. [-a----X--.------]
In the above example the format is independent of the order in
which the file names were entered. The file on drive "A" is listed
first bwce to the earlier file on drive "B"
was evidently "Deleted" and is labeled accordingly. The changes would
be perceived by the person who made them to be the added word "New"
and later in the file, 3 bytes whose value have been change but occupy
the same positions relative to the data surrounding them.
Compare does a reasonably good job of deducing differences between
files that have just a few changes. Some difficulty arises when there
is a high difference density. This is especially true when a few
changed bytes are intermixed with added or deleted data. In these cases
it may be difficult to interpret CMP.COM's output, but at a minimum it
will let you know that there is a difference of some kind.
The algorithm implemented in CMP.COM involves 2 basic parameters.
a. The maximum scan range or number of bytes to be scanned and
b. The minimum match length that is considered to be a valid compare.
The default values are 256 and 16, respectively.
The procedure is as follows :
1. Scan ahead for matching bytes on the assumption that no
data has been inserted or deleted until matching bytes are
encountered. Save the distance to the matching bytes. The
maximum range is scanned.
2. Scan ahead for matching bytes on the assumption that data
has been added to the later file. Save the distance to the
matching bytes. The scan range in the earliest file is
twice the minimum match length. The later file is scanned
to the max range.
3. Repeat step 2 with the roles of the files reversed.
4. If steps 2 and 3 did not produce a match, repeat them using
the maximum scan range in both files.
5. Select the minimum of the above 3 lengths and report all
bytes within that distance to be unique to the appropriate
files. In case 1 the data in both files is reported. In case
2 data in the later file is reported and in case 3 data in
the earlier file is reported. If no match was found, report
16 bytes of both files as unique.
6. Load more file data from disk to memory, if appropriate,
and go back to step 1.
As data is reported, it is bypassed by the scan pointers and the scan
resumes at the new pointer positions. The data then need not be at the
same offset in each file to produce a match, since the pointers can move
independently. For difficult files, i.e. files with many insertions,
deletions and/or changes, the maximum range and minimum match length
parameters may be modified by the user but only after the prompt message
appears on the screen. Hence the file names must not be included on the
command line that calls CMP.COM. Both parameters must be specified.
The syntax is:
/max scan range/min match length [,filespec]
Any combination of numbers may be used but the max scan range must be
more than twice the min match length and the range connot exceed half the
available memory. i.e. the memory space allocated to one of the files.
Otherwise the "improper number(s)" message appears and the parameters
are not accepted.
Examples of commands entered while still under DOS's control.
CMP FILE1.TXT FILE2.TEXT
Examples of commands entered while under CMP.COM's control.
Enter the files to be compared or new parameters
This is the prompt message to type in the file specifications or
new parameters for the scan algorithm. They must be on the same
line separated by tab, space or comma. The parameters, if present,
must preceed the file names. If file specifications have been
entered previously, it is only necessary to press enter to repeat
Compare more files (Y/N) ?
This message only appears if CMP.COM was called without trailing
parameters specifying the files. The assumption is made that the
disk containing CMP.COM was removed from the drive.
If another compare operation is desired, enter "Y", otherwise
enter "N" and control is returned to DOS (the command processor).
Comparing file to itself
The file specifications refer to the same file. It is being compared
to itself to verify the integrity of the disk data.
Files are identical
The files contain identical data and are the same length. The date
and time might be different.
FILESPEC is empty
There is no data in the indicated file.
FILESPEC not found
The indicated file could not be found on the specified drive.
Invalid drive or file specification
DOS found the drive, file name or file extention to be invalid.
The max range was either less than or equal to twice the min match
length, or greater than half the available memory space, or only one
parameter was specified.