Contents of the INSID123.DOC file
Inside 1-2-3 Worksheet Files
(PC Tech Journal October 1984 by W. F. Sharpe)
For some tasks, 123 requires substantial effort and has an
excessive "run time." It is difficult to do multiple regression,
quadratic programming, and matrix inversion with 123; these tasks are
more easily performed with other programs.
Even for such procedures, however, 123 is ideal for preparing
input, transforming variables in various ways, and providing graphs of
output. Its good points can be used to advantage and its drawbacks
alleviated if 123 is used for preparation of input for a "foreign"
program and for analysis of its output.
A typical procedure involves five steps:
1. Prepare input using 123, saving the input as a 123
2. Use the Lotus Translate facility to convert the 123
worksheet file to a DIF file.
3. Run the foreign program, which takes its input from
the DIF input file and sends its output to another DIF file.
4. Use the Lotus Translate facility to convert the DIF
output file to a 123 worksheet file.
5. Load the output worksheet file into 123 for further
analysis, plotting, etc.
This is not simple. Moreover, since most of the information is
numeric, a great deal of translation is involved. 123 stores numeric
data in binary format. In the DIF format, numbers are stored as ASCII
characters. Thus step two translates binary numbers to ASCII character
strings. The foreign program must retranslate the character strings to
binary numbers, do its work, then translate its numeric output to
character strings to be written in the DIF format. Step four involves
further translation -- from character strings to binary numbers. For
large problems, such translation can require minutes of processing.
There is a better way. An analytic program can read input directly
from a 123 worksheet file and prepare another 123 worksheet file
containing output. This obviates steps two and four, speeds up the
process and allows more information to be transferred between 123 and
the "foreign" program.
To write such programs, of course, it is necessary to understand
the format of 123 worksheet files. Lotus does not currently provide
this information, but it can be deduced by examining enough examples of
files created with 123.
The following information is believed to be accurate for Version
1A of Lotus 123 for the IBM PC. A BASIC program for reading and
printing data from a file is used for illustration.
Worksheet files are composed of a series of records of variable
lengths. Each record begins with a header consisting of a pair of
two-byte integers that indicates the record type and length. The
record length indicates the number of bytes in the remainder of the
record. There are many types of records. For reading and writing
files of data, the following suffice:
Record type 0: Header
Record type 6: Range
Record type 13: Integer value
Record type 14: Double-precision value
Record type 15: Character string (Label)
Record type 16: Formula and current value
Record type 1: End-of-worksheet
The INSIDE.BAS program reads and prints the contents of a worksheet
To process a worksheet involves: opening the file and checking for
a valid header; reading and processing records; and stopping when the
end-of-worksheet record has been read. Lines 40 through 100 of the
program provide a "main routine" to do this. Two routines are called
-- one (line 120) to open the file, and another (line 210) to read and
process a record.
Worksheet files may contain anything -- including a Control-Z
(ASCII 26) character. Since this signals the end of a standard text
file, "random" file access must be used, even for serial processing.
In BASIC, random files may have no more than 32,767 records. Since
worksheet files can be long, a relatively large record size is
The sample program uses records of 128 characters. Portions of
the file are read into a character string called BUFFER$, as needed.
The procedure to do this extends from lines 120 through 190. Lines
130-160 receive the name of the file from the user, append .WKS, open
it as a random file with a record length of 128 and assign all input to
a 128-character string named BUFFER$. Line 170 reads the first 128
characters into BUFFER$. Line 180 sets POINTER% to indicate that the
next available byte is the first character in BUFFER$. Line 190
returns to the calling procedure (in this case, the main routine).
The workhorse procedure begins at line 210. It reads a record and
arranges for appropriate processing. Line 220 calls a routine to read
the first portion of the next record. This routine (beginning at line
370) sets RECORDTYPE% to the type of record and RECORDLENGTH% to the
length of the remainder of the records. If the record is one of the
seven types listed in lines 240 through 300, control is passed to the
appropriate routine (which will RETURN directly when through). If the
record is another type, the remaining bytes are read but ignored. This
is done with a routine (beginning at line 480) designed to get the next
byte from the file.
The basic strategy for getting a byte from the files uses a
pointer, indicating the position in BUFFER$ in which the desired byte
is stored. This is incremented by one after each byte is used. If the
resulting value exceeds 128, the next 128 characters are read into
BUFFER$, and the pointer is reset to 1.
Lines 480 through 550 do the work. The next byte is copied into
BYTE$ as a character and the pointer is incremented so that it
indicates the next character in BUFFER$. If necessary, a new set of
bytes is read into BUFFER$ and POINTER$ is reset. Line 490 saves the
previous byte for possible further processing.
As indicated earlier, each record starts with a header -- two
bytes indicating the type of record, followed by two bytes indicating
the length of the remainder of the record. These values are to be
stored in two variables: RECORDTYPE% and RECORDLENGTH%. The requisite
procedure begins at line 370. In each case, two bytes are obtained
(using the procedure at line 480). They are combined to form a string
(called PREVIOUSBYTE$+BYTE$) that then is converted to an integer value
by the CVI function.
Processing the header (type 0) record is not strictly necessary.
However, checking its validity is wise. It also should be the first
record on the file. A valid header record has a length of two bytes.
Version 1A writes an ASCII 4 in each one. All three conditions are
checked in the procedure that runs from line 1010 to 1100.
Date are stored in four types of records: integer value records
(type 13), double-precision value records (type 14), character string
records (type 15), and formula records (type 16). One such record
exists for each occupied cell in the worksheet. No information is
stored for empty cells.
Each data record includes information about location of the cell,
including both row and column numbers. These row and column numbers
begin with zero; thus, row number 3 is displayed as row 4, column number
0 is displayed as column A, etc. Cells are stored in order, beginning
with the upper left portion of the worksheet, proceeding from left to
right within rows and from the top row to the bottom.
When a portion of a worksheet is saved (for example, with a /File
Extract Value instruction), cell locations are converted to those that
would have been applicable if the portion had been in the upper left
corner of the worksheet. For ease of interpretation, a cell's location
can be printed in "external" terms -- that is, as it would be
displayed. Lines 680-760 contain a procedure designed to show the
location of a cell in ROW% and COLUMN%.
In each of the four types of data records -- 13, 14, 15, and 16 --
the first byte indicates the format to be used when displaying the
contents of the cell, and the next four bytes indicate the column and
row numbers in which the cell is located. The procedure between lines
570 and 660 processes the first five bytes of a data record. After
this has been run, the procedure beginning at line 680 can be used to
print the row and column in "display" terms.
The simplest data record is type 13, which contains an integer
value stored in two bytes. The location and contents of such records
are obtained and printed with the procedure between lines 1340 and 1410.
Numeric data not stored as two-byte integers are stored as eight-
byte (double-precision) floating-point numbers, such as are present in
records of type 14.
Unfortunately, double-precision numbers are stored in one format
by 123 and in another by BASIC. All double-precision numbers in
worksheet files are stored in the format used by the Intel 8087
coprocessor. The eight bytes representing such values are stored
"backwards" on the file. To be used in BASIC programs, such values
must be converted. Some compilers for other languages (for example,
Pascal and C) do not suffer from this drawback.
It is possible to call up an assembly-language routine designed to
perform such a conversion from a BASIC program. Although inelegant,
this procedure to convert the double-precision value is used beginning
at line 780. Lines 790 through 810 test to see if the value is not
available (displayed by 123 as NA). In worksheet files such values are
indicated by an ASCII 255 byte followed by an ASCII 240 byte. The
remaining bytes have values of zero but the first two suffice for
purposes of identification. Line 820 and 830 test for a zero value.
This is indicated by zero values for all eight bytes, but inspection of
the first two suffices. The remainder of the procedure performs the
Double-precision value records are processed beginning at line
1430. Lines 1460 through 1490 get the eight bytes containing the
value, putting the numeric values of the bytes in vector BYT%. The
bytes are put in "backwards" -- the value of the first BYT%(8), the
second in BYT%(7), etc.
When vectors BYT%(1) through BYT%(8) are ready, the conversion
procedure is called. The result is returned as DOUBLE# (a double-
precision number), unless ISNA% equals 1 in which case the cell
contains an NA. Line 1510 prints either the value or NA.
Character string records (type 15) are much simpler than double-
precision value records. The character string follows the cell format
and location in the data record. The first character of the string
indicates the positioning for display purposes (an apostrophe for left
justification, a quotation mark for right justification, a caret for
centering, etc.). The last character is an ASCII 0 (null). The
requisite procedure runs from line 1540 through line 1630.
Since five bytes are used for the record format, row, and column,
the string is stored in the remaining (RECORDLENGTH%-5) bytes. These
are assembled into a CHARSTRING$, which is then printed.
Much of the power of 123 comes from its use of formulas, which
indicate how the value in a cell is to be calculated. A formula record
(type 16) is used for any cell with such a formula. Fortunately, the
last calculated value of the formula (the number that is displayed) is
A formula record is, in effect, a double-precision value record
plus a series of bytes containing a formula. With some minor
modifications, type 16 records therefore can be processed in a manner
that is similar to that used for type 14 records. A procedure to do
this can be found in lines 1650 through 1780.
Lines 1660 through 1730 correspond to lines 1440 through 1510 in
the procedure for double-precision values. Lines 1740 through 1770
take care of the formula information. The format, cell location and
last computed value require 13 bytes, so the formula is stored in the
remaining (RECORDLENGTH%-13) bytes, which are simply read and ignored.
If desired, this procedure could also be used for double-precision
values, since type 14 records have a record length of 13, and the FOR
loop in lines 1750-1770 will not execute in such cases.
When a worksheet (or portion of one) is saved, 123 writes a record
indicating the range from which the data came. The record (type 6)
indicates the column and row at the upper left corner of the range in
which data were contained. Each value is stored as a two-byte integer.
For files saved with a /FS command, the "from" row and column are
both zero, and the "to" row and column indicate the lower right corner
of the area containing data. For portions of files extracted with a
/FX command, the values indicate the range that the area containing
data occupied in the original worksheet. Since cell locations are
converted to a "base" of 0,0 when a portion of a worksheet is
extracted, the "from" values must be subtracted from the "to" values
for consistency. The resulting locations are printed by lines
The last record in a valid worksheet files if of type 1 and has a
length of zero. The procedure between lines 1800 and 1820 processes
it. The program to read and print the essential information from a
worksheet file is now complete. Incorporating these routines in other
programs is relatively simple, making it possible to do any type of
processing with a worksheet file.
To create a new worksheet file the process must be reversed.
A valid header record must begin the new file, and a valid end-of-
worksheet record must end it. Including a range record after the
header and before the data records is desirable (but not necessary).
If possible, data records should be stored in order (left-to-right
within rows, top-to-bottom by rows).
Files saved with 123 typically include a great deal of additional
information. To print the contents of the records containing such
information, add these lines:
305 PRINT"Record Type";RECORDTYPE%
335 PRINT ASC(BYTE$);
These statements will print the type of each such record, followed by
the numeric values of the bytes in it.