Category : Assembly Language Source Code
Archive   : LZ.ZIP
Filename : LZ.TXT
Limpel-Ziv algorithm.
The compression algorithm builds a string translation table that maps
substrings from the input stream into fixed-length codes. These codes
are used by the decompression algorithm to rebuild the compressor's table
and reconstruct the original data stream. In it's simplest form, the
algorithm can be stated as:
"If
The following comments were paraphrased from the VMS LZCOMP program
sources. The algorithm is from the article "A Technique for High
Performance Data Compression" by Terry A. Welch which appeared in IEEE
Computer Volume 17, Number 6 (June 1984), pp 8-19.
Compress:
1) Initialize table to single character strings.
2) Read first character. Set
character.
3) Read next character, k.
4) If at end of file, output code
5) If
6) Output code
7) Put
8) Set
character k is read in, the table is searched for
combination is found,
next character is read in. Otherwise, this combination is added to the
table, the code
The decompression algorithm builds an identical table by parsing each
received code into a prefix string and extension character. The
extension character is pushed onto the stack and the prefix string
translated again until it is a single character. This completes the
decompression. The decompressed code is then output by popping the stack
and a new entry is made in the table.
The algorithm breaks when the input stream contains the sequence
k
sequence is handled in step 3 below.
Decompress:
1) Read first input code, assign to
finchar. Output k.
2) Read next code
3) If
a) Push finchar onto stack.
b) Set
4) If
a) Push
b) Set
c) Goto step 4.
5) Set oldcode and finchar to
6) While characters on stack pop stack and output character.
7) Add
8) Set oldcode to incode.
9) Goto step 2.
The algorithm used here has one additional feature. The output codes are
variable length. They start at nine bits. Once the number of codes
exceeds the current code size, the number of bits in the code is
increased. When the table is completely full, a clear code is
transmitted for the decompressor and the table is reinitialized. This
program uses a maximum code size of 12 bits for a total of 4096 codes.
The decompressor realizes that the code size is changing when it's table
size reaches the maximum for the current code size. At this point, the
code size is increased. Remember that the decompressor's table is
identical to the compressor's table at any point in the original data
stream.
My sources of reference while implementing these programs were the
following:
Bernie Eiben, DEC
Unix (tm) COMPRESS program sources
VMS LZCOMP/LZDCMP program sources (Martin Minow, DEC)
ARC file archiver sources (Thom Henderson, System Enhancement
Associates)
Very nice! Thank you for this wonderful archive. I wonder why I found it only now. Long live the BBS file archives!
This is so awesome! 😀 I’d be cool if you could download an entire archive of this at once, though.
But one thing that puzzles me is the “mtswslnkmcjklsdlsbdmMICROSOFT” string. There is an article about it here. It is definitely worth a read: http://www.os2museum.com/wp/mtswslnk/