Dec 072017
 
LHZ file compression, TP5.0+ source code. Well commented.
File TPLZH.ZIP from The Programmer’s Corner in
Category Pascal Source Code
LHZ file compression, TP5.0+ source code. Well commented.
File Name File Size Zip Size Zip Type
LZ.PAS 2679 920 deflated
LZH.PAS 2970 1348 deflated
LZHASM.OBJ 4194 2336 deflated
LZHASMC.OBJ 4194 2336 deflated
LZO.PAS 3513 1366 deflated
TESTLZO.PAS 2589 827 deflated
TPLZH.DOC 16437 5764 deflated
TPLZH.HST 2716 1025 deflated

Download File TPLZH.ZIP Here

Contents of the TPLZH.DOC file


Merry Christmas everybody! (Almost)
***********************************************************************
Latest Huffman hacker: Joe Jared of 1:125/[email protected]
File request name : TPHUF or TPLZH (Magic names only please)
Data : (510)834-6906, V.32bis/HST, 8n1

Discription : (V0.19) Huffman Compression Engine w/TP {$O+} interface.

implies that there is a known bug
implies that there are no known bugs.
Downloadable name : [email protected]##.SQZ where @ is the major release number and
## the minor release. The Squeeze-it compression
utility is available by file request as SQZ.
(Or download as SQZ*.EXE)

Topics:
- HELP WANTED
- What's new
- Plans for the fidonet nodelist.
- How to use this utility in your programs
- Legalities and distribution
- Version history: Refer to TPLZH.HST


***********************************************************************
- HELP WANTED

If you are fluent in another language other than Pascal, please send
a working unit that would make this engine portable to your language.
LZHASM.OBJ is a standard Microsoft assembler unit, and if necessary,
I can compile other variations to make it compatible with your
prefered language. There are now 2 versions of the assembler object
file. LZHASM.OBJ uses pascal calls, and LZHASMC.OBJ uses C calls.


Also, if someone would like to clean up documentation, that would be
appreciated too. I r a programmer, not a speller.



***********************************************************************


-New in current release:

The following is a runtime comparison between TPLZH and the
all pascal version. Basis of comparison is RA.DOC being
compressed by the 100% assembler version, which allocates
a good chunk of itself on the heap, vs. the All pascal version,
which only places its I/O buffers on the heap. (306k of text)

These numbers were based on:

(FinishtimePas - StarttimePas)/(FinishTimeAsm - StartTimeAsm)
Accuracy (+- 0.5%)
OS/2 Compatibility cannot be tested here, but the basis of the
compatibility has to do with whether or not the FS register is
used.

***********************************************************************

(HUFFCOMP) (In Streams.Pas) { Copyright D.J. Murdoch, (1992) }
Input size: 306368 bytes
Output size: 159706 bytes
Timings:
CPU Compress Decompress
386 64.7s 67.62s
286 222.89s 229.15s
8086 406.78 406.5s

Exe size: 12288 bytes
***********************************************************************
LHarc 1.13c (By Haruyasu YOSHIZAKI )
The original designer of huffman compression.

Input size: 306368 bytes
Output size: 101671 bytes
Timings:
CPU Compress Decompress
386 26.15 8.02
286 85.47 23.78
8086 164.23 43.66

***********************************************************************

Input size: 306368 bytes
output size: 101639 bytes

Original
Encode Time: (LZHUFTP5) (TPLZH)
386 122.48 76.79
286 457.53 290.89
8086 907.1 502.02

Decode time:
386 14.55 7.2
286 50.76 26.2
8086 97.93 47.01

(Comparison based on LZHUFTP5 "Out of the box")
Speed CPU Encode Decode
33Mhz 386 159% 202%
8Mhz 286 157% 194%
8Mhz 8086 181% 208%

***********************************************************************
V0.19 (TPLZH019.SQZ)
Thanks to Andres Cvitkovich, 2:310/[email protected]
for the Object oriented programming interface.

Now, if you don't pre-fill the inputbuffer, the engine will do
it for you.

Added LZHASMC.OBJ, which uses C Calling convention.
This has in no way been tested, and might be unnecessary.

***********************************************************************
- Plans for the fidonet nodelist.

Within the next week, there will be a second utility out, that
will supply the fidonet nodelist as follows:


Nodelist.LZ
Nodlist.LZX

Regions.LZ
Regions.LZx


The structures for these files will be as follows:


Type
NetlistRec = Record {Nodelist.LZ}
Zone,Net : Word;
NetPointer : Longint;
end;

RegionRec = Record
Zone,Region : Word;
RegionPointer : Longint;


Nodelist.LZ will have seperate segments for each net, which will
be pointed to by NetlistRec.Netpointer. To find a specific net,
all one would have to do is search through the index, and then
find the appropriate Zone:Net match.

To find an appropriate list of nets in either region or Zone,
you would use the Region.LZ and Region.LZx


The source code will be provided as an example
implementation for your friendly neighborhood nodelist compiler.

Although the compression part will be slow, I'm sure someone will
find a good use for this. Perhaps even to create a nodediff
processer that works with the compressed files. Here is one
proposal for handling of new nodediffs of a different format:

1> Nodelist.LZ is never sorted. (For nodediff reasons)
Afer compilation, Nodelist.LZX and Region.LZX will be sorted
by region and net.

2> All file comments or notes are 'assumed' to belong to the
current host.
3> As the nodediff is read in, each net is read into memory,
(Please use a filemode of 0, as I'm on a network )

4> CRC/LF/CR
Occurances of the following to be translated to CR/LF

Amiga line feeds only
Macintosh style Carriage returns without linefeeds.

5> (This and a dozen other spaces intentionally left invisible)

6> Processing of nodediffs:
Processing of nodediffs will have little difference between
how it's done now, and how it would be done on the fly.
2 procedures would be setup for reads and writes, and for the
read/modify/write cycle, modifications will be held in another
heap buffer. (2 allocations of LZHMemSeg^).
(Specifics to be worked out at a later date)


***********************************************************************
- Implementation, and explaination of the interface to turbo pascal.


Memory requirements:
Since this is constantly updated, I wont update it again.
As of version 0.18, memory requirements are as follows:

{$M 1024, 56000, 56000}
Codespace: approximately 2865 bytes.
DataSeg: approximately 700 bytes.

The stack is probably higher than necessary, but heap memory
is definately accurate.

The engine uses approximately 1.2k of data seg as well, although
these too are 'thrown' onto the heap.

This unit is overlayable.


The following system variables are available for your interface to play
with:

type

IObuf = array[0..$2800-1] of byte; {These buffers are now FIXED!}

LZHRec = Record
count : LongInt;
textsize : LongInt;
codesize : LongInt;
inptr,inend,outptr,outend : Word;
Ebytes : Longint;
inbuf,outbuf : IObuf;
{Buffersize and position are critical}
WorkSpace : Array [0..$8657] of byte;
{LZHASM work space}
End;

{$L LZHASM}



var
StupidAlloc : Boolean;
StupidPtr : Pointer;
These variable is to provide Segment:0 alignment for versions
of turbo Pascal prior to 6.0. Refer to notes in InitLzh.

WriteFromBuffer,
ReadToBuffer: Procedure;

These variables should "point" to your procedures for reading and
writing of LZH compressed data. Before calling any LZH Functions,
you must have the following lines of code in your routine, or some
function thereof:
{$F+}
Myprocwrite
{$F-}
begin
(Your procedure for handling data from LZHMem^.Inbuff)
end;

Myprocread
{$F-}
begin
(Your procedure for handling data from LZHMem^.Outbuff)
end;


Your startup for your program should have:

Myprocwrite := WriteFromBuffer
Myprocread := ReadToBuffer

The actual procedure type MUST be a far Call, but the data within
the procedure may be near.

IObuf = array[0..$2800-1] of byte; {These buffers are now FIXED!}

For your reference, preceed all variables with LZHMem^. as this
structure is of heap pointer type.

LZHRec = Record
count : LongInt;
Current position of compression of input data. This counter
will continue to count, until you dinit and reinit the engine.

textsize : LongInt;
Size of input text data

codesize : LongInt;
Size of output code data.

These variables are available to provide user interface for
ratios. Since the assembler obj file is external to turbo
pascal, adding floating point math to your program will have no
effect on the speed of compression.

inptr,inend,outptr,outend : Word;

Inptr points to the position of valid data in your input bufferm
as does outptr for your output buffer. These are counters to
determine where in the appropriate buffer to place the next byte
of data. (See example LZ.PAS for details of implementation).


inend, Outend:

These variables point to the last valid byte in the appropriate
buffer. You MUST set these values to the pointer.
(Again see LZ.PAS for details)



Ebytes : Longint;
EBytes is a count of total bytes to compress. You MUST set this
variable to the total number of bytes you wish to compress.


inbuf,outbuf : IObuf;
These are input/output buffers for the compression engine.
In previous versions, these were seperate from LZHMem^, but it
seemed more practical to have one call to allocate memory. If
someone complains loud enough I'll "Put em back" to seperate
independent pointers.


WorkSpace : Array [0..$8657] of byte;
{LZHASM work space}
This variable array is work space for LZHASM.OBJ. Please do
not adjust it or change the data while encode or decode is
active. If you wish to keep the space active on your heap, you
can use it in between runs for other things.


End;


LZHMem: ^LZHRec;
LZHMemSeg : WORD;

Notice that there is no offset variable. This is intentional,
and the reason is simple: SPEED!
In LZH.PAS, there is a sample routine for allocating memory that
is segment:0 aligned. The difference in memory allocated using
this method is up to 32 bytes.
(For allocations of 55k, who cares!)

procedure Encode ; compresses data
procedure Decode; decompresses data
Procedure InitLZH; memory allocation and seg setterupper
Procedure DInitLZH; Memory de-allocation.

(Pay attention to this section in future versions)
The proper sequence is as follows:

Set writetobuffer and readfrombuffer to point to your procedures
for handling of buffer data. Make sure that at bare minimum,
that the procedure is of far type;
{$F+}
Procedure YourProc;
{$F-}

InitLzh

Set the following LZHMem^. variables to zero:
Inptr
OutPtr

if compressing:
Set LZHMEM^.Ebytes to the size of the data you wish to compress.

Call encode to compress, or decode to decompress.

The memory buffers will have the decompressed data.

***********************************************************************

-Special design notes:

This unit is specifically designed to be a portable unit for all
versions of Turbo Pascal. (Should work for windows too)
If this unit is modified in any way,
please keep backwards compatibility in mind.

The following commands are identical in operations, although the
first listed command is portable only to turbo pascal 6.0 and
higher:

AllocmemSeg( PointerToType,Sizeof(PointerToType^));

Getmem(PointerToType,(Sizeof(PointerTotype)AND$FFF0)+16)

If you use this unit, for all memory allocations you must use
the latter command, or modify your unit to use AllocMemSeg.

PLEASE DO NOT DISTRIBUTE THIS UNIT IF YOU MODIFY THE MEMORY ALLOCATION
SECTION. ALLOCMEMSEG IS TP6.0 COMPATIBLE AND ABOVE, WHEREAS THIS UNIT IS
COMPATIBLE WITH ALMOST ALL VERSIONS OF TURBO PASCAL.


***********************************************************************
-Legalities:

Note: I hate seeing legal items at the beginnings of files, wading
through tons and tons of used cow food to get to the meat of what the
utility really does. With this in mind, all legal issues or limitations
are here, at the end of the document.
-Compression of this archive:

As far as compression type is concerned, I could personally care
less, as long as the software used to re-compress is freely
available. Please to not recompress this archive with
Lameware, and do not add useless banner files.

The contents of this archive should contain the following:

LZHASMC.OBJ -=> The C version (Uses C calling conventions.)
(Untested and maybe unnecessary)

LZHASM.OBJ -=> The huffman compression object file (MSC6.0)
(Pascal/Basic/Fortran/?)

TPLZH.HST -=> Version history
TPLZH.NEW -=> This document.

LZ.PAS -=> Sample TP 5.0+ version
LZH.PAS -=> Base unit for all pascal platforms

LZO.PAS -=> OOPS examples for TP 6.0+
TESTLZO.PAS -=> OOPS examples for TP 6.0+

Anything additional shall be considered twit behavior.

If this object is used in any utility you write for public
or commercial use, please give credit to the following people
in your documentation:


Haruyasu YOSHIZAKI : Original concepts and lharc program.
Kenji RIKITAKE : English translation to C
Peter Sawatzki : Pascal interface(TP 5.0+}
Wayne Sullivan : Pascal interface
Joe Jared : Assembler optimization {TP 5.0++}
Andres Cvitkovich : Object version (TP 6.0+)

here may be other legal issues, but at this time I'm not aware
of them.

-Distribution

No fee may be charged for distribution of this package, and it
CANNOT be sold for any reason.

Exception: The disk this program is on, may be sold for the cost
of the disk. (Not to exceed $0.70)+exact mail costs if mailed.

If in a shareware package, no additional charges may be added
for the use of this "module". This is a *freeware* implementation
of huffman compression.


 December 7, 2017  Add comments

Leave a Reply