Category : Tutorials + Patches
Archive   : TOKEN.ZIP
Filename : MYZIP004
BBS: PALACE Conference : ARCHIVE Imported: 08/19/1990
To: DEAN COOPER Number: (NEW) Date: 08/19/1990
From: YOU Reference: 106 Time: 3:49 pm
Subj: ADDITION TO ZIP FORMAT æ Security: PUBLIC Read: N
Echo Flag : Y Personal Read: Y
DC> Well, I'm sure you'll here this from many people, but don't
make a change to the ZIP format and still output ZIP files
as the default.
The "compression" system I am working on is for an application I am
developing, but the idea is to do two things: to make the type of
compression I am using publicly documented so other people can
figure out what I am doing. The usage is for the application I am
working on, so the only people who have to worry about it are those
who want to directly use files the application I'm working on
creates.
DC> Phil went through a TERRIBLE ruckus
(especially from the Unix people) when he added a new
compression scheme (squashing) to the ARC format. It
didn't seem to matter any that the change was EXTREMELY
trivial and only required two lines of code to be changed
(and simple changes to boot).
If they don't like things to change, let 'em go back to $1,000
per K memory, non-multitasking machines without virtual memory
and single user only operating systems, loaded from paper tape.
DC> Even if you can get all the parties who make ZIP files to
agree on the new compression method, I wouldn't output
ZIP files by default until Phil releases the change
himself.
I am simply announcing what I am doing. Nobody else has to support
it since all my programs will handle their own files. The public
notice is to give others the opportunity to do so if they want to.
DC> And believe me, I may well be in such a position to have
to make that exact decision myself with my own enhancements
to ZIP. However, there are other problems involved. Chief
among them is that many people simply won't want a new
compression scheme added unless it compresses significantly
better than the current ones.
I'll give a few details. I'm writing a compiler/interpreter.
Some applications may issue the eqivalent of an "INCLUDE" command
to the compiler/interpreter to load source code provided with the
system for use in various things. The source would be stored as a
type of tokenized file the way BASIC source files are stored.
Also, by doing a word-by-word tokenization, I can do the same thing
with the error message file. Instead of a huge text file, I can
include a compressed, indexed text file. Borland did this with
their TURBO.MSG error message file, but the objective here is
twofold - to have just ONE zip file which is the library, any
include files (Imagine a C program without INCLUDE
the case of a C compiler, <> indicates go to library. So all
files - include files, object files, libraries, sources, error
messages, etc., are in ONE place. People have complained about
Borland not documenting its .TPU file format; I am not going to
have that happen.
DC> And even if it does, but only works on text files, they are
likely to not want that either. You see, many people out
there have restricted memory conditions and wouldn't like
having more code in the archiver than is necessary (to
alleviate this, your code should be as small as possible
and not require any more memory to extract than any of ZIP's
other compression methods).
The only reason anyone really needs to worry about this is if they
are either using one of my programs or a program by someone else
that supports this. The method of creating files using this
tokenized method isn't necessary unless someone is creating an
application that needs this format.
DC> Second, many people out there have their own code, or code
they got from who knows where and don't, can't, or won't want
to modify it to include your new method.
If they aren't working with the type of files I'm doing, they don't
have to worry about it.
DC> To alleviate this, you should work with the people who are
developing the public domain, portable version of ZIP (they
have started with Sam Smith's code) and provide portable
code for your method.
That is why I am making the announcement, to let people who MIGHT
be interested in the method be able to use it if they want.
One of the reasons I got interested in this was a mention that in
the error message files for the COBOL compilers from IBM, CDC,
Honeywell, Xerox, Burroughs and Univac, there were only 500
different words. Yet the error messages from each of them run
dozens of pages. There should be a way to reduce space on these.
The University of Minnesota sends out a Pascal Compiler for the CDC
Cyber series of computers, which comes with a number of source
libraries people can include with their code.
-> MegaMail(tm) #0:COBOL is four of Programming's Seven Deadly Sins.
1.20
Very nice! Thank you for this wonderful archive. I wonder why I found it only now. Long live the BBS file archives!
This is so awesome! 😀 I’d be cool if you could download an entire archive of this at once, though.
But one thing that puzzles me is the “mtswslnkmcjklsdlsbdmMICROSOFT” string. There is an article about it here. It is definitely worth a read: http://www.os2museum.com/wp/mtswslnk/