Category : Dbase (Clipper, FoxBase, etc) Languages Source Code
Archive   : TN8904.ZIP
Filename : DBASE_IV.ASC

 
Output of file : DBASE_IV.ASC contained in archive : TN8904.ZIP
DOS File Allocation and CHKDSK.COM

James Fernandez

This article will discuss how dBASE IV files can
be damaged through lost clusters and how this may result
in unpredictable behavior on the part of dBASE IV.

The CHKDSK Command in DOS

The first warning to an unsuspecting user might be running a program
for weeks, when suddenly, the program stops running! A cryptic one
line message, illegal opcode, appears on the screen. Locked
up, you have little choice but to reboot.

Or perhaps you change a report and (think) you save the changes to
disk. But when you run the report, the changes are not there!

Here's another scenario: Maybe you run the same set of customer labels
over and over, then one day the labels are scattered haphazardly across
the page! You think the printer's gone beserk!

If this is your worst nightmare, or worse yet, if this sounds all
too familiar, read on; this article's for you.

This article is all about passing the buck. While dBASE IV may not
be perfect (but almost), these problems are most likely caused by
improperly accessed or stored files. So, the focus of this article
is on how these types of errors develop and CHKDSK.COM, a DOS utility
which, when understood and used properly, can often resolve these
types of problems. Okay, okay, I know what you're thinking: I've
been using this machine for years and I never had these problems until
I used dBASE IV. I don't blame you for thinking it's a software
problem and in part you would be right in thinking so. dBASE IV certainly
increases the potential of such disasters, which is why this article
is so importanteven if you haven't been hit by the cluster buster.

Down to business now. Before we can solve the problem, we need to
examine how the problem develops. So, let's first explain how files
are stored on the hard disk.

Performance Degradation

Definition of a sector

When a hard disk is formatted with the DOS FORMAT command, it divides
the physical hard disk into fixed-length segments, called sectors,
which are each assigned a unique identifying number. These numbers
provide the hard disk controller with the ability to recognize any
sector on the disk. Sectors which are not usable for data storage
by DOS are flagged during the formatting process as bad. Bad
sectors are often physically flawed areas of the hard disk which cause
the disk space to be unusable.

In DOS 3.x, the size of each sector is 512 bytes (meaning that it
can contain 512 characters). This isn't enough to hold most files. dBASE
IV product files as well as your data files vary in length, so DOS
divides each one of them into groups of sectors. This means that a
single file may be spread over dozens or even hundreds of sectors
(depending on its file size and the hardware environment).

To help organize all these sectors, they are combined into groups,
typically of four. Each such group is called a cluster and has a
capacity of 2048 bytes. Although the hard disk controller can distinguish
sectors on the disk, the smallest section DOS recognizes in storing
and retrieving files is a cluster. This means that when a very small
file, say 10 characters, is written to a hard disk it will require
2048 bytes of disk space. (This can be seen by comparing the available
disk space before and after creating such a small file.)

There's a tradeoff here. If DOS acknowledged sectors as the smallest
portion that a file can be stored in, there would be less wasted
(unused) disk spaceparticularly if the majority of files were
very small or were slightly larger than a 2048 byte increment. However,
by chopping a file into sectors, rather than clusters, it would be
harder to keep track of all the parts (since there would be more of
them), accessing the file would take longer (since there would be
more sections to search through and for) and perhaps most importantly,
there would be a greater chance of being unable to find all the data
for a particular file.


This brings us to the File Allocation Table or FAT. The FAT
is a table whose elements represent the clusters on a disk. There
is one and only one FAT element per cluster and vice versa. Each
FAT element that represents an occupied cluster contains a number
which tells DOS one of two things: what cluster represents the next
segment of data for the current file or no more clusters contain data
from the current file. In other words, the FAT points to the next
cluster if there is one, and if not it tells DOS that the end of the
file has been reached.

This, in effect, associates a file with a particular chain of clusters
which tells DOS where to find the data on the hard disk. Well,
all this is fine, you may think, but how does DOS know which
FAT element to look at first? Where is the start of a file? The
answer is the directory. The DOS directory, besides containing what
you see when you type the DIR command, contains the starting cluster
number for each file. Once this starting cluster has been read, DOS
consults the FAT for the next cluster position or end-of-file marker
by looking up the data in the starting cluster's corresponding FAT
element.

Before we continue, let's sum up what we've stated so far: First,
the FORMAT command is issued to define sectors on a hard disk. These
sectors are grouped by fours into clusters. A cluster is the smallest
amount of disk space that DOS can access and therefore that a file
can exist in. DOS finds a file by looking it up in the directory,
which indicates where the starting cluster is. Once the
data is read from the starting cluster, the associated FAT element
is referenced to find the next cluster which contains data from the
file; this process continues until the end-of-file marker is found
in one of the FAT elements.

What Goes Wrong

So what can go wrong with such a copecetic system? Several things
can: As we mentioned earlier, bad sectors are unusable
portions of the hard disk. Although these are flagged to
indicate that data should not be written to them, occassionally data
may get written to a bad sector or a sector containing data may become
bad. Once a sector is bad it can not become usable again, so any
data stored there is not retrievable. You can, however, sometimes
make more disk space available by reformatting a hard disk which has
a lot of bad sectors; more sectors may become available as they are
redefined on the hard disk. In any case, age and use can eventually
wear out a hard disk, creating more bad sectors as time goes on.

Well, bad sectors are the least of our problems. Let's pause for
a moment and discuss how disk space gets used up. You may picture
a clean hard disk (one with no data written to it) as accumulating
data from its outer edge and continuously writing data as the head
actuator (the read/write device) spins inwardmuch like a needle
plays on a record (remember records?). This analogy is incorrect. The
most efficient way for DOS to write a file actually involves the speed
at which the head actuator moves relative to the revolving speed of
the hard disk. In other words, a timing ratio determines where the
next available cluster is. We won't go into any more detail
except to say that this is called the interleave and determines
the logically contiguous (as opposed to physically contiguous) layout
of clusters.

When DOS writes a file to the hard disk it looks for the first available
cluster, writes data to that cluster, continues to the next available
cluster and so on until the entire file is written. If the file was
written as efficiently as possible, the clusters it occupies will
be logically contiguous. However, when files are frequently deleted
from the hard disk and/or disk space gets used up, finding the next
available cluster becomes more difficult. This causes file fragmentation:
a file is considered fragmented when it is strewn over non-contiguous
clusters. The consequence of this is slow processing or executing,
particularly with product files such as dBASE IV.

File fragmentation can be remedied most easily with a commercial disk
optimizer (Mace Utilities, Disk Optimizer, Norton Utilities,
etc.). A disk optimization utility will make the occupied clusters
logically contiguous. This will increase access speed as the head
actuator mechanism of the hard disk will have less travelling to do
to read data into the computer's memory.

Additionally, any attempt to recover deleted or erased data will have
a much greater chance of success if the file containing the data is
not scattered over the hard disk. So far we've covered problems due
to bad sectors and file fragmentation. A much more common cause of
errors, like the ones listed at the beginning of this article, is
lost or orphaned clusters. A lost cluster is
one which contains data from a file, but the link to that cluster
(pointer from the FAT table) has been corrupted so that DOS can't
find the cluster. So, lost clusters are due to corruption of the
FAT rather than to the data itself.

When CHKDSK is run, DOS examines every element in the FAT, tracing
through them to determine the status of all the clusters. All clusters
that are marked as belonging to a file but have lost their connection
to the associated file are flagged and DOS gives the following warning:
xx lost clusters found in xx chains - convert to files? Answering
yes to this question can do no harmthe damage to the FAT has
already been done.

Many users think the /F parameter of the CHKDSK command will fix
a file in a lost cluster. The /F parameter more accurately stands
for file. When running CHKDSK/F and lost clusters are found, if you
answer yes when prompted, DOS places the newly linked chains (of lost
clusters) in files in the root directory. These files are assigned
filenames of FILEnnnn.CHK, where nnnn is a sequential number beginning
with 0000.

You may then use the TYPE command to determine the contents of the
orphaned (lost) clusters that make up the .CHK file. If that data
is a product file of dBASE IV, you must recopy or reinstall the product
file from the original diskette to restore an uncorrupted version
to the hard disk. If the file is a data or text file, it will be
readily apparent as the file will contain garbled but easily recognizable
data. In this case you must use a backup file, losing your most recent
changes.

If CHKDSK is run without the /F parameter, DOS will correct errors
to the FAT but no FILEnnnn.CHK files will be generated. In any case,
CHKDSK can do no harmthe damage has already been done!

dBASE IV and Hard Disks

So, why do these clusters get lost in the first place and why dBASE
IV? When exiting dBASE IV without properly quitting the program,
closing databases or typing CLEAR ALL at the dot prompt,
lost clusters can be created. Pressing the CTRL-ALT-DEL keys
to reboot the computer can also create lost clusters. Power surges
during a disk read/write can create FAT damage and subsequently lost
clusters.

Should a data file reside in a lost cluster, DOS cannot properly access
the designated file. If the file happens to be a dBASE product file
or a .PRG file, erratic behavior similar to the examples in the beginning
of this article may result.

End-users, noting the differences between dBASE III PLUS and dBASE
IV, point out that this sort of file corruption never or only very
seldom occurred with earlier versions of dBASE. This is probably because
dBASE III PLUS contained only one dBASE overlay (.OVL) file. dBASE
IV, on the other hand, contains six overlay files. Furthermore, reports,
labels and menus are generated through files with the .GEN extension;
each new report, for example, uses at least four files during its
definition and compilation process. Additionally, dBASE IV has three
resource files (with .RES extensions) and creates many temporary files.
The chances of any one of these files residing in a lost cluster is
far greater in dBASE IV than in previous versions of dBASE simply
because dBASE IV uses many more files.

If you encounter strange problems in dBASE IV, run CHKDSK/F to clean
up the FAT and identify corrupted files. Then reinstall dBASE
IV (unless you know only your data files were corrupted), writing
the product files to valid sectors using a corrected and error-free
FAT.

Disk optimization utilities (mentioned earlier for fragmentation problems)
can be used to detect and correct file system problems as well. These
may also speed up the dBASE IV program by making the product (and
all other) files logically contiguous (and therefore easier and quicker
to access).

Because the FAT maintains system file specifications, care should
be taken to be aware of any problems associated with system file corruption. To
prevent inadvertent errors from corrupting the FAT on a daily basis,
you can insert the CHKDSK/F command as a line in your AUTOEXEC.BAT
file. This will detect and correct file system errors upon boot up.

Following these additional preventative measures can also alleviate
potential frustrations:

1. If your power is unreliable, acquire an uninterruptible
power supply (ups).

2. Train users to exit dBASE IV properly by QUITting.

3. Follow a regular system of backing up and follow
it religiously.

4. Make periodic use of disk optimization software.


5. Insert CHKDSK/F in your AUTOEXEC.BAT to detect
FAT corruption.

6. Keep a file recovery program utility like Norton
Utilities or Mace Utilities on hand.

Hopefully, with proper training and disk optimization, problems with
your hard disk will be minimal and infrequent.

v



  3 Responses to “Category : Dbase (Clipper, FoxBase, etc) Languages Source Code
Archive   : TN8904.ZIP
Filename : DBASE_IV.ASC

  1. Very nice! Thank you for this wonderful archive. I wonder why I found it only now. Long live the BBS file archives!

  2. This is so awesome! 😀 I’d be cool if you could download an entire archive of this at once, though.

  3. But one thing that puzzles me is the “mtswslnkmcjklsdlsbdmMICROSOFT” string. There is an article about it here. It is definitely worth a read: http://www.os2museum.com/wp/mtswslnk/