Category : Printer + Display Graphics
Archive   : OCR22.ZIP

Output of file : OCRSHARE.DOC contained in archive : OCR22.ZIP

OCRSHARE Release 2.2
Shareware User's Manual

Solution Technology Inc.
1101 S. Rogers Circle, Bldg. 14
Boca Raton, Florida 33487
Tel. (407) 241-3210
Fax. (407) 241-3251

Copr (c) 1984-1990 Solution Technology, Inc. page 1

Table of Contents
Front Matter
Copyright Acknowledgements....................... 1
License.......................................... 1
Bulletin Boards.................................. 2
Public Domain and Shareware Libraries............ 3
Association of Shareware Professionals (ASP)..... 4
Registration..................................... 4

Introduction......................................... 5
Why OCR Software?................................. 5
Features.......................................... 6
OCRSHARE...................................... 6
ATXSHARE...................................... 6
Advantex OCR.................................. 6
Ver 2.2 Enhancements.............................. 7
How to Use This Manual........................... 7
Tutorial...................................... 7
Trouble Shooting.............................. 7

Installation......................................... 8
Computer Requirements............................. 8
Running from a Hard Disk.......................... 8
Notation.......................................... 8
Hard Disk Installation............................ 9
Modifying CONFIG.SYS.............................. 9
Files & Buffers................................... 9
Starting OCRSHARE................................. 10
Forcing a Display Adaptor......................... 10
Screen Selections................................. 11
Mice.............................................. 11
Networks.......................................... 12
EMS Memory........................................ 12

Tutorial............................................. 15
Starting OCRSHARE................................. 15
Moving Around..................................... 15
The Six Basic Keyboard Keys................... 15
Short Cut Keys................................ 16
Exiting OCRSHARE.................................. 16
Perform OCR on an Image File...................... 17
Load a scanned image file..................... 17
Exploring the loaded image.................... 17
Load an OCR Recognition Font.................. 17
Setup the Text Output File.................... 17
OCR the full page............................. 17
Selecting areas to OCR or Cutout.............. 18
Canceling a selected area..................... 18
OCR of selected area.......................... 18
Unloading an OCR Font......................... 18
Saving a selected area........................ 18
Training a New OCR Font....................... 19
Saving the New OCR Font....................... 19

Copr (c) 1984-1990 Solution Technology, Inc. page 2

More about OCRSHARE's File Finders................ 19
Editing the File Name......................... 20
Over typing the File Name..................... 21
Incrementing Numbered File Names.............. 21
Graphics Editing.................................. 21
Invert Page................................... 22
Flip Page Vertical............................ 22
Erase Inside.................................. 22
Erase Outside................................. 22

OCR Notes............................................ 24
Looking at the output text file................... 24
Additional Information on Training................ 24
Naming the ink blots during training.......... 24
Proper training and symbol diversity.......... 25
Making an Alphabet Page....................... 25
Point sizes and Scanning DPI.................. 25
About Autoskip and Manual Training Modes...... 26
More on Autoskip Training..................... 26
About the Interactive Training screen......... 26
Training Your Own Fonts for Recognition....... 27
Using Your New OCR Font....................... 27
Multi font Capability......................... 27
Handling Special OCR Problems..................... 30
Broken Letters................................ 30
Broken Dot Matrix............................. 30
Run together Letters.......................... 30
Big Dirt Spots................................ 30
Underscores................................... 31
The % symbol.................................. 31
Italics....................................... 31
Advanced Training Methods......................... 31
Training Foreign Characters................... 31
Training from FRENCH.ATX...................... 32
Training Foreign Language Letters............. 32
Font Editing for correct OCR output........... 32
Using your new French OCR font................ 33
Training Special Symbols.......................... 33
Ocr Tuneup Fonts.................................. 33
Creating a tuneup font........................ 34
Derivative Fonts.................................. 34
Testing an Existing Font...................... 34
Editing the font.............................. 35
Retraining your font.......................... 35

Trouble Shooting..................................... 36
OCRSHARE V2.2 Registration Form...................... 44
Note To Retail Dealers............................... 45
Note To Shareware Dealers............................ 45

Copr (c) 1984-1990 Solution Technology, Inc. page 3

Copyright Acknowledgements

MS-DOS is a registered trademark of Microsoft Corporation. IBM,
PC-XT and PC-AT are registered trademarks of International Busi
ness Machine Corporation. PC Paintbrush, PC Paintbrush +, and
Publisher's Paintbrush are registered trademarks of ZSoft Corpo
ration. PageMaker is a registered trademark of Aldus Corpora
tion. Ventura Publisher is a registered trademark of Xerox
Corporation. H.P. is a registered trademark of Hewlett-Packard


OCRSHARE (c) 1986 ,1987, 1988, 1989, 1990 is copyrighted software
program product of Solution Technology, Inc.(STI), Boca Raton,
Florida. OCRSHARE is being distributed under the shareware dis
tribution process and is intended for the personal use and enjoy
ment of the recipient.

OCRSHARE is copyrighted and has been released for distribution as
SHAREWARE. Please note that a great deal of effort and time has
invested in the development of this program. You are granted a
license to try OCRSHARE for a reasonable trial period without

1. No organization, individual or other entity may reproduce,
print, duplicate, copy, or distributed the program, manual
or any ancillary file herein for any commercial purpose or
personal gain or commercial gain whatsoever without the
expressed written permission of Solution Technology, Inc.

2. STI hereby grants the user a single user license for the
users private use and enjoyment, the programs, documents and
Aancillary files herein.

2. The user is additionally granted the right to and encouraged

a. print a hard copy version of this manual for his own
b. give, copy, upload and otherwise distribute WITHOUT
CHARGE the complete and full and true copy of the
OCRSHRxx.ZIP file.

3. STi makes no warranty, either expressed or implied, with
respect to the shareware software described herein, its
quality, performance, marketability, or fitness for any
particular purpose.

Copr (c) 1984-1990 Solution Technology, Inc. page 1

4. The user acknowledges that STi is not liable for any dam
ages, either consequential or direct, arising out of the use
of this Software. The user assumes complete responsibility
for any decisions made or actions taken based on information
obtained or distributed through use of this software product
and any accompanying instructional or reference materials
provided by Solution Technology, Inc.

Bulletin Board License

1. Operators of electronic bulletin boards (Sysops) are
hereby licensed and encouraged to post OCRSHARE for
downloading by their users subject to the following
provisions. In addition sysops are encouraged to keep
the uploaded .ZIP file name OCRSHRxx.ZIP where xx
represents this version of OCRSHARE (eg OCRSHR22.ZIP)
so your users will know if they have the latest ver

2. OCRSHARE may be uploaded to and downloaded from commer
cial systems such as CompuServe, the Source, and BIX,
so long as the only charge paid by the subscriber is
for on-line time and there is no charge for the pro
gram. Those copying, sharing, and/or electronically
transmitting the program are required not to delete or
modify the copyright notice and restrictive notices
from the program or documentation; anyone doing so will
be treated as a contributory copyright violator.

3. If you, as a BBS sysop or user, are passing this pro
gram on others, uploading it to a bulletin board sys
tem, or including it in a users group library, do not
separate the files contained in the distribution ar
chive - pass the entire archive on to the intended
party. This ensures that those who receive the program
will have all the correct configuration utilities and
documentation necessary to get OCRSHARE up and running
quickly. A listing of what files you should have and
the purpose of each is listed later in this manual.

4. The OCRSHARE documentation may not be modified by
users. The program may not be separated from the docu
mentation when distributed. Printed or Photocopies
("Xeroxed") copies of the OCRSHARE documentation (i.e.,
this manual) may not be distributed or sold without the
written permission of STI.

5. This license to use OCRSHARE does NOT include any right
to distribute or sell OCRSHARE for commercial purposes,
gain, compensation or profit. No entity or person other
than Solution Technology, Inc. may accept payment or
royalties for this program without an expressed written
agreement with STI. Distribution terms are given below.

Copr (c) 1984-1990 Solution Technology, Inc. page 2

Public Domain and Shareware Libraries

1. Distributors of "public domain" or user-supported
software libraries must obtain written permission to
distribute copies of OCRSHARE. No one may use OCRSHARE
as a promotion for any commercial venture or as an
enticement for the user to pay for any program,
product, or service unless they have received the
express written permission of Solution Technology, Inc.

2. You must obtain written permission from Solution Tech
nology to distribute OCRSHARE. Please use the vendor
application supplied near the end of this user manual.
If you do not receive a reply, write again: our silence
does NOT constitute permission, and you may not dis
tribute, "pending" receipt of permission.

3. A maximum disk fee as set by Solution Technology in the
above vendor contract must not be exceeded. OCRSHARE
may not be included on any disk sold for more than this
maximum. Major CD-ROM or optical disk libraries are
exempt from this restriction, provided that they have
STI's permission to distribute OCRSHARE.

4. Vendors may not modify or delete ANY files on the disk.
Vendors may add a "GO" program, and/or a reasonable
number of small text files designed to assist or pro
vide a service to the user, but these added files must
be easily identifiable and end-users must be allowed to
delete the added files.

5. Vendors must make a reasonable effort to distribute
only the most recent versions of OCRSHARE. All vendors
who have requested and received written permission to
distribute OCRSHARE will receive new MAJOR releases as
they are issued.

6. All disk vendors must comply with any and all vendor
guidelines vendor requirements set forth by the Associ
ation of Shareware Professionals (ASP); for more infor
mation about ASP, contact its chairman, Jim Button, at
Buttonware in Seattle. Violation of any ASP guideline
or requirement automatically revokes permission to
distribute OCRSHARE.

Until formal requirements are adopted by the ASP, you
must comply with the following guidelines:

Vendors must make an attempt to educate users on the
nature of Shareware. Catalogs, advertisements, order
forms, and all disks sold should contain ASP-approved
or recommended wording describing the nature of share
ware, and should explicitly state that no part of disk

Copr (c) 1984-1990 Solution Technology, Inc. page 3

sale revenues are paid to the programs' authors. When
vendor catalogs or advertisements carry both Shareware
and PD programs, the Shareware programs must be differ
entiated from the public domain programs in some way
(in the description, with an asterisk, by listing the
registration fee, etc.).

Association of Shareware Professionals (ASP)

OCRSHARE is a Shareware program conforming to standards as
established by the Association of Shareware Professionals
(ASP) located at 325 118th Ave. S.E., Suite 200, Belleview,
WA 98005.

If you find OCRSHARE useful you are encouraged to register
you copy. The base registration fee is $45. With registra
tion you will recieve in return the latest version of OCR
SHARE called ATXSHARE which does not have the shareware nags
and allows graphics translations between .PCX, IMG and .TIF
image files. Additionally registration allows you to obtain
a full set of pretrained Advantex OCR fonts for only $25.
You will find a registration form at the end of this docu
ment which will aid you in registering your copy of OCR

Please keep in mind that we must have a registration form on
file for you before we can offer product support.

Copr (c) 1984-1990 Solution Technology, Inc. page 4


OCRSHARE is a complete shareware version of STI's Advantex
OCR (Optical Character Recognition) package which is de
signed to convert scanned image files into text and picture
files for use by other programs in your computer. After
using OCRSHARE to input a document, you can manipulate the
extracted data using your favorite word processor, desk top
publishing, or graphics programs.

OCRSHARE is smart, fast and flexible but still easy to use.
We're sure you will find OCRSHARE an indispensable tool to
reduce those endless hours at the keyboard attempting to
computerize volumes of printed material.

Why OCR Software?

When it comes to inputting existing documents and hard copy
data into your computer, you basically have only two
options: either enter the data "manually" by retyping it
into your system, or have an Optical Character Recognition
(OCR) system do it for you.

The page image is first scanned into a bitmapped image file
using the utilities provided with your scanner. Each image
file represents a single page and contains many rows of
black and white dots (called pixels) which come from detect
ing the light reflected from the characters, lines, logos,
dirt and other ink patterns on the page. While these graphic
images look like characters on your graphics display, they
are still only collections of dots.

OCRSHARE's major purpose is to learn to identify these ink
patterns and output its ASCII equivalent into a text file.
For example, a graphic "A" becomes an ASCII "A", and so on
until each character has been translated or converted to

OCRSHARE can read using its dictionarys of existing symbols
(stored in .FTF files) or easily learn to recognize new
symbols, company logos or characters in other accented
alphabets such as German, Greek, Spanish, French and Rus
sian. Thus graphic symbols and characters are translated
into their letter equivalents. Note that this process is
letter and word recognition, not language translation.
OCRSHARE can even be taught to read hand printed letters if
the penmanship is neat and consistent.

Copr (c) 1984-1990 Solution Technology, Inc. page 5

Features vary from product to product as follows

- Reads scanned images TIF, PCX and IMG files
- Output ASCII, Wordstar or Word Perfect files
- OCR Features
- Trainable OCR Fonts
- Handles Skewed Text
- Reads Monospaced, Proportional and Typeset Text
- Comes with Helvetica, Times Roman & Courier OCR Fonts
- Adjustable Automatic Dirt/Spotting Filter
- Adjustable Automatic Graphics/Line/Box Filter
- Image Cleanup features
- Erase Inside
- Erase Outside
- Rotate Page
- No Direct Scanner Support

ATXSHARE Features ($45 Registeration Required)
- All of the OCRSHARE Features above
- Latest Version
- No Shareware Nags
- Save/Cutout TIF, PCX, IMG Files (eg file conversion)
_ Optional OCR font library ($25)

AdvanTex OCR ($395)
- Direct Scanner Support for Cannon, Microtek, HP Scanjet
and ScanJet Plus, Panasonic RS505,506, Chinon 2000 and
3000, Ricoh, Mitsubishi full page Handscanner, and
- Batch Mode Support
- Multipage Support
- Save/Cutout TIF, PCX, IMG Files (eg file conversion)
- Ocr Font Library included
- Other extensions.

Copr (c) 1984-1990 Solution Technology, Inc. page 6

Ver 2.2 Enhancements

The following improvemets, enhancements, features and capa
bilities were added to the AdvanTex Release 2.2 base program
in upgrade to earlier versions.

- Added a line separation preprocessor(Edit Menu) for
text which has lines joined vertically by a few touch
ing letters.
- Added a ligatured (joined character) error detector
which properly marks untrained ligatures in the output
text (with XX).
- Added proper " processing
- Added word break test processor(Edit Menu)
- Improved context rules for capitalization and numbers
- Improved recognition at 200 dpi
- Improved recognition rules for all recognition
- Improved and expanded tolerance for line skew
- Improved word space processing.
- Upgraded all Scanner Drivers
- Upgraded all OCR Fonts

How to Use This Manual
Most PC users, novice to expert, generally hate to read
manuals. Right?! That's why we've made OCRSHARE as
intuitive and easy to use as possible.

We have included easy-to-understand lessons in our tutorial
section. These lessons provide you with clear, step-by-step
instructions on using all the simple, yet powerful features
of OCRSHARE. Even if you are an expert computer user, you
should at least glance through the lessons to get acquainted
with OCRSHARE. Once you understand the basics, you can then
use the manual's index to quickly find out how to make
OCRSHARE do what you want it to do.

You learn even more about OCRSHARE by visiting the Tutorial
section and going through the step-by-step examples in the

Trouble Shooting
Help is always available while running OCRSHARE. Simply
press "Alt" and "H" together and a Help menu will appear on
your screen.

Copr (c) 1984-1990 Solution Technology, Inc. page 7


Computer Requirements

Your computer should have at least the following capabili

- An 8088, 8086, 80286 or 80386 based IBM/PC or
compatible computer. The faster the CPU chip the
better. An 8088/4.77 mhz will work, but will be
unbearably slow.

- DOS 3.0 or higher operating system.

- A full 640 K of main memory.

- A hard disk drive with at least 2-3 megabytes of
free space.

- Display card/monitor with graphics capability
(Hercules monochrome, CGA, EGA, VGA, etc.)

Running from Hard Disk or Ram Disk

OCRSHARE can be installed in virtually any directory on any
hard disk drive of your system, even copied and run from a
RAM disk (most effective when running the disk EMS simula


- When you see [sp] the space indicated is mandatory so
press the space bar on the keyboard.

- When you see [enter] press the key marked ENTER on your

- When you see a letter preceded by the word CTRL (as in
[CTRL-C], hold down the key marked "CTRL" or "Control"
on your keyboard and simultaneously press the letter
key indicated (in this case 'C'.

Copr (c) 1984-1990 Solution Technology, Inc. page 8

Hard Disk Installation

A summary of the installation steps are listed below.
Installing OCRSHARE is quite simple and mostly involves
copying files.

1. Create a subdirectory called OCRSHARE for OCRSHARE on
your hard disk. You may use another directory name,
although the examples assume OCRSHARE is used.

2. Use PKUNZIP to expand the files of OCRSHRxx.ZIP into
the directory.

3. Modify CONFIG.SYS if necessary.

4. Log into the OCRSHARE directory and run the program by


Modifying CONFIG.SYS

You should have a CONFIG.SYS file in your root directory.
CONFIG.SYS is a file which is read by the computer during
the power up boot and CTRL-ALT-DEL three-key boot sequence.
CONFIG.SYS, which must be on the boot disk, contains
configuration information for DOS as well as specifications
for the loadable device drivers.

While your CONFIG.SYS may contain more statements, the
statements used by OCRSHARE are:



We recommend the following minimum values for the FILES and
BUFFERS statements in CONFIG.SYS. More that the minimum is
perfectly OK.

FILES = 20

Copr (c) 1984-1990 Solution Technology, Inc. page 9

Note: If there are multiple FILES and or BUFFERS lines in
your CONFIG.SYS (usually put there by different application
installation programs) keep only the largest and delete all
other lines.


To start OCRSHARE type


OCRSHARE will normally automatically select the proper
internal graphics display driver and display mode for your
graphics card.

Note: Color monitors are supported in black and white (2
color) modes only since color modes are too memory intensive
for large images.

Forcing a Display Adaptor

If you have dual screens, an unknown brand display card, or
desire a different display mode, you may have to override
OCRSHARE's auto selection. To do this, simply add the appro
priate switch parameter after OCRSHARE on the command line
when you start OCRSHARE.

For Example, The following forces OCRSHARE to use the 2
color 640X350 EGA driver.


Upon entering this command the following message will be

IBM Enhanced Graphics Adapter, 640 x 350 2-color
to quit -press CTRL-C
to continue -press any other key

If you signal continue, OCRSHARE will set up for the card
you specified specified. The hardware configuration is
saved in the file OCRSHARE.CFG so that the next time you
start OCRSHARE your display adaptor is automatically select

If OCRSHARE seems to hang on start up, you may still have an
incompatible display card. Press CTRL-X to get back to the
prompt or reboot and append the correct option switch for
your specific display adaptor.

Copr (c) 1984-1990 Solution Technology, Inc. page 10

OCRSHARE /E:6 [enter] if you have an EGA compatible card
OCRSHARE /C:1 [enter] if you have a CGA compatible card
OCRSHARE /E:7 [enter] if you have a VGA compatible card
OCRSHARE /xxx [enter] (where xxx is one of the settings below)

Screen Selections

The screen selections supported at the time of the printing
of this manual are listed below.

Option Graphics Card Screen Size

/A AT&T Graphics Adapter 640x400
/A:1 AT&T Graphics Adapter 640x400
/C:1 IBM Color Graphics Adapter (CGA) 640x200
/E:1 IBM Enhanced Graphics Adapter (EGA) 640x350
/E:6 IBM Enhanced Graphics Adapter (EGA) 640x350
/E:7 IBM PS/2 MultiColor Graphics Array(MCGA) 640x480
/G MDS Genius Display Adapter 736x1008
/H Hercules/AST Monochrome 720x348
/O Toshiba 3100 640x400
/T Tecmar Graphics Adapter 720x352
/T:1 Tecmar Graphics Adapter 720x352
/T:2 Tecmar Graphics-Adapter 720x704
/S STB GraphicsPlus-II Adapter 640x352
/S:1 STB GraphicsPlus-II Adapter 640x352
/S:5 STB GraphicsPlus-III Adapter 640x400
/W Wyse WY-700 640x400
/W:1 Wyse WY-700 640x400
/W:2 Wyse WY-700 800x400
/W:3 Wyse WY-700 1280x800
/X IBM 3270 PC 720x350
/X:1 IBM 3270 PC 720x350

For Compaq Monochrome Graphics Monitors - Use the following

/A:1 Portable II and III plasma 640x400
/C Portable or CGA compatible 640x200
/E:1 EGA with monochrome screen 640x350

Note: VGA will always default to MCGA (/E:7)

Although OCRSHARE does not at this time support a mouse
directly, it will work if you have a programmable mouse
hooked up to your computer. Such a mouse can emulate the
directional keys of the keyboard (left, right arrow, etc.).

Copr (c) 1984-1990 Solution Technology, Inc. page 11


OCRSHARE is not generally network aware but can be used from
a server. Secondly if your network drivers occupy too much
room in the first 640K OCRSHARE will not have suffient room
to run. Also be aware of OCRSHARE's use of EMS memory with
particular attention to server disk caching. To install
simply copy all of the programs and files to a server direc

EMS Memory

EMS memory provides up to 8 megabytes of additional memory
to any IBM/PC compatible computer. This memory is not
directly addressable by computer; it must be paged in and
out to a 64 Kilobyte buffer by software calls to an EMM
device driver. This driver, installed in CONFIG.SYS, must
conform to the Lotus, Intel, Microsoft (LIM) specification
3.3 or 4.0.

OCRSHARE and EMS memory

OCRSHARE uses the EMM driver to access the EMS memory.
OCRSHARE stores the full sized display image and the zoomed
out image in the EMS memory. While you do not need external
EMS memory for occasional work, you will find the perform
ance advantages of a real EMS card significant if you are
trying to process large numbers of pages.

EMS Memory Requirements

OCRSHARE images require memory real or simulated EMS memory.
The chart below gives you some idea of how much memory you
will need for different sized images.

Image Resolution 8.5x11 in

200 Dpi about 500K
300 Dpi about 1.1 Meg
400 Dpi about 1.8 Meg

The formula for calculating the number of bytes required is:
(Resolution x Resolution x length x width) / 8 (bits per

Note: If you have insufficient real EMS memory to store the
size of the image you are scanning, you will get an EMS
allocation error. You can still process the image by setting
OCRSHARE to simulate the EMS environment on disk, exiting
and restarting OCRSHARE.

Copr (c) 1984-1990 Solution Technology, Inc. page 12

Types of EMS Memory

Depending on whether you are using a real EMS card, an
extended memory EMS simulator, or a disk EMS simulator, the
memory used by the EMS system will, respectively, be paged
in and out to a real EMS card, or to extended memory, or to
your hard disk. OCRSHARE's internal EMS simulator is a
highly optimized form of a disk EMS simulator.

OCRSHARE's EMS simulator

If you do not have real EMS memory active, OCRSHARE will use
the free space on your hard disk as temporary virtual memory
and simulate the EMS function internally.

If, for example, you are running OCRSHARE from drive C:
which has, let's say, 400K of hard disk space remaining (use
CHKDSK to determine), then there will not be enough space to
run OCRSHARE. If you have another hard drive (e.g., D:) with
more space in it, you can log into that drive (D:), and run
OCRSHARE as long as OCRSHARE.EXE is in the path. Although
OCRSHARE is installed on C:, because it is called from
another drive (D:), the remaining space on that drive (D:)
will be used as the simulator.

Note: Make sure your hard disk has 1-2 megabytes of free
space on it. You can use DOS's CHKDSK utility to see how
much disk space is available. The OCRSHARE simulator is in
effect while OCRSHARE is running. When OCRSHARE exits, all
of OCRSHARE's simulator files are deleted.

Installing a disk cache in EMS Memory


If you try to run ANY EMS application with an "ill- behaved"
delayed write cache program installed in EMS, it will likely
result in a serious disk crash with corrupted FAT tables.
"Well-behaved" EMS disk caching programs will properly save
and restore EMS information as it operates.

While cache programs usually operate properly when set to
WRITE THROUGH mode (where all writes are done immediately),
the problem occurs when the cache program is set to do
DELAYED WRITES to the disk. This requires that the cache use
a clock tick to interrupt the application program to do the
disk writes. If the cache program DOES NOT properly save
and restore EMS page information during this interrupt, it
WILL cause a disk crash.

Copr (c) 1984-1990 Solution Technology, Inc. page 13

This warning does not apply to WRITE THROUGH cache programs
or VDISK programs which are generally well behaved. If you
want to use DELAYED WRITES with your disk cache program,
install the cache memory buffer Extended memory. Besides
using the unused 384K that is typically there, this leaves
the EMS Expanded Memory free for OCRSHARE to use.
Scanner Installation

OCRSHARE and ATXSHARE are shareware versions of Solution
Technology's Advantex OCR product which directly supports
most desktop scanners. Due to the fact that the of scanners
is often complicated and (and costly) in terms of telephone
support these shareware products are not distributed with
scanner drivers or scanning capability. To get direct
scanner support you must purchase regular AdvanTex. (See the
file README.1ST).

OCR on prescanned Images
You must scan images using your scanners built in utility
and save them in a Tiff, PCX or IMG image file format.
OCRSHARE can then load these images and perform OCR or basic
image manipulation.

Copr (c) 1984-1990 Solution Technology, Inc. page 14


With OCRSHARE, you are only a few keystrokes from obtaining
usable text output in the form of either an ASCII text file
from pages scanned and saved in PCX, TIF or IMG image file

This Tutorial is designed to get you up and running with
OCRSHARE in the shortest possible time. If you feel that you
need more detailed instructions, or you are going to use
OCRSHARE almost immediately to do production work, simply
begin with the regular Tutorial (Tutorial) Section.


1. Start OCRSHARE by typing OCRSHARE at the DOS prompt,
then press [enter].

Note: As long as OCRSHARE.EXE is in the PATH command,
OCRSHARE can be called from any drive or sub-directory.
Please refer to your DOS manual for more information about
the use of PATH.
Moving Around

To move the selector bar, use the up, down, left, and right
arrow keys on your numeric/arrow key pad. The [PgUp] and
[Home] key will move the cursor to the top of a menu, and
the [PgDn] and [End] keys will move the cursor to the bottom
of a menu.

The Six Basic Keyboard Keys
You normally only need to use the following six keys to
execute virtually any function in OCRSHARE.

Arrow Keys - The up and down arrow keys move the menu

cursor bar up and down any menu. The left and right
arrow keys move the cursor bar from menu to menu.

Escape Key - The Esc (escape) key will acts as both your
dismiss menu (or cancel) key, and as your attention
key. By pressing the Esc key, OCRSHARE will stop what
it is doing and return to the previous menu screen.

[enter] Key - The [enter] key is used to select, execute or
activate the action highlighted by the cursor. Once you
have placed the cursor on the action of choice, press
[enter] to activate that action. The [enter] key may be
identified as RETURN or as an arrow on your keyboard.

Copr (c) 1984-1990 Solution Technology, Inc. page 15

Short Cut Keys
The following keys allow you to bypass moving around the

Home Key - This key moves you to the top of any menu or

Home Key - This key moves you to the bottom of any menu
or list.

PgUp/PgDn - These keys move you up or down one list page on
a scrollable list. If the list is not scrollable these
keys act the same as the Home/End keys.

CTRL Keys - A control key shortcut is indicated on the
menus by a ^ symbol followed by a letter. To execute
the function associated with a control key, hold down
the CTRL key and press the letter indicated on the

Function Keys - A function key shortcut is indicated on the
menus by a F and followed by a number. To execute the
function associated with an F key, simply press the
indicated function key on your keyboard.

CTRL PgDn - When drawing a capture box with Select Area F9,
this combination will move the cursor box down the page
in much larger steps than the down arrow will.

CTRL PgUp When drawing a capture box with Select Area
F9, this combination will move the cursor box
down the page in much larger steps than the
up arrow will.

CTRL -> When drawing a capture box with Select Area
F9, this combination will move the cursor box
to the right in much larger steps than just
the right arrow will.

CTRL <- When drawing a capture box with Select Area
F9, this combination will move the cursor box
to the left in much larger steps than just
the left arrow will.


When it is time to exit OCRSHARE, use the arrow keys to
position the cursor over the Exit to DOS command under the
INFO main menu. Press [enter]. You may also exit OCRSHARE by
pressing the CTRL-X control key shortcut.

Copr (c) 1984-1990 Solution Technology, Inc. page 16

Perform OCR on an Image File

Load a scanned image file
1. Press F5 to select the image file loader
2. Press [enter] to select the OCRSHARE file format.
3. Press the Down Arrow Key to move the selector bar over
the PAGE1.ATX file name
4. Press [enter] Once to select that file to load.
5. Press [enter] Again to load the selected file.

Exploring the loaded image
1. Load an image file as described above
2. Select "Select Area" or press F9. A cross hair cursor
will appear.
3. Use the Arrow Keys to move the cross hair cursor to
interesting parts of the scanned image.
4. Press F10 to "Zoom In" on selected area
5. Press F10 again to "Zoom Out" to the full page view.
6. Press [esc] key to cancel the select area.

Load an OCR Recognition Font
1. Select "Font Settings..." on the OCR menu, or press
2. Select "Load Existing Font".
3. Move the cursor bar so that it is over TMSROMAN.FTF in
fonts list.
4. Press [enter] once to select the font.
5. Press [enter] a second time to load the font.
6. Press Esc to get back to the OCR main menu.

Setup the Text Output file
1. Select "Text Settings..." on the OCR menu, or press
2. Select "Set Output Text File".
3. Type DEMO at the File: prompt (the .TXT extension will
be appended for you); press [enter].
4. Press [Esc] to get back to the OCR main menu.

OCR the full page
1. Load one or more OCR Fonts.
2. Load a scanned page image file (PAGE1.ATX for
3. Setup text output file name.
4. Select "Convert to Text" or press F4 to perform OCR on
the full page and put the results into a newly created
file called DEMO.TXT. This output text file can be sent
to the printer, viewed from DOS, or imported into your
favorite word processor.

Copr (c) 1984-1990 Solution Technology, Inc. page 17

Selecting areas to OCR or Cutout
1. Press F9 (Select area)
2. Move the cross hairs with the arrow keys.
3. Anchor the first corner by pressing the [enter] key.
4. Pull the tiny icon box (dragon) using the arrow keys
until it surrounds the area desired.
5. Use the "+" and "-" keys to move the dragon to
different corners of the larger capture box.
6. When you are satisfied with the positions of all
corners, lock the capture box in place by pressing the
[enter] key.

Canceling a selected area
1. Press F9 (Select area)
2. Press [enter] twice.
3. Since the select area rectangle has no width or height
it is now canceled and full page is implied.

Ocr of selected area
1. Mark an area of interest as described above.
2. Press F4 to perform OCR on the selected area and
either append to or overwrite the existing file called
DEMO.TXT. This output text file can be sent to the
printer, viewed from DOS, or imported into your
favorite word processor.

Unloading an OCR Font
1. Select "Font Settings..." on the OCR menu or press
the CTRL-F key.
2. Use Down Arrow to move the selector to TMSROMAN.FTF.
3. Press [enter] to pop up the font management menu.
4. Press [enter] to remove TMSROMAN.FTF

Saving a selected area
1. Mark an area of interest as described above.
2. Select "Save Area" or press F8.
3. Select a graphics format to save to. (OCRSHARE is
OCRSHARE's fast load internal graphics file format.
You must register to obtain the other formats
4. Type a file name, say CUTOUT, at the File: prompt; the
appropriate extension will be appended for you.
5. Press the [enter] key to use this file name.
6. The selected area of the image is now saved into the
file CUTOUT.ATX, in the graphics file format you se

Copr (c) 1984-1990 Solution Technology, Inc. page 18

Training a New OCR Font
1. If you have not loaded PAGE1.ATX do so now as described
2. Use Select Area (F9) to draw a box around the letters.
Make sure that you have some upper case or tall letters
on each text line in the selected area.
3. Select "Font Settings..." or press CTRL-F.
4. If TMSROMAN.FTF is still loaded, unload it as described
5. Select "Make a New Font" and type in OCRFONT (or any
other name you want).
6. Press the [enter] key to register that font for
7. Press [Esc] to get back to the OCR main menu.
8. Select "Train Font" or press F7 to begin training
characters from the selected area into the newly
created OCRFONT.FTF.

a) As each letter is displayed and boxed, type the
corresponding letter on the keyboard followed by
[enter] key.
b) If multiple letters are boxed, you can type them
all in sequence, and press the [enter] key.
c) If the symbol boxed has no equivalent keyboard
letter, you should make up an unambiguous two or
three letter sequence to name the letter, and
press the [enter] key. Note that the next time
that symbol appears for training you MUST use the
SAME name you entered earlier.
d) To skip over dirt, spots or badly broken letters,
press [enter] without typing the name of a symbol.
Only named symbols are added to the OCR font data
e) Press the [Esc] key any time to stop training and
get back to the OCR menu.

Saving the New OCR Font
5. When you are finished with training of the new font
Select "Font Settings..." or press CTRL-F from the OCR
a) Select OCRFONT.FTF (eg the ocr font you have just
b) Move the selector bar to "Save Font" and press
[enter]. Except for the letters OCRSHARE hasn't
seen yet, you have just created a partially-
trained font.

More about OCRSHARE's File Finders

OCRSHARE uses a menu window called a File Finder to select
for loading and name files for saving. A File Finder dis
plays a list of the available files of a given type in a the

Copr (c) 1984-1990 Solution Technology, Inc. page 19

currently specified directory. Explore this by the following

1. The cursor bar is initially located on the File:
prompt bar. Press the up arrow key until the cursor
bar is on the PATH: *.ATX prompt line.
2. Press the down arrow key until the cursor bar is in the
box listing the OCRSHARE image file names.
3. Use the down arrow key again to move the cursor bar
cursor bar down until it covers the file named
4. Press the [enter] key.

Note that the file name and the cursor bar pop up to the
File: prompt line. Press the [enter] key again to load the
PAGE1.ATX file into display memory. While loading any file,
OCRSHARE first displays a load progress window, then a
Computing Full Page View progress window as it constructs
the full page view of the image. When loading has finished,
OCRSHARE will display the full page view on your monitor.

5. Press the ESC key to hide the OCR drop down menu as
illustrated above. This lets you view the page without
anything in the way.

6. Press the ESC key again and the OCR main menu will

Editing the File Name
All OCRSHARE text fields have an easy to use, built in, line
text editor. Notice the small vertical bar to the right of
PAGE1.ATX. This means that the field editor is insert mode.

1. Press the [Ins] key several times to toggle between
insert and overstrike mode.

In insert mode the cursor indicates the point between
letters where the next character will go. In overstrike
mode the block cursor covers the character which will be
overwritten. It indicates that you can edit the characters
of the file name in this field.

2. Press the Home key to move to the beginning of the text
3. Press the End key to move to the end of the text field.
4. Press Left arrow key to move the text cursor (vertical
bar) to the point between the 1 and the period. This
is the position where we are going to begin editing.
5. Press the Backspace key to delete the 1. (You could
have also moved to the point between the E and the 1
and pressed the Del key).
6. Type the number 1 to reinsert the number 1.

Copr (c) 1984-1990 Solution Technology, Inc. page 20

7. Type the letter a to change PAGE1.ATX to PAGE1a.ATX
8. Press the Backspace key to delete the a.

Over typing the File Name
You can also type over the existing file name.

1. On any finder menu, move the selector bar to the File:
2. Press the Home key. The text cursor will move to the
far left side of the file name.
3. Press [Ins] (insert) key until the text cursor changes
to the type-over fode. OCRSHARE is ready to type over the existing charac
ter or the entire file name.
4. Type in the filenames letters and press [del] key to
delete any trailing letters.
5. Press [enter] to use this filename and the path speci
fied by the Path: field.

Incrementing Numbered File Names
OCRSHARE is designed to work easily with filenames that end
in numbers. Since PAGE1.ATX ends with the number 1 to make
it a numbered file. Note that the file extension (in this
case .ATX) indicates the type of file information rather
than just serving as the end of the filename.

1. Press the F2 function key on your key board. The file
name will change to PAGE2.ATX.
2. Press the F10 function key and the File name changes
back to PAGE1.ATX. You may increment or decrement the
file name in this manner for all numbers between 0
(zero) and 9999.


1. Each type of finder has a separate data storage areas
for their current information. This means that you can
set the path for the OCRSHARE finder to one directory
and the path of the PCX finder to another directory and
OCRSHARE will keep them straight.
Graphics Editing

The following graphics menu performs editing functions that
modify pixels, areas or the entire page in the display
memory. This enables you to modify scanned images to improve
the quality of graphics and OCR projects. OCRSHARE can load
and save images to or from PCX, TIFF, or (our own) OCRSHARE
file formats thus allowing you to convert image files for
different purposes.

Copr (c) 1984-1990 Solution Technology, Inc. page 21

Invert Page
Invert Page which makes negatives of positive (black on
white) images.

1. If you don't have the image PAGE1.ATX .ATX still
loaded, load it again.
2. Select Invert Page by moving the cursor bar over Invert
Page and pressing [enter].
3. The black and white image reverses itself.
4. Press [enter] again and the image reverses, bringing it
back to its original state. It is that easy.

Flip Page Vertical
Flip page vertical rights pages which were accidentally
scanned upside down.

1. Select Flip Page Vertical and the image will turn
upside down so that the top is now the bottom.
2. Press [enter] again and the image returns to its
original upright position.

Erase Inside
Erase Inside erases the area INSIDE a "Select Area"
rectangle and is good for selectively eliminating unwanted
parts of the scanned page. The area selected can be as small
as one bit or as large as the entire page.

1. Use Select Area to enclose an area inside the monitor
screen in the image as shown above.
2. Select Erase Inside. OCRSHARE will erase everything
within the box. The monitor in the image now appears
clear with a black border around it as shown above.

Erase Outside
Erase Outside erases the area OUTSIDE a "Select Area"
rectangle and is perfect for eliminating edge trash. The
area selected can be as small as one bit or as large as the
entire page.

1. Use Select Area to enclose an area inside the monitor
screen in the image as shown above.
2. Select Erase Outside. OCRSHARE will erase everything
BUT what is inside the select area box. The monitor in
the image now appears clear with a black border around
it as shown above.


Copr (c) 1984-1990 Solution Technology, Inc. page 22

1. Your original image, PAGE1.ATX, has not been modified,
and will not be until you elect to save over it.

Copr (c) 1984-1990 Solution Technology, Inc. page 23

OCR Notes

1. Due to memory constraints, we provide direct, internal,
support for only a few word processor formats at this
time. If you are going into a different word processor,
you should select the Wordstar document format, and the
import utility provided with your word processor to
convert the ASCII text file into your word processor's
internal text format.

2. OCRSHARE automatically avoids small spots of dirt as
well as pictures in PAGE1.ATX.

3. You may occasionally see extra spaces or symbols
seemingly run together in the Progress Monitor window.
In addition you may even see two or three lines as
OCRSHARE wraps the text output on the display. Don't
fret, the monitor window is only an early preview of
how conversion is doing. The actual page formatting is
done later, after OCRSHARE can analyze the relationship
of ALL of the symbols from the page. Therefore, the
actual text (.TXT) file will show the most accurate

Looking at the output text file.
You must leave OCRSHARE to examine your output text file

1. Press CTRL-X or select Exit to DOS on the INFO main menu.
2. At the DOS prompt type in the following command:

TYPE PAGE1.TXT [enter]

Additional Information on Training

Naming the ink blots during training
Font training is the process by which the user teaches
OCRSHARE to recognize new characters. In its simplest form,
training is a two-step process.

1. OCRSHARE automatically locates and draws a box around
an unknown ink pattern.

2. You type in one or more ASCII character(s) to name the
ink pattern. OCRSHARE will remember your choice, and
during OCR translation it will output these ASCII
character when it detects another ink pattern closely
approximating the one trained.

Copr (c) 1984-1990 Solution Technology, Inc. page 24

Proper training and symbol diversity
One of the major components of properly training a new OCR
Font is to select an area of symbols from a scanned page of
text that contains the font desired and a good mixture of
upper case letters, lower case letters, numbers and punctua

Making an Alphabet Page
We have found that the following letter sequence, taken from
a standard keyboard does quite a nice job of training up a
character set. Not perfect, mind you, but easy to implement
and use. Note that there are extra spaces between each
symbol. Note also that the punctuation MUST NOT be put on
separate lines otherwise its size and relative position
relative to the rest of the letters in the font will be

` 1 2 3 4 5 6 7 8 9 0 - = \ ~ ! @ # $ % ^ & * ( ) +
A a B b C c D d E e F f G g H h I i J j K k
L l M m N n O o P p Q q R r S s T t U u V v
W w X x Y y Z z , . < > / ; ' : " [ ] { }

Since we have found that it is usually sufficient to train
each symbol 3 to 8 times, you should, for convenience, put a
number of sets of the above sequence on the same page. In
addition, you should generate a test paragraph (Quick brown
fox paragraph) which uses all of the letters and punctuation
in normal words and sentences.

Point sizes and Scanning DPI
We suggest using 14 point type when printing an Alphabet
Page using a publishing system and doing your scanning at
the highest available resolution to get the best possible
typeface rendering. Since OCRSHARE has OMNISIZE we are
training on a typeface, not a point size.

Note: You need to press [character] [enter] to tell OCR
SHARE's memory. If [enter] alone is pressed, no training for
the current symbol is accomplished.

Note: Should you inadvertently strike the wrong key(s), you
may untrain the mistrained symbol. See Untrain Symbol
described later for details.

Copr (c) 1984-1990 Solution Technology, Inc. page 25

About Autoskip and Manual Training Modes
As shipped OCRSHARE comes up in Manual Training mode. This
means that it stops on every symbol encountered. If you
switch to AutoSkip training mode OCRSHARE will, when at
least the minimum number of symbols have been trained, stop
only on symbols which do not meet minimum acceptable
criteria for matching symbols currently in the OCR font data
base being trained.

More on Autoskip Training
Autoskip means that OCRSHARE will automatically skip over
any character during training that meets the following
default criteria: The menu fields within the parentheses
below are found in the Convert Settings Control Panel menu
which can be accessed by pressing the [CTRL-C] key.

- The training count for a character is at least 2 (per
Min Training Count)

- The confidence level is between 0-40 (per Confidence

- The confidence level of the second choice is at least
15 points more than the first choice (per Similarity

Autoskip will allow you to train a new font quickly and
accurately by ignoring known characters and concentrating on
unknown or unclear characters.

About the Interactive Training screen.
Understanding this information is very important to properly
train a font. The categories are as follows:

Train Count: Indicates the number of times OCRSHARE has
been trained on the particular character it is display
ing in its Closest Choice List. You rarely need to
train OCRSHARE on any single character more than 3

Confidence: OCRSHARE is numerically telling you how
confident it is that the Closest Choice List characters
match the character it has drawn a box around at the
top of the screen. A good match falls in the range
between 0-40. A number higher than 100 indicates that
OCRSHARE has no confidence it recognizes the boxed
character at the top of the screen properly.

Copr (c) 1984-1990 Solution Technology, Inc. page 26

In Autoskip training mode, as the training progresses,
OCRSHARE will start to see the characters for a second time,
and the confidence rises (numbers start to drop
dramatically) as the correct closest choices are displayed.

Font Name: Indicates what font in memory OCRSHARE has
chosen its closest choice list characters from. This
indicator is only important when more than one font has
been loaded in memory.

Closest Choice List: Indicates the best matches OCRSHARE
has for the boxed character at the top of the screen.
The Closest Choice characters always appear in descend
ing order with the best choice at the top of the list,
the fourth best at the bottom of the list.

Trained Alphabet: The list of characters in this box
indicate to you which characters have been trained at
least one time, and what characters OCRSHARE has yet to
see. Every time you enter a new character into OCR
SHARE's memory, it will darken that character on the
Trained Alphabet list. Characters OCRSHARE has yet to
see will remain halftone. An effectively trained font
will have at least all upper case letters (A- Z), lower
case letters (a-z), and punctuation marks darkened.

Translation Progress Window: This window will give you an
indication as to where the translation is in relation
ship to the scanned page. The translation Progress
Window is NOT to scale. Do not be concerned if the
spacing of characters and words is not exact, or if
extra carriage returns appear on the screen.

Training Your Own Fonts for Recognition
Although we have provided 3 pre-trained fonts, you may find
that you get best results when you train your own pages for
recognition. This is because there are so many fonts in
existence and therefore we cannot train them all.

Using Your New OCR Font
You can use NEWFONT alone or in conjunction with other
existing fonts simply by loading into memory using the "Font
Settings..." function menu.

Multi font Capability
It is important to note that OCRSHARE can utilize more than
one font at a time. Multiple font capability will enable you
to properly translate a very broad range of printed material
containing mixed fonts on the same page.

Copr (c) 1984-1990 Solution Technology, Inc. page 27

Note: Very small type or low DPI scanning resolution may
require you to train special fonts as the height of the
characters may approach the lower mathematical limits of
OCRSHARE's OMNISIZE capabilities.

About the Convert Settings Control Panel
The options found on the Conversion Control Panel allow you
to modify the operation of OCRSHARE's OCR engine and identi
cally affect both training a font and translating the char
acter images to text. Normally you will not have to change
any of these settings.

Zone - This parameter determines which area of the image
will be trained, or converted to text.

"Whole page" will cause the training or conversion to
start at the left topmost character on the page
and will proceed to the right bottom character.

"Select area" will cause training to begin with the
top left most character inside a box drawn by
Select Area and will proceed to the right bottom
character in the capture box.

"Page mask" behaves in the same way as Select Area,
except that it uses a Select Area which remains
fixed, page after page. This is very useful when
converting text from the same zone on many pages.

Training Mode - Allows you select one of two modes for a
font, Manual Training and Autoskip training. Autoskip
will be the mode that is used most often once you gain
familiarity with OCRSHARE.

"Autoskip" Autoskip training will occur at an
accelerated pace, because as OCRSHARE becomes more
confident of the symbols you are teaching it will
stop to train only characters in which it has
lower confidence.

"Train OCR" Manual training stops on every symbol,
regardless of the confidence level. This allows
for training of characters up to 128 times each
(optimal training is 3-8 times per character).

View Translation - Translation Monitor ON allows you to view
the conversion to text. Translation Monitor off causes
a faster translation to occur since the text does not
have to be displayed. NOTE: The actual text file is
created at the very end of the viewing stage, when the
message Formatting Output Page is displayed.

Copr (c) 1984-1990 Solution Technology, Inc. page 28

Set Max Symbol Size - Allows you to determine the largest
character or graphic that OCRSHARE will recognize
during training and translation. When this function is
selected, the dimensions of the capture box (drawn with
Select Area) will automatically be placed in Edit Max
Symbol Size setting . Any graphic larger than this box
will be ignored. This is extremely handy for processing
pages containing both text and graphics, since the
graphics will be ignored during OCR.

Set Page Mask Bounds - This will determine the area of the
page mask, or the area that will be recognized during
training and translation. When this function is se
lected, the dimensions of the capture box (drawn with
Select Area) will automatically be placed in Edit Page
Mask Bounds. Once the Training mode is set to Page
Mask, OCRSHARE will deal only with images inside this
area. Furthermore, only this area will be active from
page to page while Page mask is selected.
Set Spot Filter Size - Determines the size of the graphic
below which is ignored during training and translation.
For example, any graphic less than 15 pixels in size
will be ignored. The shape of the graphic does not
effect the filter in any way.

Edit Max Symbol Size - This feature allows you to edit, the
symbol size by entering the number of pixels you wish
the length and width of the symbols to be. However,
this function will rarely be used since the dimensions
shown here are usually set by Set Max Symbol Size.

Edit Page Mask Bounds - This feature allows you to edit the
page mask by entering the number of pixels for each
edge of the capture area. However, this function will
rarely be used since the dimensions shown here are
usually set by Set Page Mask Bounds.

Min Training Count - This number, between 2 and 9 inclusive,
determines the minimum number of times each character
will be trained before OCRSHARE will consider skipping
over it during training.

Max Training Count - This number, between 2 and 9 inclusive,
determines the maximum number of times any character is
trained when Training Mode is set to autoskip.

Confidence Threshold - This number only has effect in the
Manual Train, Autoskip mode and does not usually need
to be changed. OCRSHARE's certainty in identifying
characters depends on a number of parameters, including
the quality of the scan, the consistency and quality of
the type and how well the font is trained. Changing the
Confidence Threshold number determines how sure we want
OCRSHARE to be in identifying characters before auto-
skipping to the next character. Setting the confidence

Copr (c) 1984-1990 Solution Technology, Inc. page 29

threshold to 0, however, means that OCRSHARE will NEVER
autoskip except for PERFECT matches. A Confidence
Threshold of 40 is represents allowing about 15% error
before a good match is failed during training.

Similarity Threshold - This number only has effect in the
Manual Train, Autoskip mode and does not usually need
to be changed. Changing the Similarity Threshold deter
mines how close the first two choices of symbols must
be before OCRSHARE will stop and ask for confirmation.
OCRSHARE wants to have confidence that it is correctly
differentiating its choices before it skips to the next
letter. Setting the Similarity Threshold to 0 effec
tively turns the detector off.

Handling Special OCR Problems

Broken Letters
You can lose proper symbol capture if letters are broken in
such a way that a clear vertical white space is visible.
This is usually caused by a light ribbon or an nth order
photocopy. First, try rescanning using your utility soft
ware with a darker contrast setting. Secondly, reprint the
page with a new ribbon. Third, use manual training and skip
over the broken letters.

Broken Dot Matrix
See Broken Letters (above).

Run together Letters
OCRSHARE can separate letters which overlap if there is at
least a thin white space snaking down between them. OCR
SHARE will, however, clump multiple symbols together
into a single symbol if they physically touch or run togeth
er in any manner no matter how thin the touch point.
First, try scanning with a lighter contrast setting or a
higher Dpi resolution. Secondly, reprint the script training
page with a space between each letter. Third, use the Erase
Inside function with a one pixel wide box to erase a thin
line between the joined letters. Fourth, use manual training
and either skip over the run together letters or train them
as letter pairs.

Big Dirt Spots
If the page has dirt spots larger than the minimum spot size
because these spots might be treated as a valid symbol.
First, try cleaning the scanner glass. Secondly, inspect
your printed training page and use white out to cover any
spots you may find (these are usually flecks in the paper
itself). Third, try scanning with a lighter contrast set

Copr (c) 1984-1990 Solution Technology, Inc. page 30

ting to reduce sensitivity to spotting. Finally, use the
Erase Inside and Erase Outside function to eliminate any
problem areas.

On some printers the underscore symbol is so far under the
normal baseline that it often becomes separated into a
separate line. If this is occurring, it may cause an occa
sional spurious symbol to appear on a line by itself.

The % Symbol
In some fonts, OCRSHARE will (sometimes) separate the %
symbol into a lower case o followed by a /o. If this is
occurring, you may have to train using the Manual training
modes. We suggest tagging the lower case o as %o and the /o
as %/o then using the font editor, redefining %o as the
ignore symbol (~~) and %/o as the % symbol.

For reasons similar to the % symbol, in some italic fonts,
OCRSHARE set the dots above i's and j's and the dots below
?'s and !'s. If this is occurring, you should train the page
using a manual training mode. We suggest tagging and redefi
nition in a manner similar to that used for the % symbol

Advanced Training Methods

While there are many printed symbols which do not appear on
your keyboard, you will quite often find that you want to
recognize them and translate them into something sensible in
your output text file.

Training Foreign Characters
Foreign alphabets have accented symbols which are not found
in the English alphabet and, as a consequence, do not occur
on US keyboards.

Handling foreign letters and special symbols requires two

- Train the foreign font database, coding the special
characters as described.

- Edit the font database, redefining all of the specially
coded characters so that it will, when performing OCR,
form the correct letters and output them to your output
text file.

Copr (c) 1984-1990 Solution Technology, Inc. page 31

Training from FRENCH.ATX.
1. Load the FRENCH.ATX image file.
2. Use Font Settings CTRL-F to define a new font called
3. Select Train Font F7, or press F7.
4. Train normal letters with single keystrokes. When you
encounter a foreign language symbol, refer the to the
next section to learn how to type in a two-character
sequence for the symbol. Continue until you have
completed training the page. Note that the double
letter symbols are NOT displayed in the alphabet

Training Foreign Language Letters
As you train, you will need to name the letters of the
various foreign alphabets which may not appear on your
keyboard. When this occurs, you will have two choices. The
first is to input the proper ASCII extended character
number by depressing the ALT key and typing the number on
your number key pad. A chart with all of the appropriate
ASCII extended character numbers has been supplied in the
Trouble Shooting section of this manual.

There is an alternative method. You can make up an
equivalent training name for each of these unusual symbols.
The following examples show some of the cases we have
encountered and the practical coding rules we made up to
train these foreign letters on an English keyboard.

Font Editing for correct OCR output
When you are finished training from FRENCH.ATX, you need to
edit how the font outputs the foreign language symbols to
the text file. Select Font Settings... CTRL-F from the OCR
main menu.

1. Select the "Font Settings..." sub menu
2. Move the cursor bar down to FRENCH.FTF
3. Move the cursor bar down to Edit and press [enter].
4. Pressing the [end] key.
5. Move the cursor bar to the first multi letter symbol you
want to redefine (e') and select it by pressing
6. Select Redefine from List.
7. Move the cursor down the list with the down arrow key
until it rests on the forward accented e symbol. Press
[enter] to select this e'.
8. Repeat steps 5,6, and 7 for each of our specially coded
French letter.
9. Press [esc] when you are finished to get back to the
"Font Settings..." menu
10. Save your newly modified font

Copr (c) 1984-1990 Solution Technology, Inc. page 32

Using your new French OCR font
Provided you have captured all of the letters, numbers and
punctuation of the typeface, you are now ready to use the
OCR function (F4) to convert French documents printed in
that typeface. Let's try it now on the same FRENCH.ATX that
we have loaded. Set an output file using Text Settings...
CTRL-T. Press F4 to activate the conversion.

Training Special Symbols

OCRSHARE's ability to recognize special symbols can be quite
handy when you intelligently redefine them for your
particular needs. In this section you will learn how we code
and redefine a few special symbols in ways that generate
useful output. This section is intended to be used as an
example as, in reality, there are thousands of potential
uses for this technique.

Copyright symbol

Code as (c)


We suggest that you should code square bullets as "[-]"
and round bullets as "(-)". For example, in Ventura
Publisher text we use the font editor to redefine both
of our training keystrokes "(-)" and "[-]" to the same
string "@BULLET = ".

The Hand Symbol.

If you want to train OCRSHARE to recognize the Hand
symbol we suggest that you code this as "!H!". For
Ventura Publisher we redefine "!H!" to "@NOTE ="

OCR Tuneup Fonts

Occasionally you will use an existing font which does not
give you a good translation. Often the reason is a variation
in the design of the printed typeface. For example, there
are over 25 significant variations of the Courier typewriter
font. Other frequent causes are letter distortions caused by
poor quality printing or nth order photocopying. Another
frequent cause is the accidental ligatures generated by
either scanning at too low a resolution or by the original
typesetter setting the letters so closely that they
physically touch. In either case this causes a new,
untrained, symbol to be defined.

Copr (c) 1984-1990 Solution Technology, Inc. page 33

The easiest way to handle this is to train a tune-up OCR
font which will augment your regular font. A tune-up font is
loaded in addition to your base font to supply the
additional trained symbols you need for this particular

Some tune-up fonts you create will be temporary, meant only
to aid the translation of a specific problem document. These
can be discarded after you use them. Others tune-up fonts
will be variations of a base OCR font. These you should keep
to load again with your base font.

Creating a tune up font.
Note: You can press the ESC key at any time to back out of
these menus without doing anything.

1. Select the Font Settings...CTRL-F menu.
2. Select Load Existing Font. Select and load your base
font, e.g., TIMES.FTF.
3. Select Create New Font. [enter] TIMES-HP. This will
create the tune up OCR font TIMES-HP.FTF.
4. Load or scan the page containing the font to be tuned.
5. Interactively train the whole page or a selected area
of the page where the problem symbols are. (OCRSHARE
will only stop on symbols which are problematical in
BOTH the base and the tuneup OCR databases). Note: type
in multi character strings when OCRSHARE stops on an
run together characters.
6. When completed, save the tune-up font in the normal

7. To use the tune up later, load the base font AND the
tune-up font and proceed normally with your OCR trans

Derivative Fonts

Occasionally you will use an existing font which does not
give you a good translation. When the primary reason is a
variation in the design of the printed typeface, as an
alternative to tuning up a font, you may want to either
correct the existing font or create a derivative font. In
either case the procedure is identical.

Testing an Existing Font.
1. Set up the Convert Settings control panel for normal
interactive training.
2. Load the OCR base font which seems to be giving prob
3. Load or scan the sample page which is giving problems
when using the specified base font.

Copr (c) 1984-1990 Solution Technology, Inc. page 34

4. Perform a test translation. Write down the symbols
which are not being recognized properly.

Editing the font.
5. You now need to untrain each of the problem characters
in the OCR font. Bring up the Font Editor by selecting
Font Settings...CTRL-F, then highlighting the font and
pressing [enter], and selecting Edit.
6. Use the down arrow key to move to each problem charac
ter noted in the test translation.
7. Select it.
8. Select Untrain Symbol. The definition of that symbol
will be deleted from that fonts database in memory. It
is now as if the symbol was never trained in the first
place. The font database on disk has not been modified
because we haven't saved the one in memory yet.
9. Repeat steps 7,8 and 9 for each problem character.

Retraining your font.
10. Using the newly edited font AND the same image,
activate Train Font F7. OCRSHARE will stop on each of
the symbols you've untrained (and occasionally some of
the other normal letters) so that you can retrain, or
enhance the training of already-trained symbols. This
should go very quickly.
11. Once you've finished training the font, you should save
it. If you save the font under a different name, you
have created a derivative font. If you save the font
under its original name, you have corrected the exist
ing font.

Copr (c) 1984-1990 Solution Technology, Inc. page 35

Trouble Shooting

The following is a checklist of items to verify before running

I installed OCRSHARE according to the instructions, but when I
started it, I did not get the menus display.

a) The first thing to try is delete the OCRSHARE.CFG file
if it exists then restart OCRSHARE. If the graphics
display adaptor shown when you restart is not the one
you have installed then you will have to override OCR
SHARE's selection.

b) If the above fails, check to see if the monitor and
display card you have is really a graphics display. If
it is, make sure that you are selecting the correct
override switch for the graphics display card you have

c) Delete the OCRSHARE.PRO setup file.

The machine locks when I try to run OCRSHARE

a) Be sure enough memory is available, at least 520K free.
Use CHKDSK to be sure. Also, release all TSR's (termi
nate and big resident programs) you may have loaded
before running OCRSHARE, we need the space.

b) Make sure that a proper screen driver is loaded (see
Selecting a Screen Driver in Installation.

The machine just jumps back to DOS when I run OCRSHARE.

This may be because you are using the EMS simulator and have
very little space left on your hard disk. Be sure to make 2-
4 megabytes available on your hard disk, or run from another
hard disk or partition with more space on it.

The menu display is up but the keyboard won't work.

Press the NUM LOCK key; the NUM LOCK light on the keyboard
should always be OFF.

The Mouse Won't Work

OCRSHARE does NOT support a mouse at this time unless your
mouse is programmable in such a way that it will emulate
keystrokes such as up arrow, down arrow, enter, Page Down,
etc. Consult your mouse manual for specific instructions for
setting up your mouse in this manner.

Copr (c) 1984-1990 Solution Technology, Inc. page 36

When saving an image, the hard disk light "churns forever", and
the on screen status bar barely moves.

There is not enough hard disk space available to which to
save the image. Press Esc, and either use the File Manager
(INFO menu) to delete some files and make more room, or Exit
to DOS CTRL-X to do the same.

The scanned image is garbage, or no image shows at all.

The most likely problem here is that you are using a scanner
which has its own image buffer mapped into the address space
above 640K and another device such as an graphics display
card, an EMS card or a network card which conflicts with it.
Verify that an EMS conflict is or is not occurring by
selecting Load Page F5 and loading the EMS test page. If the
cross hair pattern is not clean has the same garbage as the
scanned image, there is an EMS conflict.

Resolve the address conflict by changing the address of
the scanner card or the address of the conflicting card.
Make sure that you edit the CONFIG.SYS file to tell the

Finally, make sure that the scanner is clean. You can
easily check to see if the unit is clean by scanning a pure
white page. If an image with many spots results, clean the

The Quick Start procedure shows nothing/garbage in the
Translation Window.

Your scanner is probably not working properly. Press Scan
Page F3 to scan the WELCOME TO OCRSHARE page. An image of
the page should show up on your monitor in black characters
on white background. If not, walk through the Scanner
Problems, and Scanning sections earlier in Trouble shooting.

When training, nothing happens.

If nothing happens, and you are "kicked" back to the main
menus, make sure that the loaded image consists of dark
characters on light background. If not you must invert the
page using "Invert Page" off the EDIT menu.

OCRSHARE is skipping characters that it has never seen before.

This is probably because the untrained character being
skipped is too much like another that has been trained.

a) Press Esc to exit the training screen.

Copr (c) 1984-1990 Solution Technology, Inc. page 37

b) Select Convert Settings... CTRL-C and set the Training
Mode to Train OCR manual.

c) Use Select Area F9 to capture the word(s) in which the
character(s) being skipped are located.

d) Select Train Font F7 and train the character(s) that
were being skipped.

e) When finished, you may reset the Training mode to
Train OCR, autoskip.

I keep seeing "Out of Memory at row XXX. Please set narrower

If your page is very complex, you made need to select an
area of text to train/convert on, rather than the whole
page. More often than not, the cause is skewed, or sloping,
lines of text. Use Straighten Page on the GRAPHICS menu to
horizontally align the image. Also, perform a CHKDSK from
DOS to verify that at least 520K is free before running
OCRSHARE. Be sure to release any TSR's before running

How do I know which font to load when converting to text?

A font guide comes with the registered version of OCRSHARE.
If you have this guide. Compare the typestyles on your page
to those of the samples and load the one(s) that are the
most similar. If you are not sure which font is most like
yours, you can load up to five fonts that are most like the
one on your page.

There are too many/too few line feeds and or/spaces in the text

a) View the text (.TXT) file from DOS using the TYPE
command, or use your favorite word processor. The
Translation Window in OCRSHARE is not to scale, meaning
not all spacing is shown correctly.

b) Remember to set Proportional line Spacing if the type
on the page is proportionally spaced, and to set Fixed
line spacing if the type is fixed-spaced or typewritten

How do I skip a characters during training?

Press [enter] only without naming the symbol. A symbol is
trained only when character(s) are entered and return is

Copr (c) 1984-1990 Solution Technology, Inc. page 38

What happens if I try to train a font when more than one font is

OCRSHARE will show you the "best match" for the current
character based on all the loaded fonts, even though only
the font set to "Train" is being modified. Unless you are
training an existing font as a derivative font (see Lesson
5), you should remove all fonts other than the one being
trained for best results. Then simply reload all the desired
fonts for conversion to text after training is finished.

When F7 (Train Font is selected) nothing happens, and the program
returns to the menus.

a) Make sure that the image is dark characters on light
background. If not, change the Image Background
(Scanner settings... CTRL-S) to the opposite of what is
currently displayed, and re-scan the page.
b) Make sure that Whole page is the Conversion zone (in
Convert Settings... CTRL-C) selected. Or, that the
Select area or Page mask are of sufficient size to
capture the area of the page you want to train on.

The characters are badly broken or OCRSHARE is stopping on pieces
of a character during training.

1. Try increasing the scanner contrast to a higher setting
(toward Dark) so that the characters darken up and the
broken pieces become continuous (see Lesson 1).
2. Use the following "Tag" coding of the character as to
ignore or consolidate parts of the broken symbols.

a. "Tag" the first piece of the character by entering
a two- or three- character sequence, then press
b. Tag the second piece of the character by entering
the correct symbol. For example, if the character
is supposed to be an "A", but appears in two
pieces, identify the first piece as "AA", and the
second as "A".
c. When training is completed, prepare to Edit the
font per the instructions under Editing the Font.
d. Once the Font list window is displayed, Page Down
to the end of the table and locate the two- and
three- character symbols. Highlight the row with
"AA" and press return.
e. Select Redefine as Text and enter the ignore
marker, i.e., two tilde marks (~~) .
f. Repeat step 5 for each of the remaining multi-
character tags.
g. During conversion to text, the first piece will
now be recognized, but nothing will be output; the
second piece will be translated to the symbol
assigned to it.

Copr (c) 1984-1990 Solution Technology, Inc. page 39

More than one character is included in the training box on the

a) Try setting the scanner resolution to its highest DPI
setting. This insures that the scanner has the best
chance of seeing white space (if any) between adjacent
characters. (see Lesson 2).
b) If the scanner is already at its highest DPI setting
and you still have excessive problems with run together
letters, try decreasing the scanner contrast a little
bit towards lighter. This insures that the scanner has
the best chance of seeing white space (if any) between
adjacent characters.
c) Some characters may be actually run together. Train
these groups of run together characters as a multi-
letter (ligatured) symbols. You will usually find that
the typesetter consistently ran certain letter pairs
together. As long as the sequence is 40 characters or
less, the corresponding sequence can be entered. For
example, if the sequence "iff" should appear in the
training window instead of "i", "f", and "f", enter
"iff" from the keyboard, then press return.

The dots on the i's and j's are not included in the box during

Try setting the Spot Filter (in Convert Settings... CTRL-C)
from 15 to 8 dots. OCRSHARE is not finding the dot since its
size is smaller than the one specified in the Spot Filter

More than one character, vertically, or more than one full line,
are being displayed during training.

a) Occasionally, more than one line of text will be so
closely spaced that descending characters (e.g. "y")
will hang into the next line of text. In these
instances, OCRSHARE cannot determine where one
character ends and the other begins, so it will capture
both as one character. Do not train on any such symbol,
just press return only.

b) Another possibility is that two lines have some dirt,
malformed or particularly large characters invading the
interline space which causes a lack of line separation.
The trick here is to provide OCRSHARE with just enough
space between the lines of text so that it can
distinguish among them. A very simple method for doing
this is to draw a box (using Select Area, F9) between
the two lines whose dimensions are the length of the
line horizontally, and about 1/32 to 1/16 inch in
height (a very short rectangular box is needed). Once
the box is drawn, select Erase Inside (GRAPHICS menu)
and all marks within this box causing poor line
separation will be removed.

Copr (c) 1984-1990 Solution Technology, Inc. page 40

Some lines of text are accurately translated, yet others are
completely mis-translated; or, characters from two consecutive
lines are appearing in the training window.

a) This usually happens when OCRSHARE is having trouble
with line separation. That is, OCRSHARE cannot
distinguish the bottom of one line from the top of the
next. Check the physical page to make sure there are no
stray marks between the lines that are displayed
simultaneously. Also check for any underlined words.
Try erasing or masking these marks and re-scan the

b) Make sure the image is straight. If it is not, re-scan
the page, or Straighten the image (see Lesson 1).

The larger characters, and sometimes normal-sized characters, are
being ignored during training and/or translation.

Remember that any symbol larger than the Max Symbol Size
setting (Convert Settings...) will be ignored. This will
include any large single characters and any multi-character
symbols (e.g. ARL) whose combined size is larger than the
specified symbol size. Even though the characters A,R,L are
normal-sized, their being so close makes them appear as one
character to OCRSHARE.

Can OCRSHARE handle columnar material?

Yes. It is best to convert columns to text one at a time by
using Select area and drawing a box around each column one
at a time.

What is the best way to OCR a complex page, i.e. one with both
pictures and text on it?

There are actually three ways a complex page can be

a) Perhaps the best way is to do nothing, as OCRSHARE will
automatically ignore graphics (pictures) as it performs
OCR on the page

b) Select Convert to Layers (GRAPHICS menu). This feature
will separate the image into three files. The first,
LARGE.RAS will contain all images (usually pictures)
greater than the Max Symbol Size (Convert Settings...).
MEDIUM.RAS will contain all the images (usually text
images) greater than the spot filter setting and less
than or equal to the Max Symbol Size. SMALL.RAS will
contain most of the stray marks and spots on the page.
Converting to layers is a convenient way of
automatically separating the page into its various

Copr (c) 1984-1990 Solution Technology, Inc. page 41

c) Use Erase Inside on picture portions of the page. If
you do not wish to keep the pictures, you may decide to
remove them entirely form the image so that you can see
what you are doing with the text more readily.

Can OCRSHARE handle landscape-oriented material

At this time, OCRSHARE does not have a 180-degree rotate
function, but will be in the future.

Can OCRSHARE run under Windows?

Yes. Conditionally. OCRSHARE is NOT a Microsoft Windows
compatible program. Either run OCRSHARE without a .PIF file
and use all of Windows defaults or set up a .PIF file which
gives OCRSHARE the MAXIMUM amount of memory and indicate
that it directly modifies the screen and keyboard. Keep in
mind that OCRSHARE, like Windows, is a large program and
thus will probably run best as a stand alone process.

Can fonts trained at one resolution be used to translate text
scanned at another?

Yes, but the accuracy can be lower as compared to the
accuracy when using a font trained at the same resolution as
the scanned page.

Which TIFF format does OCRSHARE read and write to?

Our TIFF format is essentially the same as the ones generat
ed and read by Hewlett-Packard Scan Gallery software and
have been tested against a number of other software pack

How do you train if more than one font is present on a page?

It is best to train all the characters of one font and save
it to a file, remove this font, and then to train all the
characters of another font and save this training in another
file. Try not to mix characters of different fonts during
training of any given font. Remember that Select Area (F9)
can be used to capture various words and lines of text to
augment the training process should characters of one font
be intermixed with characters of another font.

Can OCRSHARE be trained to read quotation marks?

No. Each piece of the mark must be assigned a ' since
OCRSHARE captures only one ink pattern at a time.

Copr (c) 1984-1990 Solution Technology, Inc. page 42

Can more than one font be trained at the same time?

No. If more than one font is loaded (Font Settings), then
only one o these can be set to Train mode at any given time;
the rest are set to Use.

How many bytes of (EMS) memory do scanned images require?

It depends on the resolution, but the number can be deter
mined by simple calculation: At a resolution of 200 Dpi,
there are 200 bits by 200 bits, or 40,000 bits of informa
tion per square inch). Therefore, the formula for determin
ing the number of bytes required is (Resolution x Resolution
x length x width)/8 bits per byte. On a 8 1/2 x 11 page,
then, there are (200 x 200 x 8 1/2 x 11) / 8 or 467,500
bytes of memory required to store this image.

Copr (c) 1984-1990 Solution Technology, Inc. page 43

OCRSHARE V2.2 Registration Form

Name: ------------------------------------------------------

Address: ___________________________________________________



State(or Provence+Country):_________________________________

Zip/Postal Code:_____________

Phone: _______________________



____ Registration(s) of OCRSHARE $45.00 each $__________

____ OCR Font Disks $25.00 each $__________

____ Advantex OCR $395.00 each $__________

ATXSHARE/Advantex Media Selection

[ ] 5.25 DSDD (640K)
[ ] 5.25 DSHD (1.2Meg)
[ ] 3.5 DSDD (720K)

TOTAL $__________

Make Check or money order should be made payable to, UPS COD
availible on request:

Solution Technology, Inc
PO Box 273372
Boca Raton, Florida 33487

Copr (c) 1984-1990 Solution Technology, Inc. page 44


For more product information on OCRSHARE, ATXSHARE or Advan
text Retail Dealers should contact Solution Technology, Inc.
directly at the address and phone number given on the front
cover of this manual.


Shareware Dealers must contact STI directly to apply for
distribution. Any request for shareware catalog inclusion
requires that you send a current copy of your product cata
log with your application, along with a blank diskette and
mailer. Once you have been approved, you will receive
written permission from Solution Technology to include
OCRSHARE in your library for distribution. Additionally,
you will receive your diskette back containing the latest
OCRSHARE release. You must, in addition send each issue of
your catalog as they are published to update our mailing
list of active vendors. STI's continued receipt of your
catalog will both enable us to verify that you have the most
recent OCRSHARE release (within reason), and that your
business is still active.

We will send authorized shareware dealers each MAJOR revi
sion of OCRSHARE as it is released so that your library is
kept up to date.

Copr (c) 1984-1990 Solution Technology, Inc. page 45

  3 Responses to “Category : Printer + Display Graphics
Archive   : OCR22.ZIP

  1. Very nice! Thank you for this wonderful archive. I wonder why I found it only now. Long live the BBS file archives!

  2. This is so awesome! 😀 I’d be cool if you could download an entire archive of this at once, though.

  3. But one thing that puzzles me is the “mtswslnkmcjklsdlsbdmMICROSOFT” string. There is an article about it here. It is definitely worth a read: