Category : Various Text files
Archive   : NLM-INFO.ZIP

Output of file : MEDINDEX.TXT contained in archive : NLM-INFO.ZIP
National Library of Medicine
APRIL 1990
MedIndEx_ Project

The objective of the MedIndEx_ Project is to develop and test
interactive knowledge-based systems for computer-assisted indexing of
medical literature currently indexed in the MEDLINE_ database using
terms from the Medical Subject Headings (MeSH_) thesaurus. The
system is expected to facilitate expert indexing by, in effect,
combining MeSH and indexing rules into a knowledge base (KB), using
this to assist indexers at computer workstations.

The MedIndEx (Medical Indexing Expert) prototype is written in a
frame language, a type of object-oriented language where objects,
known as frames, are used for representing concepts. In a frame, a
concept (the frame name) is described as a list of pairs of slots and
values, where a slot is a relation, and a value is another frame name
that completes the relationship. For example, Heart LOCATION Thorax
describes the heart (frame) in terms of its location (slot) in the
thorax (value). Most frames contain a number of slot-value pairs (up
to eight slots per frame in the current prototype). Determining
these relations for indexing, which do not exist as such in MeSH or
any other suitable thesaurus, is part of the research of developing
the KB. Frame descriptions contain not only this factual knowledge,
but also procedural knowledge. Specifically, slots contain not only
values but also executable procedures that enable the system to
assist indexers interactively, as discussed further on.

An important relation in the KB is known as inherits-from, which
links the entire KB into a single classification. For example, Knee
Injuries INHERITS-FROM Leg Injuries establishes this hierarchy
between the higher-level frame Leg Injuries and the lower-level frame
Knee Injuries. Inheritance, whereby lower-level frames automatically
assume descriptions of higher-level frames to which they are linked
by this inherits-from relation, achieves a number of important KB
functions. These include maintaining consistency of the KB, detecting
redundancy in the KB, and simplifying algorithms for accessing frames
based on these explicit hierarchical paths from higher-level frames
to lower-level frames.

Indexers, with system help and guidance coming from the KB, would
create for each document indexed a set of indexing frames patterned
after KB frames. Indexing frames are descriptions of instances of KB
frames. These instances correspond to objects, events, procedures,
and other specific descriptions as discussed in documents being
indexed. The name of an indexing frame is a concatenation of its KB
frame name and the unique document number of the document being
indexed. Each indexing frame is linked to its corresponding KB frame
by the same inherits-from relation that is used for linking frames in
the KB classification. Indexing frames inherit slots from these KB
frames, and since KB frames include executable procedures (indexing
rules), this is how the indexing system can give help that is quite
specific to the concept being indexed.

For example, the following shows a slot that has been filled in an
indexing frame for document #86365451 which is about leg injuries.
The value Angiography, entered by the indexer in response to the slot
prompt PROCEDURE, indicates that this document discusses angiography
(radiography of blood vessels) for leg injuries. Also shown is the
inherits-from relation between this indexing frame and the Leg
Injuries frame in the knowledge base.

Leg Injuries 86365451
Leg Injuries
> Angiography

Indexing assistance includes slot names as prompts for indexers to
consider indexable aspects of a document (PROCEDURE slot in the above
example); validating indexers' input (accepting the value
Angiography); prescribing or suggesting slot values based on KB
rules; and hierarchical KB displays for browsing permissible values
for the current slot. Other important features include automatic
detection of inconsistencies in previously-stored indexing frames,
retention of canceled frames for possible re-use, and a word-level
aliasing technique to permit truncation of individual words in a
term, which then would be recognized by the system as lead-in
vocabulary for frame terms. Hash tables and caching are used for
quick access to data.

The KB contains rules not only for creating and filling indexing
frames, but also for generating in the background conventional MeSH
indexing terms at the level of expert indexing. For example, the
above simplified indexing frame would automatically generate the MeSH
indexing terms Leg Injuries/RADIOGRAPHY and Angiography. This output
could be compared to conventional indexing output for testing the
prototype, as well as provide actual MEDLINE indexing for current
retrieval if the prototype were adopted.

Indexing frame output, which consists of a network of linked indexing
frames, would describe a document more precisely than conventional
indexing. Since there exists no standard retrieval language for
searching frame databases, Project staff have successfully
demonstrated converting a sample frame database into a commercial
Relational Database Management System (RDBMS) supporting a standard
retrieval language, SQL, and retrieval from this database using
relations derived from indexing frames. In addition, the MedIndEx
prototype can be adapted as a knowledge-based front-end for an
intelligent searching assistant. It guides searchers in creating
query frames, and develops a nested search statement using boolean
algebra and expansion of terms based on the MeSH classification. It
then automatically transforms the search strategy into the syntax of
the query language of a retrieval service, connects to the service
over the Internet, and runs the search returning postings to matching

An important development in the MedIndEx System is the KB manager.
This software, designed for use by knowledge engineers managing the
knowledge base, serves to ensure a consistent, compact, and
syntactically correct KB, utilizing the inheritance feature of frame
languages, and special scripts employing menu and cut & paste
interfaces. Graphical display of hierarchies and creation of frames
in batch mode are additional enhancements.

The prototype is written in Lucid Common Lisp 4.0 and runs on the
SPARCstation 2_ workstation under the Unix_ operating system. An X
Window interface has been written using the X11 Release 5 library;
CLX, CLUE, and CLOS are also used. Project staff have experimented
with access to the prototype from a PC ethernetted to the workstation
using X server software on the PC. Access is possible over the
Internet from remote sites.

The development of an evaluation design for testing the MedIndEx
prototype is in progress.

For more detail, the following publication may be consulted: Susanne
M. Humphrey. Indexing biomedical documents: from thesaural to
knowledge-based retrieval systems. Artificial Intelligence in
Medicine 1992;4(5):343-71.

For further information, contact:

MedIndEx Project
Lister Hill National Center
for Biomedical Communications
National Library of Medicine
8600 Rockville Pike
Bethesda, Maryland 20894

  3 Responses to “Category : Various Text files
Archive   : NLM-INFO.ZIP

  1. Very nice! Thank you for this wonderful archive. I wonder why I found it only now. Long live the BBS file archives!

  2. This is so awesome! 😀 I’d be cool if you could download an entire archive of this at once, though.

  3. But one thing that puzzles me is the “mtswslnkmcjklsdlsbdmMICROSOFT” string. There is an article about it here. It is definitely worth a read: