Category : Files from Magazines
Archive   : AWKWARD.ZIP
Filename : AWKWARD.TXT

 
Output of file : AWKWARD.TXT contained in archive : AWKWARD.ZIP


AI Expert - The Magazine of Artificial Intelligence in Practice
October 1989 pp 40-46
All Rights Reserved

AWKWARD: Toward DATABASE REASONING

by Yuval Lirov and Swaminathan Ravikumar

Combining expert systems and database applications technology results
in enhanced information use, maximizing productivity and quality.[1] A
combined system can process large fact and knowledge bases. Denver
Works uses a factory test-repair history database equipped with
deductive capabilities as a troubleshooting expert-system prototype.
The system is also used as a knowledge-engineering guidance device,
focusing it's efforts exclusively on the acquisition of troubleshooting
knowledge about manufacturing. Error recovery, persistency,
concurrency, and freedom from knowledge-size limitations are the main
benefits of deductive databases. To develop such intelligent databases,
we have written several awk[2] queries to extract circuit-pack
troubleshooting knowledge from the history database and an awk-based
forward chainer to emulate troubleshooting reasoning. Our technique's
simplicity is enhanced by awk's standard availability.

Traditionally, database and expert-system technologies are merged by
providing an interface between the expert system and the
database-management system. Such interfaces allow query and update of
external databases. Further, a translation program is often
constructed, converting the production rules to sequences of database
operations that can emulate human reasoning.[3] A different way of
integrating databases and expert systems is the use of a database to
alleviate the knowledge-acquisition bottleneck. This technique belongs
to the bottom-up class, where raw data is used to obtain a qualitative
description of a phenomenon; the database supplies the information
needed to construct the rules. A classic example of such a merger is
Quinlan's ID3 algorithm.[4] Both ways of integrating databases and
expert systems allow for independent development and building of
reasoning capabilities on top of existing databases.

In this article we combine the two approaches of database and
expert-systems integration. We use the same query language to achieve
both tasks: rule creation and rule chaining (reasoning). In particular,
we use awk queries to develop an intelligent user interface to an
existing database. The database contains data about the observed faults
and test-and-replace activities of several troubleshooters working on
an electronic circuit-pack manufacturing line. The intelligent user
interface provides guidance to novice troubleshooters by consulting the
database and directing the process of replacing faulty components.

Our choice of awk is motivated by several considerations. First, awk is
a general purpose programming tool. Using it to create rules is
appropriate since awk was originally designed for data retrieval,
transformation, reduction, and validation. Second, awk is a standard
part of UNIX System v. 3.1 and it also runs under MS-DOS, making it a
standard, easily accessible tool. And finally, an awk program is a
sequence of pattern-action statements. The pattern part of a statement
is a multicondition declaration, matched by a line in the database. The
action part is a multioperation procedure describing the action to be
performed when the pattern part is matched. Thus, an awk program can be
regarded as a rule based program.

Awk, however, is not an expert-system shell. It does not have working
memory or conflict resolution strategies to emulate human reasoning. In
the following sections we will describe our method of emulating human
reasoning in awk using the circuit-pack troubleshooting database.

CIRCUIT-PACK EXPERT SYSTEMS

Using expert-system technology for electronic circuit-pack
troubleshooting is justified primarily by economic reasons. Even a
novice troubleshooter, equipped with such a system, is expected to
halve the time needed to diagnose a circuit, reduce work in process,
and minimize the number of test-replace cycles.[5]

Early troubleshooting expert systems represented the knowledge of
experienced analysts as rules. A rule has two sides: the left-hand side
describes a particular situation characterized by current observations,
and the right-hand side describes the next measurement or replacement.
The rules-collection process (knowledge acquisition) is the most
difficult part of constructing such expert systems. The analysts often
fail to explain their actions, the completeness and consistency of the
knowledge base is difficult to verify, and the expert system requires
continous upgrading to reflect the increments in the troubleshooting
knowledge. A better alternative is acquiring troubleshooting knowledge
from a variety of sources.

STAREX[6] is an expert system that has about 90% of it's rules
constructed artificially from sources other than human experts. STAREX
features automatic knowledgebase derivation from the circuit-pack
schematic, component-replacement and measurement costs, and
component-failure probabilities. While the schematic is available from
the circuit-pack design files, the component failure rates are more
difficult to acquire.

Besides being the necessary data for the automatic rule derivation,
component failure distribution information can also be used as a
completness verification tool of a shallow knowledge base. In fact,
this data can be used to supplement the missing rules. In the extreme
situation, the troubleshooting knowledge base, consisting of only the
replacement rules according to the component rates, can still be a
useful repair tool[7], although without understanding the circuit-pack
functionality. We will describe our approach to derive such
troubleshooting knowledge base automatically; but first, a description
of the expert system shell that will be used to encode that knowledge
is necessary.

AWKWARD EXPERT-SYSTEM SHELL

Possible strategies for integrating database management and
knowledge-base systems can be classified as enhancement and coupling of
independent systems. System enhancements include either adding
functionality to database management systems (DBMSs) to support
knowlwdge-based inference or extending expert systems with database
management technology. Independent systems coupling can be performed
under either tight or loose protocols.

A loose coupling implies that all the required data is retrieved from
the DBMS before the inference phase. Under a loose protocol, an expert
system-shell operates the knowledge base and DBMS seperatly. The
knowledge bases are unshared between user and transient. (When an
expert system goes away, it takes the rule base with it.) Databases,
however, are both shared and persistent (not transient by their
design). Loose coupling has two main deficiencies; an inability to
reason consistently using dynamic, time-varying data, and an inability
to partition the database before the reasoning to query efficiently.
Loose coupling's deficiency in handling time-varying data results in
inefficient polling queries by the expert system. The inability to
partition the database results either in the need to retrieve all the
data or develop a seperate module that guesses what to retrieve.

In tight coupling, an expert system interacts with the database during
the inference by constructing queires and interpreting the data at
runtime. Under the tight coupling protocol, the rules ordering dynamic
or nonpartitionable data are stored in the database. The prevalent
technique is to turn DBMS commands into rules by creating an illusion
that these commands are always running. The DBMS then triggers the
reasoning asynchronously when a change in data occurs. Using this
mechanism, rules residing in the database can be evaluated either early
or late with respect to the query, enabling forward or backward
chaining respectively. Tight-coupling schemes are deficient in certain
reasoning techniques such as explanation, deferred activation,
nonmonotonic reasoning, rule partitioning, and other more advanced
expert-system facilities.

What is really needed is the ability to emulate expert-system reasoning
entirely in the database's terms. In this section we'll describe a
system that falls into the system enhancements catagory: we extend an
existing database with several features to support inferencing at the
reasoning level of a human troubleshooter. Our awk-based expert-system
shell consists of two modules: the translation module and the inference
engine.

The translator is responsible for translating the knowledge
(represented in the rule form) to the tabulated format. For clarity's
sake we'll assume the rules have a simple syntax:

::= IF THEN
::= IS
::=
::=
::= high | low | integer
::= IS
::= measure | replace
::=
::=
::= integer
::= integer

The translation removes the keywords, IF, THEN, and IS from the rules
and populates the condition and action tables. The condition table's
format is variable : value : rule : identifier and the action table's
format is operation : operand : rule identifier where : is the field
seperation. The awk translation script (Listing 1) maps the following
rule base:

01 IF IC1-10 IS -5v
02 THEN measure with voltmeter IS IC2-25
03 IF IC2-25 IS +5v
04 THEN replace IS IC6

to the following condition and action tables:

Condition table
01 IC1-10:-5v:rule0
02 IC2-25:+5v:rule1

Action table
01 measure with voltmeter:IC2-25:rule0
02 replace:IC6:rule1

Inference engines emulate human reasoning by chaining the
knowledge-base rules and interacting with the user when additional
information is required. Two kinds of rule chaining are used in most
expert-system shells: forward and backward chaining. In backward
chaining systems, the right-hand side of a rule is considered a goal
that requires verification that the left-hand side is satisfied. This
requirement is considered a new goal, and another rule, having it's
right-hand side matching the new goal, is found. The process continues
until a goal corresponding to a verified fact is obtained. The initial
goal is proved by backward chaining (from conclusion back to facts).
PROLOG is an example of a backward-chaining system.

In contrast to backward-chaining systems, forward-chaining ones start
from the facts and match the conclusions of the satisfied rules with
the conditions of the new rules until a satisfactory conclusion is
obtained. OPS58 is an example of a forward-chaining system.

It is possible to emulate forward-chaining systems using a
backward-chaining system and vice versa. Therefore, we restrict
ourselves in this article to a demonstration of a forward-chaining
inference engine emulating human reasoning on a database. A
forward-chainer cycles through the following three stages:

o Matches ruls to the facts in the working memory

o Resolves conflict of rules to find the "most important" rule

o Fires the rule and updates the working memory

Rules are stored so they can be accessed and matched to the current
state of the working memory using two approaches. One approach is to
read all the rules into an array and manipulate that array in internal
memory. While this approach is time-efficient, it's memory requirements
are quite demanding. For example, any contemporary PROLOG system will
crash if required to handle more than 600,000 clauses.[3]

To alleviate this problem, our inference engine emulates working memory
in ASCII files. Listing 2 depicts the inference engine awk script. The
entire inference engine is described in BEGIN (line one to 25), where
lines two to seven correspond to initialization and lines nine to 25 to
inference. Note that the two functions in lines 27 to 34 and 36 to 43
used to query the action and condition tables shown previously. Examine
this sample interaction with the user:

$nawk -f rtrans.awk rules.data
$nawk -f fchain1.awk
Enter initial condition: IC1-10 -5v
measure with voltmeter IC2-25: +5v
measure signal K1-out: good
visually inspect C20: bad
replace C20:
$

AWKWARD IMPLEMENTATION

The AWKWARD system comprises four scripts that successively fetch and
transform the accumulated data until the required troublshooting advice
is constructured. The initial database contains records describing the
events that take place in the circuit-pack manufacturing. Each event
has a unique tag and value, and are of types passed test, failed test,
response, and so on. The record always contains the serial number of
the pack and time when the event took place. For test failure, the
record also contains the test step where the failure was observed. For
repair, the record contains the replaced component description.

Our goal is to build a system that, when queried with a test step, will
provide troubleshooting advice describing the most likely cause for
failure. At a later query with the same test step parameter and serial
number, the system will provide the next most likely cause for failure.

Shop-floor data is collected as ASCII files, each containing records
about different pack codes. Here is an sample data file:

01 BT 12-18-88 16:05:38
02 TS DAISY1
03 BC TN754 V12
04 SN 8DC379700
05 BR P
06 LC D02
07 OC SHIPPED FROM POS. 101, 16:05:38 SUN. DEC.18
08 TO 1
09 ET 12-18-88 16:05:38
10 BT 12-18-88 17:12:27
11 TS FR1
12 BC TN742 V10
13 SN 6D4316300
14 BR P
15 LC B04
16 OC SHIPPED FROM POS. 216, 17:12:27 SUN. DEC.18
17 TO 1
18 ET 12-18-88 17:12:27

The two records starting from line one and line 10 correspond to two
different pack codes as indicated by their BC field (line three and
line 12). First script traverses all the records and classifies them
into seperate files based on the pack code of each record:

01 BEGIN {
02 RS = "";
03 FS = "bn";
04 }
05
06 $1 ~ /^BC/ {
07 filename = 2; # field two is pack code
08 for (i = 1; i <= NF; i++)
09 printf("%s\n", $i) >> filename # append data to filename
10 close(filename); # cannot have too many open files
11 }

The script's output is a set of records files belonging to a unique
pack code (note the BC field):

01 BT 01-03-89 09:47:41
02 TS DAISY2
03 BC TN742 V17
04 SN 8DC379800
05 BR P
06 LC C11
07 OC SHIPPED FROM POS. 201, 9:47:41 TUE. JAN. 3
08 TO 1
09 ET 01-03-89 09:47:41
10
11 BT 01-03-89 13:08:35
12 TS FR1
13 BC TN742 V16
14 SN 7D7320600
15 BR P
16 LC C12
17 OC SHIPPED FROM POS. 216, 13:08:34 TUE. JAN. 3
18 TO 1
19 ET 01-03-89 13:08:35
20

The second script has two purposes: to remove unwanted fields from each
record and collapse each multiline record into a single line so the
UNIX sort utility can be used. The second script collects the serial
number, modification time, test step, action taken, and replaced part
from each multiline record in the files created by the first script as
a line. The lines are sorted by the serial number and modification
time, producing this script output:

01 4DB198010 88-12-02 200
02 4DB198010 88-12-06 default REPLACED IC17.4
03 4DC137807 88-10-02 200
04 4DC137807 88-10-04 default REPLACED C22.7
05 5D3334309 88-12-10 47
06 5D3334309 88-12-13 default REPLACED IC16.7
07 5D3388705 88-10-15 51
08 5D3388705 88-10-27 default REPLACED C32

The third script operates on the second script's output to create a
failure step, replacement relation. The script has no pattern to match
and hence the action taken is performed on every line. The data is
output in a different order: test step followed by replacement
information and serial number. The output is sorted by the serial
number. For the script to output the replaced part and fix the failing
test step on a particular serial number, there should be a line in the
input file for that serial number failing the same test step. The time
of failure should be before the replacement time (note line seven and
line eight of the first script's output). The third script's output is:

01 200 REPLACED C22.7 4DC148913
02 200 REPLACED IC17.4 4DC137811
03 200 REPLACED IC17.3 4DC148510
04 200 REPLACED IC17.3 5D6362208
05 47 REPLACED IC1 7D6311409
06 47 REPLACED IC1 8D9069509
07 47 REPLACED IC10 7D5383800
08 47 REPLACED IC11 6DC374112
09 47 REPLACED IC11 8DC375604
10 47 REPLACED IC11 8DC379805

The third script's output is the history of replacements for the same
test step collected across different serial numbers. The fourth script
(Listing 3) produces the following advice table; the output is reverse
sorted, listing the replacement with the highest frequency first:

Test Step Frequency Component
51 9 U1
51 3 K2
51 1 K3
51 1 IC17.7
51 1 C32
51 1 C20
200 2 IC17.4
200 1 IC17.3
200 1 C22.7
202 4 IC17.0
202 3 IC17.7
202 1 U4, U5
202 1 U3
202 1 U2
202 1 IC17.6
202 1 IC17.3, IC17.5

Finally, the fifth script (Listing 4) constructs the rule base from the
advice table, queries the user, matches the rule, and displays the
advice. Each cycle matches the current data with one or more of the
rules in the rule array. The appropriate rules fire, therby generating
new data that the next cycle uses.

STRENGTHS AND WEAKNESSES

Rules are stored in the relational database. The match-resolve-fire
cycle of forward-chaining is enabled by the appropriate queries. The
data, matching the left-hand side of the patterns in the queries,
enable the firing -- new data is created, matching the new patterns.
This process continues until no pattern matches any available data. The
awk-based expert-system shell is written in as few as 60 lines.

The main benefit of our approach is the ability to process unlimited
numbers of rules and data. Further, when the reasoning process is
interrupted, it can continue from the interrupted place since all of
the interim data is written on disk. This method's weakness is its slow
performance. More research is needed to devise smart techniques to
improve the time performance of reasoning on databases.


REFERENCES

1. Schur, S. "Intelligent Databases," Database Programming and Design,
June 1988, pp. 46-53.

2. Aho, A., B.W. Kernighan, and P.J. Weinberger. The Awk Programming
Language, Reading, Mass.: Addison-Wesley, 1988.

3. Tzvieli, A. "On the Coupling of a Production System Shell and a
DBMS." In Beeri, C.J.W. Schmidt, and A. Dayal, eds. Proceedings of the
Third International Conference on Data and Knowledge Based Systems,
Jersualem, Israel, 1988.

4. Quinlan, J.R. "Semi-Autonomous Acquisition of Pattern-Based
Knowledge." In Donald Michie, ed. Introductory Readings on Expert
Systems. Reading, Mass: Gordon and Breach, 1982.

5. Lirov, Y. "A Survey of Artificial Intelligence Techniques in Circuit
Pack Diagnostics." In Computers and Mathamatics with Applications 18,
pp. 381-398.

6. Lirov, Y. "STAREX-Simultaneous Test and Replace Circuit Pack
Troubleshooting Expert Systems Prototyping and Implementation."
Forthcoming in Engineering Applications of Artificial Intelligence.

7. Yau, C., "ILIAD - I Learn I Aid Diagnosis." Proceedings of the IEEE
Test Conference, 1986.

8. Forgy, C.L. OPS5 User's Manual, Technical Report CMU-CS-81-136, Dept
of Computer Science, Carnegie-Mellon University, Pittsburgh, Pa., 1981.

9. Levy, Leon S. Taming the Tiger, Berlin, W. Germany: Springer-Verlag,
1987.

Yuval Lirov and Swaminathan Ravikumar are knowledge engineers at AT&T
Bell Laboratories, Holmdel, N.J.



  3 Responses to “Category : Files from Magazines
Archive   : AWKWARD.ZIP
Filename : AWKWARD.TXT

  1. Very nice! Thank you for this wonderful archive. I wonder why I found it only now. Long live the BBS file archives!

  2. This is so awesome! 😀 I’d be cool if you could download an entire archive of this at once, though.

  3. But one thing that puzzles me is the “mtswslnkmcjklsdlsbdmMICROSOFT” string. There is an article about it here. It is definitely worth a read: http://www.os2museum.com/wp/mtswslnk/