Category : Miscellaneous Language Source Code
Archive   : MATCHLSP.ZIP
Filename : MATCH.LSP

 
Output of file : MATCH.LSP contained in archive : MATCHLSP.ZIP
;;
;; By: Scott Cothrell
;; Severn, Maryland 21144
;; Notes:
;; This implementation of the pattern matching function contains less
;; than one page of acutal code (not counting comments) and is completely
;; recursive. Match, and its sub-functions are written entirely in simple
;; lisp functions and should be completely portable (it was written using
;; PC-LISP 3.0 and run on XLISP 2.0 for ease of output. The only change
;; needed related to the fact that XLISP returns a list structure from the
;; function LAST and PC-LISP returns an atom if the last item is atomic).
;; Additionally, this implementation is not restricted to only atoms in
;; the Target list. It will work equally well with atoms and lists or
;; combinations of the two in the Target list.
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;; Function: MATCH
;; Arguments: PATTERN, TARGET
;; Descripton:
;; MATCH takes two lists, PATTERN and TARGET, as input. PATTERN is a
;; list consisting of atoms and wild-card constructs (lists). Two special
;; atoms are defined, ? and *. These two atoms are defined to be wildcards
;; with the following properties:
;;
;; ? -- a wild card atom that will match any atom in TARGET.
;; * -- a wild card atom that will match one or more atoms in the TARGET.
;;
;; The wild-card constructs are lists consisting of a wild-card
;; indicator, $ or !, and a variable to assign to the wild-card matches.
;; The wild-card indicator is the car of the wild-card construct and the
;; variable is the cadr of the construct. All other entries are ignored.
;; The constructs are defined as follows:
;;
;; ($ ) -- a wild card form that matches any atom in the TARGET and
;; assigns that atom to .
;; (! ) -- a wild card form that matches one or more atoms in the
;; TARGET and assigns the list of the matched atoms to .
;;
;; MATCH has the form (MATCH PATTERN TARGET) where PATTERN has the above
;; described form. TARGET is a list containing atoms and lists in any
;; combination.
;;
;; Operation:
;; MATCH is basically a high level shell for its major function, MATCH-1.
;; MATCH calls MATCH-1 with PATTERN and TARGET, expecting MATCH-1 to return
;; nil if the pattern is not found in TARGET within the length of PATTERN.
;; In this case, MATCH recursively calls itself with PATTERN and the cdr of
;; TARGET. This action is repeated until either MATCH-1 returns a non-nil
;; value or TARGET is exhausted. If a non-nil value is returned by MATCH-1,
;; then MATCH returns true, t. Otherwise, MATCH returns nil.
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

(defun match (pattern target)
(cond
((null target) nil) ;; a little error checking, pattern and target
((null pattern) nil) ;; must be non-nil
((match-1 pattern target) t) ;; first call to match-1. If it returns a
;; non-nil value, then return t.
(t (match pattern (cdr target))) ;; Otherwise recurse on match with
;; pattern and the cdr of target. In effect
;; this slides the search window to the
;; right by one atom.
) ;; cond
) ;; match
;;
;;
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Function: MATCH-1
;; Arguments: two lists, PATTERN and TARGET, conforming to the rules for
;; the function MATCH.
;; Returns: nil -- if no match was found between the current pattern and target
;; t or a non-nil value -- if a pattern match was found with pattern
;; and target.
;; Operation:
;; MATCH-1 is the evaluating routine. Pattern is checked to see if it is
;; null. If null then a t is returned indicating that the pattern was found
;; in the supplied target. It can be shown that, due to the checking of the
;; input in MATCH, this function cannot be called with a nil pattern unless
;; it is due to some recursion. If pattern and target are both called with
;; nil assignments, the previous iteration had a match, wild-card or otherwise
;; that remove the only atom from both the pattern and target lists. This
;; indicates that all items for pattern were matched and the function
;; returns true.
;; After Pattern is checked, Target is checked to see if it is null. In
;; this case, the pattern is not yet exhausted but the target is. This is
;; a clear case to return nil. The routine could have returned some sort of
;; flag to indicate specifically that Target was exhausted, allowing
;; controlling software to terminate the search early, since subsequent calls
;; would use the cdr of Target which, being smaller, would most likely fail
;; also. It was determined to NOT do this to keep the complexity down and
;; instead rely on a simple true/false boolean style response.
;; Assuming that neither Pattern nor Target are null, the car of pattern
;; is next checked to see if it is an atomic element. If atomic, the car
;; of pattern is compared to the wild-card symbols and to the car of the
;; Target. A match is processed by calling MATCH-1 recursively with the
;; remainder of the Pattern and Target. A mis-match will return nil.
;; If the car of Pattern is not atomic then it is assumed to be a list
;; (since it cannot be the nil list which was checked for earlier) construct
;; that denotes a wild-card. The first element of this construct is tested
;; for equality with the wild-card list construct delimiters. A match is
;; processed in a recursive fashion using Pattern and Target. A mis-match
;; is assumed to be an error in the Pattern and will result in a nil return
;; value.
;; The actual details of each step can be found in the comments below.
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(defun match-1 (pattern target)
(cond

;; if pattern is exhausted then all search parameters have been satisfied.
;;
((null pattern) t)

;; if target is exhausted then we can't satisfy the search, return nil.
;;
((null target) nil)

;; the car of pattern is an atom, then process as an atom
;; and not a wild-card construct.
;;
((atom (car pattern))
(cond

;; if the first atom of pattern is a '?', then discard the '?' and
;; the car of target. ? matched the car of target.
;; recurse using cdr pattern and cdr target to check the rest of
;; the pattern against the rest of the target.
;;
((equal (car pattern) '?)
(match-1 (cdr pattern) (cdr target)))

;; if the first atom of pattern matches the first atom of target.
;; then recurse using cdr pattern and cdr target to check the rest
;; of the pattern against the rest of the target.
;;
((equal (car pattern) (car target))
(match-1 (cdr pattern) (cdr target)))

;; if the first atom of pattern is a '*', then call match-3 to
;; serve as a higher level control routine for recursion on match-1.
;; Match-3 will keep the history of *. If the history ends in 'end'
;; then match-3 found a failure after the wild card expansion, so
;; return nil for this iteration of the search. Otherwise, return
;; the list that contains the expansion of the wild-card. This is a
;; non-nil value, so match will interpret it as true.
;;
((equal (car pattern) '*)
(cond
((equal (last (match-3 (cdr pattern) (cdr target))) '(end)) nil)
(t (cons (car target) (match-3 (cdr pattern) (cdr target))))
) ;; cond
) ;; equal

;; if the atom didn't match one of the above cases then we have a
;; complete mis-match, so return nil.
;;
(t nil)
) ;; cond
) ;; atom

;; At this point, we should have only a list of sort as the first element
;; of pattern.
;;
;; Check the first element of the first element of pattern,
;; (car (car pattern)). If this element is '$' then we have a single
;; atom wild-card. Set the value of the 2nd atom of the list to the value
;; of the first element of target. Recurse with the remainder of pattern
;; and target. The variable will retain its value even if a match is not
;; found. The variable may be changed by later invocations of the same
;; wild-card form (it is not expanded or substituted for later occurances
;;
((equal (caar pattern) '$)
(set (cadar pattern) (car target))
(match-1 (cdr pattern) (cdr target)) )

;; Check the list for the '!' wild-card construct.
;; Call match-3 to act as a history mechanism for repeated calls to
;; match-1. Match-3 will return the list which contains the atoms
;; to be assigned to the wild-card variable. If the list has the last
;; element of 'end', then the call to match-3 in invalidated and nil is
;; returned. If the last element is not 'end', then the call was
;; successful. The same call is repeated within a cons'ing structure
;; to build the complete list of atoms (the car of target was discarded
;; in the call to match-3) and the result is 'set' as the value of the
;; wild-card variable.
;;
((equal (caar pattern) '!)
(cond
((equal (last (match-3 (cdr pattern) (cdr target))) '(end)) nil)
(t (set (cadar pattern) (cons (car target) (match-3
(cdr pattern) (cdr target)))))
) ;; cond
) ;; equal

;; All other constructs are not supported and return nil
;;
(t nil)

) ;; cond
) ;; match-1
;;
;;
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Function: MATCH-3
;; Arguments: two lists, PATTERN and TARGET, conforming to the rules for
;; MATCH.
;; Returns: a list. The last element of list is a flag for the entire
;; list's validity. If the last element is 'end', then the list is
;; not valid and should be discarded. Otherwise the entire list,
;; including the last element is the list of atoms matched by the
;; wild-card.
;; Operation:
;; MATCH-3 is a control and memory routine for MATCH-1. It operates
;; similar to MATCH with the addition of a history mechanism. It is invoked
;; by MATCH-1 to controlt the recursion and collect the history for the
;; expansion of multi-atom wild-cards. This routine, given a pattern and a
;; target, will call MATCH-1. If MATCH-1 returns nil, then MATCH-3 will
;; call itself recursively, ala MATCH, with the cdr of target. This will
;; continue until target is exhausted or MATCH-1 finds pattern in target.
;; If target is exhausted, then MATCH-3 constructs the list (end) at that
;; level. Unwinding the recursion levels will cons additional atoms to
;; (end) but the top level at MATCH-1 will throw the list away when it
;; checks for 'end as the last atom of the list. In this case, the atom
;; 'end is being used as a flag to indicate that MATCH-3 was not successful.
;; If, on the other hand, at some level of recursion, MATCH-1 finds a match,
;; MATCH-3 will return nil. As the levels of recursion unwind, higher
;; levels will cons elements to (), producing the list containing the atoms
;; matched by the wild-card.
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(defun match-3 (pattern target)
(cond
((null target) (cons 'end ()) ) ;; if target is empty, put 'end at the
;; end of the return list.
((null (match-1 pattern target)) ;; use match-1 to check for a match
;; from the current point.
;; if there was no match, then save the
;; first element of target as a match to
;; the wild-card and try with the rest of
;; the target list.
;;
(cons (car target) (match-3 pattern (cdr target))))

;; at this point, target is not null and the last call to match-1 was
;; successful. Return nil to cons the wild-card matching atoms to in
;; order to form a list of those atoms matched by the wild-card.
;;
(t nil)
) ;; cond
) ;; match-3




  3 Responses to “Category : Miscellaneous Language Source Code
Archive   : MATCHLSP.ZIP
Filename : MATCH.LSP

  1. Very nice! Thank you for this wonderful archive. I wonder why I found it only now. Long live the BBS file archives!

  2. This is so awesome! 😀 I’d be cool if you could download an entire archive of this at once, though.

  3. But one thing that puzzles me is the “mtswslnkmcjklsdlsbdmMICROSOFT” string. There is an article about it here. It is definitely worth a read: http://www.os2museum.com/wp/mtswslnk/