source: project/release/4/matchable/trunk/match.scm @ 33292

Last change on this file since 33292 was 33292, checked in by Alex Shinn, 5 years ago

Adding missing files (fixes issue #1275).

File size: 37.7 KB
Line 
1;;;; match.scm -- portable hygienic pattern matcher -*- coding: utf-8 -*-
2;;
3;; This code is written by Alex Shinn and placed in the
4;; Public Domain.  All warranties are disclaimed.
5
6;;> \example-import[(srfi 9)]
7
8;;> A portable hygienic pattern matcher.
9
10;;> This is a full superset of the popular \hyperlink[
11;;> "http://www.cs.indiana.edu/scheme-repository/code.match.html"]{match}
12;;> package by Andrew Wright, written in fully portable \scheme{syntax-rules}
13;;> and thus preserving hygiene.
14
15;;> The most notable extensions are the ability to use \emph{non-linear}
16;;> patterns - patterns in which the same identifier occurs multiple
17;;> times, tail patterns after ellipsis, and the experimental tree patterns.
18
19;;> \section{Patterns}
20
21;;> Patterns are written to look like the printed representation of
22;;> the objects they match.  The basic usage is
23
24;;> \scheme{(match expr (pat body ...) ...)}
25
26;;> where the result of \var{expr} is matched against each pattern in
27;;> turn, and the corresponding body is evaluated for the first to
28;;> succeed.  Thus, a list of three elements matches a list of three
29;;> elements.
30
31;;> \example{(let ((ls (list 1 2 3))) (match ls ((1 2 3) #t)))}
32
33;;> If no patterns match an error is signalled.
34
35;;> Identifiers will match anything, and make the corresponding
36;;> binding available in the body.
37
38;;> \example{(match (list 1 2 3) ((a b c) b))}
39
40;;> If the same identifier occurs multiple times, the first instance
41;;> will match anything, but subsequent instances must match a value
42;;> which is \scheme{equal?} to the first.
43
44;;> \example{(match (list 1 2 1) ((a a b) 1) ((a b a) 2))}
45
46;;> The special identifier \scheme{_} matches anything, no matter how
47;;> many times it is used, and does not bind the result in the body.
48
49;;> \example{(match (list 1 2 1) ((_ _ b) 1) ((a b a) 2))}
50
51;;> To match a literal identifier (or list or any other literal), use
52;;> \scheme{quote}.
53
54;;> \example{(match 'a ('b 1) ('a 2))}
55
56;;> Analogous to its normal usage in scheme, \scheme{quasiquote} can
57;;> be used to quote a mostly literally matching object with selected
58;;> parts unquoted.
59
60;;> \example|{(match (list 1 2 3) (`(1 ,b ,c) (list b c)))}|
61
62;;> Often you want to match any number of a repeated pattern.  Inside
63;;> a list pattern you can append \scheme{...} after an element to
64;;> match zero or more of that pattern (like a regexp Kleene star).
65
66;;> \example{(match (list 1 2) ((1 2 3 ...) #t))}
67;;> \example{(match (list 1 2 3) ((1 2 3 ...) #t))}
68;;> \example{(match (list 1 2 3 3 3) ((1 2 3 ...) #t))}
69
70;;> Pattern variables matched inside the repeated pattern are bound to
71;;> a list of each matching instance in the body.
72
73;;> \example{(match (list 1 2) ((a b c ...) c))}
74;;> \example{(match (list 1 2 3) ((a b c ...) c))}
75;;> \example{(match (list 1 2 3 4 5) ((a b c ...) c))}
76
77;;> More than one \scheme{...} may not be used in the same list, since
78;;> this would require exponential backtracking in the general case.
79;;> However, \scheme{...} need not be the final element in the list,
80;;> and may be succeeded by a fixed number of patterns.
81
82;;> \example{(match (list 1 2 3 4) ((a b c ... d e) c))}
83;;> \example{(match (list 1 2 3 4 5) ((a b c ... d e) c))}
84;;> \example{(match (list 1 2 3 4 5 6 7) ((a b c ... d e) c))}
85
86;;> \scheme{___} is provided as an alias for \scheme{...} when it is
87;;> inconvenient to use the ellipsis (as in a syntax-rules template).
88
89;;> The \scheme{..1} syntax is exactly like the \scheme{...} except
90;;> that it matches one or more repetitions (like a regexp "+").
91
92;;> \example{(match (list 1 2) ((a b c ..1) c))}
93;;> \example{(match (list 1 2 3) ((a b c ..1) c))}
94
95;;> The boolean operators \scheme{and}, \scheme{or} and \scheme{not}
96;;> can be used to group and negate patterns analogously to their
97;;> Scheme counterparts.
98
99;;> The \scheme{and} operator ensures that all subpatterns match.
100;;> This operator is often used with the idiom \scheme{(and x pat)} to
101;;> bind \var{x} to the entire value that matches \var{pat}
102;;> (c.f. "as-patterns" in ML or Haskell).  Another common use is in
103;;> conjunction with \scheme{not} patterns to match a general case
104;;> with certain exceptions.
105
106;;> \example{(match 1 ((and) #t))}
107;;> \example{(match 1 ((and x) x))}
108;;> \example{(match 1 ((and x 1) x))}
109
110;;> The \scheme{or} operator ensures that at least one subpattern
111;;> matches.  If the same identifier occurs in different subpatterns,
112;;> it is matched independently.  All identifiers from all subpatterns
113;;> are bound if the \scheme{or} operator matches, but the binding is
114;;> only defined for identifiers from the subpattern which matched.
115
116;;> \example{(match 1 ((or) #t) (else #f))}
117;;> \example{(match 1 ((or x) x))}
118;;> \example{(match 1 ((or x 2) x))}
119
120;;> The \scheme{not} operator succeeds if the given pattern doesn't
121;;> match.  None of the identifiers used are available in the body.
122
123;;> \example{(match 1 ((not 2) #t))}
124
125;;> The more general operator \scheme{?} can be used to provide a
126;;> predicate.  The usage is \scheme{(? predicate pat ...)} where
127;;> \var{predicate} is a Scheme expression evaluating to a predicate
128;;> called on the value to match, and any optional patterns after the
129;;> predicate are then matched as in an \scheme{and} pattern.
130
131;;> \example{(match 1 ((? odd? x) x))}
132
133;;> The field operator \scheme{=} is used to extract an arbitrary
134;;> field and match against it.  It is useful for more complex or
135;;> conditional destructuring that can't be more directly expressed in
136;;> the pattern syntax.  The usage is \scheme{(= field pat)}, where
137;;> \var{field} can be any expression, and should result in a
138;;> procedure of one argument, which is applied to the value to match
139;;> to generate a new value to match against \var{pat}.
140
141;;> Thus the pattern \scheme{(and (= car x) (= cdr y))} is equivalent
142;;> to \scheme{(x . y)}, except it will result in an immediate error
143;;> if the value isn't a pair.
144
145;;> \example{(match '(1 . 2) ((= car x) x))}
146;;> \example{(match 4 ((= square x) x))}
147
148;;> The record operator \scheme{$} is used as a concise way to match
149;;> records defined by SRFI-9 (or SRFI-99).  The usage is
150;;> \scheme{($ rtd field ...)}, where \var{rtd} should be the record
151;;> type descriptor specified as the first argument to
152;;> \scheme{define-record-type}, and each \var{field} is a subpattern
153;;> matched against the fields of the record in order.  Not all fields
154;;> must be present.
155
156;;> \example{
157;;> (let ()
158;;>   (define-record-type employee
159;;>     (make-employee name title)
160;;>     employee?
161;;>     (name get-name)
162;;>     (title get-title))
163;;>   (match (make-employee "Bob" "Doctor")
164;;>     (($ employee n t) (list t n))))
165;;> }
166
167;;> For records with more fields it can be helpful to match them by
168;;> name rather than position.  For this you can use the \scheme{@}
169;;> operator, originally a Gauche extension:
170
171;;> \example{
172;;> (let ()
173;;>   (define-record-type employee
174;;>     (make-employee name title)
175;;>     employee?
176;;>     (name get-name)
177;;>     (title get-title))
178;;>   (match (make-employee "Bob" "Doctor")
179;;>     ((@ employee (title t) (name n)) (list t n))))
180;;> }
181
182;;> The \scheme{set!} and \scheme{get!} operators are used to bind an
183;;> identifier to the setter and getter of a field, respectively.  The
184;;> setter is a procedure of one argument, which mutates the field to
185;;> that argument.  The getter is a procedure of no arguments which
186;;> returns the current value of the field.
187
188;;> \example{(let ((x (cons 1 2))) (match x ((1 . (set! s)) (s 3) x)))}
189;;> \example{(match '(1 . 2) ((1 . (get! g)) (g)))}
190
191;;> The new operator \scheme{***} can be used to search a tree for
192;;> subpatterns.  A pattern of the form \scheme{(x *** y)} represents
193;;> the subpattern \var{y} located somewhere in a tree where the path
194;;> from the current object to \var{y} can be seen as a list of the
195;;> form \scheme{(x ...)}.  \var{y} can immediately match the current
196;;> object in which case the path is the empty list.  In a sense it's
197;;> a 2-dimensional version of the \scheme{...} pattern.
198
199;;> As a common case the pattern \scheme{(_ *** y)} can be used to
200;;> search for \var{y} anywhere in a tree, regardless of the path
201;;> used.
202
203;;> \example{(match '(a (a (a b))) ((x *** 'b) x))}
204;;> \example{(match '(a (b) (c (d e) (f g))) ((x *** 'g) x))}
205
206;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
207;; Notes
208
209;; The implementation is a simple generative pattern matcher - each
210;; pattern is expanded into the required tests, calling a failure
211;; continuation if the tests fail.  This makes the logic easy to
212;; follow and extend, but produces sub-optimal code in cases where you
213;; have many similar clauses due to repeating the same tests.
214;; Nonetheless a smart compiler should be able to remove the redundant
215;; tests.  For MATCH-LET and DESTRUCTURING-BIND type uses there is no
216;; performance hit.
217
218;; The original version was written on 2006/11/29 and described in the
219;; following Usenet post:
220;;   http://groups.google.com/group/comp.lang.scheme/msg/0941234de7112ffd
221;; and is still available at
222;;   http://synthcode.com/scheme/match-simple.scm
223;; It's just 80 lines for the core MATCH, and an extra 40 lines for
224;; MATCH-LET, MATCH-LAMBDA and other syntactic sugar.
225;;
226;; A variant of this file which uses COND-EXPAND in a few places for
227;; performance can be found at
228;;   http://synthcode.com/scheme/match-cond-expand.scm
229;;
230;; 2016/03/06 - fixing named match-let (thanks to Stefan Israelsson Tampe)
231;; 2015/05/09 - fixing bug in var extraction of quasiquote patterns
232;; 2014/11/24 - adding Gauche's `@' pattern for named record field matching
233;; 2012/12/26 - wrapping match-let&co body in lexical closure
234;; 2012/11/28 - fixing typo s/vetor/vector in largely unused set! code
235;; 2012/05/23 - fixing combinatorial explosion of code in certain or patterns
236;; 2011/09/25 - fixing bug when directly matching an identifier repeated in
237;;              the pattern (thanks to Stefan Israelsson Tampe)
238;; 2011/01/27 - fixing bug when matching tail patterns against improper lists
239;; 2010/09/26 - adding `..1' patterns (thanks to Ludovic CourtÚs)
240;; 2010/09/07 - fixing identifier extraction in some `...' and `***' patterns
241;; 2009/11/25 - adding `***' tree search patterns
242;; 2008/03/20 - fixing bug where (a ...) matched non-lists
243;; 2008/03/15 - removing redundant check in vector patterns
244;; 2008/03/06 - you can use `...' portably now (thanks to Taylor Campbell)
245;; 2007/09/04 - fixing quasiquote patterns
246;; 2007/07/21 - allowing ellipsis patterns in non-final list positions
247;; 2007/04/10 - fixing potential hygiene issue in match-check-ellipsis
248;;              (thanks to Taylor Campbell)
249;; 2007/04/08 - clean up, commenting
250;; 2006/12/24 - bugfixes
251;; 2006/12/01 - non-linear patterns, shared variables in OR, get!/set!
252
253;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
254;; force compile-time syntax errors with useful messages
255
256(define-syntax match-syntax-error
257  (syntax-rules ()
258    ((_) (match-syntax-error "invalid match-syntax-error usage"))))
259
260;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
261
262;;> \section{Syntax}
263
264;;> \macro{(match expr (pattern . body) ...)\br{}
265;;> (match expr (pattern (=> failure) . body) ...)}
266
267;;> The result of \var{expr} is matched against each \var{pattern} in
268;;> turn, according to the pattern rules described in the previous
269;;> section, until the the first \var{pattern} matches.  When a match is
270;;> found, the corresponding \var{body}s are evaluated in order,
271;;> and the result of the last expression is returned as the result
272;;> of the entire \scheme{match}.  If a \var{failure} is provided,
273;;> then it is bound to a procedure of no arguments which continues,
274;;> processing at the next \var{pattern}.  If no \var{pattern} matches,
275;;> an error is signalled.
276
277;; The basic interface.  MATCH just performs some basic syntax
278;; validation, binds the match expression to a temporary variable `v',
279;; and passes it on to MATCH-NEXT.  It's a constant throughout the
280;; code below that the binding `v' is a direct variable reference, not
281;; an expression.
282
283(define-syntax match
284  (syntax-rules ()
285    ((match)
286     (match-syntax-error "missing match expression"))
287    ((match atom)
288     (match-syntax-error "no match clauses"))
289    ((match (app ...) (pat . body) ...)
290     (let ((v (app ...)))
291       (match-next v ((app ...) (set! (app ...))) (pat . body) ...)))
292    ((match #(vec ...) (pat . body) ...)
293     (let ((v #(vec ...)))
294       (match-next v (v (set! v)) (pat . body) ...)))
295    ((match atom (pat . body) ...)
296     (let ((v atom))
297       (match-next v (atom (set! atom)) (pat . body) ...)))
298    ))
299
300;; MATCH-NEXT passes each clause to MATCH-ONE in turn with its failure
301;; thunk, which is expanded by recursing MATCH-NEXT on the remaining
302;; clauses.  `g+s' is a list of two elements, the get! and set!
303;; expressions respectively.
304
305(define-syntax match-next
306  (syntax-rules (=>)
307    ;; no more clauses, the match failed
308    ((match-next v g+s)
309     (error 'match "no matching pattern"))
310    ;; named failure continuation
311    ((match-next v g+s (pat (=> failure) . body) . rest)
312     (let ((failure (lambda () (match-next v g+s . rest))))
313       ;; match-one analyzes the pattern for us
314       (match-one v pat g+s (match-drop-ids (begin . body)) (failure) ())))
315    ;; anonymous failure continuation, give it a dummy name
316    ((match-next v g+s (pat . body) . rest)
317     (match-next v g+s (pat (=> failure) . body) . rest))))
318
319;; MATCH-ONE first checks for ellipsis patterns, otherwise passes on to
320;; MATCH-TWO.
321
322(define-syntax match-one
323  (syntax-rules ()
324    ;; If it's a list of two or more values, check to see if the
325    ;; second one is an ellipsis and handle accordingly, otherwise go
326    ;; to MATCH-TWO.
327    ((match-one v (p q . r) g+s sk fk i)
328     (match-check-ellipsis
329      q
330      (match-extract-vars p (match-gen-ellipsis v p r  g+s sk fk i) i ())
331      (match-two v (p q . r) g+s sk fk i)))
332    ;; Go directly to MATCH-TWO.
333    ((match-one . x)
334     (match-two . x))))
335
336;; This is the guts of the pattern matcher.  We are passed a lot of
337;; information in the form:
338;;
339;;   (match-two var pattern getter setter success-k fail-k (ids ...))
340;;
341;; usually abbreviated
342;;
343;;   (match-two v p g+s sk fk i)
344;;
345;; where VAR is the symbol name of the current variable we are
346;; matching, PATTERN is the current pattern, getter and setter are the
347;; corresponding accessors (e.g. CAR and SET-CAR! of the pair holding
348;; VAR), SUCCESS-K is the success continuation, FAIL-K is the failure
349;; continuation (which is just a thunk call and is thus safe to expand
350;; multiple times) and IDS are the list of identifiers bound in the
351;; pattern so far.
352
353(define-syntax match-two
354  (syntax-rules (_ ___ ..1 *** quote quasiquote ? $ struct @ object = and or not set! get!)
355    ((match-two v () g+s (sk ...) fk i)
356     (if (null? v) (sk ... i) fk))
357    ((match-two v (quote p) g+s (sk ...) fk i)
358     (if (equal? v 'p) (sk ... i) fk))
359    ((match-two v (quasiquote p) . x)
360     (match-quasiquote v p . x))
361    ((match-two v (and) g+s (sk ...) fk i) (sk ... i))
362    ((match-two v (and p q ...) g+s sk fk i)
363     (match-one v p g+s (match-one v (and q ...) g+s sk fk) fk i))
364    ((match-two v (or) g+s sk fk i) fk)
365    ((match-two v (or p) . x)
366     (match-one v p . x))
367    ((match-two v (or p ...) g+s sk fk i)
368     (match-extract-vars (or p ...) (match-gen-or v (p ...) g+s sk fk i) i ()))
369    ((match-two v (not p) g+s (sk ...) fk i)
370     (match-one v p g+s (match-drop-ids fk) (sk ... i) i))
371    ((match-two v (get! getter) (g s) (sk ...) fk i)
372     (let ((getter (lambda () g))) (sk ... i)))
373    ((match-two v (set! setter) (g (s ...)) (sk ...) fk i)
374     (let ((setter (lambda (x) (s ... x)))) (sk ... i)))
375    ((match-two v (? pred . p) g+s sk fk i)
376     (if (pred v) (match-one v (and . p) g+s sk fk i) fk))
377    ((match-two v (= proc p) . x)
378     (let ((w (proc v))) (match-one w p . x)))
379    ((match-two v (p ___ . r) g+s sk fk i)
380     (match-extract-vars p (match-gen-ellipsis v p r g+s sk fk i) i ()))
381    ((match-two v (p) g+s sk fk i)
382     (if (and (pair? v) (null? (cdr v)))
383         (let ((w (car v)))
384           (match-one w p ((car v) (set-car! v)) sk fk i))
385         fk))
386    ((match-two v (p *** q) g+s sk fk i)
387     (match-extract-vars p (match-gen-search v p q g+s sk fk i) i ()))
388    ((match-two v (p *** . q) g+s sk fk i)
389     (match-syntax-error "invalid use of ***" (p *** . q)))
390    ((match-two v (p ..1) g+s sk fk i)
391     (if (pair? v)
392         (match-one v (p ___) g+s sk fk i)
393         fk))
394    ((match-two v ($ rec p ...) g+s sk fk i)
395     (if (is-a? v rec)
396         (match-record-refs v rec 0 (p ...) g+s sk fk i)
397         fk))
398    ((match-two v (struct rec p ...) g+s sk fk i)
399     (if (is-a? v rec)
400         (match-record-refs v rec 0 (p ...) g+s sk fk i)
401         fk))
402    ((match-two v (@ rec p ...) g+s sk fk i)
403     (if (is-a? v rec)
404         (match-record-named-refs v rec (p ...) g+s sk fk i)
405         fk))
406    ((match-two v (object rec p ...) g+s sk fk i)
407     (if (is-a? v rec)
408         (match-record-named-refs v rec (p ...) g+s sk fk i)
409         fk))
410    ((match-two v (p . q) g+s sk fk i)
411     (if (pair? v)
412         (let ((w (car v)) (x (cdr v)))
413           (match-one w p ((car v) (set-car! v))
414                      (match-one x q ((cdr v) (set-cdr! v)) sk fk)
415                      fk
416                      i))
417         fk))
418    ((match-two v #(p ...) g+s . x)
419     (match-vector v 0 () (p ...) . x))
420    ((match-two v _ g+s (sk ...) fk i) (sk ... i))
421    ;; Not a pair or vector or special literal, test to see if it's a
422    ;; new symbol, in which case we just bind it, or if it's an
423    ;; already bound symbol or some other literal, in which case we
424    ;; compare it with EQUAL?.
425    ((match-two v x g+s (sk ...) fk (id ...))
426     (let-syntax
427         ((new-sym?
428           (syntax-rules (id ...)
429             ((new-sym? x sk2 fk2) sk2)
430             ((new-sym? y sk2 fk2) fk2))))
431       (new-sym? random-sym-to-match
432                 (let ((x v)) (sk ... (id ... x)))
433                 (if (equal? v x) (sk ... (id ...)) fk))))
434    ))
435
436;; QUASIQUOTE patterns
437
438(define-syntax match-quasiquote
439  (syntax-rules (unquote unquote-splicing quasiquote)
440    ((_ v (unquote p) g+s sk fk i)
441     (match-one v p g+s sk fk i))
442    ((_ v ((unquote-splicing p) . rest) g+s sk fk i)
443     (if (pair? v)
444       (match-one v
445                  (p . tmp)
446                  (match-quasiquote tmp rest g+s sk fk)
447                  fk
448                  i)
449       fk))
450    ((_ v (quasiquote p) g+s sk fk i . depth)
451     (match-quasiquote v p g+s sk fk i #f . depth))
452    ((_ v (unquote p) g+s sk fk i x . depth)
453     (match-quasiquote v p g+s sk fk i . depth))
454    ((_ v (unquote-splicing p) g+s sk fk i x . depth)
455     (match-quasiquote v p g+s sk fk i . depth))
456    ((_ v (p . q) g+s sk fk i . depth)
457     (if (pair? v)
458       (let ((w (car v)) (x (cdr v)))
459         (match-quasiquote
460          w p g+s
461          (match-quasiquote-step x q g+s sk fk depth)
462          fk i . depth))
463       fk))
464    ((_ v #(elt ...) g+s sk fk i . depth)
465     (if (vector? v)
466       (let ((ls (vector->list v)))
467         (match-quasiquote ls (elt ...) g+s sk fk i . depth))
468       fk))
469    ((_ v x g+s sk fk i . depth)
470     (match-one v 'x g+s sk fk i))))
471
472(define-syntax match-quasiquote-step
473  (syntax-rules ()
474    ((match-quasiquote-step x q g+s sk fk depth i)
475     (match-quasiquote x q g+s sk fk i . depth))))
476
477;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
478;; Utilities
479
480;; Takes two values and just expands into the first.
481(define-syntax match-drop-ids
482  (syntax-rules ()
483    ((_ expr ids ...) expr)))
484
485(define-syntax match-tuck-ids
486  (syntax-rules ()
487    ((_ (letish args (expr ...)) ids ...)
488     (letish args (expr ... ids ...)))))
489
490(define-syntax match-drop-first-arg
491  (syntax-rules ()
492    ((_ arg expr) expr)))
493
494;; To expand an OR group we try each clause in succession, passing the
495;; first that succeeds to the success continuation.  On failure for
496;; any clause, we just try the next clause, finally resorting to the
497;; failure continuation fk if all clauses fail.  The only trick is
498;; that we want to unify the identifiers, so that the success
499;; continuation can refer to a variable from any of the OR clauses.
500
501(define-syntax match-gen-or
502  (syntax-rules ()
503    ((_ v p g+s (sk ...) fk (i ...) ((id id-ls) ...))
504     (let ((sk2 (lambda (id ...) (sk ... (i ... id ...)))))
505       (match-gen-or-step v p g+s (match-drop-ids (sk2 id ...)) fk (i ...))))))
506
507(define-syntax match-gen-or-step
508  (syntax-rules ()
509    ((_ v () g+s sk fk . x)
510     ;; no OR clauses, call the failure continuation
511     fk)
512    ((_ v (p) . x)
513     ;; last (or only) OR clause, just expand normally
514     (match-one v p . x))
515    ((_ v (p . q) g+s sk fk i)
516     ;; match one and try the remaining on failure
517     (let ((fk2 (lambda () (match-gen-or-step v q g+s sk fk i))))
518       (match-one v p g+s sk (fk2) i)))
519    ))
520
521;; We match a pattern (p ...) by matching the pattern p in a loop on
522;; each element of the variable, accumulating the bound ids into lists.
523
524;; Look at the body of the simple case - it's just a named let loop,
525;; matching each element in turn to the same pattern.  The only trick
526;; is that we want to keep track of the lists of each extracted id, so
527;; when the loop recurses we cons the ids onto their respective list
528;; variables, and on success we bind the ids (what the user input and
529;; expects to see in the success body) to the reversed accumulated
530;; list IDs.
531
532(define-syntax match-gen-ellipsis
533  (syntax-rules ()
534    ((_ v p () g+s (sk ...) fk i ((id id-ls) ...))
535     (match-check-identifier p
536       ;; simplest case equivalent to (p ...), just bind the list
537       (let ((p v))
538         (if (list? p)
539             (sk ... i)
540             fk))
541       ;; simple case, match all elements of the list
542       (let loop ((ls v) (id-ls '()) ...)
543         (cond
544           ((null? ls)
545            (let ((id (reverse id-ls)) ...) (sk ... i)))
546           ((pair? ls)
547            (let ((w (car ls)))
548              (match-one w p ((car ls) (set-car! ls))
549                         (match-drop-ids (loop (cdr ls) (cons id id-ls) ...))
550                         fk i)))
551           (else
552            fk)))))
553    ((_ v p r g+s (sk ...) fk i ((id id-ls) ...))
554     ;; general case, trailing patterns to match, keep track of the
555     ;; remaining list length so we don't need any backtracking
556     (match-verify-no-ellipsis
557      r
558      (let* ((tail-len (length 'r))
559             (ls v)
560             (len (and (list? ls) (length ls))))
561        (if (or (not len) (< len tail-len))
562            fk
563            (let loop ((ls ls) (n len) (id-ls '()) ...)
564              (cond
565                ((= n tail-len)
566                 (let ((id (reverse id-ls)) ...)
567                   (match-one ls r (#f #f) (sk ...) fk i)))
568                ((pair? ls)
569                 (let ((w (car ls)))
570                   (match-one w p ((car ls) (set-car! ls))
571                              (match-drop-ids
572                               (loop (cdr ls) (- n 1) (cons id id-ls) ...))
573                              fk
574                              i)))
575                (else
576                 fk)))))))))
577
578;; This is just a safety check.  Although unlike syntax-rules we allow
579;; trailing patterns after an ellipsis, we explicitly disable multiple
580;; ellipsis at the same level.  This is because in the general case
581;; such patterns are exponential in the number of ellipsis, and we
582;; don't want to make it easy to construct very expensive operations
583;; with simple looking patterns.  For example, it would be O(n^2) for
584;; patterns like (a ... b ...) because we must consider every trailing
585;; element for every possible break for the leading "a ...".
586
587(define-syntax match-verify-no-ellipsis
588  (syntax-rules ()
589    ((_ (x . y) sk)
590     (match-check-ellipsis
591      x
592      (match-syntax-error
593       "multiple ellipsis patterns not allowed at same level")
594      (match-verify-no-ellipsis y sk)))
595    ((_ () sk)
596     sk)
597    ((_ x sk)
598     (match-syntax-error "dotted tail not allowed after ellipsis" x))))
599
600;; To implement the tree search, we use two recursive procedures.  TRY
601;; attempts to match Y once, and on success it calls the normal SK on
602;; the accumulated list ids as in MATCH-GEN-ELLIPSIS.  On failure, we
603;; call NEXT which first checks if the current value is a list
604;; beginning with X, then calls TRY on each remaining element of the
605;; list.  Since TRY will recursively call NEXT again on failure, this
606;; effects a full depth-first search.
607;;
608;; The failure continuation throughout is a jump to the next step in
609;; the tree search, initialized with the original failure continuation
610;; FK.
611
612(define-syntax match-gen-search
613  (syntax-rules ()
614    ((match-gen-search v p q g+s sk fk i ((id id-ls) ...))
615     (letrec ((try (lambda (w fail id-ls ...)
616                     (match-one w q g+s
617                                (match-tuck-ids
618                                 (let ((id (reverse id-ls)) ...)
619                                   sk))
620                                (next w fail id-ls ...) i)))
621              (next (lambda (w fail id-ls ...)
622                      (if (not (pair? w))
623                          (fail)
624                          (let ((u (car w)))
625                            (match-one
626                             u p ((car w) (set-car! w))
627                             (match-drop-ids
628                              ;; accumulate the head variables from
629                              ;; the p pattern, and loop over the tail
630                              (let ((id-ls (cons id id-ls)) ...)
631                                (let lp ((ls (cdr w)))
632                                  (if (pair? ls)
633                                      (try (car ls)
634                                           (lambda () (lp (cdr ls)))
635                                           id-ls ...)
636                                      (fail)))))
637                             (fail) i))))))
638       ;; the initial id-ls binding here is a dummy to get the right
639       ;; number of '()s
640       (let ((id-ls '()) ...)
641         (try v (lambda () fk) id-ls ...))))))
642
643;; Vector patterns are just more of the same, with the slight
644;; exception that we pass around the current vector index being
645;; matched.
646
647(define-syntax match-vector
648  (syntax-rules (___)
649    ((_ v n pats (p q) . x)
650     (match-check-ellipsis q
651                          (match-gen-vector-ellipsis v n pats p . x)
652                          (match-vector-two v n pats (p q) . x)))
653    ((_ v n pats (p ___) sk fk i)
654     (match-gen-vector-ellipsis v n pats p sk fk i))
655    ((_ . x)
656     (match-vector-two . x))))
657
658;; Check the exact vector length, then check each element in turn.
659
660(define-syntax match-vector-two
661  (syntax-rules ()
662    ((_ v n ((pat index) ...) () sk fk i)
663     (if (vector? v)
664         (let ((len (vector-length v)))
665           (if (= len n)
666               (match-vector-step v ((pat index) ...) sk fk i)
667               fk))
668         fk))
669    ((_ v n (pats ...) (p . q) . x)
670     (match-vector v (+ n 1) (pats ... (p n)) q . x))))
671
672(define-syntax match-vector-step
673  (syntax-rules ()
674    ((_ v () (sk ...) fk i) (sk ... i))
675    ((_ v ((pat index) . rest) sk fk i)
676     (let ((w (vector-ref v index)))
677       (match-one w pat ((vector-ref v index) (vector-set! v index))
678                  (match-vector-step v rest sk fk)
679                  fk i)))))
680
681;; With a vector ellipsis pattern we first check to see if the vector
682;; length is at least the required length.
683
684(define-syntax match-gen-vector-ellipsis
685  (syntax-rules ()
686    ((_ v n ((pat index) ...) p sk fk i)
687     (if (vector? v)
688       (let ((len (vector-length v)))
689         (if (>= len n)
690           (match-vector-step v ((pat index) ...)
691                              (match-vector-tail v p n len sk fk)
692                              fk i)
693           fk))
694       fk))))
695
696(define-syntax match-vector-tail
697  (syntax-rules ()
698    ((_ v p n len sk fk i)
699     (match-extract-vars p (match-vector-tail-two v p n len sk fk i) i ()))))
700
701(define-syntax match-vector-tail-two
702  (syntax-rules ()
703    ((_ v p n len (sk ...) fk i ((id id-ls) ...))
704     (let loop ((j n) (id-ls '()) ...)
705       (if (>= j len)
706         (let ((id (reverse id-ls)) ...) (sk ... i))
707         (let ((w (vector-ref v j)))
708           (match-one w p ((vector-ref v j) (vector-set! v j))
709                      (match-drop-ids (loop (+ j 1) (cons id id-ls) ...))
710                      fk i)))))))
711
712(define-syntax match-record-refs
713  (syntax-rules ()
714    ((_ v rec n (p . q) g+s sk fk i)
715     (let ((w (slot-ref rec v n)))
716       (match-one w p ((slot-ref rec v n) (slot-set! rec v n))
717                  (match-record-refs v rec (+ n 1) q g+s sk fk) fk i)))
718    ((_ v rec n () g+s (sk ...) fk i)
719     (sk ... i))))
720
721(define-syntax match-record-named-refs
722  (syntax-rules ()
723    ((_ v rec ((f p) . q) g+s sk fk i)
724     (let ((w (slot-ref rec v 'f)))
725       (match-one w p ((slot-ref rec v 'f) (slot-set! rec v 'f))
726                  (match-record-named-refs v rec q g+s sk fk) fk i)))
727    ((_ v rec () g+s (sk ...) fk i)
728     (sk ... i))))
729
730;; Extract all identifiers in a pattern.  A little more complicated
731;; than just looking for symbols, we need to ignore special keywords
732;; and non-pattern forms (such as the predicate expression in ?
733;; patterns), and also ignore previously bound identifiers.
734;;
735;; Calls the continuation with all new vars as a list of the form
736;; ((orig-var tmp-name) ...), where tmp-name can be used to uniquely
737;; pair with the original variable (e.g. it's used in the ellipsis
738;; generation for list variables).
739;;
740;; (match-extract-vars pattern continuation (ids ...) (new-vars ...))
741
742(define-syntax match-extract-vars
743  (syntax-rules (_ ___ ..1 *** ? $ struct @ object = quote quasiquote and or not get! set!)
744    ((match-extract-vars (? pred . p) . x)
745     (match-extract-vars p . x))
746    ((match-extract-vars ($ rec . p) . x)
747     (match-extract-vars p . x))
748    ((match-extract-vars (struct rec . p) . x)
749     (match-extract-vars p . x))
750    ((match-extract-vars (@ rec (f p) ...) . x)
751     (match-extract-vars (p ...) . x))
752    ((match-extract-vars (object rec (f p) ...) . x)
753     (match-extract-vars (p ...) . x))
754    ((match-extract-vars (= proc p) . x)
755     (match-extract-vars p . x))
756    ((match-extract-vars (quote x) (k ...) i v)
757     (k ... v))
758    ((match-extract-vars (quasiquote x) k i v)
759     (match-extract-quasiquote-vars x k i v (#t)))
760    ((match-extract-vars (and . p) . x)
761     (match-extract-vars p . x))
762    ((match-extract-vars (or . p) . x)
763     (match-extract-vars p . x))
764    ((match-extract-vars (not . p) . x)
765     (match-extract-vars p . x))
766    ;; A non-keyword pair, expand the CAR with a continuation to
767    ;; expand the CDR.
768    ((match-extract-vars (p q . r) k i v)
769     (match-check-ellipsis
770      q
771      (match-extract-vars (p . r) k i v)
772      (match-extract-vars p (match-extract-vars-step (q . r) k i v) i ())))
773    ((match-extract-vars (p . q) k i v)
774     (match-extract-vars p (match-extract-vars-step q k i v) i ()))
775    ((match-extract-vars #(p ...) . x)
776     (match-extract-vars (p ...) . x))
777    ((match-extract-vars _ (k ...) i v)    (k ... v))
778    ((match-extract-vars ___ (k ...) i v)  (k ... v))
779    ((match-extract-vars *** (k ...) i v)  (k ... v))
780    ((match-extract-vars ..1 (k ...) i v)  (k ... v))
781    ;; This is the main part, the only place where we might add a new
782    ;; var if it's an unbound symbol.
783    ((match-extract-vars p (k ...) (i ...) v)
784     (let-syntax
785         ((new-sym?
786           (syntax-rules (i ...)
787             ((new-sym? p sk fk) sk)
788             ((new-sym? any sk fk) fk))))
789       (new-sym? random-sym-to-match
790                 (k ... ((p p-ls) . v))
791                 (k ... v))))
792    ))
793
794;; Stepper used in the above so it can expand the CAR and CDR
795;; separately.
796
797(define-syntax match-extract-vars-step
798  (syntax-rules ()
799    ((_ p k i v ((v2 v2-ls) ...))
800     (match-extract-vars p k (v2 ... . i) ((v2 v2-ls) ... . v)))
801    ))
802
803(define-syntax match-extract-quasiquote-vars
804  (syntax-rules (quasiquote unquote unquote-splicing)
805    ((match-extract-quasiquote-vars (quasiquote x) k i v d)
806     (match-extract-quasiquote-vars x k i v (#t . d)))
807    ((match-extract-quasiquote-vars (unquote-splicing x) k i v d)
808     (match-extract-quasiquote-vars (unquote x) k i v d))
809    ((match-extract-quasiquote-vars (unquote x) k i v (#t))
810     (match-extract-vars x k i v))
811    ((match-extract-quasiquote-vars (unquote x) k i v (#t . d))
812     (match-extract-quasiquote-vars x k i v d))
813    ((match-extract-quasiquote-vars (x . y) k i v d)
814     (match-extract-quasiquote-vars
815      x
816      (match-extract-quasiquote-vars-step y k i v d) i () d))
817    ((match-extract-quasiquote-vars #(x ...) k i v d)
818     (match-extract-quasiquote-vars (x ...) k i v d))
819    ((match-extract-quasiquote-vars x (k ...) i v d)
820     (k ... v))
821    ))
822
823(define-syntax match-extract-quasiquote-vars-step
824  (syntax-rules ()
825    ((_ x k i v d ((v2 v2-ls) ...))
826     (match-extract-quasiquote-vars x k (v2 ... . i) ((v2 v2-ls) ... . v) d))
827    ))
828
829
830;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
831;; Gimme some sugar baby.
832
833;;> Shortcut for \scheme{lambda} + \scheme{match}.  Creates a
834;;> procedure of one argument, and matches that argument against each
835;;> clause.
836
837(define-syntax match-lambda
838  (syntax-rules ()
839    ((_ (pattern . body) ...) (lambda (expr) (match expr (pattern . body) ...)))))
840
841;;> Similar to \scheme{match-lambda}.  Creates a procedure of any
842;;> number of arguments, and matches the argument list against each
843;;> clause.
844
845(define-syntax match-lambda*
846  (syntax-rules ()
847    ((_ (pattern . body) ...) (lambda expr (match expr (pattern . body) ...)))))
848
849;;> Matches each var to the corresponding expression, and evaluates
850;;> the body with all match variables in scope.  Raises an error if
851;;> any of the expressions fail to match.  Syntax analogous to named
852;;> let can also be used for recursive functions which match on their
853;;> arguments as in \scheme{match-lambda*}.
854
855(define-syntax match-let
856  (syntax-rules ()
857    ((_ ((var value) ...) . body)
858     (match-let/helper let () () ((var value) ...) . body))
859    ((_ loop ((var init) ...) . body)
860     (match-named-let loop () ((var init) ...) . body))))
861
862;;> Similar to \scheme{match-let}, but analogously to \scheme{letrec}
863;;> matches and binds the variables with all match variables in scope.
864
865(define-syntax match-letrec
866  (syntax-rules ()
867    ((_ ((var value) ...) . body)
868     (match-let/helper letrec () () ((var value) ...) . body))))
869
870(define-syntax match-let/helper
871  (syntax-rules ()
872    ((_ let ((var expr) ...) () () . body)
873     (let ((var expr) ...) . body))
874    ((_ let ((var expr) ...) ((pat tmp) ...) () . body)
875     (let ((var expr) ...)
876       (match-let* ((pat tmp) ...)
877         . body)))
878    ((_ let (v ...) (p ...) (((a . b) expr) . rest) . body)
879     (match-let/helper
880      let (v ... (tmp expr)) (p ... ((a . b) tmp)) rest . body))
881    ((_ let (v ...) (p ...) ((#(a ...) expr) . rest) . body)
882     (match-let/helper
883      let (v ... (tmp expr)) (p ... (#(a ...) tmp)) rest . body))
884    ((_ let (v ...) (p ...) ((a expr) . rest) . body)
885     (match-let/helper let (v ... (a expr)) (p ...) rest . body))))
886
887(define-syntax match-named-let
888  (syntax-rules ()
889    ((_ loop ((pat expr var) ...) () . body)
890     (let loop ((var expr) ...)
891       (match-let ((pat var) ...)
892         . body)))
893    ((_ loop (v ...) ((pat expr) . rest) . body)
894     (match-named-let loop (v ... (pat expr tmp)) rest . body))))
895
896;;> \macro{(match-let* ((var value) ...) body ...)}
897
898;;> Similar to \scheme{match-let}, but analogously to \scheme{let*}
899;;> matches and binds the variables in sequence, with preceding match
900;;> variables in scope.
901
902(define-syntax match-let*
903  (syntax-rules ()
904    ((_ () . body)
905     (let () . body))
906    ((_ ((pat expr) . rest) . body)
907     (match expr (pat (match-let* rest . body))))))
908
909
910;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
911;; Otherwise COND-EXPANDed bits.
912
913(cond-expand
914 (chibi
915  (define-syntax match-check-ellipsis
916    (er-macro-transformer
917     (lambda (expr rename compare)
918       (if (compare '... (cadr expr))
919           (car (cddr expr))
920           (cadr (cddr expr))))))
921  (define-syntax match-check-identifier
922    (er-macro-transformer
923     (lambda (expr rename compare)
924       (if (identifier? (cadr expr))
925           (car (cddr expr))
926           (cadr (cddr expr)))))))
927
928 (else
929  ;; Portable versions
930  ;;
931  ;; This *should* work, but doesn't :(
932  ;;   (define-syntax match-check-ellipsis
933  ;;     (syntax-rules (...)
934  ;;       ((_ ... sk fk) sk)
935  ;;       ((_ x sk fk) fk)))
936  ;;
937  ;; This is a little more complicated, and introduces a new let-syntax,
938  ;; but should work portably in any R[56]RS Scheme.  Taylor Campbell
939  ;; originally came up with the idea.
940  (define-syntax match-check-ellipsis
941    (syntax-rules ()
942      ;; these two aren't necessary but provide fast-case failures
943      ((match-check-ellipsis (a . b) success-k failure-k) failure-k)
944      ((match-check-ellipsis #(a ...) success-k failure-k) failure-k)
945      ;; matching an atom
946      ((match-check-ellipsis id success-k failure-k)
947       (let-syntax ((ellipsis? (syntax-rules ()
948                                 ;; iff `id' is `...' here then this will
949                                 ;; match a list of any length
950                                 ((ellipsis? (foo id) sk fk) sk)
951                                 ((ellipsis? other sk fk) fk))))
952         ;; this list of three elements will only match the (foo id) list
953         ;; above if `id' is `...'
954         (ellipsis? (a b c) success-k failure-k)))))
955
956  ;; This is portable but can be more efficient with non-portable
957  ;; extensions.  This trick was originally discovered by Oleg Kiselyov.
958  (define-syntax match-check-identifier
959    (syntax-rules ()
960      ;; fast-case failures, lists and vectors are not identifiers
961      ((_ (x . y) success-k failure-k) failure-k)
962      ((_ #(x ...) success-k failure-k) failure-k)
963      ;; x is an atom
964      ((_ x success-k failure-k)
965       (let-syntax
966           ((sym?
967             (syntax-rules ()
968               ;; if the symbol `abracadabra' matches x, then x is a
969               ;; symbol
970               ((sym? x sk fk) sk)
971               ;; otherwise x is a non-symbol datum
972               ((sym? y sk fk) fk))))
973         (sym? abracadabra success-k failure-k)))))))
Note: See TracBrowser for help on using the repository browser.