source: project/wiki/eggref/4/lexgen @ 14872

Last change on this file since 14872 was 14872, checked in by Ivan Raikov, 10 years ago

lexgen doc update

File size: 5.8 KB
Line 
1[[tags: eggs]]
2[[toc:]]
3
4== lexgen
5
6=== Description
7
8{{lexgen}} is a lexer generator comprised in its core of only four
9small procedures. The programmer combines these procedures into
10regular expression pattern matchers.
11
12A pattern matcher procedure takes a list of streams, and returns a
13new list of streams advanced by every combination allowed by the
14pattern matcher function. A stream is defined as a list that contains
15a list of characters consumed by the pattern matcher, and a list of
16characters not yet consumed.
17
18Note that the number of streams returned by a pattern matcher
19typically won't match the number of streams passed in. If the pattern
20doesn't match at all, the empty list is returned.
21
22
23=== Library Procedures
24
25Every combinator procedure in this library returns a procedure that
26takes in a list of streams as an argument.
27
28==== Basic procedures
29
30<procedure>(tok TOKEN PROC) => MATCHER</procedure>
31
32Procedure {{tok}} builds a pattern matcher function that, for each
33stream given, applies a procedure to the given token {{TOKEN}} and an
34input character. If the procedure returns a true value, that value is
35prepended to the list of consumed elements, and the input character is
36removed from the list of input elements.
37
38<procedure>(seq MATCHER1 MATCHER2) => MATCHER</procedure>
39
40{{seq}} builds a matcher that matches a sequence of patterns.
41
42<procedure>(bar MATCHER1 MATCHER2) => MATCHER</procedure>
43
44{{bar}} matches either of two patterns. It's analogous to patterns
45separated by {{|}} in traditional regular expressions.
46
47<procedure>(star MATCHER) => MATCHER</procedure>
48
49{{star}} is an implementation of the Kleene closure. It is analogous
50to {{*}} in traditional regular expressions.
51
52==== Convenience procedures
53
54These procedures are built from the previous four and are provided
55for convenience.
56
57<procedure>(try PROC) => PROC</procedure>
58
59Converts a binary predicate procedure to a binary procedure that
60returns its right argument when the predicate is true, and false
61otherwise.
62
63<procedure>(char CHAR) => MATCHER</procedure>
64
65Matches a single character.
66
67<procedure>(lst MATCHER-LIST) => MATCHER</procedure>
68
69Constructs a matcher for the sequence of matchers in {{MATCHER-LIST}}.
70
71<procedure>(pos MATCHER) => MATCHER</procedure>
72
73Positive closure. Analogous to {{+}}.
74
75<procedure>(opt MATCHER) => MATCHER</procedure>
76
77Optional pattern. Analogous to {{?}}.
78
79<procedure>(set CHAR-SET) => MATCHER</procedure>
80
81Matches any of a SRFI-14 set of characters.
82
83<procedure>(range CHAR CHAR) => MATCHER</procedure>
84
85Matches a range of characters. Analogous to character class {{[]}}.
86
87<procedure>(lit STRING) => MATCHER</procedure>
88
89Matches a literal string {{s}}.
90
91
92==== Lexer procedures
93
94<procedure>(longest STREAM-LIST) => STREAM</procedure>
95
96Takes the resulting streams produced by the application of a pattern
97on a stream (or streams) and selects the longest match if one
98exists. If {{STREAM-LIST}} is empty, it returns {{#F}}.
99
100
101<procedure>(lex MATCHER ERROR STRING) => CHAR-LIST</procedure>
102
103{{lex}} takes a pattern and a string, turns the string into a list of
104streams (containing one stream), applies the pattern, and returns the
105longest match. Argument {{ERROR}} is a single-argument procedure
106called when the pattern does not match anything.
107
108=== Examples
109
110  ;; A pattern to match floating point numbers.
111  ;; "-"?(([0-9]+(\\.[0-9]+)?)|(\\.[0-9]+))([eE][+-]?[0-9]+)?
112
113  (define (err s)
114    (print "lexical error on stream: " s)
115    (list))
116
117  (define numpat
118    (let* ((digit        (range #\0 #\9))
119           (digits       (pos digit))
120           (fraction     (seq `(,(char #\.) ,digits)))
121           (significand  (bar `(,(seq `(,digits ,(opt fraction))) ,fraction)))
122           (exp          (seq `(,(set "eE") ,(opt (set "+-")) ,digits)))
123           (sign         (opt (char #\-)) ))     
124     (seq `(,sign ,(seq `(,significand ,(opt exp)))))))
125
126  (print (lex numpat err "3.45e-6"))
127
128=== Requires
129
130* [[matchable]]
131
132=== Version History
133
134* 2.2 Bug fix in procedure star
135* 2.1 Added procedure lst
136* 2.0 Core procedures rewritten in continuation-passing style
137* 1.5 Using (require-extension srfi-1)
138* 1.4 Ported to Chicken 4
139* 1.2 Added procedures try and tok (supersedes pred)
140* 1.0 Initial release
141
142=== License
143
144Based on the [[http://www.standarddeviance.com/projects/combinators/combinators.html|SML lexer generator by Thant Tessman]].
145
146  Copyright 2009 Ivan Raikov.
147  All rights reserved.
148 
149  Redistribution and use in source and binary forms, with or without
150  modification, are permitted provided that the following conditions are
151  met:
152 
153  Redistributions of source code must retain the above copyright
154  notice, this list of conditions and the following disclaimer.
155 
156  Redistributions in binary form must reproduce the above copyright
157  notice, this list of conditions and the following disclaimer in the
158  documentation and/or other materials provided with the distribution.
159 
160  Neither the name of the author nor the names of its contributors may
161  be used to endorse or promote products derived from this software
162  without specific prior written permission.
163 
164  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
165  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
166  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
167  FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
168  COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
169  INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
170  (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
171  SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
172  HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
173  STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
174  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
175  OF THE POSSIBILITY OF SUCH DAMAGE.
Note: See TracBrowser for help on using the repository browser.