source: project/wiki/eggref/4/comparse @ 30703

Last change on this file since 30703 was 30703, checked in by Moritz Heidkamp, 7 years ago

comparse: Remove mentions of lazy-seq from introduction, use parser-input type instead

File size: 10.4 KB
Line 
1[[tags: egg]]
2
3== comparse
4
5Comparse is a library of
6[[http://en.wikipedia.org/wiki/Parser_combinator|parser combinators]]
7similar to [[http://www.haskell.org/haskellwiki/Parsec|Haskell's
8Parsec]]. Its implementation is based on
9[[http://common-lisp.net/~dcrampsie/smug.html|Drew Crampsie's smug for
10Common Lisp]] but slightly deviates in naming and provides a number of
11additional convenience combinators.
12
13
14[[toc:]]
15
16=== Caveat
17
18As of now the API documentation is still incomplete, notably missing
19the main entry point procedure {{parse}}. Also, the
20[[#examples|Examples section]] is yet to be written.
21
22
23=== Introduction
24
25Comparse provides a library of ''parsers'' which can be applied to
26''inputs'' in order to produce ''parse results''. It also provides
27''parser combinators'' to create new parsers. In Comparse inputs are
28represented by a {{parser-input}} type. It can hold a sequence of any
29kinds of ''items'' (often characters). For example:
30
31<enscript language="scheme">
32(use comparse)
33
34(define some-input
35  (->parser-input "hello world"))
36</enscript>
37
38A ''parser'' then is a function which takes an ''input'' as its
39argument and, in case the parse succeeds, returns a ''parse result''
40(which can be a value of any kind) and the ''remainder'' of the input
41(which is a {{parser-input}} again). In case the parse fails it
42returns {{#f}}.
43
44A very basic parser provided by Comparse is the {{item}} parser which
45consumes one item from the given input and returns it as its parse
46result. If the input is empty, {{item}} fails. We can apply it to the
47input we defined above using the {{parse}} procedure
48
49<enscript language="scheme">
50(parse item some-input)
51</enscript>
52
53Which gives us these results:
54
55<enscript language="scheme">
56#\h
57#<parser-input #\e #\l #\l #\o #\space #\w #\o #\r #\l #\d ...>
58</enscript>
59
60There is another basic parser called {{fail}} which always fails, no
61matter what's passed to it. These primitive parsers are not very
62useful on their own, of course. In order to do more interesting things
63Comparse provides a selection of ''parser combinators''. Just like
64regular function combinators (such as {{compose}} or {{complement}}),
65parser combinators are functions which take one or more parsers as
66their arguments and combine them into a new parser. One of the basic
67parser combinators in Comparse is {{bind}}. It takes a parser and a
68procedure as its arguments and returns a parser which first applies
69the passed parser and, if it succeeds, calls the procedure with its
70parse result. The procedure then must return another parser which will
71be applied to the remainder of the parser that was passed to
72{{bind}}. Okay, that sounds pretty dazzling, so an example is in
73order:
74
75<enscript language="scheme">
76(define double-item
77  (bind item (lambda (x)
78               (lambda (remainder)
79                 (cons (list x x) remainder)))))
80
81(parse double-item some-input)
82</enscript>
83
84Which gives us:
85
86<enscript language="scheme">
87(#\h #\h)
88#<parser-input #\e #\l #\l #\o #\space #\w #\o #\r #\l #\d ...>
89</enscript>
90
91So the procedure we pass to {{bind}} returns a parser which constructs
92a two-item list from the parse result and returns it with the
93untouched remainder. As returning a result from a parser without
94touching the remainder is needed quite frequently, Comparse provides
95the {{result}} parser combinator which does just that. Using it we can
96write {{double-item}} like this:
97
98
99<enscript language="scheme">
100(define double-item
101  (bind item (lambda (x)
102               (result (list x x)))))
103</enscript>
104
105Comparse provides many convenient and more high-level parser
106combinators than the ones presented in this introdution. See the
107[[#examples|Examples section]] on how to use them to create parsers
108for practical tasks. Also note that you would usually not manually
109convert raw inputs (like strings or input ports) into a
110{{parser-input}} object like we did above. Instead you can pass the
111raw input object directly to the {{parse}} procedure which will
112automatically turn it into a {{parser-input}} for you.
113
114
115=== A note on variadic combinators
116
117Some parser combinators accept one or more parsers as their
118arguments. As a convenience when just one argument is passed and it is
119a list instead of a parser, that list is treated as the argument
120list. In other words those combinators implicitly {{apply}} themselves
121to the given (single) list argument.
122
123
124=== API
125
126<procedure>(result value)</procedure>
127
128Returns a parser which always succeeds, leaves the input untouched and
129returns {{value}}.
130
131
132<procedure>(fail input)</procedure>
133
134A parser which always fails.
135
136
137<procedure>(item input)</procedure>
138
139A parser which consumes and returns the first item of {{input}},
140i.e. it returns that item and the tail of {{input}}. Fails if
141{{input}} is empty.
142
143
144<procedure>(bind parser proc)</procedure>
145
146Returns a parser which applies {{parser}} and calls {{proc}} with its
147parse result if it succeeds. {{proc}} must return a parser which will
148be applied to the remainder returned by {{parser}}.
149
150
151<procedure>(satisfies condition . args)</procedure>
152
153Returns a parser which consumes the next {{item}} of its input and
154applies {{condition}} to it like this: {{(apply condition item
155args)}}. If that call returns {{#f}}, the parser fails. Otherwise it
156succeeds and returns the item as its result.
157
158
159<procedure>(is x)</procedure>
160
161Returns a parser which succeeds when the next {{item}} of its input is
162{{eq?}} to {{x}}.
163
164
165<procedure>(in collection . items)</procedure>
166
167Returns a parser which succeeds when the next item of the given input
168is contained in {{collection}}. Collection can be a
169[[http://srfi.schemers.org/srfi-14/srfi-14.html|SRFI 14 char-set]] or
170a list which is checked for membership with {{memq}}. Given more than
171one argument, membership of the next item is checked for with {{memq}}
172in {{(cons collection items)}}.
173
174
175<procedure>(sequence parser . parsers)</procedure>
176
177Returns a parser which applies the given parsers from left to right
178and returns the list of their results when all succeed. See also the
179[[#a-note-on-variadic-combinators|note on variadic combinators]].
180
181
182<procedure>(char-seq str)</procedure>
183
184Returns a parser which consumes the chars of the given string {{str}}
185in the order contained therein, i.e. it parses a literal string. On
186success the parser returns {{str}}.
187
188
189<procedure>(any-of parser . parsers)</procedure>
190
191Returns a parser which applies the given parsers from left to right
192and returns the result of the first one that succeeds. See also the
193[[#a-note-on-variadic-combinators|note on variadic combinators]].
194
195
196<procedure>(all-of parser . parsers)</procedure>
197
198Returns a parser which applies the given parsers from left to right
199and returns the result of the last one if all succeed. See also the
200[[#a-note-on-variadic-combinators|note on variadic combinators]].
201
202
203<procedure>(none-of parser . parsers)</procedure>
204
205Returns a parser which applies the given parsers from left to right
206and returns {{#t}} as its result if none of them succeed while it
207doesn't consume any items from the input. See also the
208[[#a-note-on-variadic-combinators|note on variadic combinators]].
209
210
211<procedure>(none-of* parser [parsers ...] but)</procedure>
212
213Returns a parser which applies the given parsers from left to
214right. If none of them succeed, it applies the parser {{but}} and
215returns its result.
216
217
218<procedure>(preceded-by parser . parsers)</procedure>
219
220Returns a parser which applies the given parsers from left to right to
221the input or the remainder of the preceding parser, respectively, and
222returns the result of the last one. See also the
223[[#a-note-on-variadic-combinators|note on variadic combinators]].
224
225
226<procedure>(followed-by parser following-parser . following-parsers)</procedure>
227
228Returns a parser which applies {{parser}} if the given
229{{following-parser[s]}} succeed on its remainder from left to
230right. See also the [[#a-note-on-variadic-combinators|note on variadic
231combinators]] which applies to the {{following-parser[s]}} argument(s)
232in this case.
233
234
235<procedure>(enclosed-by open content close)</procedure>
236
237Returns a parser which applies the given parsers {{open}},
238{{content}}, and {{close}} in that order to the input or the remainder
239of the preceding parser, respectively, and returns the result of
240{{content}} with the remainder of {{close}}.
241
242
243=== Examples
244
245Yet to be written.
246
247
248=== About this egg
249
250==== Source
251
252The [[https://bitbucket.org/DerGuteMoritz/comparse|source code is
253hosted at Bitbucket]]. Feel free to fork it and send pull requests
254there.
255
256==== Author
257
258[[/users/moritz-heidkamp|Moritz Heidkamp]]
259
260==== Version history
261
262; 0.1.0 : Add {{parser-input}} wrapper type so as not to leak {{lazy-seq}} implementation detail. Add {{end-of-input}} parser. Improve performance of {{as-string}} and {{repeated}} combinators.
263; 0.0.5 : Turn {{latch}} into compile-time dependency
264; 0.0.4 : Fix potential apply limit exceedance in {{as-string}}. Extend {{repeated}} to parse exactly {{n}} repetitions.
265; 0.0.3 : Correct dependencies
266; 0.0.2 : Add memoization support
267; 0.0.1 : Initial release
268
269==== License
270
271  Copyright (c) 2013, Moritz Heidkamp
272  All rights reserved.
273 
274  Redistribution and use in source and binary forms, with or without
275  modification, are permitted provided that the following conditions are
276  met:
277 
278  Redistributions of source code must retain the above copyright
279  notice, this list of conditions and the following disclaimer.
280 
281  Redistributions in binary form must reproduce the above copyright
282  notice, this list of conditions and the following disclaimer in the
283  documentation and/or other materials provided with the distribution.
284 
285  Neither the name of the author nor the names of its contributors may
286  be used to endorse or promote products derived from this software
287  without specific prior written permission.
288 
289  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
290  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
291  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
292  FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
293  COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
294  INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
295  (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
296  SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
297  HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
298  STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
299  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
300  OF THE POSSIBILITY OF SUCH DAMAGE.
301
Note: See TracBrowser for help on using the repository browser.