source: project/wiki/eggref/4/expat @ 32241

Last change on this file since 32241 was 32241, checked in by felix winkelmann, 6 years ago

removed some call/cc.org links

File size: 8.3 KB
Line 
1[[tags: egg]]
2
3== expat
4
5[[toc:]]
6
7=== Description
8
9An interface to James Clarks' [[http://www.libexpat.org|Expat]] XML parser.
10
11=== Author
12
13[[/users/felix winkelmann|felix winkelmann]]; ported to Chicken 4 by [[/users/ecloud|Shawn Rutledge]]
14
15=== Requirements
16
17* [[silex]]
18* [[easyffi]]
19
20=== Documentation
21
22Expat is a stream-oriented parser. You register callback (or handler)
23functions with the parser and then start feeding it the document. As
24the parser recognizes parts of the document, it will call the
25appropriate handler for that part (if you've registered one.) The
26document is fed to the parser in pieces, so you can start parsing
27before you have all the document. This also allows you to parse really
28huge documents that won't fit into memory.
29
30If you want to parse an entire document into memory or if you need
31more bells and whistles, you should take a look at Oleg Kiselyov's
32[[ssax|SSAX]] parser.
33
34==== expat:make-parser
35
36<procedure>(expat:make-parser #!key (encoding #f) (namespaces #f) (namespace-separator #\:))</procedure>
37
38Creates a parser object with the specified attributes. {{encoding}}
39should be a string designating the encoding of the document and should
40be one of the following:
41
42* UTF-8
43* UTF-16
44* ISO-8859-1
45* US-ASCII
46
47If no encoding or {{#f}} is given, then the encoding specified in the
48document.  Note that the strings passed to the handlers are always
49UTF-8 encoded.
50
51If {{namespaces}} is true, then namespace declarations are properly
52recognized and tags belonging to a namespace will be prefixed with the
53namespace string and the character given in {{namespace-separator}}.
54
55==== expat:make-external-entity-parser
56
57<procedure>(expat:make-external-entity-parser PARSER CONTEXT #!key (encoding #f))</procedure>
58
59Creates a parser to recursively process external entities.
60
61==== expat:destroy-parser
62
63<procedure>(expat:destroy-parser PARSER)</procedure>
64
65Releases the memory resources associated with PARSER.
66
67==== expat:parse
68
69<procedure>(expat:parse PARSER STRING #!key length (final #t) (external-entities #f))</procedure>
70
71Parses a piece of XML document given in {{STRING}}. If {{length}} is
72given, then it specifies the number of bytes to parse.  If {{final}}
73is true, then the string is the last piece of the document. {{LENGTH}}
74defaults to {{(string-length STRING)}}.
75
76Returns {{#t}} on success, or triggers and exception of the kinds
77{{(exn expat)}}.  If {{external-entities}} controls whether parsing of
78external entities is enabled and can be any of the symbols {{never}},
79{{always}} or {{unless-standalone}}. {{#f}} and {{#t}} are synonymous
80for {{never}} and {{always}}.
81
82==== expat:set-start-handler!
83
84<procedure>(expat:set-start-handler! PARSER PROCEDURE)</procedure>
85
86Sets the handler to process start (and empty) tags. {{PROCEDURE}} will
87be called with two arguments: the tag (a string) and a list of pairs,
88where each pair is of the form {{(ATTRIBUTENAME . ATTRIBUTEVALUE)}}
89(both strings).
90
91==== expat:set-end-handler!
92
93<procedure>(expat:set-end-handler! PARSER PROCEDURE)</procedure>
94
95Sets the handler to process end (and empty) tags. {{PROCEDURE}} will
96be called with one argumente the tag (a string).
97
98==== expat:set-character-data-handler!
99
100<procedure>(expat:set-character-data-handler! PARSER PROCEDURE)</procedure>
101
102Sets the handler to process text. {{PROCEDURE}} will be called with one
103argument: a string containing a piece of text. Note that a single
104block of contiguous text free of markup may still result in a sequence
105of calls to this handler.
106
107==== expat:set-processing-instruction-handler!
108
109<procedure>(expat:set-processing-instruction-handler! PARSER PROCEDURE)</procedure>
110
111Sets the handler to for processing insructions. {{PROCEDURE}} will be
112called with two arguments: target and data (both strings). The target
113is the first word in the processing instruction.  The data is the rest
114of the characters in it after skipping all whitespace after the
115initial word.
116
117==== expat:set-comment-handler!
118
119<procedure>(expat:set-comment-handler! PARSER PROCEDURE)</procedure>
120
121Sets the handler to process comments. {{PROCEDURE}} will be called with
122the all the text inside the comment delimiters.
123
124==== expat:set-external-entity-ref-handler!
125
126<procedure>(expat:set-external-entity-ref-handler! PARSER PROCEDURE)</procedure>
127
128Sets the handler to references to external entities. PROCEDURE will be
129called with four arguments: context, URI base, system- and public ID
130(all strings). To parse the external entity, create a parser with
131{{expat:make-external-entity-parser}}.
132
133=== Examples
134
135A silly example:
136
137<enscript highlight="scheme">
138(use expat)
139
140(define text #<<EOF
141<?xml version='1.0'?>
142<!-- a comment -->
143<?pi1 yepyepyep?>
144<yo:this yo='abc' xmlns:yo="http://www.yo.com">
145&gt;&;lt;&#x100;
146<yo:test>yes, no, &#33<is/><a/>
147</yo:test>some more text
148</yo:this>
149EOF
150)
151
152(define p (expat:make-parser namespaces: #t))
153(expat:set-start-handler! p (lambda (tag attrs) (print "Start: " tag " - " attrs)))
154(expat:set-end-handler! p (lambda (tag) (print "End: " tag)))
155(expat:set-character-data-handler! p (lambda (text) (pp (string->list text))))
156(expat:set-processing-instruction-handler! p (lambda (target text) (print "PI: " target " - " text)))
157(expat:set-comment-handler! p (lambda (text) (print "Comment: " text)))
158(expat:parse p text)
159(expat:destroy-parser p)
160</enscript>
161
162This will output:
163
164  Comment:  a comment
165  PI: pi1 - yepyepyep
166  Start: http://www.yo.com:this - ((yo . abc))
167  (#\newline)
168  (#\>)
169  (#\<)
170  (#\Ä #\)
171  (#\newline)
172  (#\space)
173  Start: http://www.yo.com:test - ()
174  (#\y #\e #\s #\, #\space #\n #\o #\, #\space)
175  (#\!)
176  Start: is - ()
177  End: is
178  Start: a - ()
179  End: a
180  (#\newline)
181  (#\space)
182  End: http://www.yo.com:test
183  (#\s #\o #\m #\e #\space #\m #\o #\r #\e #\space #\t #\e #\x #\t)
184  (#\newline)
185  End: http://www.yo.com:this
186
187Another example that uses DTDs:
188
189Say we have a file foo.xml:
190
191  <?xml version="1.0"?>
192  <!DOCTYPE foo SYSTEM "foo.dtd">
193  <foo>
194  &abcdef;
195  </foo>
196
197and another one called foo.dtd:
198
199  <!ENTITY abcdef "this is a test">
200
201<enscript highlight="scheme">
202(use utils expat)
203
204(define p (expat:make-parser))
205(expat:set-start-handler! p (lambda (tag attrs) (print "Start: " tag " - " attrs)))
206(expat:set-end-handler! p (lambda (tag) (print "End: " tag)))
207(expat:set-character-data-handler! p (lambda (text) (pp (string->list text))))
208
209(expat:set-external-entity-ref-handler!
210 p
211 (lambda (context base sys pub)
212   (print "external: " sys)
213   (let* ([p2 (expat:make-external-entity-parser p context)]
214          [s (expat:parse p2 (read-all "foo.dtd"))] )
215     (expat:destroy-parser p2)
216     s) ) )
217
218(expat:parse p (read-all "foo.xml") external-entities: #t)
219(expat:destroy-parser p)
220</enscript>
221
222=== Changelog
223
224* 1.4 Ported to Chicken 4
225* 1.3 Removed use of {{___callback}}
226* 1.2 Works withh externalized easyffi extension
227* 1.1 Added support for parsing external entities; optional arguments to {{expat:parse}} are now keyword arguments.
228* 1.0 Initial release
229
230=== License
231
232  Copyright (c) 2005, Felix L. Winkelmann
233  All rights reserved.
234 
235  Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following
236  conditions are met:
237 
238    Redistributions of source code must retain the above copyright notice, this list of conditions and the following
239      disclaimer.
240    Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following
241      disclaimer in the documentation and/or other materials provided with the distribution.
242    Neither the name of the author nor the names of its contributors may be used to endorse or promote
243      products derived from this software without specific prior written permission.
244 
245  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS
246  OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
247  AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR
248  CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
249  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
250  SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
251  THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
252  OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
253  POSSIBILITY OF SUCH DAMAGE.
Note: See TracBrowser for help on using the repository browser.