source: project/expat/expat.html @ 6960

Last change on this file since 6960 was 6960, checked in by felix winkelmann, 12 years ago

various fixes of the use of _callback [reported by Salmonella]

File size: 9.5 KB
Line 
1<html>
2<head><title>Eggs Unlimited - expat</title>
3<style type="text/css">
4  <!--
5      CODE {
6             color: #666666;
7           }
8      EM {
9           font-weight: bold;
10           font-style: normal;
11         }
12      DT.function { 
13                    background: #f5f5f5;
14                    color: black;
15                    padding: 0.1em;
16                    border: 1px solid #bbbaaf;
17                    font-family: monospace;
18                  }
19      PRE {
20        background: #efeee0;
21        padding: 0.1em;
22        border: 1px solid #bbbaaf;
23      }
24      TABLE {
25        background: #f5f5f5;
26        padding: 0.2em;
27      }
28      TH {
29        border-bottom: 1px solid black;
30      }
31    -->
32</style>
33</head>
34<body>
35
36<center><img src="egg.jpg"></center>
37<center><a href="index.html">back</a></center>
38
39<h2>expat</h2>
40
41<h3>Description:</h3>
42An interface to James Clarks' <a href="http://www.libexpat.org/">Expat</a> XML parser.
43
44<h3>Author:</h3>
45<a href="mailto:felix@call-with-current-continuation.org">felix</a>
46
47<h3>Version:</h3>
48<ul>
49<li>1.3
50removed use of <tt>___callback</tt>
51<li>1.2
52Works withh externalized easyffi extension
53<li>1.1
54Added support for parsing external entities; optional arguments to <code>expat:parse</code> are now keyword arguments.
55<li>1.0
56</ul>
57
58<h3>Usage:</h3>
59
60<pre>
61(require-extension expat)
62</pre>
63
64<h3>Download:</h3>
65<a href="expat.egg">expat.egg</a>
66
67<h3>Documentation:</h3>
68
69Expat is a stream-oriented parser. You register callback (or handler) functions with the parser and then start feeding it the document. As the parser recognizes parts of the document, it will call the appropriate handler for that part (if you've registered one.) The document is fed to the parser in pieces, so you can start parsing before you have all the document. This also allows you to parse really huge documents that won't fit into memory.
70
71<p>If you want to parse an entire document into memory or if you need more bells and whistles, you should take a look
72at Oleg Kiselyov's <a href="ssax.html">SSAX</a> parser.
73
74<dl>
75<dt class="function"><em>[procedure]</em> (expat:make-parser #!key (encoding #f) (namespaces #f) (namespace-separator #\:))</dt>
76<dd>
77<p>Creates a parser object with the specified attributes. <code>encoding</code> should be a string designating the encoding
78of the document and should be one of the following:
79
80<pre>
81UTF-8
82UTF-16
83ISO-8859-1
84US-ASCII
85</pre>
86
87If no encoding or <code>#f</code> is given, then the encoding specified in the document.
88Note that the strings passed to the handlers are always UTF-8 encoded.
89
90<p>if <code>namespaces</code> is true, then namespace declarations are properly recognized and tags belonging to a namespace will be
91prefixed with the namespace string and the character given in <code>namespace-separator</code>.
92
93</dd>
94
95<dt class="function"><em>[procedure]</em> (expat:make-external-entity-parser PARSER CONTEXT #!key (encoding #f))</dt>
96<dd>
97<p>Creates a parser to recursively process external entities.
98</dd>
99
100<dt class="function"><em>[procedure]</em> (expat:destroy-parser PARSER)</dt>
101<dd>
102<p>Releases the memory resources associated with PARSER.
103</dd>
104
105<dt class="function"><em>[procedure]</em> (expat:parse PARSER STRING #!key length (final #t) (external-entities #f))</dt>
106<dd>
107<p>Parses a piece of XML document given in STRING. If <code>length</code> is given, then it specifies the number of bytes to parse.
108If <code>final</code> is true, then the string is the last piece of the document. LENGTH defaults to
109<code>(string-length STRING)</code>.
110<br>Returns <code>#t</code> on success, or triggers and exception of the kinds <code>(exn expat)</code>.
111If <code>external-entities</code> controls whether parsing of external entities is enabled and can be any of the symbols
112<code>never</code>, <code>always</code> or <code>unless-standalone</code>. <code>#f</code> and <code>#t</code> are
113synonymous for <code>never</code> and <code>always</code>.
114</dd>
115
116<dt class="function"><em>[procedure]</em> (expat:set-start-handler! PARSER PROCEDURE)</dt>
117<dd>
118<p>Sets the handler to process start (and empty) tags. PROCEDURE will be called with two arguments: the tag (a string)
119and a list of pairs, where each pair is of the form <code>(ATTRIBUTENAME . ATTRIBUTEVALUE)</code> (both strings).
120</dd>
121
122<dt class="function"><em>[procedure]</em> (expat:set-end-handler! PARSER PROCEDURE)</dt>
123<dd>
124<p>Sets the handler to process end (and empty) tags. PROCEDURE will be called with one argumente the tag (a string).
125</dd>
126
127<dt class="function"><em>[procedure]</em> (expat:set-character-data-handler! PARSER PROCEDURE)</dt>
128<dd>
129<p>Sets the handler to process text. PROCEDURE will be called with one argument: a string containing a piece
130of text. Note that a single block of contiguous text free of markup may still result in a sequence of calls to this handler.
131</dd>
132
133<dt class="function"><em>[procedure]</em> (expat:set-processing-instruction-handler! PARSER PROCEDURE)</dt>
134<dd>
135<p>Sets the handler to for processing insructions. PROCEDURE will be called with two arguments: target and
136data (both strings). The target is the first word in the processing instruction.
137The data is the rest of the characters in it after skipping all whitespace after the initial word.
138</dd>
139
140<dt class="function"><em>[procedure]</em> (expat:set-comment-handler! PARSER PROCEDURE)</dt>
141<dd>
142<p>Sets the handler to process comments. PROCEDURE will be called with the all the text inside the comment delimiters.
143</dd>
144
145<dt class="function"><em>[procedure]</em> (expat:set-external-entity-ref-handler! PARSER PROCEDURE)</dt>
146<dd>
147<p>Sets the handler to references to external entities. PROCEDURE will be called with four arguments: context, URI base,
148system- and public ID (all strings). To parse the external entity, create a parser with
149<code>expat:make-external-entity-parser</code>.
150</dd>
151
152</dl>
153
154<h3>Example:</h3>
155
156A silly example:
157
158<pre>
159(use expat)
160
161(define text #&lt;&lt;EOF
162&lt;?xml version='1.0'?&gt;
163&lt;!-- a comment --&gt;
164&lt;?pi1 yepyepyep?&gt;
165&lt;yo:this yo='abc' xmlns:yo="http://www.yo.com"&gt;
166&amp;gt;&amp;lt;&amp;#x100;
167 &lt;yo:test&gt;yes, no, &amp;#33;&lt;is/&gt;&lt;a/&gt;
168 &lt;/yo:test&gt;some more text
169&lt;/yo:this&gt;
170EOF
171)
172
173(define p (expat:make-parser namespaces: #t))
174(expat:set-start-handler! p (lambda (tag attrs) (print "Start: " tag " - " attrs)))
175(expat:set-end-handler! p (lambda (tag) (print "End: " tag)))
176(expat:set-character-data-handler! p (lambda (text) (pp (string->list text))))
177(expat:set-processing-instruction-handler! p (lambda (target text) (print "PI: " target " - " text)))
178(expat:set-comment-handler! p (lambda (text) (print "Comment: " text)))
179(expat:parse p text)
180(expat:destroy-parser p)
181</pre>
182
183This will output:
184
185<pre>
186Comment:  a comment
187PI: pi1 - yepyepyep
188Start: http://www.yo.com:this - ((yo . abc))
189(#\newline)
190(#\&gt;)
191(#\&lt;)
192(#\Ä #\)
193(#\newline)
194(#\space)
195Start: http://www.yo.com:test - ()
196(#\y #\e #\s #\, #\space #\n #\o #\, #\space)
197(#\!)
198Start: is - ()
199End: is
200Start: a - ()
201End: a
202(#\newline)
203(#\space)
204End: http://www.yo.com:test
205(#\s #\o #\m #\e #\space #\m #\o #\r #\e #\space #\t #\e #\x #\t)
206(#\newline)
207End: http://www.yo.com:this
208</pre>
209
210<p>Another example that uses DTDs:
211
212<p>Say we have a file <code>foo.xml</code>:
213
214<pre>
215&lt;?xml version="1.0"?&gt;
216&lt;!DOCTYPE foo SYSTEM "foo.dtd"&gt;
217&lt;foo&gt;
218  &abcdef;
219&lt;/foo&gt;
220</pre>
221
222and another one called <code>foo.dtd</code>:
223
224<pre>
225&lt;!ENTITY abcdef "this is a test"&gt;
226</pre>
227
228<pre>
229(use utils expat)
230
231(define p (expat:make-parser))
232(expat:set-start-handler! p (lambda (tag attrs) (print "Start: " tag " - " attrs)))
233(expat:set-end-handler! p (lambda (tag) (print "End: " tag)))
234(expat:set-character-data-handler! p (lambda (text) (pp (string->list text))))
235
236(expat:set-external-entity-ref-handler!
237 p
238 (lambda (context base sys pub)
239   (print "external: " sys)
240   (let* ([p2 (expat:make-external-entity-parser p context)]
241          [s (expat:parse p2 (read-all "foo.dtd"))] )
242     (expat:destroy-parser p2)
243     s) ) )
244
245(expat:parse p (read-all "foo.xml") external-entities: #t)
246(expat:destroy-parser p)
247</pre>
248
249<h3>License:</h3>
250<pre>
251Copyright (c) 2005, Felix L. Winkelmann
252All rights reserved.
253
254Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following
255conditions are met:
256
257  Redistributions of source code must retain the above copyright notice, this list of conditions and the following
258    disclaimer.
259  Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following
260    disclaimer in the documentation and/or other materials provided with the distribution.
261  Neither the name of the author nor the names of its contributors may be used to endorse or promote
262    products derived from this software without specific prior written permission.
263
264THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS
265OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
266AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR
267CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
268CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
269SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
270THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
271OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
272POSSIBILITY OF SUCH DAMAGE.
273</pre>
274
275
276<hr><a href="index.html">back</a>
277
278</body>
279</html>
Note: See TracBrowser for help on using the repository browser.