source: project/wiki/sxml-transforms @ 12880

Last change on this file since 12880 was 12880, checked in by sjamaan, 11 years ago

Temporary hack to fix example sections in wiki eggdoc pages

File size: 11.7 KB
Line 
1[[tags:eggs]]
2
3This is version 1.2 of the '''sxml-transforms''' extension library for Chicken Scheme.
4
5[[toc:]]
6
7== Description
8
9The [[http://cvs.sourceforge.net/viewcvs.py/ssax/SSAX|SXML transformations]] (to XML, SXML, and HTML) from the [[http://ssax.sf.net|SSAX project]]
10
11== Documentation
12
13
14This egg provides the SXML transforms available in the SSAX/SXML Sourceforge project.  It incorporates one main module, and an auxiliary one:
15
16=== sxml-transforms
17
18==== SRV:send-reply
19
20<procedure>(SRV:send-reply . fragments)</procedure>
21
22Output the FRAGMENTS to the current output port.
23
24The fragments are a list of strings, characters, numbers, thunks, {{#f}},
25{{#t}} -- and other fragments.  The function traverses the tree
26depth-first, writes out strings and characters, executes thunks, and
27ignores {{#f}} and {{'()}}.  The function returns {{#t}} if anything was written
28at all; otherwise the result is {{#f}}.  If {{#t}} occurs among the fragments,
29it is not written out but causes the result of {{SRV:send-reply}} to be {{#t}}.
30
31==== pre-post-order
32
33<procedure>(pre-post-order tree bindings)</procedure>
34
35Traversal of an SXML tree or a grove: a <Node> or a <Nodelist>
36
37A <Node> and a <Nodelist> are mutually-recursive datatypes that
38underlie the SXML tree:
39     <Node> ::= (name . <Nodelist>) | "text string"
40An (ordered) set of nodes is just a list of the constituent nodes:
41     <Nodelist> ::= (<Node> ...)
42Nodelists, and Nodes other than text strings are both lists. A
43<Nodelist> however is either an empty list, or a list whose head is
44not a symbol (an atom in general). A symbol at the head of a node is
45either an XML name (in which case it's a tag of an XML element), or
46an administrative name such as '@'.
47See SXPath.scm and SSAX.scm for more information on SXML.
48
49Pre-Post-order traversal of a tree and creation of a new tree:
50    pre-post-order:: <tree> x <bindings> -> <new-tree>
51where
52    <bindings> ::= (<binding> ...)
53    <binding> ::= (<trigger-symbol> *preorder* . <handler>) |
54                  (<trigger-symbol> *macro* . <handler>) |
55                  (<trigger-symbol> <new-bindings> . <handler>) |
56                  (<trigger-symbol> . <handler>)
57    <trigger-symbol> ::= XMLname | *text* | *default*
58    <handler> :: <trigger-symbol> x [<tree>] -> <new-tree>
59
60The pre-post-order function visits the nodes and nodelists
61pre-post-order (depth-first).  For each <Node> of the form (name
62<Node> ...) it looks up an association with the given 'name' among
63its <bindings>. If failed, pre-post-order tries to locate a
64*default* binding. It's an error if the latter attempt fails as
65well.  Having found a binding, the pre-post-order function first
66checks to see if the binding is of the form
67  (<trigger-symbol> *preorder* . <handler>)
68If it is, the handler is 'applied' to the current node. Otherwise,
69the pre-post-order function first calls itself recursively for each
70child of the current node, with <new-bindings> prepended to the
71<bindings> in effect. The result of these calls is passed to the
72<handler> (along with the head of the current <Node>). To be more
73precise, the handler is _applied_ to the head of the current node
74and its processed children. The result of the handler, which should
75also be a <tree>, replaces the current <Node>. If the current <Node>
76is a text string or other atom, a special binding with a symbol
77*text* is looked up.
78
79A binding can also be of a form
80     (<trigger-symbol> *macro* . <handler>)
81This is equivalent to *preorder* described above. However, the result
82is re-processed again, with the current stylesheet.
83
84==== post-order
85
86<procedure>(post-order tree bindings)</procedure>
87
88Deprecated. This was a version of pre-post-order that did not accept
89{{*macro*}} or {{*preorder*}} directives.
90
91==== foldts
92
93<procedure>(foldts fdown fup fhere seed tree)</procedure>
94
95Tree fold operator.
96
97    tree = atom | (node-name tree ...)
98
99    foldts fdown fup fhere seed (Leaf str) = fhere seed str
100    foldts fdown fup fhere seed (Nd kids) =
101          fup seed $ foldl (foldts fdown fup fhere) (fdown seed) kids
102
103    procedure fhere: seed -> atom -> seed
104    procedure fdown: seed -> node -> seed
105    procedure fup: parent-seed -> last-kid-seed -> node -> seed
106
107foldts returns the final seed
108
109==== replace-range
110
111<procedure>(replace-range beg-pred end-pred forest)</procedure>
112
113    procedure: replace-range:: BEG-PRED x END-PRED x FOREST -> FOREST
114Traverse a forest depth-first and cut/replace ranges of nodes.
115
116The nodes that define a range don't have to have the same immediate
117parent, don't have to be on the same level, and the end node of a
118range doesn't even have to exist. A replace-range procedure removes
119nodes from the beginning node of the range up to (but not including)
120the end node of the range.  In addition, the beginning node of the
121range can be replaced by a node or a list of nodes. The range of
122nodes is cut while depth-first traversing the forest. If all
123branches of the node are cut a node is cut as well.  The procedure
124can cut several non-overlapping ranges from a forest.
125
126    replace-range:: BEG-PRED x END-PRED x FOREST -> FOREST
127where
128    type FOREST = (NODE ...)
129    type NODE = Atom | (Name . FOREST) | FOREST
130
131The range of nodes is specified by two predicates, beg-pred and end-pred.
132    beg-pred:: NODE -> #f | FOREST
133    end-pred:: NODE -> #f | FOREST
134The beg-pred predicate decides on the beginning of the range. The node
135for which the predicate yields non-#f marks the beginning of the range
136The non-#f value of the predicate replaces the node. The value can be a
137list of nodes. The replace-range procedure then traverses the tree and skips
138all the nodes, until the end-pred yields non-#f. The value of the end-pred
139replaces the end-range node. The new end node and its brothers will be
140re-scanned.
141The predicates are evaluated pre-order. We do not descend into a node that
142is marked as the beginning of the range.
143
144==== SXML->HTML
145
146<procedure>(SXML->HTML tree)</procedure>
147
148This procedure is the most generic transformation of SXML
149into the corresponding HTML document. The SXML tree is traversed
150post-order (depth-first) and transformed into another tree, which,
151written in a depth-first fashion, results in an HTML document that
152is sent to {{current-output-port}}.
153
154It's basically like pre-post-order with the universal-conversion-rules
155hardcoded, and a SRV:send-reply wrapped around it.
156Besides the universal-conversion-rules it also knows about a rule
157{{html:begin}}, which translates the HTML code to oldskool uppercase
158HTML 3 code preceded by a Content-Type header.
159
160==== entag
161
162<procedure>(entag tag elems)</procedure>
163
164Create the HTML markup fragments for tags. TAG is the name of the tag (a symbol) and ELEMS is the tree of elements that form the contents of this tag (''not'' recusively processed).
165This is used in the node handlers for the (pre-)post-order function, to prepare it for output by {{SRV:send-reply}}.
166This is an alias for {{entag-xhtml}} (see below, in the section about Chicken-specific modifications)
167
168==== enattr
169
170<procedure>(enattr attr-key value)</procedure>
171
172Create the HTML markup fragments for attributes. The ATTR-KEY is the name of the attribute (a symbol) and VALUE is the value it should have.
173This is used in the node handlers for the (pre-)post-order function, to prepare it for output by {{SRV:send-reply}}.
174
175==== string->goodHTML
176
177<procedure>(string->goodHTML html)</procedure>
178
179Given a string, check to make sure it does not contain characters
180such as '<' or '&' that require encoding. Return either the original
181string, or a list of string fragments with special characters
182replaced by appropriate character entities.
183
184==== universal-conversion-rules
185
186<constant>universal-conversion-rules</constant>
187
188Bindings for the (pre-)post-order function, which traverses the SXML tree
189and converts it to a tree of fragments. It contains rules to call
190{{string->goodHTML}}, {{enattr}} and {{entag}} on all text, attributes and
191tags. In normal situations you always append these rules to your own rules,
192or add a final pre-post-order processing step with just these bindings.
193
194==== universal-protected-rules
195
196<constant>universal-protected-rules</constant>
197
198A variation of universal-conversion-rules which keeps
199{{'<'}}, {{'>'}}, {{'&'}} and similar characters intact (ie, it
200skips calling {{string->goodHTML}}).
201The {{universal-protected-rules}} are useful when the tree of
202fragments has to be traversed one more time.
203
204==== alist-conv-rules
205
206<constant>alist-conv-rules</constant>
207
208These rules define the identity transformation. You will usually need
209to append these rules to all of the bindings you use with {{pre-post-order}},
210unless you explicitly define your own conversion rules for {{*default*}}
211and {{*text*}}.
212
213==== make-char-quotator
214
215<procedure>(make-char-quotator quot-rules)</procedure>
216
217Given QUOT-RULES, an assoc list of (char . string) pairs, return
218a quotation procedure. The returned quotation procedure takes a string
219and returns either a string or a list of strings. The quotation procedure
220check to see if its argument string contains any instance of a character
221that needs to be encoded (quoted). If the argument string is "clean",
222it is returned unchanged. Otherwise, the quotation procedure will
223return a list of string fragments. The input straing will be broken
224at the places where the special characters occur. The special character
225will be replaced by the corresponding encoding strings.
226
227For example, to make a procedure that quotes special HTML characters, do:
228<examples><example>
229(make-char-quotator
230    '((#\< . "&lt;") (#\> . "&gt;") (#\& . "&amp;") (#\" . "&quot;")))
231</example></examples>
232
233==== Chicken-specific modifications
234
235==== entag-xhtml
236
237<procedure>(entag-xhtml)</procedure>
238
239{{entag-xhtml}} closes XHTML tags properly in an HTML compatible way.  {{entag}} is now an alias for {{entag-xhtml}}, so this behaviour is the default.
240
241Newlines before open tags in the rendered HTML output are omitted for inline elements, such as {{tt}} and {{strong}}.  This prevents the introduction of extraneous whitespace.
242
243==== entag-html
244
245<procedure>(entag-html)</procedure>
246
247{{entag-html}} is an alias for the original {{entag}}.
248
249==== universal-conversion-rules
250
251The {{universal-conversion-rules}} have been augmented a bit.
252
253The following rule has been added:
254    (& ENTITY-NAME ...)
255
256Quotes character references given by strings {{ENTITY-NAME ...}}.
257
258Example:
259<examples><example>
260<expr>(& "ndash" "quot")</expr>
261<result>"&ndash;&quot;"</result>
262</example></examples>
263
264
265=== sxml-to-sxml
266
267<procedure>(pre-post-order tree bindings)</procedure>
268
269This module's version of {{pre-post-order}} is a variant which always outputs strictly-conformant SXML. It unnests lists that do not have a tag as their {{car}} until they do.
270This comes from {{sxml-to-sxml.scm}}. If you import it, be sure to rename or omit the one from the {{sxml-transforms}} module.
271
272
273== Examples
274
275[[http://okmij.org/ftp/Scheme/xml.html|Oleg's site]] is the main resource.  Be sure to read his examples and the ones in the SSAX repository (also included in the egg).  The following papers were of great help:
276* [[http://okmij.org/ftp/papers/SXs-talk.pdf]]
277* [[http://okmij.org/ftp/papers/SXs.pdf]]
278* [[http://okmij.org/ftp/papers/SXSLT-talk.pdf]]
279Also, the [[eggdoc.html|eggdoc]] extension makes heavy use of sxml-transforms.
280
281There's also a more friendly [[http://sjamaan.ath.cx/docs/scheme/sxslt.pdf|SXML tutorial]] available.
282
283The initial documentation on this wiki page came straight from the comments in the extremely well-documented source code. It's recommended you read the code if you want to learn more.
284
285== About this egg
286
287
288=== Author
289
290
291[[http://okmij.org/ftp/|Oleg Kiselyov]]. Port by [[http://3e8.org/zb/|Zbigniew]]
292
293=== Version history
294
295; 1.2 : Port to hygienic chicken
296; 1.1 : Improve inline element whitespace handling; add '&' rule.
297; 1.0 : Initial release
298
299=== License
300
301
302The sxml-transforms code is in the public domain.
303
Note: See TracBrowser for help on using the repository browser.