source: project/wiki/eggref/4/sxml-tools @ 13334

Last change on this file since 13334 was 13334, checked in by sjamaan, 11 years ago

Add documentation for the sxpath-ext bit that was added to sxml-tools egg in release 4 branch

File size: 32.1 KB
Line 
1[[tags:eggs]]
2
3This is version 1.3 of the '''sxml-tools''' extension library for Chicken Scheme.
4
5[[toc:]]
6
7== Description
8
9The [[http://cvs.sourceforge.net/viewcvs.py/ssax/sxml-tools/|sxml-tools]] from the [[http://ssax.sf.net|SSAX project]] at Sourceforge.
10
11== Documentation
12
13This egg provides some utilities from the sxml-tools available in the
14SSAX/SXML Sourceforge project.  It consists of the extensions defined
15in {{sxml-tools.scm}} plus {{sxpathlib}} and {{sxpath-ext}}.  This is
16equivalent to the "low-level sxpath interface" described at
17[[http://www196.pair.com/lisovsky/query/sxpath/|the introduction to SXPath]].
18
19These utilities are useful when you want to query SXML document trees,
20but full sxpath would be overkill.  Most of these procedures are
21faster than their sxpath equivalent, because they are very specific.
22But this also means they are very low-level, so you should use them
23only if you know what you're doing.
24
25Much documentation is available at
26[[http://www196.pair.com/lisovsky/xml/index.html|Lisovsky's XML page]]
27and the [[http://ssax.sf.net|SSAX homepage]].
28
29The initial documentation on this wiki page came straight from the
30comments in the extremely well-documented source code. It's
31recommended you read the code if you want to learn more.
32
33=== sxml-tools
34
35This section documents the procedures that come from sxml-tools. These
36include mostly-generic list and SXML operators.
37
38==== Predicates
39
40<procedure>(sxml:empty-element? obj)</procedure>
41
42Predicate which returns {{#t}} if given element {{obj}} is empty.
43Empty elements have no nested elements, text nodes, PIs, Comments or
44entities but may contain attributes or namespace-id.  It is a SXML
45counterpart of XML {{empty-element}}.
46
47<procedure>(sxml:shallow-normalized? obj)</procedure>
48
49Returns {{#t}} if the given {{obj}} is a shallow-normalized SXML
50element.  The element itself has to be normalised but its nested
51elements are not tested.
52
53<procedure>(sxml:normalized? obj)</procedure>
54
55Returns {{#t}} if the given {{obj}} is a normalized SXML element.  The element
56itself and all its nested elements have to be normalised.
57
58<procedure>(sxml:shallow-minimized? obj)</procedure>
59
60Returns {{#t}} if the given {{obj}} is a shallow-minimized SXML
61element.  The element itself has to be minimised but its nested
62elements are not tested.
63
64<procedure>(sxml:minimized? obj)</procedure>
65
66Returns {{#t}} if the given {{obj}} is a minimized SXML element.  The
67element itself and all its nested elements have to be minimised.
68
69==== Accessors
70
71These procedures obtain information about nodes, or their direct
72children.  They don't traverse subtrees.
73
74===== Normalization-independent accessors
75
76These accessors can be used on arbitrary, non-normalized SXML trees.
77Because of this, they are generally slower than the
78normalization-dependent variants listed in the next section.
79
80<procedure>(sxml:name node)</procedure>
81
82Returns a name of a given SXML node. It is introduced for the sake of
83encapsulation.
84
85<procedure>(sxml:element-name obj)</procedure>
86
87A checked version of sxml:name, which returns {{#f}} if the given
88{{obj}} is not a SXML element. Otherwise returns its name.
89
90<procedure>(sxml:node-name obj)</procedure>
91
92Safe version of sxml:name, which returns {{#f}} if the given {{obj}}
93is not a SXML node.  Otherwise returns its name.
94
95The difference between this and {{sxml::element-name}} is that a node
96can be one of {{@}}, {{@@}}, {{*PI*}}, {{*COMMENT*}} or {{*ENTITY*}}
97while an element must be a real element (any symbol not in that set is
98considered to be an element).
99
100<procedure>(sxml:ncname node)</procedure>
101
102Like {{sxml:name}}, except returns only the local part of the name
103(called an "NCName" in the
104[http://www.w3.org/TR/xml-names/|XML namespaces spec]).
105
106The node's name is interpreted as a "Qualified Name", a
107colon-separated name of which the last one is considered to be the
108local part.  If the name contains no colons, the name itself is
109returned.
110
111'''Important:''' Please note that while an SXML name is a symbol, this
112function returns a string.
113
114<procedure>(sxml:name->ns-id sxml-name)</procedure>
115
116Given a node name, return the namespace part of the name (called a
117{{namespace-id}}).  If the name contains no colons, returns {{#f}}.  See
118{{sxml:ncname}} for more info.
119
120'''Important:''' Please note that while an SXML name is a symbol, this
121function returns a string.
122
123<procedure>(sxml:content obj)</procedure>
124
125Retrieve the contents of an SXML element or nodeset.  Any non-element
126nodes (attributes, processing instructions, etc) are discarded,
127while the elements and text nodes are returned as a list of strings
128and nested elements in document order.  This list is empty if {{obj}}
129is an empty element or empty list.
130
131The inner elements are unmodified so they still contain attributes,
132but also comments or other non-element nodes.
133
134<examples>
135<example>
136<expr>
137(sxml:content
138  '(div (@ (class "content"))
139        (*COMMENT* "main contents start here")
140         "The document moved "
141         (a (@ (href "/other.xml")) "here")))
142</expr>
143<result>("The document moved " (a (@ (href "/other.xml")) "here"))</result>
144</example>
145</examples>
146
147<procedure>(sxml:text node)</procedure>
148
149Returns a string which combines all the character data from text node
150children of the given SXML element or "" if there are no text node
151children.  Note that it does not include text from descendant nodes,
152only direct children.
153
154<examples>
155<example>
156<expr>
157(sxml:text
158  '(div (@ (class "content"))
159        (*COMMENT* "main contents start here")
160         "The document moved "
161         (a (@ (href "/other.xml")) "here")))
162</expr>
163<result>("The document moved ")</result>
164</example>
165</examples>
166
167==== Normalization-dependent accessors
168
169"Universal" accessors are less effective but may be used for
170non-normalized SXML.  These safe accessors are named with suffix '-u'
171for "universal".
172
173"Fast" accessors are optimized for normalized SXML data.  They are not
174applicable to arbitrary non-normalized SXML data.  Their names have no
175specific suffixes.
176
177<procedure>(sxml:content-raw obj)</procedure>
178
179Returns all the content of normalized SXML element except attr-list
180and aux-list.  Thus it includes {{PI}}, {{COMMENT}} and {{ENTITY}}
181nodes as well as {{TEXT}} and {{ELEMENT}} nodes returned by
182{{sxml:content}}.  Returns a list of nodes in document order or empty
183list if {{obj}} is an empty element or an empty list.
184
185This function is faster than {{sxml:content}}.
186
187<procedure>(sxml:attr-list-u obj)</procedure>
188
189Returns the list of attributes for given element or nodeset.  Analog
190of {{((sxpath '(@ *)) obj)}}.  Empty list is returned if there is no
191list of attributes.
192
193<procedure>(sxml:aux-list obj)</procedure>
194<procedure>(sxml:aux-list-u obj)</procedure>
195
196Returns the list of auxiliary nodes for given element or nodeset.
197Analog of {{((sxpath '(@@ *)) obj)}}.  Empty list is returned if a
198list of auxiliary nodes is absent.
199
200<procedure>(sxml:aux-node obj aux-name)</procedure>
201
202Return the first aux-node with <aux-name> given in SXML element
203{{obj}} or {{#f}} is such a node is absent.
204
205'''NOTE:''' it returns just the ''first'' node found even if multiple
206nodes are present, so it's mostly intended for nodes with unique names.
207Use {{sxml:aux-nodes}} if you want all of them.
208
209<procedure>(sxml:aux-nodes obj aux-name)</procedure>
210   
211Return a list of aux-nodes with {{aux-name}} given in SXML element
212{{obj}} or {{'()}} if such a node is absent.
213
214<procedure>(sxml:attr obj attr-name)</procedure>
215
216Returns the value of the attribute with name {{attr-name}} in the
217given SXML element {{obj}}, or {{#f}} if no such attribute exists.
218
219<procedure>(sxml:attr-from-list attr-list name)</procedure>
220
221Returns the value of the attribute with name {{attr-name}} in the
222given list of attributes {{attr-list}}, or {{#f}} if no such attribute
223exists.  The list of attributes can be obtained from an element using
224the {{sxml:attr-list}} procedure.
225
226<procedure>(sxml:num-attr obj attr-name)</procedure>
227
228Returns the value of the numerical attribute with name {{attr-name}}
229in the given SXML element {{obj}}, or {{#f}} if no such attribute
230exists.  This value is converted from a string to a number.
231
232<procedure>(sxml:attr-u obj attr-name)</procedure>
233
234Accessor for an attribute {{attr-name}} of given SXML element {{obj}},
235which may also be an attributes-list or a nodeset (usually content of
236an SXML element)
237
238<procedure>(sxml:ns-list obj)</procedure>
239
240Returns the list of namespaces for given element.  Analog of
241{{((sxpath '(@@ *NAMESPACES* *)) obj)}}.  The empty list is returned
242if there are no namespaces.
243
244<procedure>(sxml:ns-id->nodes obj namespace-id)</procedure>
245
246Returns a list of namespace information lists that match the given
247{{namespace-id}} in SXML element {{obj}}.  Analog of
248{{((sxpath '(@@ *NAMESPACES* namespace-id)) obj)}}.
249The empty list is returned if there is no namespace with the given
250{{namespace-id}}.
251
252<examples>
253<example>
254<expr>
255(sxml:ns-id->nodes
256  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")))) 'c)
257</expr>
258<result>((c "http://www.cars.com/xml"))</result>
259</example>
260</examples>
261
262<procedure>(sxml:ns-id->uri obj namespace-id)</procedure>
263
264Returns the URI for the (first) namespace matching the given
265{{namespace-id}}, or {{#f}} if no namespace matches the given
266{{namespace-id}}.
267
268<examples>
269<example>
270<expr>
271(sxml:ns-id->uri
272  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")))) 'c)
273</expr>
274<result>"http://www.cars.com/xml"</result>
275</example>
276</examples>
277
278<procedure>(sxml:ns-uri->nodes obj uri)</procedure>
279
280Returns a list of namespace information lists that match the given
281{{uri}} in SXML element {{obj}}.
282
283<examples>
284<example>
285<expr>
286(sxml:ns-uri->nodes
287  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")
288                                 (d "http://www.cars.com/xml"))))
289  "http://www.cars.com/xml")
290</expr>
291<result>((c "http://www.cars.com/xml") (d "http://www.cars.com/xml"))</result>
292</example>
293</examples>
294
295<procedure>(sxml:ns-uri->id obj uri)</procedure>
296
297Returns the namespace id for the (first) namespace matching the given
298{{uri}}, or {{#f}} if no namespace matches the given {{uri}}.
299
300<examples>
301<example>
302<expr>
303(sxml:ns-uri->id
304  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")
305                                 (d "http://www.cars.com/xml"))))
306  "http://www.cars.com/xml")
307</expr>
308<result>c</result>
309</example>
310</examples>
311
312<procedure>(sxml:ns-id ns-list)</procedure>
313
314Given a namespace information list {{ns-list}}, returns the namespace ID.
315
316<procedure>(sxml:ns-uri ns-list)</procedure>
317
318Given a namespace information list {{ns-list}}, returns the namespace URI.
319
320<procedure>(sxml:ns-prefix ns-list)</procedure>
321
322Given a namespace information list {{ns-list}}, returns the namespace
323prefix if it is present in the list.  If it's not present, returns the
324namespace ID.
325
326==== Data modification procedures
327
328Constructors and mutators for normalized SXML data
329 
330'''Important:''' These functions are optimized for normalized SXML
331data.  They are ''not'' applicable to arbitrary non-normalized SXML
332data.
333
334Most of the functions are provided in two variants:
335
336# Side-effect intended functions for linear update of given elements.  Their names are ended with exclamation mark.
337# Pure functions without side-effects which return modified elements.
338
339
340<procedure>(sxml:change-content! obj new-content)</procedure>
341<procedure>(sxml:change-content obj new-content)</procedure>
342
343Change the content of given SXML element {{obj}} to {{new-content}}.
344If {{new-content}} is an empty list then the {{obj}} is transformed to
345an empty element.  The resulting SXML element is normalized.
346
347<procedure>(sxml:change-attrlist obj new-attrlist)</procedure>
348<procedure>(sxml:change-attrlist! obj new-attrlist)</procedure>
349
350Change the attribute list of the given SXML element {{obj}} to
351{{new-attrlist}}.
352
353<procedure>(sxml:change-name obj new-name)</procedure>
354<procedure>(sxml:change-name! obj new-name)</procedure>
355
356Change the name of the given SXML element {{obj}} to {{new-name}}.
357
358<procedure>(sxml:add-attr obj attr)</procedure>
359<procedure>(sxml:add-attr! obj attr)</procedure>
360
361Returns the given SXML element {{obj}} with the attribute {{attr}}
362added to the attribute list, or {{#f}} if the attribute already exists.
363
364<procedure>(sxml:change-attr obj attr)</procedure>
365<procedure>(sxml:change-attr! obj attr)</procedure>
366
367Returns SXML element {{obj}} with changed value of attribute {{attr}}
368or {{#f}} if where is no attribute with given name.
369
370{{attr}} is a list like it would occur as a member of an attribute
371list: {{(attr-name attr-value)}}.
372   
373<procedure>(sxml:set-attr obj attr)
374<procedure>(sxml:set-attr! obj attr)
375
376Returns SXML element {{obj}} with changed value of attribute {{attr}}.
377If there is no such attribute the new one is added.
378
379{{attr}} is a list like it would occur as a member of an attribute
380list: {{(attr-name attr-value)}}.
381
382<procedure>(sxml:add-aux obj aux-node)</procedure>
383<procedure>(sxml:add-aux! obj aux-node)</procedure>
384
385Returns SXML element {{obj}} with an auxiliary node {{aux-node}} added.
386
387<procedure>(sxml:squeeze obj)</procedure>
388<procedure>(sxml:squeeze! obj)</procedure>
389
390Returns a minimized and normalized SXML element {{obj}} with empty
391lists of attributes and aux-lists eliminated, in {{obj}} and all its
392descendants.
393   
394<procedure>(sxml:clean obj)</procedure>
395
396Returns a minimized and normalized SXML element {{obj}} with empty
397lists of attributes and '''all''' aux-lists eliminated, in {{obj}} and
398all its descendants.
399
400
401==== Sxpath-related procedures
402
403<procedure>(select-first-kid test-pred?)</procedure>
404
405Given a node, return the first child that satisfies the
406{{test-pred?}}.  Given a nodeset, traverse the set until a node is
407found whose first child matches the predicate.  Returns {{#f}} if
408there is no such a child to be found.
409
410<procedure>(sxml:node-parent rootnode)</procedure>
411
412Returns a function of one argument - an SXML element - which returns
413its parent node using {{*PARENT*}} pointer in the aux-list.
414{{'*TOP-PTR*}} may be used as a pointer to root node.  It returns an
415empty list when applied to the root node.
416
417<procedure>(sxml:add-parents obj [top-ptr])</procedure>
418
419Returns the SXML element {{obj}} annotated with {{*PARENT*}} pointers
420for {{obj}} and all its descendants.  If {{obj}} is not the root node
421(a node with a name of {{*TOP*}}), you must pass in the parent pointer
422for {{obj}} as {{top-ptr}}.
423
424'''Warning:''' This procedure mutates its {{obj}} argument.
425
426<procedure>(sxml:lookup id index)</procedure>
427
428Lookup an element using its ID.  {{index}} should be an alist of
429{{(id . element)}}.
430
431==== Markup generation
432
433===== XML
434
435<procedure>(sxml:attr->xml attr)</procedure>
436
437Returns a list containing tokens that when joined together form the
438attribute's XML output.
439
440'''Warning:''' This procedure assumes that the attribute's values have
441already been escaped (ie, {{sxml:string->xml has been called on the
442strings inside it}}).
443
444<examples>
445<example>
446<expr>(sxml:attr->xml '(href "http://example.com"))</expr>
447<result>(" " "href" "='" "http://example.com" "'")</result>
448</example>
449</examples>
450
451<procedure>(sxml:string->xml string)</procedure>
452
453Escape the {{string}} so it can be used anywhere in XML output.  This
454converts the {{<}}, {{>}}, {{'}}, {{"}} and {{&}} characters to their
455respective entities.
456
457<procedure>(sxml:sxml->xml tree)</procedure>
458
459Convert the {{tree}} of SXML nodes to a nested list of XML fragments.
460These fragments can be output by flattening the list and concatenating
461the strings inside it.
462
463==== HTML
464
465<procedure>(sxml:attr->html attr)</procedure>
466
467Returns a list containing tokens that when joined together form the
468attribute's HTML output.  The difference with the XML variant is that
469this encodes empty attribute values to attributes with no value (think
470{{selected}} in option elements, or {{checked}} in checkboxes).
471
472'''Warning:''' This procedure assumes that the attribute's values have
473already been escaped (ie, {{sxml:string->html has been called on the
474strings inside it}}).
475
476<procedure>(sxml:string->html string)</procedure>
477
478Escape the {{string}} so it can be used anywhere in XML output.  This
479converts the {{<}}, {{>}}, {{"}} and {{&}} characters to their
480respective entities.
481
482<procedure>(sxml:non-terminated-html-tag? tag)</procedure>
483
484Is the named {{tag}} one that is "self-closing" (ie, does not need to
485be terminated) in HTML 4.0?
486
487<procedure>(sxml:sxml->html tree)</procedure>
488
489Convert the {{tree}} of SXML nodes to a nested list of HTML fragments.
490These fragments can be output by flattening the list and concatenating
491the strings inside it.
492
493
494=== Procedures from sxpathlib
495
496==== Basic converters and applicators
497
498A converter is a function
499
500  type Converter = Node|Nodelist -> Nodelist
501
502A converter can also play a role of a predicate: in that case, if a
503converter, applied to a node or a nodelist, yields a non-empty
504nodelist, the converter-predicate is deemed satisfied. Throughout this
505file a nil nodelist is equivalent to {{#f}} in denoting a failure.
506
507<procedure>(nodeset? obj)</procedure>
508
509Returns {{#t}} if {{obj}} is a nodelist.
510
511<procedure>(as-nodeset obj)</procedure>
512
513If {{obj}} is a nodelist - returns it as is, otherwise wrap it in a
514list.
515
516==== Node test
517
518The following functions implement 'Node test's as defined in Sec. 2.3
519of the XPath document.  A node test is one of the components of a
520location step.  It is also a converter-predicate in SXPath.
521
522<procedure>(sxml:element? obj)</procedure>
523
524Predicate which returns {{#t}} if {{obj}} is SXML element, otherwise {{#f}}.
525
526<procedure>(ntype-names?? crit)</procedure>
527
528Takes a list of acceptable node names as a criterion and returns a
529function, which, when applied to a node, will return {{#t}} if the
530node name is present in criterion list and {{#f}} otherwise.
531
532   ntype-names?? :: ListOfNames -> Node -> Boolean
533
534<procedure>(ntype?? crit)</procedure>
535
536Takes a type criterion and returns a function, which, when applied to
537a node, will tell if the node satisfies the test.
538
539  ntype?? :: Crit -> Node -> Boolean
540
541The criterion {{crit}} is  one of the following symbols:
542
543; {{@}} : tests if the Node is an {{attributes-list}}
544; {{*}} : tests if the Node is an {{Element}}
545; {{*text*}} : tests if the Node is a text node
546; {{*data*}} : tests if the Node is a data node  (text, number, boolean, etc., but not pair)
547; {{*PI*}} : tests if the Node is a processing instructions node
548; {{*COMMENT*}} : tests if the Node is a comment node
549; {{*ENTITY*}} : tests if the Node is an entity node
550; {{*any*}} : {{#t}} for any type of Node
551; other symbol : tests if the Node has the right name given by the symbol
552
553<examples>
554<example>
555<expr>
556((ntype?? 'div) '(div (@ (class "greeting")) "hi"))
557</expr>
558<result>
559#t
560</result>
561</example>
562<example>
563<expr>
564((ntype?? 'div) '(span (@ (class "greeting")) "hi"))
565</expr>
566<result>
567#f
568</result>
569</example>
570<example>
571<expr>
572((ntype?? '*) '(span (@ (class "greeting")) "hi"))
573</expr>
574<result>
575#t
576</result>
577</example>
578</examples>
579   
580<procedure>(ntype-namespace-id?? ns-id)</procedure>
581
582This function takes a namespace-id, and returns a predicate
583{{Node -> Boolean}}, which is {{#t}} for nodes with the given
584namespace id. {{ns-id}} is a string.
585{{(ntype-namespace-id?? #f)}} will be {{#t}} for nodes with
586non-qualified names.
587
588<procedure>(sxml:complement pred)</procedure>
589
590This function takes a predicate and returns it complemented, that is
591if the given predicate yields {{#f}} or {{'()}} the complemented one
592yields the given node and vice versa.
593
594<procedure>(node-eq? other)</procedure>
595
596Returns a predicate procedure that, given a node, returns {{#t}} if
597the node is the exact same as {{other}}.
598
599<procedure>(node-equal? other)</procedure>
600
601Returns a predicate procedure that, given a node, returns {{#t}} if
602the node has the same contents as {{other}}.
603
604<procedure>(node-pos n)</procedure>
605
606Returns a procedure that, given a nodelist, returns a new nodelist
607containing only the {{n}}th element, counting from 1.  If {{n}} is
608negative, it returns a nodelist with the {{n}}th element counting from
609the right.  If no such node exists, returns the empty list.  {{n}} may
610not equal zero.
611
612<examples>
613<example>
614<expr>
615((node-pos 1) '((div "hi") (span "hello") (em "really, hi!")))
616</expr>
617<result>
618((div "hi"))
619</result>
620</example>
621<example>
622<expr>
623((node-pos 6) '((div "hi") (span "hello") (em "really, hi!")))
624</expr>
625<result>
626()
627</result>
628</example>
629<example>
630<expr>
631((node-pos -1) '((div "hi") (span "hello") (em "is this thing on?")))
632</expr>
633<result>
634((em "is this thing on?"))
635</result>
636</example>
637</examples>
638
639<procedure>(sxml:filter pred?)</procedure>
640
641Returns a procedure that accepts a nodelist or a node (which will be
642converted to a one-element nodelist) and returns only those nodes for
643which the predicate {{pred?}} does not return {{#f}} or {{'()}}.
644
645<examples>
646<example>
647<expr>
648((sxml:filter (ntype?? 'div)) '((div "hi") (span "hello") (div "still here?")))
649</expr>
650<result>
651((div "hi") (div "still here?"))
652</result>
653</example>
654</examples>
655
656<procedure>(take-until pred?)</procedure>
657<procedure>(take-after pred?)</procedure>
658
659Returns a procedure that accepts a node or a nodelist.
660
661The {{take-until}} variant returns everything ''before'' the first
662node for which the predicate {{pred?}} returns anything but {{#f}} or
663{{'()}}.  In other words, it returns the longest prefix for which the
664predicate returns {{#f}} or {{'()}}.
665
666The {{take-after}} variant returns everything ''after'' the first node
667for which the predicate {{pred?}} returns anything besides {{#f}} or
668{{'()}}.
669
670<examples>
671<example>
672<expr>
673((take-until (ntype?? 'span)) '((div "hi") (span "hello") (span "there") (div "still here?")))
674</expr>
675<result>
676((div "hi"))
677</result>
678</example>
679<example>
680<expr>
681((take-after (ntype?? 'span)) '((div "hi") (span "hello") (span "there") (div "still here?")))
682</expr>
683<result>
684((span "there") (div "still here?"))
685</result>
686</example>
687</examples>
688
689<procedure>(map-union proc list)</procedure>
690
691Apply {{proc}} to each element of the nodelist {{lst}} and return the
692list of results.  If {{proc}} returns a nodelist, splice it into the
693result (essentially returning a flattened nodelist).
694
695<procedure>(node-reverse node-or-nodelist)</procedure>
696
697Accepts a nodelist and reverses the nodes inside.  If a node is passed
698to this procedure, it returns a nodelist containing just that node.
699(it does not change the order of the children).
700
701==== Converter combinators
702
703Combinators are higher-order functions that transmogrify a converter
704or glue a sequence of converters into a single, non-trivial
705converter. The goal is to arrive at converters that correspond to
706XPath location paths.
707
708From a different point of view, a combinator is a fixed, named
709''pattern'' of applying converters. Given below is a complete set of
710such patterns that together implement XPath location path
711specification. As it turns out, all these combinators can be built
712from a small number of basic blocks; regular functional composition,
713{{map-union}} and filter applicators, and the nodelist union.
714
715<procedure>(select-kids pred?)</procedure>
716
717Returns a procedure that accepts a node and returns a nodelist of the
718node's children that satisfy {{pred?}} (ie, {{pred?}} returns anything
719but {{#f}} or {{'()}}).
720
721<procedure>(node-self pred?)</procedure>
722
723Similar to {{select-kids}} but applies to the node itself rather than
724to its children. The resulting Nodelist will contain either one
725component (the node), or will be empty (if the node failed the
726predicate).
727
728<procedure>(node-join . selectors)</procedure>
729
730Returns a procedure that accepts a nodelist or a node, and returns a
731nodelist with all the selectors applied to every node in sequence.
732The selectors must function as converter combinators, ie they must
733accept a ''node'' and output a ''nodelist''.
734
735<examples>
736<example>
737<expr>
738((node-join
739  (select-kids (ntype?? 'li))
740  sxml:content)
741 '((ul (@ (class "whiskies"))
742       (li "Ardbeg")
743       (li "Glenfarclas")
744       (li "Springbank"))))
745</expr>
746<result>
747("Ardbeg" "Glenfarclas" "Springbank")
748</result>
749</example>
750</examples>
751
752<procedure>(node-reduce . converters)</procedure>
753
754A regular functional composition of converters.
755
756From a different point of view,
757  ((apply node-reduce converters) nodelist)
758is equivalent to
759  (fold apply nodelist converters)
760i.e., folding, or reducing, a list of converters with the nodelist
761as a seed.
762
763
764<procedure>(node-or . converters)</procedure>
765
766This combinator applies all converters to a given node and produces
767the union of their results.  This combinator corresponds to a union,
768"{{|}}" operation for XPath location paths.
769
770<procedure>(node-closure test-pred?)</procedure>
771
772Select all ''descendants'' of a node that satisfy a
773converter-predicate.  This combinator is similar to {{select-kids}}
774but applies to grandchildren as well.
775
776<procedure>(node-trace title)</procedure>
777
778Returns a procedure that accepts a node or a nodelist, which it
779pretty-prints to the current output port, preceded by {{title}}.  It
780returns the node or the nodelist unchanged.  This is a useful
781debugging aid, since it doesn't really do anything besides print its
782argument and pass it on.
783
784<procedure>(sxml:node? obj)</procedure>
785
786Returns {{#t}} if the given {{obj}} is an SXML node, {{#f}} otherwise.
787A node is anything except an attribute list or an auxiliary list.
788
789<procedure>(sxml:attr-list node)</procedure>
790
791Returns the list of attributes for a given SXML node.  The empty list
792is returned if the given node is not an element, or if it has no list
793of attributes.
794
795This differs from {{sxml:attr-list-u}} in that this procedure accepts
796any SXML node while {{sxml:attr-list-u}} only accepts nodelists or
797elements.  This means that sxml:attr-list-u will throw an error if you
798pass it a text node (a string), while sxml:attr-list will not.
799
800<procedure>(sxml:attribute test-pred?)</procedure>
801
802Like {{sxml:filter}}, but considers the attributes instead of the
803nodes.  Returns a nodelist of attribtes that match {{test-pred?}}.
804
805<examples>
806<example>
807<expr>
808((sxml:attribute (ntype?? 'id))
809 '((div (@ (id "navigation")) "navigation here")
810   (div (@ (class "pullquote")) "random stuff")
811   (div (@ (id "main-content")) "lorem ipsum ...")))
812</expr>
813<result>
814((id "navigation") (id "main-content"))
815</result>
816</example>
817</examples>
818
819<procedure>(sxml:child test-pred?)</procedure>
820
821This procedure is similar to {{select-kids}}, but it returns an empty
822child-list for PI, Comment and Entity nodes.
823
824<procedure>(sxml:parent test-pred?)</procedure>
825
826Returns a procedure that accepts a root-node, and returns another
827procedure.  This second procedure accepts a nodeset (or a node) and
828returns the immediate parents of the nodes in the set, but only if
829for those parents that match the predicate.
830
831The root-node does not have to be the root node of the
832whole SXML tree -- it may be a root node of a branch of interest.
833
834This procedure can be used with any SXML node.
835
836==== Useful shortcuts
837
838<procedure>(node-parent node)</procedure>
839
840{{(node-parent rootnode)}} yields a converter that returns a parent of a
841node it is applied to. If applied to a nodelist, it returns the list
842of parents of nodes in the nodelist.
843
844This is equivalent to {{((sxml:parent (ntype? '*any*)) node)}}.
845
846<procedure>(sxml:child-nodes node)</procedure>
847
848Returns all the child nodes of the given {{node}}.
849
850This is equivalent to {{((sxml:child sxml:node?) node)}}.
851
852<procedure>(sxml:child-elements node)</procedure>
853
854Returns all the child ''elements'' of the given {{node}}. (ie,
855excludes any textnodes).
856
857This is equivalent to {{((select-kids sxml:element?) node)}}.
858
859=== Procedures from sxpath-ext
860
861==== SXML counterparts to W3C XPath Core Functions Library
862
863<procedure>(sxml:string object)</procedure>
864
865The counterpart to XPath 'string' function (section 4.2 XPath 1.0 Rec.).
866Converts a given object to a string.
867
868Notes:
869# When converting a nodeset, document order is not preserved
870# {{number->string}} returns the result in a form which is slightly different from XPath Rec. specification
871
872<procedure>(sxml:boolean object)</procedure>
873
874The counterpart to XPath 'boolean' function (section 4.3 XPath Rec.).
875Converts its argument to a boolean.
876
877<procedure>(sxml:number object)</procedure>
878
879The counterpart to XPath 'number' function (section 4.4 XPath Rec.).
880Converts its argument to a number.
881
882Notes:
883# The argument is not optional (yet?)
884# string->number conversion is not IEEE 754 round-to-nearest
885# NaN is represented as 0
886
887<procedure>(sxml:string-value node)</procedure>
888
889Returns a string value for a given node in accordance to
890XPath Rec. 5.1 - 5.7
891
892<procedure>(sxml:id id-index)</procedure>
893
894Returns a procedure that accepts a nodeset and returns a nodeset
895containing the elements in the id-index that match the string-values
896of each entry of the nodeset.  XPath Rec. 4.1
897
898The {{id-index}} is an alist with unique IDs as key, and elements as
899values:
900
901  id-index = ( (id-value . element) (id-value . element) ... )
902
903==== Comparators for XPath objects
904
905<procedure>(sxml:list-head list n)</procedure>
906
907Returns the {{n}} first members of {{list}}.  Mostly equivalent to
908SRFI-1's {{take}} procedure, except it returns the {{list}} if {{n}}
909is larger than the length of said list, instead of throwing an error.
910
911<procedure>(sxml:merge-sort less-than? list)</procedure>
912
913Returns the sorted list, the smallest member first.
914  less-than? ::= (lambda (obj1 obj2) ...)
915{{less-than?}} returns {{#t}} if {{obj1 < obj2}} with respect to the
916given ordering.
917
918<procedure>(sxml:equality-cmp bool=? number=? string=?)</procedure>
919
920A helper for XPath equality operations: {{=}} , {{!=}}.  The
921{{bool=?}}, {{number=?}} and {{string=?}} arguments are comparison
922operations for booleans, numbers and strings respectively.
923
924Returns a procedure that accepts two objects, looks at the first
925object's type and applies the correct comparison predicate to it.
926Type coercion takes place depending on the rules described in the
927XPath 1.0 spec, section 3.4 ("Booleans").
928
929<procedure>(sxml:equal? obj1 obj2)</procedure>
930<procedure>(sxml:not-equal? obj1 obj2)</procedure>
931
932Equality procedures with the default comparison operators {{eq?}},
933{{=}} and {{string=?}}, or their inverse, respectively.
934
935<procedure>(sxml:relational-cmp op)</procedure>
936
937A helper for XPath relational operations: {{<}}, {{>}}, {{<=}}, {{>=}}
938for two XPath objects.  {{op}} is one of these operators.
939
940Returns a procedure that accepts two objects and returns the value of
941the procedure applied to these objects, converted according to the
942coercion rules described in the XPath 1.0 spec, section 3.4
943("Booleans").
944
945==== XPath axes
946
947<procedure>(sxml:ancestor test-pred?)</procedure>
948
949Like {{sxml:parent}}, except it returns all the ancestors that match
950{{test-pred?}}, not just the immediate parent.
951
952<procedure>(sxml:ancestor-or-self test-pred?)</procedure>
953
954Like {{sxml:ancestor}}, except also allows the node itself to match
955the predicate.
956
957<procedure>(sxml:descendant test-pred?)</procedure>
958
959Like {{node-closure}}, except the resulting nodeset is in depth-first
960order instead of breadth-first.
961
962<procedure>(sxml:descendant-or-self test-pred?)</procedure>
963
964Like {{sxml:descendant}}, except also allows the node itself to match
965the predicate.
966
967<procedure>(sxml:following test-pred?)</procedure>
968
969Returns a procedure that accepts a root node and returns a new
970procedure that accepts a node and returns all nodes following this
971node in the document source matching the predicate.
972
973<procedure>(sxml:following-sibling test-pred?)</procedure>
974
975Like {{sxml:following}}, except only siblings (nodes at the same level
976under the same parent) are returned.
977
978<procedure>(sxml:preceding test-pred?)</procedure>
979
980Returns a procedure that accepts a root node and returns a new
981procedure that accepts a node and returns all nodes preceding this
982node in the document source matching the predicate.
983
984<procedure>(sxml:preceding-sibling test-pred?)</procedure>
985
986Like {{sxml:preceding}}, except only siblings (nodes at the same level
987under the same parent) are returned.
988
989<procedure>(sxml:namespace test-pred?)</procedure>
990
991Returns a procedure that accepts a nodeset and returns the namespace
992lists of the nodes matching {{test-pred?}}.
993
994
995== About this egg
996
997=== Author
998
999[[http://okmij.org/ftp/|Oleg Kiselyov]], [[http://www196.pair.com/lisovsky/|Kirill Lisovsky]], [[http://modis.ispras.ru/Lizorkin/index.html|Dmitry Lizorkin]].
1000
1001=== Version history
1002
1003; 1.3 : Port to Chicken 4, separation of sxml-tools/sxpathlib from sxpath
1004; 1.2 : uses {{string-intersperse}} and {{concatenate}} instead of {{apply string-append}} and {{apply append}} to circumvent argument count limit [felix]
1005; 1.1 : exports/imports support, split stx-engine into syntax & support files [Kon Lovett]
1006; 1.0 : Initial release [zbigniew]
1007
1008=== License
1009
1010The sxml-tools are in the public domain.
Note: See TracBrowser for help on using the repository browser.