source: project/wiki/eggref/4/sxpath @ 13426

Last change on this file since 13426 was 13426, checked in by sjamaan, 12 years ago

Document the sxpath function library procedures

File size: 43.9 KB
Line 
1[[tags:eggs]]
2
3This is version 0.1 of the '''sxpath''' extension library for Chicken Scheme.
4
5[[toc:]]
6
7== Description
8
9The sxpath parts of the [[http://cvs.sourceforge.net/viewcvs.py/ssax/sxml-tools/|sxml-tools]] from the [[http://ssax.sf.net|SSAX project]] at Sourceforge.
10Because txpath and sxpath are interwoven, this egg also includes txpath parts.
11
12== Documentation
13
14This egg provides the sxpath-related tools from the sxml-tools available
15in the SSAX/SXML Sourceforge project.
16
17It is split up in three modules: [[#sxpath|sxpath]], [[#txpath|txpath]]
18and [[#sxpath-lolevel]]. {{sxpath}} depends on {{txpath}} and both
19modules depend on {{sxpath-lolevel}}.
20
21Much documentation is available at
22[[http://www196.pair.com/lisovsky/xml/index.html|Lisovsky's XML page]]
23and the [[http://ssax.sf.net|SSAX homepage]].
24
25The initial documentation on this wiki page came straight from the
26comments in the extremely well-documented source code. It's
27recommended you read the code if you want to learn more.
28
29== sxpath
30
31This is the preferred interface to use.  It allows you to query the
32SXML document tree using an s-expression based language, in which you
33can also use arbitrary procedures and even "classic" textual XPath
34(see [[#txpath|below]] for docs on that).
35
36A complete description on how to use this is outside the scope of this
37egg documentation. See
38[[http://www196.pair.com/lisovsky/query/sxpath/|the introduction to SXPath]]
39for that.
40
41<procedure>(sxpath path [ns-binding])</procedure>
42
43Returns a procedure that accepts an SXML document tree and returns a
44nodeset (list of nodes) that match the {{path}} expression.
45
46The optional {{ns-binding}} argument is an alist of namespace
47bindings.  It is used to map abbreviated namespace prefixes to full
48URI strings but ''only for textual XPath strings'' embedded in the
49{{path}} expression.
50
51It can be useful to compare the following examples to those for
52[[#txpath|txpath]].
53
54<examples>
55<example>
56<expr>
57;; selects all the 'item' elements that have an 'olist' parent
58;; (which is not root) and that are in the same document as the context node
59((sxpath `(// olist item))
60 '(doc (olist (item "1")) (item "2") (nested (olist (item "3")))))
61</expr>
62<result>
63((item "1") (item "3"))
64</result>
65</example>
66<example>
67<expr>
68;; selects the 'chapter' children of the context node that have one or
69;; more 'title' children with string-value equal to 'Introduction'
70((sxpath '((chapter ((equal? (title "Introduction"))))))
71 '(text  (chapter (title "Introduction"))  (chapter "No title for this chapter")  (chapter (title "Conclusion"))))
72</expr>
73<result>
74((chapter (title "Introduction")))
75</result>
76</example>
77<example>
78<expr>
79;; (sxpath string-expr) is equivalent to (txpath string-expr)
80((sxpath "chapter[title='Introduction']")
81 '(text  (chapter (title "Introduction"))  (chapter "No title for this chapter")  (chapter (title "Conclusion"))))
82</expr>
83<result>
84((chapter (title "Introduction")))
85</result>
86</example>
87</examples>
88
89
90<procedure>(if-sxpath path)</procedure>
91
92Like {{sxpath}}, only returns {{#f}} instead of the empty list if
93nothing matches (so it does ''not'' always return a nodeset).
94
95<procedure>(car-sxpath path)</procedure>
96
97Like {{sxpath}}, only instead of a nodeset it returns the first node
98found.  If no node was found, return '''an empty list'''.
99
100<procedure>(if-car-sxpath path)</procedure>
101
102Like {{car-sxpath}}, only returns {{#f}} instead of the empty list if
103nothing matches.
104
105<procedure>(sxml:id-alist node . lpaths)</procedure>
106
107Builds an index as a list of {{(ID_value . element)}} pairs for given
108{{node}}. {{lpaths}} are location paths for attributes of type ID (ie,
109sxpath expressions that tell it how to find the ID attribute).
110
111Note: location paths ''must'' be of the form {{(expr '@ attrib-name)}}.
112
113See also {{sxml:lookup}} below, in {{sxpath-lolevel}}, which can use
114this index.
115
116<examples>
117<example>
118<expr>
119;; TODO: find out why location paths must be of the form (expr '@ symbol)
120;;       or if this description is incorrect
121(sxml:id-alist
122 '(div (span (@ (id "hi")) "there")
123       (div (@ (id "hello")) "dude")
124       (a (@ (id "link")) "click here"))
125 '(span @ id) '(a @ id))
126</expr>
127<result>
128(("hi" . (span (@ (id "hi")) "there"))
129 ("link" . (a (@ (id "link")) "click here")))
130</result>
131</example>
132</examples>
133
134== txpath
135
136This section documents the txpath interface. This interface is mostly
137useful for programs that deal exclusively with "legacy" textual XPath
138queries.
139
140=== Primary interface
141
142The following procedures are the main interface one would use in
143practice. There are also more low-level procedures (see next section),
144which one could use to build txpath extensions.
145
146<procedure>(sxml:xpath string . ns-binding)</procedure>
147<procedure>(txpath string . ns-binding)</procedure>
148<procedure>(sxml:xpath+root string . ns-binding)</procedure>
149<procedure>(sxml:xpath+root+vars string . ns-binding)</procedure>
150
151Returns a procedure that accepts an SXML document tree and returns a
152nodeset (list of nodes) that match the XPath expression {{string}}.
153
154The optional {{ns-binding}} argument is an alist of namespace
155bindings.  It is used to map abbreviated namespace prefixes to full
156URI strings.
157
158{{(txpath x)}} is equivalent to {{(sxpath x)}} whenever {{x}} is a
159string.  The {{txpath}}, {{sxml:xpath+root}} and
160{{sxml:xpath+root+vars}} procedures are currently all aliases for
161{{sxml:xpath}}, which exist for backwards compatibility reasons.
162
163It's useful to compare the following examples to the above examples
164for [[#sxpath|sxpath]].
165
166<examples>
167<example>
168<expr>
169;; selects all the 'item' elements that have an 'olist' parent
170;; (which is not root) and that are in the same document as the context node
171((txpath "//olist/item")
172 '(doc (olist (item "1")) (item "2") (nested (olist (item "3")))))
173</expr>
174<result>
175((item "1") (item "3"))
176</result>
177</example>
178<example>
179<expr>
180;; Same example as above, but now with a namespace prefix of 'x',
181;; which is bound to the namespace "bar" in the ns-binding parameter.
182((txpath "//x:olist/item" '((x . "bar")))
183 '(doc (bar:olist (item "1")) (item "2") (nested (olist (item "3")))))
184</expr>
185<result>
186((item "1"))
187</result>
188<example>
189<expr>
190;; selects the 'chapter' children of the context node that have one or
191;; more 'title' children with string-value equal to 'Introduction'
192((txpath "chapter[title='Introduction']")
193 '(text  (chapter (title "Introduction"))  (chapter "No title for this chapter")  (chapter (title "Conclusion"))))
194</expr>
195<result>
196((chapter (title "Introduction")))
197</result>
198</example>
199</examples>
200
201<procedure>(sxml:xpath+index string . ns-binding)</procedure>
202
203This procedure returns the result of {{sxml:xpath}} consed onto
204{{#t}}.  If the {{sxml:xpath}} would return {{#f}}, this returns
205{{#f}} instead.
206
207It is provided solely for backwards compatibility.
208
209
210<procedure>(sxml:xpointer string . ns-binding)</procedure>
211<procedure>(sxml:xpointer+root+vars string . ns-binding)</procedure>
212
213Returns a procedure that accepts an SXML document tree and returns a
214nodeset (list of nodes) that match the XPointer expression {{string}}.
215
216The optional {{ns-binding}} argument is an alist of namespace
217bindings.  It is used to map abbreviated namespace prefixes to full
218URI strings.
219
220Currently, only the XPointer {{xmlns()}} and {{xpointer()}} schemes
221are implemented, the {{element()}} scheme is not.
222
223<examples>
224<example>
225<expr>
226;; selects all the 'item' elements that have an 'olist' parent
227;; (which is not root) and that are in the same document as the context node.
228;; Equivalent to (txpath "//olist/item").
229((sxml:xpointer "xpointer(//olist/item)")
230 '(doc (olist (item "1")) (item "2") (nested (olist (item "3")))))
231</expr>
232<result>
233((item "1") (item "3"))
234</result>
235</example>
236<example>
237<expr>
238;; An example with a namespace prefix, now using the XPointer xmlns()
239;; function instead of the ns-binding parameter. xmlns always have full
240;; namespace names on their right-hand side, never bound shortcuts.
241((sxml:xpointer "xmlns(x=bar)xpointer(//x:olist/item)")
242 '(doc (bar:olist (item "1")) (item "2") (nested (olist (item "3")))))
243</expr>
244<result>
245((item "1"))
246</result>
247</example>
248</examples>
249
250<procedure>(sxml:xpointer+index string . ns-binding)</procedure>
251
252This procedure returns the result of {{sxml:xpointer}} consed onto
253{{#t}}.  If the {{sxml:xpointer}} would return {{#f}}, this returns
254{{#f}} instead.
255
256It is provided solely for backwards compatibility.
257
258
259<procedure>(sxml:xpath-expr string . ns-binding)</procedure>
260
261Returns a procedure that accepts an SXML node and returns {{#t}} if
262the node matches the {{string}} expression.  This is an expression of
263type {{Expr}}, which is whatever you can put in a predicate (between
264square brackets after a node name).
265
266The optional {{ns-binding}} argument is an alist of namespace
267bindings.  It is used to map abbreviated namespace prefixes to full
268URI strings.
269
270<examples>
271<example>
272<expr>
273;; Does the node have a class attribute with "content" as value?
274((sxml:xpath-expr "@class=\"content\"")
275 '(div (@ (class "content")) (p "Lorem ipsum")))
276</expr>
277<result>
278#t
279</result>
280</example>
281<example>
282<expr>
283;; Does the node have a paragraph with string value of "Lorem ipsum"?
284((sxml:xpath-expr "p=\"Lorem ipsum\"")
285 '(div (@ (class "content")) (p "Lorem ipsum")))
286</expr>
287<result>
288#t
289</result>
290</example>
291<example>
292<expr>
293;; Does the node have a "p" child node with string value of "Blah"?
294((sxml:xpath-expr "p=\"Blah\"")
295 '(div (@ (class "content")) (p "Lorem ipsum")))
296</expr>
297<result>
298#f
299</result>
300</example>
301</examples>
302
303
304=== XPath function library
305
306The procedures documented in this section can be used to implement a
307custom xpath traverser.  Unlike the sxpath low-level procedures, they
308are not in a separate library because they are in the same file as the
309high-level procedures, so the library size is not impacted by
310splitting them up.  When importing the {{txpath}} module you can
311simply leave these procedures out, so splitting them up into a
312separate library would provide no benefits.
313
314These procedures implement the core XPath functions, as described in
315[[http://www.w3.org/TR/xpath#corelib|The XPath specification, section 4]].
316
317All of the following procedures return procedures that accept 4
318arguments, which together make up (part of) the XPath context:
319
320  (lambda (nodeset root-node context var-binding) ...)
321
322The {{nodeset}} argument is the nodeset (a list of nodes) that is
323currently under consideration.  The {{root-node}} argument is a
324nodeset containing only one element: the root node of the document.
325The {{context}} argument is a list of two numbers; the position and
326size of the context.  The {{var-binding}} argument is an alist of
327XPath variable bindings.
328
329The arguments to each of these core procedures, if any, are all
330procedures of the same type as they return.  For example,
331{{sxml:core-local-name}} accepts an optional procedure which accepts a
332nodeset, a root-node, a context, a var-binding and returns a nodeset.
333Of this nodeset, the local part of the name of the first node (if any)
334is returned.  The values for each of these arguments are just those
335passed to {{sxml:core-local-name}}.
336
337==== Node set functions
338
339* <procedure>(sxml:core-last)</procedure>
340* <procedure>(sxml:core-position)</procedure>
341* <procedure>(sxml:core-count node-set)</procedure>
342* <procedure>(sxml:core-id object)</procedure>
343* <procedure>(sxml:core-local-name [node-set])</procedure>
344* <procedure>(sxml:core-namespace-uri [node-set])</procedure>
345* <procedure>(sxml:core-name [node-set])</procedure>
346
347==== String functions
348
349* <procedure>(sxml:core-string [object])</procedure>
350* <procedure>(sxml:core-concat [string ...])</procedure>
351* <procedure>(sxml:core-starts-with string prefix)</procedure>
352* <procedure>(sxml:core-contains string substring)</procedure>
353* <procedure>(sxml:core-substring-before string separator)</procedure>
354* <procedure>(sxml:core-substring-after string separator)</procedure>
355* <procedure>(sxml:core-substring string numeric-offset [length])</procedure>
356* <procedure>(sxml:core-string-length [string])</procedure>
357* <procedure>(sxml:core-normalize-space [string])</procedure>
358* <procedure>(sxml:core-translate string from to)</procedure>
359
360==== Boolean functions
361
362* <procedure>(sxml:core-boolean object)</procedure>
363* <procedure>(sxml:core-not boolean)</procedure>
364* <procedure>(sxml:core-true)</procedure>
365* <procedure>(sxml:core-false)</procedure>
366* <procedure>(sxml:core-lang lang-code)</procedure>
367
368==== Number functions
369
370* <procedure>(sxml:core-number [object])</procedure>
371* <procedure>(sxml:core-sum node-set)</procedure>
372* <procedure>(sxml:core-floor number)</procedure>
373* <procedure>(sxml:core-ceiling number)</procedure>
374* <procedure>(sxml:core-round number)</procedure>
375
376==== Parameter list
377
378<constant>sxml:classic-params</constant>
379
380This is a very long list of parameters containing parser and traversal
381information for the textual xpath parser engine.  This corresponds to
382the "function library" mentioned in the
383[[http://www.w3.org/TR/xpath#section-Introduction|introduction of the XPath spec]].
384You will have read the source code for details on how exactly to use it.
385
386
387== sxpath-lolevel
388
389This section documents the low-level sxpath interface. It includes
390mostly-generic list and SXML operators.
391
392It consists of the extensions defined in {{sxml-tools.scm}} plus
393{{sxpathlib}} and {{sxpath-ext}}.  This is equivalent to the
394"low-level sxpath interface" described at
395[[http://www196.pair.com/lisovsky/query/sxpath/|the introduction to SXPath]].
396
397These utilities are useful when you want to query SXML document trees,
398but full sxpath would be overkill.  Most of these procedures are
399faster than their sxpath equivalent, because they are very specific.
400But this also means they are very low-level, so you should use them
401only if you know what you're doing.
402
403
404==== Predicates
405
406<procedure>(sxml:empty-element? obj)</procedure>
407
408Predicate which returns {{#t}} if given element {{obj}} is empty.
409Empty elements have no nested elements, text nodes, PIs, Comments or
410entities but may contain attributes or namespace-id.  It is a SXML
411counterpart of XML {{empty-element}}.
412
413<procedure>(sxml:shallow-normalized? obj)</procedure>
414
415Returns {{#t}} if the given {{obj}} is a shallow-normalized SXML
416element.  The element itself has to be normalised but its nested
417elements are not tested.
418
419<procedure>(sxml:normalized? obj)</procedure>
420
421Returns {{#t}} if the given {{obj}} is a normalized SXML element.  The element
422itself and all its nested elements have to be normalised.
423
424<procedure>(sxml:shallow-minimized? obj)</procedure>
425
426Returns {{#t}} if the given {{obj}} is a shallow-minimized SXML
427element.  The element itself has to be minimised but its nested
428elements are not tested.
429
430<procedure>(sxml:minimized? obj)</procedure>
431
432Returns {{#t}} if the given {{obj}} is a minimized SXML element.  The
433element itself and all its nested elements have to be minimised.
434
435==== Accessors
436
437These procedures obtain information about nodes, or their direct
438children.  They don't traverse subtrees.
439
440===== Normalization-independent accessors
441
442These accessors can be used on arbitrary, non-normalized SXML trees.
443Because of this, they are generally slower than the
444normalization-dependent variants listed in the next section.
445
446<procedure>(sxml:name node)</procedure>
447
448Returns a name of a given SXML node. It is introduced for the sake of
449encapsulation.
450
451<procedure>(sxml:element-name obj)</procedure>
452
453A checked version of sxml:name, which returns {{#f}} if the given
454{{obj}} is not a SXML element. Otherwise returns its name.
455
456<procedure>(sxml:node-name obj)</procedure>
457
458Safe version of sxml:name, which returns {{#f}} if the given {{obj}}
459is not a SXML node.  Otherwise returns its name.
460
461The difference between this and {{sxml::element-name}} is that a node
462can be one of {{@}}, {{@@}}, {{*PI*}}, {{*COMMENT*}} or {{*ENTITY*}}
463while an element must be a real element (any symbol not in that set is
464considered to be an element).
465
466<procedure>(sxml:ncname node)</procedure>
467
468Like {{sxml:name}}, except returns only the local part of the name
469(called an "NCName" in the
470[[http://www.w3.org/TR/xml-names/|XML namespaces spec]]).
471
472The node's name is interpreted as a "Qualified Name", a
473colon-separated name of which the last one is considered to be the
474local part.  If the name contains no colons, the name itself is
475returned.
476
477'''Important:''' Please note that while an SXML name is a symbol, this
478function returns a string.
479
480<procedure>(sxml:name->ns-id sxml-name)</procedure>
481
482Given a node name, return the namespace part of the name (called a
483{{namespace-id}}).  If the name contains no colons, returns {{#f}}.  See
484{{sxml:ncname}} for more info.
485
486'''Important:''' Please note that while an SXML name is a symbol, this
487function returns a string.
488
489<procedure>(sxml:content obj)</procedure>
490
491Retrieve the contents of an SXML element or nodeset.  Any non-element
492nodes (attributes, processing instructions, etc) are discarded,
493while the elements and text nodes are returned as a list of strings
494and nested elements in document order.  This list is empty if {{obj}}
495is an empty element or empty list.
496
497The inner elements are unmodified so they still contain attributes,
498but also comments or other non-element nodes.
499
500<examples>
501<example>
502<expr>
503(sxml:content
504  '(div (@ (class "content"))
505        (*COMMENT* "main contents start here")
506         "The document moved "
507         (a (@ (href "/other.xml")) "here")))
508</expr>
509<result>("The document moved " (a (@ (href "/other.xml")) "here"))</result>
510</example>
511</examples>
512
513<procedure>(sxml:text node)</procedure>
514
515Returns a string which combines all the character data from text node
516children of the given SXML element or "" if there are no text node
517children.  Note that it does not include text from descendant nodes,
518only direct children.
519
520<examples>
521<example>
522<expr>
523(sxml:text
524  '(div (@ (class "content"))
525        (*COMMENT* "main contents start here")
526         "The document moved "
527         (a (@ (href "/other.xml")) "here")))
528</expr>
529<result>("The document moved ")</result>
530</example>
531</examples>
532
533==== Normalization-dependent accessors
534
535"Universal" accessors are less effective but may be used for
536non-normalized SXML.  These safe accessors are named with suffix '-u'
537for "universal".
538
539"Fast" accessors are optimized for normalized SXML data.  They are not
540applicable to arbitrary non-normalized SXML data.  Their names have no
541specific suffixes.
542
543<procedure>(sxml:content-raw obj)</procedure>
544
545Returns all the content of normalized SXML element except attr-list
546and aux-list.  Thus it includes {{PI}}, {{COMMENT}} and {{ENTITY}}
547nodes as well as {{TEXT}} and {{ELEMENT}} nodes returned by
548{{sxml:content}}.  Returns a list of nodes in document order or empty
549list if {{obj}} is an empty element or an empty list.
550
551This function is faster than {{sxml:content}}.
552
553<procedure>(sxml:attr-list-u obj)</procedure>
554
555Returns the list of attributes for given element or nodeset.  Analog
556of {{((sxpath '(@ *)) obj)}}.  Empty list is returned if there is no
557list of attributes.
558
559<procedure>(sxml:aux-list obj)</procedure>
560<procedure>(sxml:aux-list-u obj)</procedure>
561
562Returns the list of auxiliary nodes for given element or nodeset.
563Analog of {{((sxpath '(@@ *)) obj)}}.  Empty list is returned if a
564list of auxiliary nodes is absent.
565
566<procedure>(sxml:aux-node obj aux-name)</procedure>
567
568Return the first aux-node with <aux-name> given in SXML element
569{{obj}} or {{#f}} is such a node is absent.
570
571'''NOTE:''' it returns just the ''first'' node found even if multiple
572nodes are present, so it's mostly intended for nodes with unique names.
573Use {{sxml:aux-nodes}} if you want all of them.
574
575<procedure>(sxml:aux-nodes obj aux-name)</procedure>
576   
577Return a list of aux-nodes with {{aux-name}} given in SXML element
578{{obj}} or {{'()}} if such a node is absent.
579
580<procedure>(sxml:attr obj attr-name)</procedure>
581
582Returns the value of the attribute with name {{attr-name}} in the
583given SXML element {{obj}}, or {{#f}} if no such attribute exists.
584
585<procedure>(sxml:attr-from-list attr-list name)</procedure>
586
587Returns the value of the attribute with name {{attr-name}} in the
588given list of attributes {{attr-list}}, or {{#f}} if no such attribute
589exists.  The list of attributes can be obtained from an element using
590the {{sxml:attr-list}} procedure.
591
592<procedure>(sxml:num-attr obj attr-name)</procedure>
593
594Returns the value of the numerical attribute with name {{attr-name}}
595in the given SXML element {{obj}}, or {{#f}} if no such attribute
596exists.  This value is converted from a string to a number.
597
598<procedure>(sxml:attr-u obj attr-name)</procedure>
599
600Accessor for an attribute {{attr-name}} of given SXML element {{obj}},
601which may also be an attributes-list or a nodeset (usually content of
602an SXML element)
603
604<procedure>(sxml:ns-list obj)</procedure>
605
606Returns the list of namespaces for given element.  Analog of
607{{((sxpath '(@@ *NAMESPACES* *)) obj)}}.  The empty list is returned
608if there are no namespaces.
609
610<procedure>(sxml:ns-id->nodes obj namespace-id)</procedure>
611
612Returns a list of namespace information lists that match the given
613{{namespace-id}} in SXML element {{obj}}.  Analog of
614{{((sxpath '(@@ *NAMESPACES* namespace-id)) obj)}}.
615The empty list is returned if there is no namespace with the given
616{{namespace-id}}.
617
618<examples>
619<example>
620<expr>
621(sxml:ns-id->nodes
622  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")))) 'c)
623</expr>
624<result>((c "http://www.cars.com/xml"))</result>
625</example>
626</examples>
627
628<procedure>(sxml:ns-id->uri obj namespace-id)</procedure>
629
630Returns the URI for the (first) namespace matching the given
631{{namespace-id}}, or {{#f}} if no namespace matches the given
632{{namespace-id}}.
633
634<examples>
635<example>
636<expr>
637(sxml:ns-id->uri
638  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")))) 'c)
639</expr>
640<result>"http://www.cars.com/xml"</result>
641</example>
642</examples>
643
644<procedure>(sxml:ns-uri->nodes obj uri)</procedure>
645
646Returns a list of namespace information lists that match the given
647{{uri}} in SXML element {{obj}}.
648
649<examples>
650<example>
651<expr>
652(sxml:ns-uri->nodes
653  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")
654                                 (d "http://www.cars.com/xml"))))
655  "http://www.cars.com/xml")
656</expr>
657<result>((c "http://www.cars.com/xml") (d "http://www.cars.com/xml"))</result>
658</example>
659</examples>
660
661<procedure>(sxml:ns-uri->id obj uri)</procedure>
662
663Returns the namespace id for the (first) namespace matching the given
664{{uri}}, or {{#f}} if no namespace matches the given {{uri}}.
665
666<examples>
667<example>
668<expr>
669(sxml:ns-uri->id
670  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")
671                                 (d "http://www.cars.com/xml"))))
672  "http://www.cars.com/xml")
673</expr>
674<result>c</result>
675</example>
676</examples>
677
678<procedure>(sxml:ns-id ns-list)</procedure>
679
680Given a namespace information list {{ns-list}}, returns the namespace ID.
681
682<procedure>(sxml:ns-uri ns-list)</procedure>
683
684Given a namespace information list {{ns-list}}, returns the namespace URI.
685
686<procedure>(sxml:ns-prefix ns-list)</procedure>
687
688Given a namespace information list {{ns-list}}, returns the namespace
689prefix if it is present in the list.  If it's not present, returns the
690namespace ID.
691
692==== Data modification procedures
693
694Constructors and mutators for normalized SXML data
695 
696'''Important:''' These functions are optimized for normalized SXML
697data.  They are ''not'' applicable to arbitrary non-normalized SXML
698data.
699
700Most of the functions are provided in two variants:
701
702# Side-effect intended functions for linear update of given elements.  Their names are ended with exclamation mark.
703# Pure functions without side-effects which return modified elements.
704
705
706<procedure>(sxml:change-content! obj new-content)</procedure>
707<procedure>(sxml:change-content obj new-content)</procedure>
708
709Change the content of given SXML element {{obj}} to {{new-content}}.
710If {{new-content}} is an empty list then the {{obj}} is transformed to
711an empty element.  The resulting SXML element is normalized.
712
713<procedure>(sxml:change-attrlist obj new-attrlist)</procedure>
714<procedure>(sxml:change-attrlist! obj new-attrlist)</procedure>
715
716Change the attribute list of the given SXML element {{obj}} to
717{{new-attrlist}}.
718
719<procedure>(sxml:change-name obj new-name)</procedure>
720<procedure>(sxml:change-name! obj new-name)</procedure>
721
722Change the name of the given SXML element {{obj}} to {{new-name}}.
723
724<procedure>(sxml:add-attr obj attr)</procedure>
725<procedure>(sxml:add-attr! obj attr)</procedure>
726
727Returns the given SXML element {{obj}} with the attribute {{attr}}
728added to the attribute list, or {{#f}} if the attribute already exists.
729
730<procedure>(sxml:change-attr obj attr)</procedure>
731<procedure>(sxml:change-attr! obj attr)</procedure>
732
733Returns SXML element {{obj}} with changed value of attribute {{attr}}
734or {{#f}} if where is no attribute with given name.
735
736{{attr}} is a list like it would occur as a member of an attribute
737list: {{(attr-name attr-value)}}.
738   
739<procedure>(sxml:set-attr obj attr)
740<procedure>(sxml:set-attr! obj attr)
741
742Returns SXML element {{obj}} with changed value of attribute {{attr}}.
743If there is no such attribute the new one is added.
744
745{{attr}} is a list like it would occur as a member of an attribute
746list: {{(attr-name attr-value)}}.
747
748<procedure>(sxml:add-aux obj aux-node)</procedure>
749<procedure>(sxml:add-aux! obj aux-node)</procedure>
750
751Returns SXML element {{obj}} with an auxiliary node {{aux-node}} added.
752
753<procedure>(sxml:squeeze obj)</procedure>
754<procedure>(sxml:squeeze! obj)</procedure>
755
756Returns a minimized and normalized SXML element {{obj}} with empty
757lists of attributes and aux-lists eliminated, in {{obj}} and all its
758descendants.
759   
760<procedure>(sxml:clean obj)</procedure>
761
762Returns a minimized and normalized SXML element {{obj}} with empty
763lists of attributes and '''all''' aux-lists eliminated, in {{obj}} and
764all its descendants.
765
766
767==== Sxpath-related procedures
768
769<procedure>(select-first-kid test-pred?)</procedure>
770
771Given a node, return the first child that satisfies the
772{{test-pred?}}.  Given a nodeset, traverse the set until a node is
773found whose first child matches the predicate.  Returns {{#f}} if
774there is no such a child to be found.
775
776<procedure>(sxml:node-parent rootnode)</procedure>
777
778Returns a function of one argument - an SXML element - which returns
779its parent node using {{*PARENT*}} pointer in the aux-list.
780{{'*TOP-PTR*}} may be used as a pointer to root node.  It returns an
781empty list when applied to the root node.
782
783<procedure>(sxml:add-parents obj [top-ptr])</procedure>
784
785Returns the SXML element {{obj}} annotated with {{*PARENT*}} pointers
786for {{obj}} and all its descendants.  If {{obj}} is not the root node
787(a node with a name of {{*TOP*}}), you must pass in the parent pointer
788for {{obj}} as {{top-ptr}}.
789
790'''Warning:''' This procedure mutates its {{obj}} argument.
791
792<procedure>(sxml:lookup id index)</procedure>
793
794Lookup an element using its ID.  {{index}} should be an alist of
795{{(id . element)}}.
796
797==== Markup generation
798
799===== XML
800
801<procedure>(sxml:attr->xml attr)</procedure>
802
803Returns a list containing tokens that when joined together form the
804attribute's XML output.
805
806'''Warning:''' This procedure assumes that the attribute's values have
807already been escaped (ie, {{sxml:string->xml has been called on the
808strings inside it}}).
809
810<examples>
811<example>
812<expr>(sxml:attr->xml '(href "http://example.com"))</expr>
813<result>(" " "href" "='" "http://example.com" "'")</result>
814</example>
815</examples>
816
817<procedure>(sxml:string->xml string)</procedure>
818
819Escape the {{string}} so it can be used anywhere in XML output.  This
820converts the {{<}}, {{>}}, {{'}}, {{"}} and {{&}} characters to their
821respective entities.
822
823<procedure>(sxml:sxml->xml tree)</procedure>
824
825Convert the {{tree}} of SXML nodes to a nested list of XML fragments.
826These fragments can be output by flattening the list and concatenating
827the strings inside it.
828
829==== HTML
830
831<procedure>(sxml:attr->html attr)</procedure>
832
833Returns a list containing tokens that when joined together form the
834attribute's HTML output.  The difference with the XML variant is that
835this encodes empty attribute values to attributes with no value (think
836{{selected}} in option elements, or {{checked}} in checkboxes).
837
838'''Warning:''' This procedure assumes that the attribute's values have
839already been escaped (ie, {{sxml:string->html has been called on the
840strings inside it}}).
841
842<procedure>(sxml:string->html string)</procedure>
843
844Escape the {{string}} so it can be used anywhere in XML output.  This
845converts the {{<}}, {{>}}, {{"}} and {{&}} characters to their
846respective entities.
847
848<procedure>(sxml:non-terminated-html-tag? tag)</procedure>
849
850Is the named {{tag}} one that is "self-closing" (ie, does not need to
851be terminated) in HTML 4.0?
852
853<procedure>(sxml:sxml->html tree)</procedure>
854
855Convert the {{tree}} of SXML nodes to a nested list of HTML fragments.
856These fragments can be output by flattening the list and concatenating
857the strings inside it.
858
859
860=== Procedures from sxpathlib
861
862==== Basic converters and applicators
863
864A converter is a function
865
866  type Converter = Node|Nodelist -> Nodelist
867
868A converter can also play a role of a predicate: in that case, if a
869converter, applied to a node or a nodelist, yields a non-empty
870nodelist, the converter-predicate is deemed satisfied. Throughout this
871file a nil nodelist is equivalent to {{#f}} in denoting a failure.
872
873<procedure>(nodeset? obj)</procedure>
874
875Returns {{#t}} if {{obj}} is a nodelist.
876
877<procedure>(as-nodeset obj)</procedure>
878
879If {{obj}} is a nodelist - returns it as is, otherwise wrap it in a
880list.
881
882==== Node test
883
884The following functions implement 'Node test's as defined in Sec. 2.3
885of the XPath document.  A node test is one of the components of a
886location step.  It is also a converter-predicate in SXPath.
887
888<procedure>(sxml:element? obj)</procedure>
889
890Predicate which returns {{#t}} if {{obj}} is SXML element, otherwise {{#f}}.
891
892<procedure>(ntype-names?? crit)</procedure>
893
894Takes a list of acceptable node names as a criterion and returns a
895function, which, when applied to a node, will return {{#t}} if the
896node name is present in criterion list and {{#f}} otherwise.
897
898   ntype-names?? :: ListOfNames -> Node -> Boolean
899
900<procedure>(ntype?? crit)</procedure>
901
902Takes a type criterion and returns a function, which, when applied to
903a node, will tell if the node satisfies the test.
904
905  ntype?? :: Crit -> Node -> Boolean
906
907The criterion {{crit}} is  one of the following symbols:
908
909; {{@}} : tests if the Node is an {{attributes-list}}
910; {{*}} : tests if the Node is an {{Element}}
911; {{*text*}} : tests if the Node is a text node
912; {{*data*}} : tests if the Node is a data node  (text, number, boolean, etc., but not pair)
913; {{*PI*}} : tests if the Node is a processing instructions node
914; {{*COMMENT*}} : tests if the Node is a comment node
915; {{*ENTITY*}} : tests if the Node is an entity node
916; {{*any*}} : {{#t}} for any type of Node
917; other symbol : tests if the Node has the right name given by the symbol
918
919<examples>
920<example>
921<expr>
922((ntype?? 'div) '(div (@ (class "greeting")) "hi"))
923</expr>
924<result>
925#t
926</result>
927</example>
928<example>
929<expr>
930((ntype?? 'div) '(span (@ (class "greeting")) "hi"))
931</expr>
932<result>
933#f
934</result>
935</example>
936<example>
937<expr>
938((ntype?? '*) '(span (@ (class "greeting")) "hi"))
939</expr>
940<result>
941#t
942</result>
943</example>
944</examples>
945   
946<procedure>(ntype-namespace-id?? ns-id)</procedure>
947
948This function takes a namespace-id, and returns a predicate
949{{Node -> Boolean}}, which is {{#t}} for nodes with the given
950namespace id. {{ns-id}} is a string.
951{{(ntype-namespace-id?? #f)}} will be {{#t}} for nodes with
952non-qualified names.
953
954<procedure>(sxml:complement pred)</procedure>
955
956This function takes a predicate and returns it complemented, that is
957if the given predicate yields {{#f}} or {{'()}} the complemented one
958yields the given node and vice versa.
959
960<procedure>(node-eq? other)</procedure>
961
962Returns a predicate procedure that, given a node, returns {{#t}} if
963the node is the exact same as {{other}}.
964
965<procedure>(node-equal? other)</procedure>
966
967Returns a predicate procedure that, given a node, returns {{#t}} if
968the node has the same contents as {{other}}.
969
970<procedure>(node-pos n)</procedure>
971
972Returns a procedure that, given a nodelist, returns a new nodelist
973containing only the {{n}}th element, counting from 1.  If {{n}} is
974negative, it returns a nodelist with the {{n}}th element counting from
975the right.  If no such node exists, returns the empty list.  {{n}} may
976not equal zero.
977
978<examples>
979<example>
980<expr>
981((node-pos 1) '((div "hi") (span "hello") (em "really, hi!")))
982</expr>
983<result>
984((div "hi"))
985</result>
986</example>
987<example>
988<expr>
989((node-pos 6) '((div "hi") (span "hello") (em "really, hi!")))
990</expr>
991<result>
992()
993</result>
994</example>
995<example>
996<expr>
997((node-pos -1) '((div "hi") (span "hello") (em "is this thing on?")))
998</expr>
999<result>
1000((em "is this thing on?"))
1001</result>
1002</example>
1003</examples>
1004
1005<procedure>(sxml:filter pred?)</procedure>
1006
1007Returns a procedure that accepts a nodelist or a node (which will be
1008converted to a one-element nodelist) and returns only those nodes for
1009which the predicate {{pred?}} does not return {{#f}} or {{'()}}.
1010
1011<examples>
1012<example>
1013<expr>
1014((sxml:filter (ntype?? 'div)) '((div "hi") (span "hello") (div "still here?")))
1015</expr>
1016<result>
1017((div "hi") (div "still here?"))
1018</result>
1019</example>
1020</examples>
1021
1022<procedure>(take-until pred?)</procedure>
1023<procedure>(take-after pred?)</procedure>
1024
1025Returns a procedure that accepts a node or a nodelist.
1026
1027The {{take-until}} variant returns everything ''before'' the first
1028node for which the predicate {{pred?}} returns anything but {{#f}} or
1029{{'()}}.  In other words, it returns the longest prefix for which the
1030predicate returns {{#f}} or {{'()}}.
1031
1032The {{take-after}} variant returns everything ''after'' the first node
1033for which the predicate {{pred?}} returns anything besides {{#f}} or
1034{{'()}}.
1035
1036<examples>
1037<example>
1038<expr>
1039((take-until (ntype?? 'span)) '((div "hi") (span "hello") (span "there") (div "still here?")))
1040</expr>
1041<result>
1042((div "hi"))
1043</result>
1044</example>
1045<example>
1046<expr>
1047((take-after (ntype?? 'span)) '((div "hi") (span "hello") (span "there") (div "still here?")))
1048</expr>
1049<result>
1050((span "there") (div "still here?"))
1051</result>
1052</example>
1053</examples>
1054
1055<procedure>(map-union proc list)</procedure>
1056
1057Apply {{proc}} to each element of the nodelist {{lst}} and return the
1058list of results.  If {{proc}} returns a nodelist, splice it into the
1059result (essentially returning a flattened nodelist).
1060
1061<procedure>(node-reverse node-or-nodelist)</procedure>
1062
1063Accepts a nodelist and reverses the nodes inside.  If a node is passed
1064to this procedure, it returns a nodelist containing just that node.
1065(it does not change the order of the children).
1066
1067==== Converter combinators
1068
1069Combinators are higher-order functions that transmogrify a converter
1070or glue a sequence of converters into a single, non-trivial
1071converter. The goal is to arrive at converters that correspond to
1072XPath location paths.
1073
1074From a different point of view, a combinator is a fixed, named
1075''pattern'' of applying converters. Given below is a complete set of
1076such patterns that together implement XPath location path
1077specification. As it turns out, all these combinators can be built
1078from a small number of basic blocks; regular functional composition,
1079{{map-union}} and filter applicators, and the nodelist union.
1080
1081<procedure>(select-kids pred?)</procedure>
1082
1083Returns a procedure that accepts a node and returns a nodelist of the
1084node's children that satisfy {{pred?}} (ie, {{pred?}} returns anything
1085but {{#f}} or {{'()}}).
1086
1087<procedure>(node-self pred?)</procedure>
1088
1089Similar to {{select-kids}} but applies to the node itself rather than
1090to its children. The resulting Nodelist will contain either one
1091component (the node), or will be empty (if the node failed the
1092predicate).
1093
1094<procedure>(node-join . selectors)</procedure>
1095
1096Returns a procedure that accepts a nodelist or a node, and returns a
1097nodelist with all the selectors applied to every node in sequence.
1098The selectors must function as converter combinators, ie they must
1099accept a ''node'' and output a ''nodelist''.
1100
1101<examples>
1102<example>
1103<expr>
1104((node-join
1105  (select-kids (ntype?? 'li))
1106  sxml:content)
1107 '((ul (@ (class "whiskies"))
1108       (li "Ardbeg")
1109       (li "Glenfarclas")
1110       (li "Springbank"))))
1111</expr>
1112<result>
1113("Ardbeg" "Glenfarclas" "Springbank")
1114</result>
1115</example>
1116</examples>
1117
1118<procedure>(node-reduce . converters)</procedure>
1119
1120A regular functional composition of converters.
1121
1122From a different point of view,
1123  ((apply node-reduce converters) nodelist)
1124is equivalent to
1125  (fold apply nodelist converters)
1126i.e., folding, or reducing, a list of converters with the nodelist
1127as a seed.
1128
1129
1130<procedure>(node-or . converters)</procedure>
1131
1132This combinator applies all converters to a given node and produces
1133the union of their results.  This combinator corresponds to a union,
1134"{{|}}" operation for XPath location paths.
1135
1136<procedure>(node-closure test-pred?)</procedure>
1137
1138Select all ''descendants'' of a node that satisfy a
1139converter-predicate.  This combinator is similar to {{select-kids}}
1140but applies to grandchildren as well.
1141
1142<procedure>(node-trace title)</procedure>
1143
1144Returns a procedure that accepts a node or a nodelist, which it
1145pretty-prints to the current output port, preceded by {{title}}.  It
1146returns the node or the nodelist unchanged.  This is a useful
1147debugging aid, since it doesn't really do anything besides print its
1148argument and pass it on.
1149
1150<procedure>(sxml:node? obj)</procedure>
1151
1152Returns {{#t}} if the given {{obj}} is an SXML node, {{#f}} otherwise.
1153A node is anything except an attribute list or an auxiliary list.
1154
1155<procedure>(sxml:attr-list node)</procedure>
1156
1157Returns the list of attributes for a given SXML node.  The empty list
1158is returned if the given node is not an element, or if it has no list
1159of attributes.
1160
1161This differs from {{sxml:attr-list-u}} in that this procedure accepts
1162any SXML node while {{sxml:attr-list-u}} only accepts nodelists or
1163elements.  This means that sxml:attr-list-u will throw an error if you
1164pass it a text node (a string), while sxml:attr-list will not.
1165
1166<procedure>(sxml:attribute test-pred?)</procedure>
1167
1168Like {{sxml:filter}}, but considers the attributes instead of the
1169nodes.  Returns a nodelist of attribtes that match {{test-pred?}}.
1170
1171<examples>
1172<example>
1173<expr>
1174((sxml:attribute (ntype?? 'id))
1175 '((div (@ (id "navigation")) "navigation here")
1176   (div (@ (class "pullquote")) "random stuff")
1177   (div (@ (id "main-content")) "lorem ipsum ...")))
1178</expr>
1179<result>
1180((id "navigation") (id "main-content"))
1181</result>
1182</example>
1183</examples>
1184
1185<procedure>(sxml:child test-pred?)</procedure>
1186
1187This procedure is similar to {{select-kids}}, but it returns an empty
1188child-list for PI, Comment and Entity nodes.
1189
1190<procedure>(sxml:parent test-pred?)</procedure>
1191
1192Returns a procedure that accepts a root-node, and returns another
1193procedure.  This second procedure accepts a nodeset (or a node) and
1194returns the immediate parents of the nodes in the set, but only if
1195for those parents that match the predicate.
1196
1197The root-node does not have to be the root node of the
1198whole SXML tree -- it may be a root node of a branch of interest.
1199
1200This procedure can be used with any SXML node.
1201
1202==== Useful shortcuts
1203
1204<procedure>(node-parent node)</procedure>
1205
1206{{(node-parent rootnode)}} yields a converter that returns a parent of a
1207node it is applied to. If applied to a nodelist, it returns the list
1208of parents of nodes in the nodelist.
1209
1210This is equivalent to {{((sxml:parent (ntype? '*any*)) node)}}.
1211
1212<procedure>(sxml:child-nodes node)</procedure>
1213
1214Returns all the child nodes of the given {{node}}.
1215
1216This is equivalent to {{((sxml:child sxml:node?) node)}}.
1217
1218<procedure>(sxml:child-elements node)</procedure>
1219
1220Returns all the child ''elements'' of the given {{node}}. (ie,
1221excludes any textnodes).
1222
1223This is equivalent to {{((select-kids sxml:element?) node)}}.
1224
1225=== Procedures from sxpath-ext
1226
1227==== SXML counterparts to W3C XPath Core Functions Library
1228
1229<procedure>(sxml:string object)</procedure>
1230
1231The counterpart to XPath 'string' function (section 4.2 XPath 1.0 Rec.).
1232Converts a given object to a string.
1233
1234Notes:
1235# When converting a nodeset, document order is not preserved
1236# {{number->string}} returns the result in a form which is slightly different from XPath Rec. specification
1237
1238<procedure>(sxml:boolean object)</procedure>
1239
1240The counterpart to XPath 'boolean' function (section 4.3 XPath Rec.).
1241Converts its argument to a boolean.
1242
1243<procedure>(sxml:number object)</procedure>
1244
1245The counterpart to XPath 'number' function (section 4.4 XPath Rec.).
1246Converts its argument to a number.
1247
1248Notes:
1249# The argument is not optional (yet?)
1250# string->number conversion is not IEEE 754 round-to-nearest
1251# NaN is represented as 0
1252
1253<procedure>(sxml:string-value node)</procedure>
1254
1255Returns a string value for a given node in accordance to
1256XPath Rec. 5.1 - 5.7
1257
1258<procedure>(sxml:id id-index)</procedure>
1259
1260Returns a procedure that accepts a nodeset and returns a nodeset
1261containing the elements in the id-index that match the string-values
1262of each entry of the nodeset.  XPath Rec. 4.1
1263
1264The {{id-index}} is an alist with unique IDs as key, and elements as
1265values:
1266
1267  id-index = ( (id-value . element) (id-value . element) ... )
1268
1269==== Comparators for XPath objects
1270
1271<procedure>(sxml:list-head list n)</procedure>
1272
1273Returns the {{n}} first members of {{list}}.  Mostly equivalent to
1274SRFI-1's {{take}} procedure, except it returns the {{list}} if {{n}}
1275is larger than the length of said list, instead of throwing an error.
1276
1277<procedure>(sxml:merge-sort less-than? list)</procedure>
1278
1279Returns the sorted list, the smallest member first.
1280  less-than? ::= (lambda (obj1 obj2) ...)
1281{{less-than?}} returns {{#t}} if {{obj1 < obj2}} with respect to the
1282given ordering.
1283
1284<procedure>(sxml:equality-cmp bool=? number=? string=?)</procedure>
1285
1286A helper for XPath equality operations: {{=}} , {{!=}}.  The
1287{{bool=?}}, {{number=?}} and {{string=?}} arguments are comparison
1288operations for booleans, numbers and strings respectively.
1289
1290Returns a procedure that accepts two objects, looks at the first
1291object's type and applies the correct comparison predicate to it.
1292Type coercion takes place depending on the rules described in the
1293XPath 1.0 spec, section 3.4 ("Booleans").
1294
1295<procedure>(sxml:equal? obj1 obj2)</procedure>
1296<procedure>(sxml:not-equal? obj1 obj2)</procedure>
1297
1298Equality procedures with the default comparison operators {{eq?}},
1299{{=}} and {{string=?}}, or their inverse, respectively.
1300
1301<procedure>(sxml:relational-cmp op)</procedure>
1302
1303A helper for XPath relational operations: {{<}}, {{>}}, {{<=}}, {{>=}}
1304for two XPath objects.  {{op}} is one of these operators.
1305
1306Returns a procedure that accepts two objects and returns the value of
1307the procedure applied to these objects, converted according to the
1308coercion rules described in the XPath 1.0 spec, section 3.4
1309("Booleans").
1310
1311==== XPath axes
1312
1313<procedure>(sxml:ancestor test-pred?)</procedure>
1314
1315Like {{sxml:parent}}, except it returns all the ancestors that match
1316{{test-pred?}}, not just the immediate parent.
1317
1318<procedure>(sxml:ancestor-or-self test-pred?)</procedure>
1319
1320Like {{sxml:ancestor}}, except also allows the node itself to match
1321the predicate.
1322
1323<procedure>(sxml:descendant test-pred?)</procedure>
1324
1325Like {{node-closure}}, except the resulting nodeset is in depth-first
1326order instead of breadth-first.
1327
1328<procedure>(sxml:descendant-or-self test-pred?)</procedure>
1329
1330Like {{sxml:descendant}}, except also allows the node itself to match
1331the predicate.
1332
1333<procedure>(sxml:following test-pred?)</procedure>
1334
1335Returns a procedure that accepts a root node and returns a new
1336procedure that accepts a node and returns all nodes following this
1337node in the document source matching the predicate.
1338
1339<procedure>(sxml:following-sibling test-pred?)</procedure>
1340
1341Like {{sxml:following}}, except only siblings (nodes at the same level
1342under the same parent) are returned.
1343
1344<procedure>(sxml:preceding test-pred?)</procedure>
1345
1346Returns a procedure that accepts a root node and returns a new
1347procedure that accepts a node and returns all nodes preceding this
1348node in the document source matching the predicate.
1349
1350<procedure>(sxml:preceding-sibling test-pred?)</procedure>
1351
1352Like {{sxml:preceding}}, except only siblings (nodes at the same level
1353under the same parent) are returned.
1354
1355<procedure>(sxml:namespace test-pred?)</procedure>
1356
1357Returns a procedure that accepts a nodeset and returns the namespace
1358lists of the nodes matching {{test-pred?}}.
1359
1360
1361== About this egg
1362
1363=== Author
1364
1365[[http://okmij.org/ftp/|Oleg Kiselyov]], [[http://www196.pair.com/lisovsky/|Kirill Lisovsky]], [[http://modis.ispras.ru/Lizorkin/index.html|Dmitry Lizorkin]].
1366
1367=== Version history
1368
1369; 0.1 : Split up the old sxml-tools egg into sxpath
1370
1371=== License
1372
1373The sxml-tools are in the public domain.
Note: See TracBrowser for help on using the repository browser.