source: project/wiki/eggref/4/sxpath @ 13415

Last change on this file since 13415 was 13415, checked in by sjamaan, 11 years ago

Document more of the high-level interface that txpath offers

File size: 41.1 KB
Line 
1[[tags:eggs]]
2
3This is version 0.1 of the '''sxpath''' extension library for Chicken Scheme.
4
5[[toc:]]
6
7== Description
8
9The sxpath parts of the [[http://cvs.sourceforge.net/viewcvs.py/ssax/sxml-tools/|sxml-tools]] from the [[http://ssax.sf.net|SSAX project]] at Sourceforge.
10Because txpath and sxpath are interwoven, this egg also includes txpath parts.
11
12== Documentation
13
14This egg provides the sxpath-related tools from the sxml-tools available
15in the SSAX/SXML Sourceforge project.
16
17It is split up in three modules: [[#sxpath|sxpath]], [[#txpath|txpath]]
18and [[#sxpath-lolevel]]. {{sxpath}} depends on {{txpath}} and both
19modules depend on {{sxpath-lolevel}}.
20
21Much documentation is available at
22[[http://www196.pair.com/lisovsky/xml/index.html|Lisovsky's XML page]]
23and the [[http://ssax.sf.net|SSAX homepage]].
24
25The initial documentation on this wiki page came straight from the
26comments in the extremely well-documented source code. It's
27recommended you read the code if you want to learn more.
28
29== sxpath
30
31This is the preferred interface to use.  It allows you to query the
32SXML document tree using an s-expression based language, in which you
33can also use arbitrary procedures and even "classic" textual XPath
34(see [#txpath|below] for docs on that).
35
36A complete description on how to use this is outside the scope of this
37egg documentation. See
38[[http://www196.pair.com/lisovsky/query/sxpath/|the introduction to SXPath]]
39for that.
40
41<procedure>(sxpath path [ns-binding])</procedure>
42
43Returns a procedure that accepts an SXML document tree and returns a
44nodeset (list of nodes) that match the {{path}} expression.
45
46The optional {{ns-binding}} argument is an alist of namespace
47bindings.  It is used to map abbreviated namespace prefixes to full
48URI strings.
49
50It can be useful to compare the following examples to those for
51[#txpath|txpath].
52
53<examples>
54<example>
55<expr>
56;; selects all the 'item' elements that have an 'olist' parent
57;; (which is not root) and that are in the same document as the context node
58((sxpath `(// olist item))
59 '(doc (olist (item "1")) (item "2") (nested (olist (item "3")))))
60</expr>
61<result>
62((item "1") (item "3"))
63</result>
64</example>
65<example>
66<expr>
67;; selects the 'chapter' children of the context node that have one or
68;; more 'title' children with string-value equal to 'Introduction'
69(sxpath '((chapter ((equal? (title "Introduction")))))
70 '(text  (chapter (title "Introduction"))  (chapter "No title for this chapter")  (chapter (title "Conclusion"))))
71</expr>
72<result>
73((chapter (title "Introduction")))
74</result>
75</example>
76<example>
77<expr>
78;; (sxpath string-expr) is equivalent to (txpath string-expr)
79(sxpath "chapter[title='Introduction']"
80 '(text  (chapter (title "Introduction"))  (chapter "No title for this chapter")  (chapter (title "Conclusion"))))
81</expr>
82<result>
83</result>
84</example>
85</examples>
86
87TODO: find out how ns-binding works and give an example that uses this.
88
89<procedure>(if-sxpath path)</procedure>
90
91Like {{sxpath}}, only returns {{#f}} instead of the empty list if
92nothing matches (so it does ''not'' always return a nodeset).
93
94<procedure>(car-sxpath path)</procedure>
95
96Like {{sxpath}}, only instead of a nodeset it returns the first node
97found.  If no node was found, return '''an empty list'''.
98
99<procedure>(if-car-sxpath path)</procedure>
100
101Like {{car-sxpath}}, only returns {{#f}} instead of the empty list if
102nothing matches.
103
104<procedure>(sxml:id-alist node . lpaths)</procedure>
105
106Builds an index as a list of {{(ID_value . element)}} pairs for given
107{{node}}. {{lpaths}} are location paths for attributes of type ID (ie,
108sxpath expressions that tell it how to find the ID attribute).
109
110Note: location paths ''must'' be of the form {{(expr '@ attrib-name)}}.
111
112See also {{sxml:lookup}} below, in {{sxpath-lolevel}}, which can use
113this index.
114
115<examples>
116<example>
117<expr>
118;; TODO: find out why location paths must be of the form (expr '@ symbol)
119;;       or if this description is incorrect
120(sxml:id-alist
121 '(div (span (@ (id "hi")) "there")
122       (div (@ (id "hello")) "dude")
123       (a (@ (id "link")) "click here"))
124 '(span @ id) '(a @ id))
125</expr>
126<result>
127(("hi" . (span (@ (id "hi")) "there"))
128 ("link" . (a (@ (id "link")) "click here")))
129</result>
130</example>
131</examples>
132
133== txpath
134
135This section documents the txpath interface. This interface is mostly
136useful for programs that deal exclusively with "legacy" textual XPath
137queries.
138
139=== High-level interface
140
141The following procedures are the main interface one would use in
142practice. There are also more low-level procedures (see next section),
143which one could use to build txpath extensions.
144
145<procedure>(sxml:xpath string . ns-binding)</procedure>
146<procedure>(txpath string . ns-binding)</procedure>
147<procedure>(sxml:xpath+root string . ns-binding)</procedure>
148<procedure>(sxml:xpath+root+vars string . ns-binding)</procedure>
149
150Returns a procedure that accepts an SXML document tree and returns a
151nodeset (list of nodes) that match the XPath expression {{string}}.
152
153The optional {{ns-binding}} argument is an alist of namespace
154bindings.  It is used to map abbreviated namespace prefixes to full
155URI strings.
156
157{{(txpath x)}} is equivalent to {{(sxpath x)}} whenever {{x}} is a
158string.  The {{txpath}}, {{sxml:xpath+root}} and
159{{sxml:xpath+root+vars}} procedures are currently all aliases for
160{{sxml:xpath}}, which exist for backwards compatibility reasons.
161
162It's useful to compare the following examples to the above examples
163for [#sxpath|sxpath].
164
165<examples>
166<example>
167<expr>
168;; selects all the 'item' elements that have an 'olist' parent
169;; (which is not root) and that are in the same document as the context node
170((txpath "//olist/item")
171 '(doc (olist (item "1")) (item "2") (nested (olist (item "3")))))
172</expr>
173<result>
174((item "1") (item "3"))
175</result>
176</example>
177<example>
178<expr>
179;; Same example as above, but now with a namespace prefix of 'x',
180;; which is bound to the namespace "bar" in the ns-binding parameter.
181((txpath "//x:olist/item" '((x . "bar")))
182 '(doc (bar:olist (item "1")) (item "2") (nested (olist (item "3")))))
183</expr>
184<result>
185((item "1"))
186</result>
187<example>
188<expr>
189;; selects the 'chapter' children of the context node that have one or
190;; more 'title' children with string-value equal to 'Introduction'
191((txpath "chapter[title='Introduction']")
192 '(text  (chapter (title "Introduction"))  (chapter "No title for this chapter")  (chapter (title "Conclusion"))))
193</expr>
194<result>
195((chapter (title "Introduction")))
196</result>
197</example>
198</examples>
199
200<procedure>(sxml:xpath+index string . ns-binding)</procedure>
201
202This procedure returns the result of {{sxml:xpath}} consed onto
203{{#t}}.  If the {{sxml:xpath}} would return {{#f}}, this returns
204{{#f}} instead.
205
206It is provided solely for backwards compatibility.
207
208
209<procedure>(sxml:xpointer string . ns-binding)</procedure>
210<procedure>(sxml:xpointer+root+vars string . ns-binding)</procedure>
211
212Returns a procedure that accepts an SXML document tree and returns a
213nodeset (list of nodes) that match the XPointer expression {{string}}.
214
215The optional {{ns-binding}} argument is an alist of namespace
216bindings.  It is used to map abbreviated namespace prefixes to full
217URI strings.
218
219Currently, only the XPointer {{xmlns()}} and {{xpointer()}} schemes
220are implemented, the {{element()}} scheme is not.
221
222<examples>
223<example>
224<expr>
225;; selects all the 'item' elements that have an 'olist' parent
226;; (which is not root) and that are in the same document as the context node.
227;; Equivalent to (txpath "//olist/item").
228((sxml:xpointer "xpointer(//olist/item)")
229 '(doc (olist (item "1")) (item "2") (nested (olist (item "3")))))
230</expr>
231<result>
232((item "1") (item "3"))
233</result>
234</example>
235<example>
236<expr>
237;; An example with a namespace prefix, now using the XPointer xmlns()
238;; function instead of the ns-binding parameter. xmlns always have full
239;; namespace names on their right-hand side, never bound shortcuts.
240((sxml:xpointer "xmlns(x=bar)xpointer(//x:olist/item)")
241 '(doc (bar:olist (item "1")) (item "2") (nested (olist (item "3")))))
242</expr>
243<result>
244((item "1"))
245</result>
246</example>
247</examples>
248
249<procedure>(sxml:xpointer+index string . ns-binding)</procedure>
250
251This procedure returns the result of {{sxml:xpointer}} consed onto
252{{#t}}.  If the {{sxml:xpointer}} would return {{#f}}, this returns
253{{#f}} instead.
254
255It is provided solely for backwards compatibility.
256
257
258<procedure>(sxml:xpath-expr string . ns-binding)</procedure>
259
260Returns a procedure that accepts an SXML node and returns {{#t}} if
261the node matches the {{string}} expression.  This is an expression of
262type {{Expr}}, which is whatever you can put in a predicate (between
263square brackets after a node name).
264
265The optional {{ns-binding}} argument is an alist of namespace
266bindings.  It is used to map abbreviated namespace prefixes to full
267URI strings.
268
269<examples>
270<example>
271<expr>
272;; Does the node have a class attribute with "content" as value?
273((sxml:xpath-expr "@class=\"content\"")
274 '(div (@ (class "content")) (p "Lorem ipsum")))
275</expr>
276<result>
277#t
278</result>
279</example>
280<example>
281<expr>
282;; Does the node have a paragraph with string value of "Lorem ipsum"?
283((sxml:xpath-expr "p=\"Lorem ipsum\"")
284 '(div (@ (class "content")) (p "Lorem ipsum")))
285</expr>
286<result>
287#t
288</result>
289</example>
290<example>
291<expr>
292;; Does the node have a "p" child node with string value of "Blah"?
293((sxml:xpath-expr "p=\"Blah\"")
294 '(div (@ (class "content")) (p "Lorem ipsum")))
295</expr>
296<result>
297#f
298</result>
299</example>
300</examples>
301
302
303=== Low-level procedures
304
305These procedures can be used to create custom xpath parsers.
306
307TODO: document these
308
309   txp:parameterize-parser
310   sxml:whitespace
311   txp:signal-semantic-error
312   txp:error?
313
314   sxml:core-last
315   sxml:core-position
316   sxml:core-count
317   sxml:core-id
318   sxml:core-local-name
319   sxml:core-namespace-uri
320   sxml:core-name
321   sxml:core-string
322   sxml:core-concat
323   sxml:core-starts-with
324   sxml:core-contains
325   sxml:core-substring-before sxml:core-substring-after
326   sxml:core-substring
327   sxml:core-string-length
328   sxml:core-normalize-space
329   sxml:core-translate
330   sxml:core-boolean
331   sxml:core-not
332   sxml:core-true
333   sxml:core-false
334   sxml:core-lang
335   sxml:core-number
336   sxml:core-sum
337   sxml:core-floor
338   sxml:core-ceiling
339   sxml:core-round
340   sxml:classic-params
341
342
343== sxpath-lolevel
344
345This section documents the low-level sxpath interface. It includes
346mostly-generic list and SXML operators.
347
348It consists of the extensions defined in {{sxml-tools.scm}} plus
349{{sxpathlib}} and {{sxpath-ext}}.  This is equivalent to the
350"low-level sxpath interface" described at
351[[http://www196.pair.com/lisovsky/query/sxpath/|the introduction to SXPath]].
352
353These utilities are useful when you want to query SXML document trees,
354but full sxpath would be overkill.  Most of these procedures are
355faster than their sxpath equivalent, because they are very specific.
356But this also means they are very low-level, so you should use them
357only if you know what you're doing.
358
359
360==== Predicates
361
362<procedure>(sxml:empty-element? obj)</procedure>
363
364Predicate which returns {{#t}} if given element {{obj}} is empty.
365Empty elements have no nested elements, text nodes, PIs, Comments or
366entities but may contain attributes or namespace-id.  It is a SXML
367counterpart of XML {{empty-element}}.
368
369<procedure>(sxml:shallow-normalized? obj)</procedure>
370
371Returns {{#t}} if the given {{obj}} is a shallow-normalized SXML
372element.  The element itself has to be normalised but its nested
373elements are not tested.
374
375<procedure>(sxml:normalized? obj)</procedure>
376
377Returns {{#t}} if the given {{obj}} is a normalized SXML element.  The element
378itself and all its nested elements have to be normalised.
379
380<procedure>(sxml:shallow-minimized? obj)</procedure>
381
382Returns {{#t}} if the given {{obj}} is a shallow-minimized SXML
383element.  The element itself has to be minimised but its nested
384elements are not tested.
385
386<procedure>(sxml:minimized? obj)</procedure>
387
388Returns {{#t}} if the given {{obj}} is a minimized SXML element.  The
389element itself and all its nested elements have to be minimised.
390
391==== Accessors
392
393These procedures obtain information about nodes, or their direct
394children.  They don't traverse subtrees.
395
396===== Normalization-independent accessors
397
398These accessors can be used on arbitrary, non-normalized SXML trees.
399Because of this, they are generally slower than the
400normalization-dependent variants listed in the next section.
401
402<procedure>(sxml:name node)</procedure>
403
404Returns a name of a given SXML node. It is introduced for the sake of
405encapsulation.
406
407<procedure>(sxml:element-name obj)</procedure>
408
409A checked version of sxml:name, which returns {{#f}} if the given
410{{obj}} is not a SXML element. Otherwise returns its name.
411
412<procedure>(sxml:node-name obj)</procedure>
413
414Safe version of sxml:name, which returns {{#f}} if the given {{obj}}
415is not a SXML node.  Otherwise returns its name.
416
417The difference between this and {{sxml::element-name}} is that a node
418can be one of {{@}}, {{@@}}, {{*PI*}}, {{*COMMENT*}} or {{*ENTITY*}}
419while an element must be a real element (any symbol not in that set is
420considered to be an element).
421
422<procedure>(sxml:ncname node)</procedure>
423
424Like {{sxml:name}}, except returns only the local part of the name
425(called an "NCName" in the
426[http://www.w3.org/TR/xml-names/|XML namespaces spec]).
427
428The node's name is interpreted as a "Qualified Name", a
429colon-separated name of which the last one is considered to be the
430local part.  If the name contains no colons, the name itself is
431returned.
432
433'''Important:''' Please note that while an SXML name is a symbol, this
434function returns a string.
435
436<procedure>(sxml:name->ns-id sxml-name)</procedure>
437
438Given a node name, return the namespace part of the name (called a
439{{namespace-id}}).  If the name contains no colons, returns {{#f}}.  See
440{{sxml:ncname}} for more info.
441
442'''Important:''' Please note that while an SXML name is a symbol, this
443function returns a string.
444
445<procedure>(sxml:content obj)</procedure>
446
447Retrieve the contents of an SXML element or nodeset.  Any non-element
448nodes (attributes, processing instructions, etc) are discarded,
449while the elements and text nodes are returned as a list of strings
450and nested elements in document order.  This list is empty if {{obj}}
451is an empty element or empty list.
452
453The inner elements are unmodified so they still contain attributes,
454but also comments or other non-element nodes.
455
456<examples>
457<example>
458<expr>
459(sxml:content
460  '(div (@ (class "content"))
461        (*COMMENT* "main contents start here")
462         "The document moved "
463         (a (@ (href "/other.xml")) "here")))
464</expr>
465<result>("The document moved " (a (@ (href "/other.xml")) "here"))</result>
466</example>
467</examples>
468
469<procedure>(sxml:text node)</procedure>
470
471Returns a string which combines all the character data from text node
472children of the given SXML element or "" if there are no text node
473children.  Note that it does not include text from descendant nodes,
474only direct children.
475
476<examples>
477<example>
478<expr>
479(sxml:text
480  '(div (@ (class "content"))
481        (*COMMENT* "main contents start here")
482         "The document moved "
483         (a (@ (href "/other.xml")) "here")))
484</expr>
485<result>("The document moved ")</result>
486</example>
487</examples>
488
489==== Normalization-dependent accessors
490
491"Universal" accessors are less effective but may be used for
492non-normalized SXML.  These safe accessors are named with suffix '-u'
493for "universal".
494
495"Fast" accessors are optimized for normalized SXML data.  They are not
496applicable to arbitrary non-normalized SXML data.  Their names have no
497specific suffixes.
498
499<procedure>(sxml:content-raw obj)</procedure>
500
501Returns all the content of normalized SXML element except attr-list
502and aux-list.  Thus it includes {{PI}}, {{COMMENT}} and {{ENTITY}}
503nodes as well as {{TEXT}} and {{ELEMENT}} nodes returned by
504{{sxml:content}}.  Returns a list of nodes in document order or empty
505list if {{obj}} is an empty element or an empty list.
506
507This function is faster than {{sxml:content}}.
508
509<procedure>(sxml:attr-list-u obj)</procedure>
510
511Returns the list of attributes for given element or nodeset.  Analog
512of {{((sxpath '(@ *)) obj)}}.  Empty list is returned if there is no
513list of attributes.
514
515<procedure>(sxml:aux-list obj)</procedure>
516<procedure>(sxml:aux-list-u obj)</procedure>
517
518Returns the list of auxiliary nodes for given element or nodeset.
519Analog of {{((sxpath '(@@ *)) obj)}}.  Empty list is returned if a
520list of auxiliary nodes is absent.
521
522<procedure>(sxml:aux-node obj aux-name)</procedure>
523
524Return the first aux-node with <aux-name> given in SXML element
525{{obj}} or {{#f}} is such a node is absent.
526
527'''NOTE:''' it returns just the ''first'' node found even if multiple
528nodes are present, so it's mostly intended for nodes with unique names.
529Use {{sxml:aux-nodes}} if you want all of them.
530
531<procedure>(sxml:aux-nodes obj aux-name)</procedure>
532   
533Return a list of aux-nodes with {{aux-name}} given in SXML element
534{{obj}} or {{'()}} if such a node is absent.
535
536<procedure>(sxml:attr obj attr-name)</procedure>
537
538Returns the value of the attribute with name {{attr-name}} in the
539given SXML element {{obj}}, or {{#f}} if no such attribute exists.
540
541<procedure>(sxml:attr-from-list attr-list name)</procedure>
542
543Returns the value of the attribute with name {{attr-name}} in the
544given list of attributes {{attr-list}}, or {{#f}} if no such attribute
545exists.  The list of attributes can be obtained from an element using
546the {{sxml:attr-list}} procedure.
547
548<procedure>(sxml:num-attr obj attr-name)</procedure>
549
550Returns the value of the numerical attribute with name {{attr-name}}
551in the given SXML element {{obj}}, or {{#f}} if no such attribute
552exists.  This value is converted from a string to a number.
553
554<procedure>(sxml:attr-u obj attr-name)</procedure>
555
556Accessor for an attribute {{attr-name}} of given SXML element {{obj}},
557which may also be an attributes-list or a nodeset (usually content of
558an SXML element)
559
560<procedure>(sxml:ns-list obj)</procedure>
561
562Returns the list of namespaces for given element.  Analog of
563{{((sxpath '(@@ *NAMESPACES* *)) obj)}}.  The empty list is returned
564if there are no namespaces.
565
566<procedure>(sxml:ns-id->nodes obj namespace-id)</procedure>
567
568Returns a list of namespace information lists that match the given
569{{namespace-id}} in SXML element {{obj}}.  Analog of
570{{((sxpath '(@@ *NAMESPACES* namespace-id)) obj)}}.
571The empty list is returned if there is no namespace with the given
572{{namespace-id}}.
573
574<examples>
575<example>
576<expr>
577(sxml:ns-id->nodes
578  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")))) 'c)
579</expr>
580<result>((c "http://www.cars.com/xml"))</result>
581</example>
582</examples>
583
584<procedure>(sxml:ns-id->uri obj namespace-id)</procedure>
585
586Returns the URI for the (first) namespace matching the given
587{{namespace-id}}, or {{#f}} if no namespace matches the given
588{{namespace-id}}.
589
590<examples>
591<example>
592<expr>
593(sxml:ns-id->uri
594  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")))) 'c)
595</expr>
596<result>"http://www.cars.com/xml"</result>
597</example>
598</examples>
599
600<procedure>(sxml:ns-uri->nodes obj uri)</procedure>
601
602Returns a list of namespace information lists that match the given
603{{uri}} in SXML element {{obj}}.
604
605<examples>
606<example>
607<expr>
608(sxml:ns-uri->nodes
609  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")
610                                 (d "http://www.cars.com/xml"))))
611  "http://www.cars.com/xml")
612</expr>
613<result>((c "http://www.cars.com/xml") (d "http://www.cars.com/xml"))</result>
614</example>
615</examples>
616
617<procedure>(sxml:ns-uri->id obj uri)</procedure>
618
619Returns the namespace id for the (first) namespace matching the given
620{{uri}}, or {{#f}} if no namespace matches the given {{uri}}.
621
622<examples>
623<example>
624<expr>
625(sxml:ns-uri->id
626  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")
627                                 (d "http://www.cars.com/xml"))))
628  "http://www.cars.com/xml")
629</expr>
630<result>c</result>
631</example>
632</examples>
633
634<procedure>(sxml:ns-id ns-list)</procedure>
635
636Given a namespace information list {{ns-list}}, returns the namespace ID.
637
638<procedure>(sxml:ns-uri ns-list)</procedure>
639
640Given a namespace information list {{ns-list}}, returns the namespace URI.
641
642<procedure>(sxml:ns-prefix ns-list)</procedure>
643
644Given a namespace information list {{ns-list}}, returns the namespace
645prefix if it is present in the list.  If it's not present, returns the
646namespace ID.
647
648==== Data modification procedures
649
650Constructors and mutators for normalized SXML data
651 
652'''Important:''' These functions are optimized for normalized SXML
653data.  They are ''not'' applicable to arbitrary non-normalized SXML
654data.
655
656Most of the functions are provided in two variants:
657
658# Side-effect intended functions for linear update of given elements.  Their names are ended with exclamation mark.
659# Pure functions without side-effects which return modified elements.
660
661
662<procedure>(sxml:change-content! obj new-content)</procedure>
663<procedure>(sxml:change-content obj new-content)</procedure>
664
665Change the content of given SXML element {{obj}} to {{new-content}}.
666If {{new-content}} is an empty list then the {{obj}} is transformed to
667an empty element.  The resulting SXML element is normalized.
668
669<procedure>(sxml:change-attrlist obj new-attrlist)</procedure>
670<procedure>(sxml:change-attrlist! obj new-attrlist)</procedure>
671
672Change the attribute list of the given SXML element {{obj}} to
673{{new-attrlist}}.
674
675<procedure>(sxml:change-name obj new-name)</procedure>
676<procedure>(sxml:change-name! obj new-name)</procedure>
677
678Change the name of the given SXML element {{obj}} to {{new-name}}.
679
680<procedure>(sxml:add-attr obj attr)</procedure>
681<procedure>(sxml:add-attr! obj attr)</procedure>
682
683Returns the given SXML element {{obj}} with the attribute {{attr}}
684added to the attribute list, or {{#f}} if the attribute already exists.
685
686<procedure>(sxml:change-attr obj attr)</procedure>
687<procedure>(sxml:change-attr! obj attr)</procedure>
688
689Returns SXML element {{obj}} with changed value of attribute {{attr}}
690or {{#f}} if where is no attribute with given name.
691
692{{attr}} is a list like it would occur as a member of an attribute
693list: {{(attr-name attr-value)}}.
694   
695<procedure>(sxml:set-attr obj attr)
696<procedure>(sxml:set-attr! obj attr)
697
698Returns SXML element {{obj}} with changed value of attribute {{attr}}.
699If there is no such attribute the new one is added.
700
701{{attr}} is a list like it would occur as a member of an attribute
702list: {{(attr-name attr-value)}}.
703
704<procedure>(sxml:add-aux obj aux-node)</procedure>
705<procedure>(sxml:add-aux! obj aux-node)</procedure>
706
707Returns SXML element {{obj}} with an auxiliary node {{aux-node}} added.
708
709<procedure>(sxml:squeeze obj)</procedure>
710<procedure>(sxml:squeeze! obj)</procedure>
711
712Returns a minimized and normalized SXML element {{obj}} with empty
713lists of attributes and aux-lists eliminated, in {{obj}} and all its
714descendants.
715   
716<procedure>(sxml:clean obj)</procedure>
717
718Returns a minimized and normalized SXML element {{obj}} with empty
719lists of attributes and '''all''' aux-lists eliminated, in {{obj}} and
720all its descendants.
721
722
723==== Sxpath-related procedures
724
725<procedure>(select-first-kid test-pred?)</procedure>
726
727Given a node, return the first child that satisfies the
728{{test-pred?}}.  Given a nodeset, traverse the set until a node is
729found whose first child matches the predicate.  Returns {{#f}} if
730there is no such a child to be found.
731
732<procedure>(sxml:node-parent rootnode)</procedure>
733
734Returns a function of one argument - an SXML element - which returns
735its parent node using {{*PARENT*}} pointer in the aux-list.
736{{'*TOP-PTR*}} may be used as a pointer to root node.  It returns an
737empty list when applied to the root node.
738
739<procedure>(sxml:add-parents obj [top-ptr])</procedure>
740
741Returns the SXML element {{obj}} annotated with {{*PARENT*}} pointers
742for {{obj}} and all its descendants.  If {{obj}} is not the root node
743(a node with a name of {{*TOP*}}), you must pass in the parent pointer
744for {{obj}} as {{top-ptr}}.
745
746'''Warning:''' This procedure mutates its {{obj}} argument.
747
748<procedure>(sxml:lookup id index)</procedure>
749
750Lookup an element using its ID.  {{index}} should be an alist of
751{{(id . element)}}.
752
753==== Markup generation
754
755===== XML
756
757<procedure>(sxml:attr->xml attr)</procedure>
758
759Returns a list containing tokens that when joined together form the
760attribute's XML output.
761
762'''Warning:''' This procedure assumes that the attribute's values have
763already been escaped (ie, {{sxml:string->xml has been called on the
764strings inside it}}).
765
766<examples>
767<example>
768<expr>(sxml:attr->xml '(href "http://example.com"))</expr>
769<result>(" " "href" "='" "http://example.com" "'")</result>
770</example>
771</examples>
772
773<procedure>(sxml:string->xml string)</procedure>
774
775Escape the {{string}} so it can be used anywhere in XML output.  This
776converts the {{<}}, {{>}}, {{'}}, {{"}} and {{&}} characters to their
777respective entities.
778
779<procedure>(sxml:sxml->xml tree)</procedure>
780
781Convert the {{tree}} of SXML nodes to a nested list of XML fragments.
782These fragments can be output by flattening the list and concatenating
783the strings inside it.
784
785==== HTML
786
787<procedure>(sxml:attr->html attr)</procedure>
788
789Returns a list containing tokens that when joined together form the
790attribute's HTML output.  The difference with the XML variant is that
791this encodes empty attribute values to attributes with no value (think
792{{selected}} in option elements, or {{checked}} in checkboxes).
793
794'''Warning:''' This procedure assumes that the attribute's values have
795already been escaped (ie, {{sxml:string->html has been called on the
796strings inside it}}).
797
798<procedure>(sxml:string->html string)</procedure>
799
800Escape the {{string}} so it can be used anywhere in XML output.  This
801converts the {{<}}, {{>}}, {{"}} and {{&}} characters to their
802respective entities.
803
804<procedure>(sxml:non-terminated-html-tag? tag)</procedure>
805
806Is the named {{tag}} one that is "self-closing" (ie, does not need to
807be terminated) in HTML 4.0?
808
809<procedure>(sxml:sxml->html tree)</procedure>
810
811Convert the {{tree}} of SXML nodes to a nested list of HTML fragments.
812These fragments can be output by flattening the list and concatenating
813the strings inside it.
814
815
816=== Procedures from sxpathlib
817
818==== Basic converters and applicators
819
820A converter is a function
821
822  type Converter = Node|Nodelist -> Nodelist
823
824A converter can also play a role of a predicate: in that case, if a
825converter, applied to a node or a nodelist, yields a non-empty
826nodelist, the converter-predicate is deemed satisfied. Throughout this
827file a nil nodelist is equivalent to {{#f}} in denoting a failure.
828
829<procedure>(nodeset? obj)</procedure>
830
831Returns {{#t}} if {{obj}} is a nodelist.
832
833<procedure>(as-nodeset obj)</procedure>
834
835If {{obj}} is a nodelist - returns it as is, otherwise wrap it in a
836list.
837
838==== Node test
839
840The following functions implement 'Node test's as defined in Sec. 2.3
841of the XPath document.  A node test is one of the components of a
842location step.  It is also a converter-predicate in SXPath.
843
844<procedure>(sxml:element? obj)</procedure>
845
846Predicate which returns {{#t}} if {{obj}} is SXML element, otherwise {{#f}}.
847
848<procedure>(ntype-names?? crit)</procedure>
849
850Takes a list of acceptable node names as a criterion and returns a
851function, which, when applied to a node, will return {{#t}} if the
852node name is present in criterion list and {{#f}} otherwise.
853
854   ntype-names?? :: ListOfNames -> Node -> Boolean
855
856<procedure>(ntype?? crit)</procedure>
857
858Takes a type criterion and returns a function, which, when applied to
859a node, will tell if the node satisfies the test.
860
861  ntype?? :: Crit -> Node -> Boolean
862
863The criterion {{crit}} is  one of the following symbols:
864
865; {{@}} : tests if the Node is an {{attributes-list}}
866; {{*}} : tests if the Node is an {{Element}}
867; {{*text*}} : tests if the Node is a text node
868; {{*data*}} : tests if the Node is a data node  (text, number, boolean, etc., but not pair)
869; {{*PI*}} : tests if the Node is a processing instructions node
870; {{*COMMENT*}} : tests if the Node is a comment node
871; {{*ENTITY*}} : tests if the Node is an entity node
872; {{*any*}} : {{#t}} for any type of Node
873; other symbol : tests if the Node has the right name given by the symbol
874
875<examples>
876<example>
877<expr>
878((ntype?? 'div) '(div (@ (class "greeting")) "hi"))
879</expr>
880<result>
881#t
882</result>
883</example>
884<example>
885<expr>
886((ntype?? 'div) '(span (@ (class "greeting")) "hi"))
887</expr>
888<result>
889#f
890</result>
891</example>
892<example>
893<expr>
894((ntype?? '*) '(span (@ (class "greeting")) "hi"))
895</expr>
896<result>
897#t
898</result>
899</example>
900</examples>
901   
902<procedure>(ntype-namespace-id?? ns-id)</procedure>
903
904This function takes a namespace-id, and returns a predicate
905{{Node -> Boolean}}, which is {{#t}} for nodes with the given
906namespace id. {{ns-id}} is a string.
907{{(ntype-namespace-id?? #f)}} will be {{#t}} for nodes with
908non-qualified names.
909
910<procedure>(sxml:complement pred)</procedure>
911
912This function takes a predicate and returns it complemented, that is
913if the given predicate yields {{#f}} or {{'()}} the complemented one
914yields the given node and vice versa.
915
916<procedure>(node-eq? other)</procedure>
917
918Returns a predicate procedure that, given a node, returns {{#t}} if
919the node is the exact same as {{other}}.
920
921<procedure>(node-equal? other)</procedure>
922
923Returns a predicate procedure that, given a node, returns {{#t}} if
924the node has the same contents as {{other}}.
925
926<procedure>(node-pos n)</procedure>
927
928Returns a procedure that, given a nodelist, returns a new nodelist
929containing only the {{n}}th element, counting from 1.  If {{n}} is
930negative, it returns a nodelist with the {{n}}th element counting from
931the right.  If no such node exists, returns the empty list.  {{n}} may
932not equal zero.
933
934<examples>
935<example>
936<expr>
937((node-pos 1) '((div "hi") (span "hello") (em "really, hi!")))
938</expr>
939<result>
940((div "hi"))
941</result>
942</example>
943<example>
944<expr>
945((node-pos 6) '((div "hi") (span "hello") (em "really, hi!")))
946</expr>
947<result>
948()
949</result>
950</example>
951<example>
952<expr>
953((node-pos -1) '((div "hi") (span "hello") (em "is this thing on?")))
954</expr>
955<result>
956((em "is this thing on?"))
957</result>
958</example>
959</examples>
960
961<procedure>(sxml:filter pred?)</procedure>
962
963Returns a procedure that accepts a nodelist or a node (which will be
964converted to a one-element nodelist) and returns only those nodes for
965which the predicate {{pred?}} does not return {{#f}} or {{'()}}.
966
967<examples>
968<example>
969<expr>
970((sxml:filter (ntype?? 'div)) '((div "hi") (span "hello") (div "still here?")))
971</expr>
972<result>
973((div "hi") (div "still here?"))
974</result>
975</example>
976</examples>
977
978<procedure>(take-until pred?)</procedure>
979<procedure>(take-after pred?)</procedure>
980
981Returns a procedure that accepts a node or a nodelist.
982
983The {{take-until}} variant returns everything ''before'' the first
984node for which the predicate {{pred?}} returns anything but {{#f}} or
985{{'()}}.  In other words, it returns the longest prefix for which the
986predicate returns {{#f}} or {{'()}}.
987
988The {{take-after}} variant returns everything ''after'' the first node
989for which the predicate {{pred?}} returns anything besides {{#f}} or
990{{'()}}.
991
992<examples>
993<example>
994<expr>
995((take-until (ntype?? 'span)) '((div "hi") (span "hello") (span "there") (div "still here?")))
996</expr>
997<result>
998((div "hi"))
999</result>
1000</example>
1001<example>
1002<expr>
1003((take-after (ntype?? 'span)) '((div "hi") (span "hello") (span "there") (div "still here?")))
1004</expr>
1005<result>
1006((span "there") (div "still here?"))
1007</result>
1008</example>
1009</examples>
1010
1011<procedure>(map-union proc list)</procedure>
1012
1013Apply {{proc}} to each element of the nodelist {{lst}} and return the
1014list of results.  If {{proc}} returns a nodelist, splice it into the
1015result (essentially returning a flattened nodelist).
1016
1017<procedure>(node-reverse node-or-nodelist)</procedure>
1018
1019Accepts a nodelist and reverses the nodes inside.  If a node is passed
1020to this procedure, it returns a nodelist containing just that node.
1021(it does not change the order of the children).
1022
1023==== Converter combinators
1024
1025Combinators are higher-order functions that transmogrify a converter
1026or glue a sequence of converters into a single, non-trivial
1027converter. The goal is to arrive at converters that correspond to
1028XPath location paths.
1029
1030From a different point of view, a combinator is a fixed, named
1031''pattern'' of applying converters. Given below is a complete set of
1032such patterns that together implement XPath location path
1033specification. As it turns out, all these combinators can be built
1034from a small number of basic blocks; regular functional composition,
1035{{map-union}} and filter applicators, and the nodelist union.
1036
1037<procedure>(select-kids pred?)</procedure>
1038
1039Returns a procedure that accepts a node and returns a nodelist of the
1040node's children that satisfy {{pred?}} (ie, {{pred?}} returns anything
1041but {{#f}} or {{'()}}).
1042
1043<procedure>(node-self pred?)</procedure>
1044
1045Similar to {{select-kids}} but applies to the node itself rather than
1046to its children. The resulting Nodelist will contain either one
1047component (the node), or will be empty (if the node failed the
1048predicate).
1049
1050<procedure>(node-join . selectors)</procedure>
1051
1052Returns a procedure that accepts a nodelist or a node, and returns a
1053nodelist with all the selectors applied to every node in sequence.
1054The selectors must function as converter combinators, ie they must
1055accept a ''node'' and output a ''nodelist''.
1056
1057<examples>
1058<example>
1059<expr>
1060((node-join
1061  (select-kids (ntype?? 'li))
1062  sxml:content)
1063 '((ul (@ (class "whiskies"))
1064       (li "Ardbeg")
1065       (li "Glenfarclas")
1066       (li "Springbank"))))
1067</expr>
1068<result>
1069("Ardbeg" "Glenfarclas" "Springbank")
1070</result>
1071</example>
1072</examples>
1073
1074<procedure>(node-reduce . converters)</procedure>
1075
1076A regular functional composition of converters.
1077
1078From a different point of view,
1079  ((apply node-reduce converters) nodelist)
1080is equivalent to
1081  (fold apply nodelist converters)
1082i.e., folding, or reducing, a list of converters with the nodelist
1083as a seed.
1084
1085
1086<procedure>(node-or . converters)</procedure>
1087
1088This combinator applies all converters to a given node and produces
1089the union of their results.  This combinator corresponds to a union,
1090"{{|}}" operation for XPath location paths.
1091
1092<procedure>(node-closure test-pred?)</procedure>
1093
1094Select all ''descendants'' of a node that satisfy a
1095converter-predicate.  This combinator is similar to {{select-kids}}
1096but applies to grandchildren as well.
1097
1098<procedure>(node-trace title)</procedure>
1099
1100Returns a procedure that accepts a node or a nodelist, which it
1101pretty-prints to the current output port, preceded by {{title}}.  It
1102returns the node or the nodelist unchanged.  This is a useful
1103debugging aid, since it doesn't really do anything besides print its
1104argument and pass it on.
1105
1106<procedure>(sxml:node? obj)</procedure>
1107
1108Returns {{#t}} if the given {{obj}} is an SXML node, {{#f}} otherwise.
1109A node is anything except an attribute list or an auxiliary list.
1110
1111<procedure>(sxml:attr-list node)</procedure>
1112
1113Returns the list of attributes for a given SXML node.  The empty list
1114is returned if the given node is not an element, or if it has no list
1115of attributes.
1116
1117This differs from {{sxml:attr-list-u}} in that this procedure accepts
1118any SXML node while {{sxml:attr-list-u}} only accepts nodelists or
1119elements.  This means that sxml:attr-list-u will throw an error if you
1120pass it a text node (a string), while sxml:attr-list will not.
1121
1122<procedure>(sxml:attribute test-pred?)</procedure>
1123
1124Like {{sxml:filter}}, but considers the attributes instead of the
1125nodes.  Returns a nodelist of attribtes that match {{test-pred?}}.
1126
1127<examples>
1128<example>
1129<expr>
1130((sxml:attribute (ntype?? 'id))
1131 '((div (@ (id "navigation")) "navigation here")
1132   (div (@ (class "pullquote")) "random stuff")
1133   (div (@ (id "main-content")) "lorem ipsum ...")))
1134</expr>
1135<result>
1136((id "navigation") (id "main-content"))
1137</result>
1138</example>
1139</examples>
1140
1141<procedure>(sxml:child test-pred?)</procedure>
1142
1143This procedure is similar to {{select-kids}}, but it returns an empty
1144child-list for PI, Comment and Entity nodes.
1145
1146<procedure>(sxml:parent test-pred?)</procedure>
1147
1148Returns a procedure that accepts a root-node, and returns another
1149procedure.  This second procedure accepts a nodeset (or a node) and
1150returns the immediate parents of the nodes in the set, but only if
1151for those parents that match the predicate.
1152
1153The root-node does not have to be the root node of the
1154whole SXML tree -- it may be a root node of a branch of interest.
1155
1156This procedure can be used with any SXML node.
1157
1158==== Useful shortcuts
1159
1160<procedure>(node-parent node)</procedure>
1161
1162{{(node-parent rootnode)}} yields a converter that returns a parent of a
1163node it is applied to. If applied to a nodelist, it returns the list
1164of parents of nodes in the nodelist.
1165
1166This is equivalent to {{((sxml:parent (ntype? '*any*)) node)}}.
1167
1168<procedure>(sxml:child-nodes node)</procedure>
1169
1170Returns all the child nodes of the given {{node}}.
1171
1172This is equivalent to {{((sxml:child sxml:node?) node)}}.
1173
1174<procedure>(sxml:child-elements node)</procedure>
1175
1176Returns all the child ''elements'' of the given {{node}}. (ie,
1177excludes any textnodes).
1178
1179This is equivalent to {{((select-kids sxml:element?) node)}}.
1180
1181=== Procedures from sxpath-ext
1182
1183==== SXML counterparts to W3C XPath Core Functions Library
1184
1185<procedure>(sxml:string object)</procedure>
1186
1187The counterpart to XPath 'string' function (section 4.2 XPath 1.0 Rec.).
1188Converts a given object to a string.
1189
1190Notes:
1191# When converting a nodeset, document order is not preserved
1192# {{number->string}} returns the result in a form which is slightly different from XPath Rec. specification
1193
1194<procedure>(sxml:boolean object)</procedure>
1195
1196The counterpart to XPath 'boolean' function (section 4.3 XPath Rec.).
1197Converts its argument to a boolean.
1198
1199<procedure>(sxml:number object)</procedure>
1200
1201The counterpart to XPath 'number' function (section 4.4 XPath Rec.).
1202Converts its argument to a number.
1203
1204Notes:
1205# The argument is not optional (yet?)
1206# string->number conversion is not IEEE 754 round-to-nearest
1207# NaN is represented as 0
1208
1209<procedure>(sxml:string-value node)</procedure>
1210
1211Returns a string value for a given node in accordance to
1212XPath Rec. 5.1 - 5.7
1213
1214<procedure>(sxml:id id-index)</procedure>
1215
1216Returns a procedure that accepts a nodeset and returns a nodeset
1217containing the elements in the id-index that match the string-values
1218of each entry of the nodeset.  XPath Rec. 4.1
1219
1220The {{id-index}} is an alist with unique IDs as key, and elements as
1221values:
1222
1223  id-index = ( (id-value . element) (id-value . element) ... )
1224
1225==== Comparators for XPath objects
1226
1227<procedure>(sxml:list-head list n)</procedure>
1228
1229Returns the {{n}} first members of {{list}}.  Mostly equivalent to
1230SRFI-1's {{take}} procedure, except it returns the {{list}} if {{n}}
1231is larger than the length of said list, instead of throwing an error.
1232
1233<procedure>(sxml:merge-sort less-than? list)</procedure>
1234
1235Returns the sorted list, the smallest member first.
1236  less-than? ::= (lambda (obj1 obj2) ...)
1237{{less-than?}} returns {{#t}} if {{obj1 < obj2}} with respect to the
1238given ordering.
1239
1240<procedure>(sxml:equality-cmp bool=? number=? string=?)</procedure>
1241
1242A helper for XPath equality operations: {{=}} , {{!=}}.  The
1243{{bool=?}}, {{number=?}} and {{string=?}} arguments are comparison
1244operations for booleans, numbers and strings respectively.
1245
1246Returns a procedure that accepts two objects, looks at the first
1247object's type and applies the correct comparison predicate to it.
1248Type coercion takes place depending on the rules described in the
1249XPath 1.0 spec, section 3.4 ("Booleans").
1250
1251<procedure>(sxml:equal? obj1 obj2)</procedure>
1252<procedure>(sxml:not-equal? obj1 obj2)</procedure>
1253
1254Equality procedures with the default comparison operators {{eq?}},
1255{{=}} and {{string=?}}, or their inverse, respectively.
1256
1257<procedure>(sxml:relational-cmp op)</procedure>
1258
1259A helper for XPath relational operations: {{<}}, {{>}}, {{<=}}, {{>=}}
1260for two XPath objects.  {{op}} is one of these operators.
1261
1262Returns a procedure that accepts two objects and returns the value of
1263the procedure applied to these objects, converted according to the
1264coercion rules described in the XPath 1.0 spec, section 3.4
1265("Booleans").
1266
1267==== XPath axes
1268
1269<procedure>(sxml:ancestor test-pred?)</procedure>
1270
1271Like {{sxml:parent}}, except it returns all the ancestors that match
1272{{test-pred?}}, not just the immediate parent.
1273
1274<procedure>(sxml:ancestor-or-self test-pred?)</procedure>
1275
1276Like {{sxml:ancestor}}, except also allows the node itself to match
1277the predicate.
1278
1279<procedure>(sxml:descendant test-pred?)</procedure>
1280
1281Like {{node-closure}}, except the resulting nodeset is in depth-first
1282order instead of breadth-first.
1283
1284<procedure>(sxml:descendant-or-self test-pred?)</procedure>
1285
1286Like {{sxml:descendant}}, except also allows the node itself to match
1287the predicate.
1288
1289<procedure>(sxml:following test-pred?)</procedure>
1290
1291Returns a procedure that accepts a root node and returns a new
1292procedure that accepts a node and returns all nodes following this
1293node in the document source matching the predicate.
1294
1295<procedure>(sxml:following-sibling test-pred?)</procedure>
1296
1297Like {{sxml:following}}, except only siblings (nodes at the same level
1298under the same parent) are returned.
1299
1300<procedure>(sxml:preceding test-pred?)</procedure>
1301
1302Returns a procedure that accepts a root node and returns a new
1303procedure that accepts a node and returns all nodes preceding this
1304node in the document source matching the predicate.
1305
1306<procedure>(sxml:preceding-sibling test-pred?)</procedure>
1307
1308Like {{sxml:preceding}}, except only siblings (nodes at the same level
1309under the same parent) are returned.
1310
1311<procedure>(sxml:namespace test-pred?)</procedure>
1312
1313Returns a procedure that accepts a nodeset and returns the namespace
1314lists of the nodes matching {{test-pred?}}.
1315
1316
1317== About this egg
1318
1319=== Author
1320
1321[[http://okmij.org/ftp/|Oleg Kiselyov]], [[http://www196.pair.com/lisovsky/|Kirill Lisovsky]], [[http://modis.ispras.ru/Lizorkin/index.html|Dmitry Lizorkin]].
1322
1323=== Version history
1324
1325; 0.1 : Split up the old sxml-tools egg into sxpath
1326
1327=== License
1328
1329The sxml-tools are in the public domain.
Note: See TracBrowser for help on using the repository browser.