source: project/wiki/eggref/4/sxpath @ 13421

Last change on this file since 13421 was 13421, checked in by sjamaan, 12 years ago

Clarify the use of ns-bindings, fix a few examples, fix hyperlink syntax

File size: 41.1 KB
Line 
1[[tags:eggs]]
2
3This is version 0.1 of the '''sxpath''' extension library for Chicken Scheme.
4
5[[toc:]]
6
7== Description
8
9The sxpath parts of the [[http://cvs.sourceforge.net/viewcvs.py/ssax/sxml-tools/|sxml-tools]] from the [[http://ssax.sf.net|SSAX project]] at Sourceforge.
10Because txpath and sxpath are interwoven, this egg also includes txpath parts.
11
12== Documentation
13
14This egg provides the sxpath-related tools from the sxml-tools available
15in the SSAX/SXML Sourceforge project.
16
17It is split up in three modules: [[#sxpath|sxpath]], [[#txpath|txpath]]
18and [[#sxpath-lolevel]]. {{sxpath}} depends on {{txpath}} and both
19modules depend on {{sxpath-lolevel}}.
20
21Much documentation is available at
22[[http://www196.pair.com/lisovsky/xml/index.html|Lisovsky's XML page]]
23and the [[http://ssax.sf.net|SSAX homepage]].
24
25The initial documentation on this wiki page came straight from the
26comments in the extremely well-documented source code. It's
27recommended you read the code if you want to learn more.
28
29== sxpath
30
31This is the preferred interface to use.  It allows you to query the
32SXML document tree using an s-expression based language, in which you
33can also use arbitrary procedures and even "classic" textual XPath
34(see [[#txpath|below]] for docs on that).
35
36A complete description on how to use this is outside the scope of this
37egg documentation. See
38[[http://www196.pair.com/lisovsky/query/sxpath/|the introduction to SXPath]]
39for that.
40
41<procedure>(sxpath path [ns-binding])</procedure>
42
43Returns a procedure that accepts an SXML document tree and returns a
44nodeset (list of nodes) that match the {{path}} expression.
45
46The optional {{ns-binding}} argument is an alist of namespace
47bindings.  It is used to map abbreviated namespace prefixes to full
48URI strings but ''only for textual XPath strings'' embedded in the
49{{path}} expression.
50
51It can be useful to compare the following examples to those for
52[[#txpath|txpath]].
53
54<examples>
55<example>
56<expr>
57;; selects all the 'item' elements that have an 'olist' parent
58;; (which is not root) and that are in the same document as the context node
59((sxpath `(// olist item))
60 '(doc (olist (item "1")) (item "2") (nested (olist (item "3")))))
61</expr>
62<result>
63((item "1") (item "3"))
64</result>
65</example>
66<example>
67<expr>
68;; selects the 'chapter' children of the context node that have one or
69;; more 'title' children with string-value equal to 'Introduction'
70((sxpath '((chapter ((equal? (title "Introduction"))))))
71 '(text  (chapter (title "Introduction"))  (chapter "No title for this chapter")  (chapter (title "Conclusion"))))
72</expr>
73<result>
74((chapter (title "Introduction")))
75</result>
76</example>
77<example>
78<expr>
79;; (sxpath string-expr) is equivalent to (txpath string-expr)
80((sxpath "chapter[title='Introduction']")
81 '(text  (chapter (title "Introduction"))  (chapter "No title for this chapter")  (chapter (title "Conclusion"))))
82</expr>
83<result>
84((chapter (title "Introduction")))
85</result>
86</example>
87</examples>
88
89
90<procedure>(if-sxpath path)</procedure>
91
92Like {{sxpath}}, only returns {{#f}} instead of the empty list if
93nothing matches (so it does ''not'' always return a nodeset).
94
95<procedure>(car-sxpath path)</procedure>
96
97Like {{sxpath}}, only instead of a nodeset it returns the first node
98found.  If no node was found, return '''an empty list'''.
99
100<procedure>(if-car-sxpath path)</procedure>
101
102Like {{car-sxpath}}, only returns {{#f}} instead of the empty list if
103nothing matches.
104
105<procedure>(sxml:id-alist node . lpaths)</procedure>
106
107Builds an index as a list of {{(ID_value . element)}} pairs for given
108{{node}}. {{lpaths}} are location paths for attributes of type ID (ie,
109sxpath expressions that tell it how to find the ID attribute).
110
111Note: location paths ''must'' be of the form {{(expr '@ attrib-name)}}.
112
113See also {{sxml:lookup}} below, in {{sxpath-lolevel}}, which can use
114this index.
115
116<examples>
117<example>
118<expr>
119;; TODO: find out why location paths must be of the form (expr '@ symbol)
120;;       or if this description is incorrect
121(sxml:id-alist
122 '(div (span (@ (id "hi")) "there")
123       (div (@ (id "hello")) "dude")
124       (a (@ (id "link")) "click here"))
125 '(span @ id) '(a @ id))
126</expr>
127<result>
128(("hi" . (span (@ (id "hi")) "there"))
129 ("link" . (a (@ (id "link")) "click here")))
130</result>
131</example>
132</examples>
133
134== txpath
135
136This section documents the txpath interface. This interface is mostly
137useful for programs that deal exclusively with "legacy" textual XPath
138queries.
139
140=== High-level interface
141
142The following procedures are the main interface one would use in
143practice. There are also more low-level procedures (see next section),
144which one could use to build txpath extensions.
145
146<procedure>(sxml:xpath string . ns-binding)</procedure>
147<procedure>(txpath string . ns-binding)</procedure>
148<procedure>(sxml:xpath+root string . ns-binding)</procedure>
149<procedure>(sxml:xpath+root+vars string . ns-binding)</procedure>
150
151Returns a procedure that accepts an SXML document tree and returns a
152nodeset (list of nodes) that match the XPath expression {{string}}.
153
154The optional {{ns-binding}} argument is an alist of namespace
155bindings.  It is used to map abbreviated namespace prefixes to full
156URI strings.
157
158{{(txpath x)}} is equivalent to {{(sxpath x)}} whenever {{x}} is a
159string.  The {{txpath}}, {{sxml:xpath+root}} and
160{{sxml:xpath+root+vars}} procedures are currently all aliases for
161{{sxml:xpath}}, which exist for backwards compatibility reasons.
162
163It's useful to compare the following examples to the above examples
164for [[#sxpath|sxpath]].
165
166<examples>
167<example>
168<expr>
169;; selects all the 'item' elements that have an 'olist' parent
170;; (which is not root) and that are in the same document as the context node
171((txpath "//olist/item")
172 '(doc (olist (item "1")) (item "2") (nested (olist (item "3")))))
173</expr>
174<result>
175((item "1") (item "3"))
176</result>
177</example>
178<example>
179<expr>
180;; Same example as above, but now with a namespace prefix of 'x',
181;; which is bound to the namespace "bar" in the ns-binding parameter.
182((txpath "//x:olist/item" '((x . "bar")))
183 '(doc (bar:olist (item "1")) (item "2") (nested (olist (item "3")))))
184</expr>
185<result>
186((item "1"))
187</result>
188<example>
189<expr>
190;; selects the 'chapter' children of the context node that have one or
191;; more 'title' children with string-value equal to 'Introduction'
192((txpath "chapter[title='Introduction']")
193 '(text  (chapter (title "Introduction"))  (chapter "No title for this chapter")  (chapter (title "Conclusion"))))
194</expr>
195<result>
196((chapter (title "Introduction")))
197</result>
198</example>
199</examples>
200
201<procedure>(sxml:xpath+index string . ns-binding)</procedure>
202
203This procedure returns the result of {{sxml:xpath}} consed onto
204{{#t}}.  If the {{sxml:xpath}} would return {{#f}}, this returns
205{{#f}} instead.
206
207It is provided solely for backwards compatibility.
208
209
210<procedure>(sxml:xpointer string . ns-binding)</procedure>
211<procedure>(sxml:xpointer+root+vars string . ns-binding)</procedure>
212
213Returns a procedure that accepts an SXML document tree and returns a
214nodeset (list of nodes) that match the XPointer expression {{string}}.
215
216The optional {{ns-binding}} argument is an alist of namespace
217bindings.  It is used to map abbreviated namespace prefixes to full
218URI strings.
219
220Currently, only the XPointer {{xmlns()}} and {{xpointer()}} schemes
221are implemented, the {{element()}} scheme is not.
222
223<examples>
224<example>
225<expr>
226;; selects all the 'item' elements that have an 'olist' parent
227;; (which is not root) and that are in the same document as the context node.
228;; Equivalent to (txpath "//olist/item").
229((sxml:xpointer "xpointer(//olist/item)")
230 '(doc (olist (item "1")) (item "2") (nested (olist (item "3")))))
231</expr>
232<result>
233((item "1") (item "3"))
234</result>
235</example>
236<example>
237<expr>
238;; An example with a namespace prefix, now using the XPointer xmlns()
239;; function instead of the ns-binding parameter. xmlns always have full
240;; namespace names on their right-hand side, never bound shortcuts.
241((sxml:xpointer "xmlns(x=bar)xpointer(//x:olist/item)")
242 '(doc (bar:olist (item "1")) (item "2") (nested (olist (item "3")))))
243</expr>
244<result>
245((item "1"))
246</result>
247</example>
248</examples>
249
250<procedure>(sxml:xpointer+index string . ns-binding)</procedure>
251
252This procedure returns the result of {{sxml:xpointer}} consed onto
253{{#t}}.  If the {{sxml:xpointer}} would return {{#f}}, this returns
254{{#f}} instead.
255
256It is provided solely for backwards compatibility.
257
258
259<procedure>(sxml:xpath-expr string . ns-binding)</procedure>
260
261Returns a procedure that accepts an SXML node and returns {{#t}} if
262the node matches the {{string}} expression.  This is an expression of
263type {{Expr}}, which is whatever you can put in a predicate (between
264square brackets after a node name).
265
266The optional {{ns-binding}} argument is an alist of namespace
267bindings.  It is used to map abbreviated namespace prefixes to full
268URI strings.
269
270<examples>
271<example>
272<expr>
273;; Does the node have a class attribute with "content" as value?
274((sxml:xpath-expr "@class=\"content\"")
275 '(div (@ (class "content")) (p "Lorem ipsum")))
276</expr>
277<result>
278#t
279</result>
280</example>
281<example>
282<expr>
283;; Does the node have a paragraph with string value of "Lorem ipsum"?
284((sxml:xpath-expr "p=\"Lorem ipsum\"")
285 '(div (@ (class "content")) (p "Lorem ipsum")))
286</expr>
287<result>
288#t
289</result>
290</example>
291<example>
292<expr>
293;; Does the node have a "p" child node with string value of "Blah"?
294((sxml:xpath-expr "p=\"Blah\"")
295 '(div (@ (class "content")) (p "Lorem ipsum")))
296</expr>
297<result>
298#f
299</result>
300</example>
301</examples>
302
303
304=== Low-level procedures
305
306These procedures can be used to create custom xpath parsers.
307
308TODO: document these
309
310   txp:parameterize-parser
311   sxml:whitespace
312   txp:signal-semantic-error
313   txp:error?
314
315   sxml:core-last
316   sxml:core-position
317   sxml:core-count
318   sxml:core-id
319   sxml:core-local-name
320   sxml:core-namespace-uri
321   sxml:core-name
322   sxml:core-string
323   sxml:core-concat
324   sxml:core-starts-with
325   sxml:core-contains
326   sxml:core-substring-before sxml:core-substring-after
327   sxml:core-substring
328   sxml:core-string-length
329   sxml:core-normalize-space
330   sxml:core-translate
331   sxml:core-boolean
332   sxml:core-not
333   sxml:core-true
334   sxml:core-false
335   sxml:core-lang
336   sxml:core-number
337   sxml:core-sum
338   sxml:core-floor
339   sxml:core-ceiling
340   sxml:core-round
341   sxml:classic-params
342
343
344== sxpath-lolevel
345
346This section documents the low-level sxpath interface. It includes
347mostly-generic list and SXML operators.
348
349It consists of the extensions defined in {{sxml-tools.scm}} plus
350{{sxpathlib}} and {{sxpath-ext}}.  This is equivalent to the
351"low-level sxpath interface" described at
352[[http://www196.pair.com/lisovsky/query/sxpath/|the introduction to SXPath]].
353
354These utilities are useful when you want to query SXML document trees,
355but full sxpath would be overkill.  Most of these procedures are
356faster than their sxpath equivalent, because they are very specific.
357But this also means they are very low-level, so you should use them
358only if you know what you're doing.
359
360
361==== Predicates
362
363<procedure>(sxml:empty-element? obj)</procedure>
364
365Predicate which returns {{#t}} if given element {{obj}} is empty.
366Empty elements have no nested elements, text nodes, PIs, Comments or
367entities but may contain attributes or namespace-id.  It is a SXML
368counterpart of XML {{empty-element}}.
369
370<procedure>(sxml:shallow-normalized? obj)</procedure>
371
372Returns {{#t}} if the given {{obj}} is a shallow-normalized SXML
373element.  The element itself has to be normalised but its nested
374elements are not tested.
375
376<procedure>(sxml:normalized? obj)</procedure>
377
378Returns {{#t}} if the given {{obj}} is a normalized SXML element.  The element
379itself and all its nested elements have to be normalised.
380
381<procedure>(sxml:shallow-minimized? obj)</procedure>
382
383Returns {{#t}} if the given {{obj}} is a shallow-minimized SXML
384element.  The element itself has to be minimised but its nested
385elements are not tested.
386
387<procedure>(sxml:minimized? obj)</procedure>
388
389Returns {{#t}} if the given {{obj}} is a minimized SXML element.  The
390element itself and all its nested elements have to be minimised.
391
392==== Accessors
393
394These procedures obtain information about nodes, or their direct
395children.  They don't traverse subtrees.
396
397===== Normalization-independent accessors
398
399These accessors can be used on arbitrary, non-normalized SXML trees.
400Because of this, they are generally slower than the
401normalization-dependent variants listed in the next section.
402
403<procedure>(sxml:name node)</procedure>
404
405Returns a name of a given SXML node. It is introduced for the sake of
406encapsulation.
407
408<procedure>(sxml:element-name obj)</procedure>
409
410A checked version of sxml:name, which returns {{#f}} if the given
411{{obj}} is not a SXML element. Otherwise returns its name.
412
413<procedure>(sxml:node-name obj)</procedure>
414
415Safe version of sxml:name, which returns {{#f}} if the given {{obj}}
416is not a SXML node.  Otherwise returns its name.
417
418The difference between this and {{sxml::element-name}} is that a node
419can be one of {{@}}, {{@@}}, {{*PI*}}, {{*COMMENT*}} or {{*ENTITY*}}
420while an element must be a real element (any symbol not in that set is
421considered to be an element).
422
423<procedure>(sxml:ncname node)</procedure>
424
425Like {{sxml:name}}, except returns only the local part of the name
426(called an "NCName" in the
427[[http://www.w3.org/TR/xml-names/|XML namespaces spec]]).
428
429The node's name is interpreted as a "Qualified Name", a
430colon-separated name of which the last one is considered to be the
431local part.  If the name contains no colons, the name itself is
432returned.
433
434'''Important:''' Please note that while an SXML name is a symbol, this
435function returns a string.
436
437<procedure>(sxml:name->ns-id sxml-name)</procedure>
438
439Given a node name, return the namespace part of the name (called a
440{{namespace-id}}).  If the name contains no colons, returns {{#f}}.  See
441{{sxml:ncname}} for more info.
442
443'''Important:''' Please note that while an SXML name is a symbol, this
444function returns a string.
445
446<procedure>(sxml:content obj)</procedure>
447
448Retrieve the contents of an SXML element or nodeset.  Any non-element
449nodes (attributes, processing instructions, etc) are discarded,
450while the elements and text nodes are returned as a list of strings
451and nested elements in document order.  This list is empty if {{obj}}
452is an empty element or empty list.
453
454The inner elements are unmodified so they still contain attributes,
455but also comments or other non-element nodes.
456
457<examples>
458<example>
459<expr>
460(sxml:content
461  '(div (@ (class "content"))
462        (*COMMENT* "main contents start here")
463         "The document moved "
464         (a (@ (href "/other.xml")) "here")))
465</expr>
466<result>("The document moved " (a (@ (href "/other.xml")) "here"))</result>
467</example>
468</examples>
469
470<procedure>(sxml:text node)</procedure>
471
472Returns a string which combines all the character data from text node
473children of the given SXML element or "" if there are no text node
474children.  Note that it does not include text from descendant nodes,
475only direct children.
476
477<examples>
478<example>
479<expr>
480(sxml:text
481  '(div (@ (class "content"))
482        (*COMMENT* "main contents start here")
483         "The document moved "
484         (a (@ (href "/other.xml")) "here")))
485</expr>
486<result>("The document moved ")</result>
487</example>
488</examples>
489
490==== Normalization-dependent accessors
491
492"Universal" accessors are less effective but may be used for
493non-normalized SXML.  These safe accessors are named with suffix '-u'
494for "universal".
495
496"Fast" accessors are optimized for normalized SXML data.  They are not
497applicable to arbitrary non-normalized SXML data.  Their names have no
498specific suffixes.
499
500<procedure>(sxml:content-raw obj)</procedure>
501
502Returns all the content of normalized SXML element except attr-list
503and aux-list.  Thus it includes {{PI}}, {{COMMENT}} and {{ENTITY}}
504nodes as well as {{TEXT}} and {{ELEMENT}} nodes returned by
505{{sxml:content}}.  Returns a list of nodes in document order or empty
506list if {{obj}} is an empty element or an empty list.
507
508This function is faster than {{sxml:content}}.
509
510<procedure>(sxml:attr-list-u obj)</procedure>
511
512Returns the list of attributes for given element or nodeset.  Analog
513of {{((sxpath '(@ *)) obj)}}.  Empty list is returned if there is no
514list of attributes.
515
516<procedure>(sxml:aux-list obj)</procedure>
517<procedure>(sxml:aux-list-u obj)</procedure>
518
519Returns the list of auxiliary nodes for given element or nodeset.
520Analog of {{((sxpath '(@@ *)) obj)}}.  Empty list is returned if a
521list of auxiliary nodes is absent.
522
523<procedure>(sxml:aux-node obj aux-name)</procedure>
524
525Return the first aux-node with <aux-name> given in SXML element
526{{obj}} or {{#f}} is such a node is absent.
527
528'''NOTE:''' it returns just the ''first'' node found even if multiple
529nodes are present, so it's mostly intended for nodes with unique names.
530Use {{sxml:aux-nodes}} if you want all of them.
531
532<procedure>(sxml:aux-nodes obj aux-name)</procedure>
533   
534Return a list of aux-nodes with {{aux-name}} given in SXML element
535{{obj}} or {{'()}} if such a node is absent.
536
537<procedure>(sxml:attr obj attr-name)</procedure>
538
539Returns the value of the attribute with name {{attr-name}} in the
540given SXML element {{obj}}, or {{#f}} if no such attribute exists.
541
542<procedure>(sxml:attr-from-list attr-list name)</procedure>
543
544Returns the value of the attribute with name {{attr-name}} in the
545given list of attributes {{attr-list}}, or {{#f}} if no such attribute
546exists.  The list of attributes can be obtained from an element using
547the {{sxml:attr-list}} procedure.
548
549<procedure>(sxml:num-attr obj attr-name)</procedure>
550
551Returns the value of the numerical attribute with name {{attr-name}}
552in the given SXML element {{obj}}, or {{#f}} if no such attribute
553exists.  This value is converted from a string to a number.
554
555<procedure>(sxml:attr-u obj attr-name)</procedure>
556
557Accessor for an attribute {{attr-name}} of given SXML element {{obj}},
558which may also be an attributes-list or a nodeset (usually content of
559an SXML element)
560
561<procedure>(sxml:ns-list obj)</procedure>
562
563Returns the list of namespaces for given element.  Analog of
564{{((sxpath '(@@ *NAMESPACES* *)) obj)}}.  The empty list is returned
565if there are no namespaces.
566
567<procedure>(sxml:ns-id->nodes obj namespace-id)</procedure>
568
569Returns a list of namespace information lists that match the given
570{{namespace-id}} in SXML element {{obj}}.  Analog of
571{{((sxpath '(@@ *NAMESPACES* namespace-id)) obj)}}.
572The empty list is returned if there is no namespace with the given
573{{namespace-id}}.
574
575<examples>
576<example>
577<expr>
578(sxml:ns-id->nodes
579  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")))) 'c)
580</expr>
581<result>((c "http://www.cars.com/xml"))</result>
582</example>
583</examples>
584
585<procedure>(sxml:ns-id->uri obj namespace-id)</procedure>
586
587Returns the URI for the (first) namespace matching the given
588{{namespace-id}}, or {{#f}} if no namespace matches the given
589{{namespace-id}}.
590
591<examples>
592<example>
593<expr>
594(sxml:ns-id->uri
595  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")))) 'c)
596</expr>
597<result>"http://www.cars.com/xml"</result>
598</example>
599</examples>
600
601<procedure>(sxml:ns-uri->nodes obj uri)</procedure>
602
603Returns a list of namespace information lists that match the given
604{{uri}} in SXML element {{obj}}.
605
606<examples>
607<example>
608<expr>
609(sxml:ns-uri->nodes
610  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")
611                                 (d "http://www.cars.com/xml"))))
612  "http://www.cars.com/xml")
613</expr>
614<result>((c "http://www.cars.com/xml") (d "http://www.cars.com/xml"))</result>
615</example>
616</examples>
617
618<procedure>(sxml:ns-uri->id obj uri)</procedure>
619
620Returns the namespace id for the (first) namespace matching the given
621{{uri}}, or {{#f}} if no namespace matches the given {{uri}}.
622
623<examples>
624<example>
625<expr>
626(sxml:ns-uri->id
627  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")
628                                 (d "http://www.cars.com/xml"))))
629  "http://www.cars.com/xml")
630</expr>
631<result>c</result>
632</example>
633</examples>
634
635<procedure>(sxml:ns-id ns-list)</procedure>
636
637Given a namespace information list {{ns-list}}, returns the namespace ID.
638
639<procedure>(sxml:ns-uri ns-list)</procedure>
640
641Given a namespace information list {{ns-list}}, returns the namespace URI.
642
643<procedure>(sxml:ns-prefix ns-list)</procedure>
644
645Given a namespace information list {{ns-list}}, returns the namespace
646prefix if it is present in the list.  If it's not present, returns the
647namespace ID.
648
649==== Data modification procedures
650
651Constructors and mutators for normalized SXML data
652 
653'''Important:''' These functions are optimized for normalized SXML
654data.  They are ''not'' applicable to arbitrary non-normalized SXML
655data.
656
657Most of the functions are provided in two variants:
658
659# Side-effect intended functions for linear update of given elements.  Their names are ended with exclamation mark.
660# Pure functions without side-effects which return modified elements.
661
662
663<procedure>(sxml:change-content! obj new-content)</procedure>
664<procedure>(sxml:change-content obj new-content)</procedure>
665
666Change the content of given SXML element {{obj}} to {{new-content}}.
667If {{new-content}} is an empty list then the {{obj}} is transformed to
668an empty element.  The resulting SXML element is normalized.
669
670<procedure>(sxml:change-attrlist obj new-attrlist)</procedure>
671<procedure>(sxml:change-attrlist! obj new-attrlist)</procedure>
672
673Change the attribute list of the given SXML element {{obj}} to
674{{new-attrlist}}.
675
676<procedure>(sxml:change-name obj new-name)</procedure>
677<procedure>(sxml:change-name! obj new-name)</procedure>
678
679Change the name of the given SXML element {{obj}} to {{new-name}}.
680
681<procedure>(sxml:add-attr obj attr)</procedure>
682<procedure>(sxml:add-attr! obj attr)</procedure>
683
684Returns the given SXML element {{obj}} with the attribute {{attr}}
685added to the attribute list, or {{#f}} if the attribute already exists.
686
687<procedure>(sxml:change-attr obj attr)</procedure>
688<procedure>(sxml:change-attr! obj attr)</procedure>
689
690Returns SXML element {{obj}} with changed value of attribute {{attr}}
691or {{#f}} if where is no attribute with given name.
692
693{{attr}} is a list like it would occur as a member of an attribute
694list: {{(attr-name attr-value)}}.
695   
696<procedure>(sxml:set-attr obj attr)
697<procedure>(sxml:set-attr! obj attr)
698
699Returns SXML element {{obj}} with changed value of attribute {{attr}}.
700If there is no such attribute the new one is added.
701
702{{attr}} is a list like it would occur as a member of an attribute
703list: {{(attr-name attr-value)}}.
704
705<procedure>(sxml:add-aux obj aux-node)</procedure>
706<procedure>(sxml:add-aux! obj aux-node)</procedure>
707
708Returns SXML element {{obj}} with an auxiliary node {{aux-node}} added.
709
710<procedure>(sxml:squeeze obj)</procedure>
711<procedure>(sxml:squeeze! obj)</procedure>
712
713Returns a minimized and normalized SXML element {{obj}} with empty
714lists of attributes and aux-lists eliminated, in {{obj}} and all its
715descendants.
716   
717<procedure>(sxml:clean obj)</procedure>
718
719Returns a minimized and normalized SXML element {{obj}} with empty
720lists of attributes and '''all''' aux-lists eliminated, in {{obj}} and
721all its descendants.
722
723
724==== Sxpath-related procedures
725
726<procedure>(select-first-kid test-pred?)</procedure>
727
728Given a node, return the first child that satisfies the
729{{test-pred?}}.  Given a nodeset, traverse the set until a node is
730found whose first child matches the predicate.  Returns {{#f}} if
731there is no such a child to be found.
732
733<procedure>(sxml:node-parent rootnode)</procedure>
734
735Returns a function of one argument - an SXML element - which returns
736its parent node using {{*PARENT*}} pointer in the aux-list.
737{{'*TOP-PTR*}} may be used as a pointer to root node.  It returns an
738empty list when applied to the root node.
739
740<procedure>(sxml:add-parents obj [top-ptr])</procedure>
741
742Returns the SXML element {{obj}} annotated with {{*PARENT*}} pointers
743for {{obj}} and all its descendants.  If {{obj}} is not the root node
744(a node with a name of {{*TOP*}}), you must pass in the parent pointer
745for {{obj}} as {{top-ptr}}.
746
747'''Warning:''' This procedure mutates its {{obj}} argument.
748
749<procedure>(sxml:lookup id index)</procedure>
750
751Lookup an element using its ID.  {{index}} should be an alist of
752{{(id . element)}}.
753
754==== Markup generation
755
756===== XML
757
758<procedure>(sxml:attr->xml attr)</procedure>
759
760Returns a list containing tokens that when joined together form the
761attribute's XML output.
762
763'''Warning:''' This procedure assumes that the attribute's values have
764already been escaped (ie, {{sxml:string->xml has been called on the
765strings inside it}}).
766
767<examples>
768<example>
769<expr>(sxml:attr->xml '(href "http://example.com"))</expr>
770<result>(" " "href" "='" "http://example.com" "'")</result>
771</example>
772</examples>
773
774<procedure>(sxml:string->xml string)</procedure>
775
776Escape the {{string}} so it can be used anywhere in XML output.  This
777converts the {{<}}, {{>}}, {{'}}, {{"}} and {{&}} characters to their
778respective entities.
779
780<procedure>(sxml:sxml->xml tree)</procedure>
781
782Convert the {{tree}} of SXML nodes to a nested list of XML fragments.
783These fragments can be output by flattening the list and concatenating
784the strings inside it.
785
786==== HTML
787
788<procedure>(sxml:attr->html attr)</procedure>
789
790Returns a list containing tokens that when joined together form the
791attribute's HTML output.  The difference with the XML variant is that
792this encodes empty attribute values to attributes with no value (think
793{{selected}} in option elements, or {{checked}} in checkboxes).
794
795'''Warning:''' This procedure assumes that the attribute's values have
796already been escaped (ie, {{sxml:string->html has been called on the
797strings inside it}}).
798
799<procedure>(sxml:string->html string)</procedure>
800
801Escape the {{string}} so it can be used anywhere in XML output.  This
802converts the {{<}}, {{>}}, {{"}} and {{&}} characters to their
803respective entities.
804
805<procedure>(sxml:non-terminated-html-tag? tag)</procedure>
806
807Is the named {{tag}} one that is "self-closing" (ie, does not need to
808be terminated) in HTML 4.0?
809
810<procedure>(sxml:sxml->html tree)</procedure>
811
812Convert the {{tree}} of SXML nodes to a nested list of HTML fragments.
813These fragments can be output by flattening the list and concatenating
814the strings inside it.
815
816
817=== Procedures from sxpathlib
818
819==== Basic converters and applicators
820
821A converter is a function
822
823  type Converter = Node|Nodelist -> Nodelist
824
825A converter can also play a role of a predicate: in that case, if a
826converter, applied to a node or a nodelist, yields a non-empty
827nodelist, the converter-predicate is deemed satisfied. Throughout this
828file a nil nodelist is equivalent to {{#f}} in denoting a failure.
829
830<procedure>(nodeset? obj)</procedure>
831
832Returns {{#t}} if {{obj}} is a nodelist.
833
834<procedure>(as-nodeset obj)</procedure>
835
836If {{obj}} is a nodelist - returns it as is, otherwise wrap it in a
837list.
838
839==== Node test
840
841The following functions implement 'Node test's as defined in Sec. 2.3
842of the XPath document.  A node test is one of the components of a
843location step.  It is also a converter-predicate in SXPath.
844
845<procedure>(sxml:element? obj)</procedure>
846
847Predicate which returns {{#t}} if {{obj}} is SXML element, otherwise {{#f}}.
848
849<procedure>(ntype-names?? crit)</procedure>
850
851Takes a list of acceptable node names as a criterion and returns a
852function, which, when applied to a node, will return {{#t}} if the
853node name is present in criterion list and {{#f}} otherwise.
854
855   ntype-names?? :: ListOfNames -> Node -> Boolean
856
857<procedure>(ntype?? crit)</procedure>
858
859Takes a type criterion and returns a function, which, when applied to
860a node, will tell if the node satisfies the test.
861
862  ntype?? :: Crit -> Node -> Boolean
863
864The criterion {{crit}} is  one of the following symbols:
865
866; {{@}} : tests if the Node is an {{attributes-list}}
867; {{*}} : tests if the Node is an {{Element}}
868; {{*text*}} : tests if the Node is a text node
869; {{*data*}} : tests if the Node is a data node  (text, number, boolean, etc., but not pair)
870; {{*PI*}} : tests if the Node is a processing instructions node
871; {{*COMMENT*}} : tests if the Node is a comment node
872; {{*ENTITY*}} : tests if the Node is an entity node
873; {{*any*}} : {{#t}} for any type of Node
874; other symbol : tests if the Node has the right name given by the symbol
875
876<examples>
877<example>
878<expr>
879((ntype?? 'div) '(div (@ (class "greeting")) "hi"))
880</expr>
881<result>
882#t
883</result>
884</example>
885<example>
886<expr>
887((ntype?? 'div) '(span (@ (class "greeting")) "hi"))
888</expr>
889<result>
890#f
891</result>
892</example>
893<example>
894<expr>
895((ntype?? '*) '(span (@ (class "greeting")) "hi"))
896</expr>
897<result>
898#t
899</result>
900</example>
901</examples>
902   
903<procedure>(ntype-namespace-id?? ns-id)</procedure>
904
905This function takes a namespace-id, and returns a predicate
906{{Node -> Boolean}}, which is {{#t}} for nodes with the given
907namespace id. {{ns-id}} is a string.
908{{(ntype-namespace-id?? #f)}} will be {{#t}} for nodes with
909non-qualified names.
910
911<procedure>(sxml:complement pred)</procedure>
912
913This function takes a predicate and returns it complemented, that is
914if the given predicate yields {{#f}} or {{'()}} the complemented one
915yields the given node and vice versa.
916
917<procedure>(node-eq? other)</procedure>
918
919Returns a predicate procedure that, given a node, returns {{#t}} if
920the node is the exact same as {{other}}.
921
922<procedure>(node-equal? other)</procedure>
923
924Returns a predicate procedure that, given a node, returns {{#t}} if
925the node has the same contents as {{other}}.
926
927<procedure>(node-pos n)</procedure>
928
929Returns a procedure that, given a nodelist, returns a new nodelist
930containing only the {{n}}th element, counting from 1.  If {{n}} is
931negative, it returns a nodelist with the {{n}}th element counting from
932the right.  If no such node exists, returns the empty list.  {{n}} may
933not equal zero.
934
935<examples>
936<example>
937<expr>
938((node-pos 1) '((div "hi") (span "hello") (em "really, hi!")))
939</expr>
940<result>
941((div "hi"))
942</result>
943</example>
944<example>
945<expr>
946((node-pos 6) '((div "hi") (span "hello") (em "really, hi!")))
947</expr>
948<result>
949()
950</result>
951</example>
952<example>
953<expr>
954((node-pos -1) '((div "hi") (span "hello") (em "is this thing on?")))
955</expr>
956<result>
957((em "is this thing on?"))
958</result>
959</example>
960</examples>
961
962<procedure>(sxml:filter pred?)</procedure>
963
964Returns a procedure that accepts a nodelist or a node (which will be
965converted to a one-element nodelist) and returns only those nodes for
966which the predicate {{pred?}} does not return {{#f}} or {{'()}}.
967
968<examples>
969<example>
970<expr>
971((sxml:filter (ntype?? 'div)) '((div "hi") (span "hello") (div "still here?")))
972</expr>
973<result>
974((div "hi") (div "still here?"))
975</result>
976</example>
977</examples>
978
979<procedure>(take-until pred?)</procedure>
980<procedure>(take-after pred?)</procedure>
981
982Returns a procedure that accepts a node or a nodelist.
983
984The {{take-until}} variant returns everything ''before'' the first
985node for which the predicate {{pred?}} returns anything but {{#f}} or
986{{'()}}.  In other words, it returns the longest prefix for which the
987predicate returns {{#f}} or {{'()}}.
988
989The {{take-after}} variant returns everything ''after'' the first node
990for which the predicate {{pred?}} returns anything besides {{#f}} or
991{{'()}}.
992
993<examples>
994<example>
995<expr>
996((take-until (ntype?? 'span)) '((div "hi") (span "hello") (span "there") (div "still here?")))
997</expr>
998<result>
999((div "hi"))
1000</result>
1001</example>
1002<example>
1003<expr>
1004((take-after (ntype?? 'span)) '((div "hi") (span "hello") (span "there") (div "still here?")))
1005</expr>
1006<result>
1007((span "there") (div "still here?"))
1008</result>
1009</example>
1010</examples>
1011
1012<procedure>(map-union proc list)</procedure>
1013
1014Apply {{proc}} to each element of the nodelist {{lst}} and return the
1015list of results.  If {{proc}} returns a nodelist, splice it into the
1016result (essentially returning a flattened nodelist).
1017
1018<procedure>(node-reverse node-or-nodelist)</procedure>
1019
1020Accepts a nodelist and reverses the nodes inside.  If a node is passed
1021to this procedure, it returns a nodelist containing just that node.
1022(it does not change the order of the children).
1023
1024==== Converter combinators
1025
1026Combinators are higher-order functions that transmogrify a converter
1027or glue a sequence of converters into a single, non-trivial
1028converter. The goal is to arrive at converters that correspond to
1029XPath location paths.
1030
1031From a different point of view, a combinator is a fixed, named
1032''pattern'' of applying converters. Given below is a complete set of
1033such patterns that together implement XPath location path
1034specification. As it turns out, all these combinators can be built
1035from a small number of basic blocks; regular functional composition,
1036{{map-union}} and filter applicators, and the nodelist union.
1037
1038<procedure>(select-kids pred?)</procedure>
1039
1040Returns a procedure that accepts a node and returns a nodelist of the
1041node's children that satisfy {{pred?}} (ie, {{pred?}} returns anything
1042but {{#f}} or {{'()}}).
1043
1044<procedure>(node-self pred?)</procedure>
1045
1046Similar to {{select-kids}} but applies to the node itself rather than
1047to its children. The resulting Nodelist will contain either one
1048component (the node), or will be empty (if the node failed the
1049predicate).
1050
1051<procedure>(node-join . selectors)</procedure>
1052
1053Returns a procedure that accepts a nodelist or a node, and returns a
1054nodelist with all the selectors applied to every node in sequence.
1055The selectors must function as converter combinators, ie they must
1056accept a ''node'' and output a ''nodelist''.
1057
1058<examples>
1059<example>
1060<expr>
1061((node-join
1062  (select-kids (ntype?? 'li))
1063  sxml:content)
1064 '((ul (@ (class "whiskies"))
1065       (li "Ardbeg")
1066       (li "Glenfarclas")
1067       (li "Springbank"))))
1068</expr>
1069<result>
1070("Ardbeg" "Glenfarclas" "Springbank")
1071</result>
1072</example>
1073</examples>
1074
1075<procedure>(node-reduce . converters)</procedure>
1076
1077A regular functional composition of converters.
1078
1079From a different point of view,
1080  ((apply node-reduce converters) nodelist)
1081is equivalent to
1082  (fold apply nodelist converters)
1083i.e., folding, or reducing, a list of converters with the nodelist
1084as a seed.
1085
1086
1087<procedure>(node-or . converters)</procedure>
1088
1089This combinator applies all converters to a given node and produces
1090the union of their results.  This combinator corresponds to a union,
1091"{{|}}" operation for XPath location paths.
1092
1093<procedure>(node-closure test-pred?)</procedure>
1094
1095Select all ''descendants'' of a node that satisfy a
1096converter-predicate.  This combinator is similar to {{select-kids}}
1097but applies to grandchildren as well.
1098
1099<procedure>(node-trace title)</procedure>
1100
1101Returns a procedure that accepts a node or a nodelist, which it
1102pretty-prints to the current output port, preceded by {{title}}.  It
1103returns the node or the nodelist unchanged.  This is a useful
1104debugging aid, since it doesn't really do anything besides print its
1105argument and pass it on.
1106
1107<procedure>(sxml:node? obj)</procedure>
1108
1109Returns {{#t}} if the given {{obj}} is an SXML node, {{#f}} otherwise.
1110A node is anything except an attribute list or an auxiliary list.
1111
1112<procedure>(sxml:attr-list node)</procedure>
1113
1114Returns the list of attributes for a given SXML node.  The empty list
1115is returned if the given node is not an element, or if it has no list
1116of attributes.
1117
1118This differs from {{sxml:attr-list-u}} in that this procedure accepts
1119any SXML node while {{sxml:attr-list-u}} only accepts nodelists or
1120elements.  This means that sxml:attr-list-u will throw an error if you
1121pass it a text node (a string), while sxml:attr-list will not.
1122
1123<procedure>(sxml:attribute test-pred?)</procedure>
1124
1125Like {{sxml:filter}}, but considers the attributes instead of the
1126nodes.  Returns a nodelist of attribtes that match {{test-pred?}}.
1127
1128<examples>
1129<example>
1130<expr>
1131((sxml:attribute (ntype?? 'id))
1132 '((div (@ (id "navigation")) "navigation here")
1133   (div (@ (class "pullquote")) "random stuff")
1134   (div (@ (id "main-content")) "lorem ipsum ...")))
1135</expr>
1136<result>
1137((id "navigation") (id "main-content"))
1138</result>
1139</example>
1140</examples>
1141
1142<procedure>(sxml:child test-pred?)</procedure>
1143
1144This procedure is similar to {{select-kids}}, but it returns an empty
1145child-list for PI, Comment and Entity nodes.
1146
1147<procedure>(sxml:parent test-pred?)</procedure>
1148
1149Returns a procedure that accepts a root-node, and returns another
1150procedure.  This second procedure accepts a nodeset (or a node) and
1151returns the immediate parents of the nodes in the set, but only if
1152for those parents that match the predicate.
1153
1154The root-node does not have to be the root node of the
1155whole SXML tree -- it may be a root node of a branch of interest.
1156
1157This procedure can be used with any SXML node.
1158
1159==== Useful shortcuts
1160
1161<procedure>(node-parent node)</procedure>
1162
1163{{(node-parent rootnode)}} yields a converter that returns a parent of a
1164node it is applied to. If applied to a nodelist, it returns the list
1165of parents of nodes in the nodelist.
1166
1167This is equivalent to {{((sxml:parent (ntype? '*any*)) node)}}.
1168
1169<procedure>(sxml:child-nodes node)</procedure>
1170
1171Returns all the child nodes of the given {{node}}.
1172
1173This is equivalent to {{((sxml:child sxml:node?) node)}}.
1174
1175<procedure>(sxml:child-elements node)</procedure>
1176
1177Returns all the child ''elements'' of the given {{node}}. (ie,
1178excludes any textnodes).
1179
1180This is equivalent to {{((select-kids sxml:element?) node)}}.
1181
1182=== Procedures from sxpath-ext
1183
1184==== SXML counterparts to W3C XPath Core Functions Library
1185
1186<procedure>(sxml:string object)</procedure>
1187
1188The counterpart to XPath 'string' function (section 4.2 XPath 1.0 Rec.).
1189Converts a given object to a string.
1190
1191Notes:
1192# When converting a nodeset, document order is not preserved
1193# {{number->string}} returns the result in a form which is slightly different from XPath Rec. specification
1194
1195<procedure>(sxml:boolean object)</procedure>
1196
1197The counterpart to XPath 'boolean' function (section 4.3 XPath Rec.).
1198Converts its argument to a boolean.
1199
1200<procedure>(sxml:number object)</procedure>
1201
1202The counterpart to XPath 'number' function (section 4.4 XPath Rec.).
1203Converts its argument to a number.
1204
1205Notes:
1206# The argument is not optional (yet?)
1207# string->number conversion is not IEEE 754 round-to-nearest
1208# NaN is represented as 0
1209
1210<procedure>(sxml:string-value node)</procedure>
1211
1212Returns a string value for a given node in accordance to
1213XPath Rec. 5.1 - 5.7
1214
1215<procedure>(sxml:id id-index)</procedure>
1216
1217Returns a procedure that accepts a nodeset and returns a nodeset
1218containing the elements in the id-index that match the string-values
1219of each entry of the nodeset.  XPath Rec. 4.1
1220
1221The {{id-index}} is an alist with unique IDs as key, and elements as
1222values:
1223
1224  id-index = ( (id-value . element) (id-value . element) ... )
1225
1226==== Comparators for XPath objects
1227
1228<procedure>(sxml:list-head list n)</procedure>
1229
1230Returns the {{n}} first members of {{list}}.  Mostly equivalent to
1231SRFI-1's {{take}} procedure, except it returns the {{list}} if {{n}}
1232is larger than the length of said list, instead of throwing an error.
1233
1234<procedure>(sxml:merge-sort less-than? list)</procedure>
1235
1236Returns the sorted list, the smallest member first.
1237  less-than? ::= (lambda (obj1 obj2) ...)
1238{{less-than?}} returns {{#t}} if {{obj1 < obj2}} with respect to the
1239given ordering.
1240
1241<procedure>(sxml:equality-cmp bool=? number=? string=?)</procedure>
1242
1243A helper for XPath equality operations: {{=}} , {{!=}}.  The
1244{{bool=?}}, {{number=?}} and {{string=?}} arguments are comparison
1245operations for booleans, numbers and strings respectively.
1246
1247Returns a procedure that accepts two objects, looks at the first
1248object's type and applies the correct comparison predicate to it.
1249Type coercion takes place depending on the rules described in the
1250XPath 1.0 spec, section 3.4 ("Booleans").
1251
1252<procedure>(sxml:equal? obj1 obj2)</procedure>
1253<procedure>(sxml:not-equal? obj1 obj2)</procedure>
1254
1255Equality procedures with the default comparison operators {{eq?}},
1256{{=}} and {{string=?}}, or their inverse, respectively.
1257
1258<procedure>(sxml:relational-cmp op)</procedure>
1259
1260A helper for XPath relational operations: {{<}}, {{>}}, {{<=}}, {{>=}}
1261for two XPath objects.  {{op}} is one of these operators.
1262
1263Returns a procedure that accepts two objects and returns the value of
1264the procedure applied to these objects, converted according to the
1265coercion rules described in the XPath 1.0 spec, section 3.4
1266("Booleans").
1267
1268==== XPath axes
1269
1270<procedure>(sxml:ancestor test-pred?)</procedure>
1271
1272Like {{sxml:parent}}, except it returns all the ancestors that match
1273{{test-pred?}}, not just the immediate parent.
1274
1275<procedure>(sxml:ancestor-or-self test-pred?)</procedure>
1276
1277Like {{sxml:ancestor}}, except also allows the node itself to match
1278the predicate.
1279
1280<procedure>(sxml:descendant test-pred?)</procedure>
1281
1282Like {{node-closure}}, except the resulting nodeset is in depth-first
1283order instead of breadth-first.
1284
1285<procedure>(sxml:descendant-or-self test-pred?)</procedure>
1286
1287Like {{sxml:descendant}}, except also allows the node itself to match
1288the predicate.
1289
1290<procedure>(sxml:following test-pred?)</procedure>
1291
1292Returns a procedure that accepts a root node and returns a new
1293procedure that accepts a node and returns all nodes following this
1294node in the document source matching the predicate.
1295
1296<procedure>(sxml:following-sibling test-pred?)</procedure>
1297
1298Like {{sxml:following}}, except only siblings (nodes at the same level
1299under the same parent) are returned.
1300
1301<procedure>(sxml:preceding test-pred?)</procedure>
1302
1303Returns a procedure that accepts a root node and returns a new
1304procedure that accepts a node and returns all nodes preceding this
1305node in the document source matching the predicate.
1306
1307<procedure>(sxml:preceding-sibling test-pred?)</procedure>
1308
1309Like {{sxml:preceding}}, except only siblings (nodes at the same level
1310under the same parent) are returned.
1311
1312<procedure>(sxml:namespace test-pred?)</procedure>
1313
1314Returns a procedure that accepts a nodeset and returns the namespace
1315lists of the nodes matching {{test-pred?}}.
1316
1317
1318== About this egg
1319
1320=== Author
1321
1322[[http://okmij.org/ftp/|Oleg Kiselyov]], [[http://www196.pair.com/lisovsky/|Kirill Lisovsky]], [[http://modis.ispras.ru/Lizorkin/index.html|Dmitry Lizorkin]].
1323
1324=== Version history
1325
1326; 0.1 : Split up the old sxml-tools egg into sxpath
1327
1328=== License
1329
1330The sxml-tools are in the public domain.
Note: See TracBrowser for help on using the repository browser.