source: project/wiki/eggref/4/sxpath @ 13373

Last change on this file since 13373 was 13373, checked in by sjamaan, 12 years ago

Add docs for the txpath module

File size: 36.5 KB
Line 
1[[tags:eggs]]
2
3This is version 0.1 of the '''sxpath''' extension library for Chicken Scheme.
4
5[[toc:]]
6
7== Description
8
9The sxpath parts of the [[http://cvs.sourceforge.net/viewcvs.py/ssax/sxml-tools/|sxml-tools]] from the [[http://ssax.sf.net|SSAX project]] at Sourceforge.
10Because txpath and sxpath are interwoven, this egg also includes txpath parts.
11
12== Documentation
13
14This egg provides the sxpath-related tools from the sxml-tools available
15in the SSAX/SXML Sourceforge project.
16
17It is split up in three modules: [[#sxpath|sxpath]], [[#txpath|txpath]]
18and [[#sxpath-lolevel]]. {{sxpath}} depends on {{txpath}} and both
19modules depend on {{sxpath-lolevel}}.
20
21Much documentation is available at
22[[http://www196.pair.com/lisovsky/xml/index.html|Lisovsky's XML page]]
23and the [[http://ssax.sf.net|SSAX homepage]].
24
25The initial documentation on this wiki page came straight from the
26comments in the extremely well-documented source code. It's
27recommended you read the code if you want to learn more.
28
29== sxpath
30
31This is the preferred interface to use.  It allows you to query the
32SXML document tree using an s-expression based language, in which you
33can also use arbitrary procedures and even "classic" textual XPath
34(see [#txpath|below] for docs on that).
35
36A complete description on how to use this is outside the scope of this
37egg documentation. See
38[[http://www196.pair.com/lisovsky/query/sxpath/|the introduction to SXPath]]
39for that.
40
41<procedure>(sxpath path [ns-binding])</procedure>
42
43Returns a procedure that accepts an SXML document tree and returns a
44nodeset (list of nodes) that match the {{path}} expression.
45
46The optional {{ns-binding}} argument is an alist of namespace
47bindings.  It is used to map abbreviated namespace prefixes to full
48URI strings.
49
50It can be useful to compare the following examples to those for
51[#txpath|txpath].
52
53<examples>
54<example>
55<expr>
56;; selects all the 'item' elements that have an 'olist' parent
57;; (which is not root) and that are in the same document as the context node
58((sxpath `(// olist item))
59 '(doc (olist (item "1")) (item "2") (nested (olist (item "3")))))
60</expr>
61<result>
62((item "1") (item "3"))
63</result>
64</example>
65<example>
66<expr>
67;; selects the 'chapter' children of the context node that have one or
68;; more 'title' children with string-value equal to 'Introduction'
69(sxpath '((chapter ((equal? (title "Introduction")))))
70 '(text  (chapter (title "Introduction"))  (chapter "No title for this chapter")  (chapter (title "Conclusion"))))
71</expr>
72<result>
73((chapter (title "Introduction")))
74</result>
75</example>
76<example>
77<expr>
78;; (sxpath string-expr) is equivalent to (txpath string-expr)
79(sxpath "chapter[title='Introduction']"
80 '(text  (chapter (title "Introduction"))  (chapter "No title for this chapter")  (chapter (title "Conclusion"))))
81</expr>
82<result>
83</result>
84</example>
85</examples>
86
87TODO: find out how ns-binding works and give an example that uses this.
88
89<procedure>(if-sxpath path)</procedure>
90
91Like {{sxpath}}, only returns {{#f}} instead of the empty list if
92nothing matches (so it does ''not'' always return a nodeset).
93
94<procedure>(car-sxpath path)</procedure>
95
96Like {{sxpath}}, only instead of a nodeset it returns the first node
97found.  If no node was found, return '''an empty list'''.
98
99<procedure>(if-car-sxpath path)</procedure>
100
101Like {{car-sxpath}}, only returns {{#f}} instead of the empty list if
102nothing matches.
103
104<procedure>(sxml:id-alist node . lpaths)</procedure>
105
106Builds an index as a list of {{(ID_value . element)}} pairs for given
107{{node}}. {{lpaths}} are location paths for attributes of type ID (ie,
108sxpath expressions that tell it how to find the ID attribute).
109
110Note: location paths ''must'' be of the form {{(expr '@ attrib-name)}}.
111
112See also {{sxml:lookup}} below, in {{sxpath-lolevel}}, which can use
113this index.
114
115<examples>
116<example>
117<expr>
118;; TODO: find out why location paths must be of the form (expr '@ symbol)
119;;       or if this description is incorrect
120(sxml:id-alist
121 '(div (span (@ (id "hi")) "there")
122       (div (@ (id "hello")) "dude")
123       (a (@ (id "link")) "click here"))
124 '(span @ id) '(a @ id))
125</expr>
126<result>
127(("hi" . (span (@ (id "hi")) "there"))
128 ("link" . (a (@ (id "link")) "click here")))
129</result>
130</example>
131</examples>
132
133== txpath
134
135This section documents the txpath interface. This interface is mostly
136useful for programs that deal exclusively with "legacy" textual XPath
137queries.
138
139<procedure>(txpath string . ns-binding)</procedure>
140
141Returns a procedure that accepts an SXML document tree and returns a
142nodeset (list of nodes) that match the {{path}} expression.
143
144The optional {{ns-binding}} argument is an alist of namespace
145bindings.  It is used to map abbreviated namespace prefixes to full
146URI strings.
147
148{{(txpath x)}} is equivalent to {{(sxpath x)}} whenever {{x}} is a
149string.
150
151It's useful to compare the following examples to the above examples
152for [#sxpath|sxpath].
153
154<examples>
155<example>
156<expr>
157;; selects all the 'item' elements that have an 'olist' parent
158;; (which is not root) and that are in the same document as the context node
159((txpath "//olist/item")
160 '(doc (olist (item "1")) (item "2") (nested (olist (item "3")))))
161</expr>
162<result>
163((item "1") (item "3"))
164</result>
165</example>
166<example>
167<expr>
168;; selects the 'chapter' children of the context node that have one or
169;; more 'title' children with string-value equal to 'Introduction'
170((txpath "chapter[title='Introduction']")
171 '(text  (chapter (title "Introduction"))  (chapter "No title for this chapter")  (chapter (title "Conclusion"))))
172</expr>
173<result>
174((chapter (title "Introduction")))
175</result>
176</example>
177</examples>
178
179== sxpath-lolevel
180
181This section documents the low-level sxpath interface. It includes
182mostly-generic list and SXML operators.
183
184It consists of the extensions defined in {{sxml-tools.scm}} plus
185{{sxpathlib}} and {{sxpath-ext}}.  This is equivalent to the
186"low-level sxpath interface" described at
187[[http://www196.pair.com/lisovsky/query/sxpath/|the introduction to SXPath]].
188
189These utilities are useful when you want to query SXML document trees,
190but full sxpath would be overkill.  Most of these procedures are
191faster than their sxpath equivalent, because they are very specific.
192But this also means they are very low-level, so you should use them
193only if you know what you're doing.
194
195
196==== Predicates
197
198<procedure>(sxml:empty-element? obj)</procedure>
199
200Predicate which returns {{#t}} if given element {{obj}} is empty.
201Empty elements have no nested elements, text nodes, PIs, Comments or
202entities but may contain attributes or namespace-id.  It is a SXML
203counterpart of XML {{empty-element}}.
204
205<procedure>(sxml:shallow-normalized? obj)</procedure>
206
207Returns {{#t}} if the given {{obj}} is a shallow-normalized SXML
208element.  The element itself has to be normalised but its nested
209elements are not tested.
210
211<procedure>(sxml:normalized? obj)</procedure>
212
213Returns {{#t}} if the given {{obj}} is a normalized SXML element.  The element
214itself and all its nested elements have to be normalised.
215
216<procedure>(sxml:shallow-minimized? obj)</procedure>
217
218Returns {{#t}} if the given {{obj}} is a shallow-minimized SXML
219element.  The element itself has to be minimised but its nested
220elements are not tested.
221
222<procedure>(sxml:minimized? obj)</procedure>
223
224Returns {{#t}} if the given {{obj}} is a minimized SXML element.  The
225element itself and all its nested elements have to be minimised.
226
227==== Accessors
228
229These procedures obtain information about nodes, or their direct
230children.  They don't traverse subtrees.
231
232===== Normalization-independent accessors
233
234These accessors can be used on arbitrary, non-normalized SXML trees.
235Because of this, they are generally slower than the
236normalization-dependent variants listed in the next section.
237
238<procedure>(sxml:name node)</procedure>
239
240Returns a name of a given SXML node. It is introduced for the sake of
241encapsulation.
242
243<procedure>(sxml:element-name obj)</procedure>
244
245A checked version of sxml:name, which returns {{#f}} if the given
246{{obj}} is not a SXML element. Otherwise returns its name.
247
248<procedure>(sxml:node-name obj)</procedure>
249
250Safe version of sxml:name, which returns {{#f}} if the given {{obj}}
251is not a SXML node.  Otherwise returns its name.
252
253The difference between this and {{sxml::element-name}} is that a node
254can be one of {{@}}, {{@@}}, {{*PI*}}, {{*COMMENT*}} or {{*ENTITY*}}
255while an element must be a real element (any symbol not in that set is
256considered to be an element).
257
258<procedure>(sxml:ncname node)</procedure>
259
260Like {{sxml:name}}, except returns only the local part of the name
261(called an "NCName" in the
262[http://www.w3.org/TR/xml-names/|XML namespaces spec]).
263
264The node's name is interpreted as a "Qualified Name", a
265colon-separated name of which the last one is considered to be the
266local part.  If the name contains no colons, the name itself is
267returned.
268
269'''Important:''' Please note that while an SXML name is a symbol, this
270function returns a string.
271
272<procedure>(sxml:name->ns-id sxml-name)</procedure>
273
274Given a node name, return the namespace part of the name (called a
275{{namespace-id}}).  If the name contains no colons, returns {{#f}}.  See
276{{sxml:ncname}} for more info.
277
278'''Important:''' Please note that while an SXML name is a symbol, this
279function returns a string.
280
281<procedure>(sxml:content obj)</procedure>
282
283Retrieve the contents of an SXML element or nodeset.  Any non-element
284nodes (attributes, processing instructions, etc) are discarded,
285while the elements and text nodes are returned as a list of strings
286and nested elements in document order.  This list is empty if {{obj}}
287is an empty element or empty list.
288
289The inner elements are unmodified so they still contain attributes,
290but also comments or other non-element nodes.
291
292<examples>
293<example>
294<expr>
295(sxml:content
296  '(div (@ (class "content"))
297        (*COMMENT* "main contents start here")
298         "The document moved "
299         (a (@ (href "/other.xml")) "here")))
300</expr>
301<result>("The document moved " (a (@ (href "/other.xml")) "here"))</result>
302</example>
303</examples>
304
305<procedure>(sxml:text node)</procedure>
306
307Returns a string which combines all the character data from text node
308children of the given SXML element or "" if there are no text node
309children.  Note that it does not include text from descendant nodes,
310only direct children.
311
312<examples>
313<example>
314<expr>
315(sxml:text
316  '(div (@ (class "content"))
317        (*COMMENT* "main contents start here")
318         "The document moved "
319         (a (@ (href "/other.xml")) "here")))
320</expr>
321<result>("The document moved ")</result>
322</example>
323</examples>
324
325==== Normalization-dependent accessors
326
327"Universal" accessors are less effective but may be used for
328non-normalized SXML.  These safe accessors are named with suffix '-u'
329for "universal".
330
331"Fast" accessors are optimized for normalized SXML data.  They are not
332applicable to arbitrary non-normalized SXML data.  Their names have no
333specific suffixes.
334
335<procedure>(sxml:content-raw obj)</procedure>
336
337Returns all the content of normalized SXML element except attr-list
338and aux-list.  Thus it includes {{PI}}, {{COMMENT}} and {{ENTITY}}
339nodes as well as {{TEXT}} and {{ELEMENT}} nodes returned by
340{{sxml:content}}.  Returns a list of nodes in document order or empty
341list if {{obj}} is an empty element or an empty list.
342
343This function is faster than {{sxml:content}}.
344
345<procedure>(sxml:attr-list-u obj)</procedure>
346
347Returns the list of attributes for given element or nodeset.  Analog
348of {{((sxpath '(@ *)) obj)}}.  Empty list is returned if there is no
349list of attributes.
350
351<procedure>(sxml:aux-list obj)</procedure>
352<procedure>(sxml:aux-list-u obj)</procedure>
353
354Returns the list of auxiliary nodes for given element or nodeset.
355Analog of {{((sxpath '(@@ *)) obj)}}.  Empty list is returned if a
356list of auxiliary nodes is absent.
357
358<procedure>(sxml:aux-node obj aux-name)</procedure>
359
360Return the first aux-node with <aux-name> given in SXML element
361{{obj}} or {{#f}} is such a node is absent.
362
363'''NOTE:''' it returns just the ''first'' node found even if multiple
364nodes are present, so it's mostly intended for nodes with unique names.
365Use {{sxml:aux-nodes}} if you want all of them.
366
367<procedure>(sxml:aux-nodes obj aux-name)</procedure>
368   
369Return a list of aux-nodes with {{aux-name}} given in SXML element
370{{obj}} or {{'()}} if such a node is absent.
371
372<procedure>(sxml:attr obj attr-name)</procedure>
373
374Returns the value of the attribute with name {{attr-name}} in the
375given SXML element {{obj}}, or {{#f}} if no such attribute exists.
376
377<procedure>(sxml:attr-from-list attr-list name)</procedure>
378
379Returns the value of the attribute with name {{attr-name}} in the
380given list of attributes {{attr-list}}, or {{#f}} if no such attribute
381exists.  The list of attributes can be obtained from an element using
382the {{sxml:attr-list}} procedure.
383
384<procedure>(sxml:num-attr obj attr-name)</procedure>
385
386Returns the value of the numerical attribute with name {{attr-name}}
387in the given SXML element {{obj}}, or {{#f}} if no such attribute
388exists.  This value is converted from a string to a number.
389
390<procedure>(sxml:attr-u obj attr-name)</procedure>
391
392Accessor for an attribute {{attr-name}} of given SXML element {{obj}},
393which may also be an attributes-list or a nodeset (usually content of
394an SXML element)
395
396<procedure>(sxml:ns-list obj)</procedure>
397
398Returns the list of namespaces for given element.  Analog of
399{{((sxpath '(@@ *NAMESPACES* *)) obj)}}.  The empty list is returned
400if there are no namespaces.
401
402<procedure>(sxml:ns-id->nodes obj namespace-id)</procedure>
403
404Returns a list of namespace information lists that match the given
405{{namespace-id}} in SXML element {{obj}}.  Analog of
406{{((sxpath '(@@ *NAMESPACES* namespace-id)) obj)}}.
407The empty list is returned if there is no namespace with the given
408{{namespace-id}}.
409
410<examples>
411<example>
412<expr>
413(sxml:ns-id->nodes
414  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")))) 'c)
415</expr>
416<result>((c "http://www.cars.com/xml"))</result>
417</example>
418</examples>
419
420<procedure>(sxml:ns-id->uri obj namespace-id)</procedure>
421
422Returns the URI for the (first) namespace matching the given
423{{namespace-id}}, or {{#f}} if no namespace matches the given
424{{namespace-id}}.
425
426<examples>
427<example>
428<expr>
429(sxml:ns-id->uri
430  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")))) 'c)
431</expr>
432<result>"http://www.cars.com/xml"</result>
433</example>
434</examples>
435
436<procedure>(sxml:ns-uri->nodes obj uri)</procedure>
437
438Returns a list of namespace information lists that match the given
439{{uri}} in SXML element {{obj}}.
440
441<examples>
442<example>
443<expr>
444(sxml:ns-uri->nodes
445  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")
446                                 (d "http://www.cars.com/xml"))))
447  "http://www.cars.com/xml")
448</expr>
449<result>((c "http://www.cars.com/xml") (d "http://www.cars.com/xml"))</result>
450</example>
451</examples>
452
453<procedure>(sxml:ns-uri->id obj uri)</procedure>
454
455Returns the namespace id for the (first) namespace matching the given
456{{uri}}, or {{#f}} if no namespace matches the given {{uri}}.
457
458<examples>
459<example>
460<expr>
461(sxml:ns-uri->id
462  '(c:part (@) (@@ (*NAMESPACES* (c "http://www.cars.com/xml")
463                                 (d "http://www.cars.com/xml"))))
464  "http://www.cars.com/xml")
465</expr>
466<result>c</result>
467</example>
468</examples>
469
470<procedure>(sxml:ns-id ns-list)</procedure>
471
472Given a namespace information list {{ns-list}}, returns the namespace ID.
473
474<procedure>(sxml:ns-uri ns-list)</procedure>
475
476Given a namespace information list {{ns-list}}, returns the namespace URI.
477
478<procedure>(sxml:ns-prefix ns-list)</procedure>
479
480Given a namespace information list {{ns-list}}, returns the namespace
481prefix if it is present in the list.  If it's not present, returns the
482namespace ID.
483
484==== Data modification procedures
485
486Constructors and mutators for normalized SXML data
487 
488'''Important:''' These functions are optimized for normalized SXML
489data.  They are ''not'' applicable to arbitrary non-normalized SXML
490data.
491
492Most of the functions are provided in two variants:
493
494# Side-effect intended functions for linear update of given elements.  Their names are ended with exclamation mark.
495# Pure functions without side-effects which return modified elements.
496
497
498<procedure>(sxml:change-content! obj new-content)</procedure>
499<procedure>(sxml:change-content obj new-content)</procedure>
500
501Change the content of given SXML element {{obj}} to {{new-content}}.
502If {{new-content}} is an empty list then the {{obj}} is transformed to
503an empty element.  The resulting SXML element is normalized.
504
505<procedure>(sxml:change-attrlist obj new-attrlist)</procedure>
506<procedure>(sxml:change-attrlist! obj new-attrlist)</procedure>
507
508Change the attribute list of the given SXML element {{obj}} to
509{{new-attrlist}}.
510
511<procedure>(sxml:change-name obj new-name)</procedure>
512<procedure>(sxml:change-name! obj new-name)</procedure>
513
514Change the name of the given SXML element {{obj}} to {{new-name}}.
515
516<procedure>(sxml:add-attr obj attr)</procedure>
517<procedure>(sxml:add-attr! obj attr)</procedure>
518
519Returns the given SXML element {{obj}} with the attribute {{attr}}
520added to the attribute list, or {{#f}} if the attribute already exists.
521
522<procedure>(sxml:change-attr obj attr)</procedure>
523<procedure>(sxml:change-attr! obj attr)</procedure>
524
525Returns SXML element {{obj}} with changed value of attribute {{attr}}
526or {{#f}} if where is no attribute with given name.
527
528{{attr}} is a list like it would occur as a member of an attribute
529list: {{(attr-name attr-value)}}.
530   
531<procedure>(sxml:set-attr obj attr)
532<procedure>(sxml:set-attr! obj attr)
533
534Returns SXML element {{obj}} with changed value of attribute {{attr}}.
535If there is no such attribute the new one is added.
536
537{{attr}} is a list like it would occur as a member of an attribute
538list: {{(attr-name attr-value)}}.
539
540<procedure>(sxml:add-aux obj aux-node)</procedure>
541<procedure>(sxml:add-aux! obj aux-node)</procedure>
542
543Returns SXML element {{obj}} with an auxiliary node {{aux-node}} added.
544
545<procedure>(sxml:squeeze obj)</procedure>
546<procedure>(sxml:squeeze! obj)</procedure>
547
548Returns a minimized and normalized SXML element {{obj}} with empty
549lists of attributes and aux-lists eliminated, in {{obj}} and all its
550descendants.
551   
552<procedure>(sxml:clean obj)</procedure>
553
554Returns a minimized and normalized SXML element {{obj}} with empty
555lists of attributes and '''all''' aux-lists eliminated, in {{obj}} and
556all its descendants.
557
558
559==== Sxpath-related procedures
560
561<procedure>(select-first-kid test-pred?)</procedure>
562
563Given a node, return the first child that satisfies the
564{{test-pred?}}.  Given a nodeset, traverse the set until a node is
565found whose first child matches the predicate.  Returns {{#f}} if
566there is no such a child to be found.
567
568<procedure>(sxml:node-parent rootnode)</procedure>
569
570Returns a function of one argument - an SXML element - which returns
571its parent node using {{*PARENT*}} pointer in the aux-list.
572{{'*TOP-PTR*}} may be used as a pointer to root node.  It returns an
573empty list when applied to the root node.
574
575<procedure>(sxml:add-parents obj [top-ptr])</procedure>
576
577Returns the SXML element {{obj}} annotated with {{*PARENT*}} pointers
578for {{obj}} and all its descendants.  If {{obj}} is not the root node
579(a node with a name of {{*TOP*}}), you must pass in the parent pointer
580for {{obj}} as {{top-ptr}}.
581
582'''Warning:''' This procedure mutates its {{obj}} argument.
583
584<procedure>(sxml:lookup id index)</procedure>
585
586Lookup an element using its ID.  {{index}} should be an alist of
587{{(id . element)}}.
588
589==== Markup generation
590
591===== XML
592
593<procedure>(sxml:attr->xml attr)</procedure>
594
595Returns a list containing tokens that when joined together form the
596attribute's XML output.
597
598'''Warning:''' This procedure assumes that the attribute's values have
599already been escaped (ie, {{sxml:string->xml has been called on the
600strings inside it}}).
601
602<examples>
603<example>
604<expr>(sxml:attr->xml '(href "http://example.com"))</expr>
605<result>(" " "href" "='" "http://example.com" "'")</result>
606</example>
607</examples>
608
609<procedure>(sxml:string->xml string)</procedure>
610
611Escape the {{string}} so it can be used anywhere in XML output.  This
612converts the {{<}}, {{>}}, {{'}}, {{"}} and {{&}} characters to their
613respective entities.
614
615<procedure>(sxml:sxml->xml tree)</procedure>
616
617Convert the {{tree}} of SXML nodes to a nested list of XML fragments.
618These fragments can be output by flattening the list and concatenating
619the strings inside it.
620
621==== HTML
622
623<procedure>(sxml:attr->html attr)</procedure>
624
625Returns a list containing tokens that when joined together form the
626attribute's HTML output.  The difference with the XML variant is that
627this encodes empty attribute values to attributes with no value (think
628{{selected}} in option elements, or {{checked}} in checkboxes).
629
630'''Warning:''' This procedure assumes that the attribute's values have
631already been escaped (ie, {{sxml:string->html has been called on the
632strings inside it}}).
633
634<procedure>(sxml:string->html string)</procedure>
635
636Escape the {{string}} so it can be used anywhere in XML output.  This
637converts the {{<}}, {{>}}, {{"}} and {{&}} characters to their
638respective entities.
639
640<procedure>(sxml:non-terminated-html-tag? tag)</procedure>
641
642Is the named {{tag}} one that is "self-closing" (ie, does not need to
643be terminated) in HTML 4.0?
644
645<procedure>(sxml:sxml->html tree)</procedure>
646
647Convert the {{tree}} of SXML nodes to a nested list of HTML fragments.
648These fragments can be output by flattening the list and concatenating
649the strings inside it.
650
651
652=== Procedures from sxpathlib
653
654==== Basic converters and applicators
655
656A converter is a function
657
658  type Converter = Node|Nodelist -> Nodelist
659
660A converter can also play a role of a predicate: in that case, if a
661converter, applied to a node or a nodelist, yields a non-empty
662nodelist, the converter-predicate is deemed satisfied. Throughout this
663file a nil nodelist is equivalent to {{#f}} in denoting a failure.
664
665<procedure>(nodeset? obj)</procedure>
666
667Returns {{#t}} if {{obj}} is a nodelist.
668
669<procedure>(as-nodeset obj)</procedure>
670
671If {{obj}} is a nodelist - returns it as is, otherwise wrap it in a
672list.
673
674==== Node test
675
676The following functions implement 'Node test's as defined in Sec. 2.3
677of the XPath document.  A node test is one of the components of a
678location step.  It is also a converter-predicate in SXPath.
679
680<procedure>(sxml:element? obj)</procedure>
681
682Predicate which returns {{#t}} if {{obj}} is SXML element, otherwise {{#f}}.
683
684<procedure>(ntype-names?? crit)</procedure>
685
686Takes a list of acceptable node names as a criterion and returns a
687function, which, when applied to a node, will return {{#t}} if the
688node name is present in criterion list and {{#f}} otherwise.
689
690   ntype-names?? :: ListOfNames -> Node -> Boolean
691
692<procedure>(ntype?? crit)</procedure>
693
694Takes a type criterion and returns a function, which, when applied to
695a node, will tell if the node satisfies the test.
696
697  ntype?? :: Crit -> Node -> Boolean
698
699The criterion {{crit}} is  one of the following symbols:
700
701; {{@}} : tests if the Node is an {{attributes-list}}
702; {{*}} : tests if the Node is an {{Element}}
703; {{*text*}} : tests if the Node is a text node
704; {{*data*}} : tests if the Node is a data node  (text, number, boolean, etc., but not pair)
705; {{*PI*}} : tests if the Node is a processing instructions node
706; {{*COMMENT*}} : tests if the Node is a comment node
707; {{*ENTITY*}} : tests if the Node is an entity node
708; {{*any*}} : {{#t}} for any type of Node
709; other symbol : tests if the Node has the right name given by the symbol
710
711<examples>
712<example>
713<expr>
714((ntype?? 'div) '(div (@ (class "greeting")) "hi"))
715</expr>
716<result>
717#t
718</result>
719</example>
720<example>
721<expr>
722((ntype?? 'div) '(span (@ (class "greeting")) "hi"))
723</expr>
724<result>
725#f
726</result>
727</example>
728<example>
729<expr>
730((ntype?? '*) '(span (@ (class "greeting")) "hi"))
731</expr>
732<result>
733#t
734</result>
735</example>
736</examples>
737   
738<procedure>(ntype-namespace-id?? ns-id)</procedure>
739
740This function takes a namespace-id, and returns a predicate
741{{Node -> Boolean}}, which is {{#t}} for nodes with the given
742namespace id. {{ns-id}} is a string.
743{{(ntype-namespace-id?? #f)}} will be {{#t}} for nodes with
744non-qualified names.
745
746<procedure>(sxml:complement pred)</procedure>
747
748This function takes a predicate and returns it complemented, that is
749if the given predicate yields {{#f}} or {{'()}} the complemented one
750yields the given node and vice versa.
751
752<procedure>(node-eq? other)</procedure>
753
754Returns a predicate procedure that, given a node, returns {{#t}} if
755the node is the exact same as {{other}}.
756
757<procedure>(node-equal? other)</procedure>
758
759Returns a predicate procedure that, given a node, returns {{#t}} if
760the node has the same contents as {{other}}.
761
762<procedure>(node-pos n)</procedure>
763
764Returns a procedure that, given a nodelist, returns a new nodelist
765containing only the {{n}}th element, counting from 1.  If {{n}} is
766negative, it returns a nodelist with the {{n}}th element counting from
767the right.  If no such node exists, returns the empty list.  {{n}} may
768not equal zero.
769
770<examples>
771<example>
772<expr>
773((node-pos 1) '((div "hi") (span "hello") (em "really, hi!")))
774</expr>
775<result>
776((div "hi"))
777</result>
778</example>
779<example>
780<expr>
781((node-pos 6) '((div "hi") (span "hello") (em "really, hi!")))
782</expr>
783<result>
784()
785</result>
786</example>
787<example>
788<expr>
789((node-pos -1) '((div "hi") (span "hello") (em "is this thing on?")))
790</expr>
791<result>
792((em "is this thing on?"))
793</result>
794</example>
795</examples>
796
797<procedure>(sxml:filter pred?)</procedure>
798
799Returns a procedure that accepts a nodelist or a node (which will be
800converted to a one-element nodelist) and returns only those nodes for
801which the predicate {{pred?}} does not return {{#f}} or {{'()}}.
802
803<examples>
804<example>
805<expr>
806((sxml:filter (ntype?? 'div)) '((div "hi") (span "hello") (div "still here?")))
807</expr>
808<result>
809((div "hi") (div "still here?"))
810</result>
811</example>
812</examples>
813
814<procedure>(take-until pred?)</procedure>
815<procedure>(take-after pred?)</procedure>
816
817Returns a procedure that accepts a node or a nodelist.
818
819The {{take-until}} variant returns everything ''before'' the first
820node for which the predicate {{pred?}} returns anything but {{#f}} or
821{{'()}}.  In other words, it returns the longest prefix for which the
822predicate returns {{#f}} or {{'()}}.
823
824The {{take-after}} variant returns everything ''after'' the first node
825for which the predicate {{pred?}} returns anything besides {{#f}} or
826{{'()}}.
827
828<examples>
829<example>
830<expr>
831((take-until (ntype?? 'span)) '((div "hi") (span "hello") (span "there") (div "still here?")))
832</expr>
833<result>
834((div "hi"))
835</result>
836</example>
837<example>
838<expr>
839((take-after (ntype?? 'span)) '((div "hi") (span "hello") (span "there") (div "still here?")))
840</expr>
841<result>
842((span "there") (div "still here?"))
843</result>
844</example>
845</examples>
846
847<procedure>(map-union proc list)</procedure>
848
849Apply {{proc}} to each element of the nodelist {{lst}} and return the
850list of results.  If {{proc}} returns a nodelist, splice it into the
851result (essentially returning a flattened nodelist).
852
853<procedure>(node-reverse node-or-nodelist)</procedure>
854
855Accepts a nodelist and reverses the nodes inside.  If a node is passed
856to this procedure, it returns a nodelist containing just that node.
857(it does not change the order of the children).
858
859==== Converter combinators
860
861Combinators are higher-order functions that transmogrify a converter
862or glue a sequence of converters into a single, non-trivial
863converter. The goal is to arrive at converters that correspond to
864XPath location paths.
865
866From a different point of view, a combinator is a fixed, named
867''pattern'' of applying converters. Given below is a complete set of
868such patterns that together implement XPath location path
869specification. As it turns out, all these combinators can be built
870from a small number of basic blocks; regular functional composition,
871{{map-union}} and filter applicators, and the nodelist union.
872
873<procedure>(select-kids pred?)</procedure>
874
875Returns a procedure that accepts a node and returns a nodelist of the
876node's children that satisfy {{pred?}} (ie, {{pred?}} returns anything
877but {{#f}} or {{'()}}).
878
879<procedure>(node-self pred?)</procedure>
880
881Similar to {{select-kids}} but applies to the node itself rather than
882to its children. The resulting Nodelist will contain either one
883component (the node), or will be empty (if the node failed the
884predicate).
885
886<procedure>(node-join . selectors)</procedure>
887
888Returns a procedure that accepts a nodelist or a node, and returns a
889nodelist with all the selectors applied to every node in sequence.
890The selectors must function as converter combinators, ie they must
891accept a ''node'' and output a ''nodelist''.
892
893<examples>
894<example>
895<expr>
896((node-join
897  (select-kids (ntype?? 'li))
898  sxml:content)
899 '((ul (@ (class "whiskies"))
900       (li "Ardbeg")
901       (li "Glenfarclas")
902       (li "Springbank"))))
903</expr>
904<result>
905("Ardbeg" "Glenfarclas" "Springbank")
906</result>
907</example>
908</examples>
909
910<procedure>(node-reduce . converters)</procedure>
911
912A regular functional composition of converters.
913
914From a different point of view,
915  ((apply node-reduce converters) nodelist)
916is equivalent to
917  (fold apply nodelist converters)
918i.e., folding, or reducing, a list of converters with the nodelist
919as a seed.
920
921
922<procedure>(node-or . converters)</procedure>
923
924This combinator applies all converters to a given node and produces
925the union of their results.  This combinator corresponds to a union,
926"{{|}}" operation for XPath location paths.
927
928<procedure>(node-closure test-pred?)</procedure>
929
930Select all ''descendants'' of a node that satisfy a
931converter-predicate.  This combinator is similar to {{select-kids}}
932but applies to grandchildren as well.
933
934<procedure>(node-trace title)</procedure>
935
936Returns a procedure that accepts a node or a nodelist, which it
937pretty-prints to the current output port, preceded by {{title}}.  It
938returns the node or the nodelist unchanged.  This is a useful
939debugging aid, since it doesn't really do anything besides print its
940argument and pass it on.
941
942<procedure>(sxml:node? obj)</procedure>
943
944Returns {{#t}} if the given {{obj}} is an SXML node, {{#f}} otherwise.
945A node is anything except an attribute list or an auxiliary list.
946
947<procedure>(sxml:attr-list node)</procedure>
948
949Returns the list of attributes for a given SXML node.  The empty list
950is returned if the given node is not an element, or if it has no list
951of attributes.
952
953This differs from {{sxml:attr-list-u}} in that this procedure accepts
954any SXML node while {{sxml:attr-list-u}} only accepts nodelists or
955elements.  This means that sxml:attr-list-u will throw an error if you
956pass it a text node (a string), while sxml:attr-list will not.
957
958<procedure>(sxml:attribute test-pred?)</procedure>
959
960Like {{sxml:filter}}, but considers the attributes instead of the
961nodes.  Returns a nodelist of attribtes that match {{test-pred?}}.
962
963<examples>
964<example>
965<expr>
966((sxml:attribute (ntype?? 'id))
967 '((div (@ (id "navigation")) "navigation here")
968   (div (@ (class "pullquote")) "random stuff")
969   (div (@ (id "main-content")) "lorem ipsum ...")))
970</expr>
971<result>
972((id "navigation") (id "main-content"))
973</result>
974</example>
975</examples>
976
977<procedure>(sxml:child test-pred?)</procedure>
978
979This procedure is similar to {{select-kids}}, but it returns an empty
980child-list for PI, Comment and Entity nodes.
981
982<procedure>(sxml:parent test-pred?)</procedure>
983
984Returns a procedure that accepts a root-node, and returns another
985procedure.  This second procedure accepts a nodeset (or a node) and
986returns the immediate parents of the nodes in the set, but only if
987for those parents that match the predicate.
988
989The root-node does not have to be the root node of the
990whole SXML tree -- it may be a root node of a branch of interest.
991
992This procedure can be used with any SXML node.
993
994==== Useful shortcuts
995
996<procedure>(node-parent node)</procedure>
997
998{{(node-parent rootnode)}} yields a converter that returns a parent of a
999node it is applied to. If applied to a nodelist, it returns the list
1000of parents of nodes in the nodelist.
1001
1002This is equivalent to {{((sxml:parent (ntype? '*any*)) node)}}.
1003
1004<procedure>(sxml:child-nodes node)</procedure>
1005
1006Returns all the child nodes of the given {{node}}.
1007
1008This is equivalent to {{((sxml:child sxml:node?) node)}}.
1009
1010<procedure>(sxml:child-elements node)</procedure>
1011
1012Returns all the child ''elements'' of the given {{node}}. (ie,
1013excludes any textnodes).
1014
1015This is equivalent to {{((select-kids sxml:element?) node)}}.
1016
1017=== Procedures from sxpath-ext
1018
1019==== SXML counterparts to W3C XPath Core Functions Library
1020
1021<procedure>(sxml:string object)</procedure>
1022
1023The counterpart to XPath 'string' function (section 4.2 XPath 1.0 Rec.).
1024Converts a given object to a string.
1025
1026Notes:
1027# When converting a nodeset, document order is not preserved
1028# {{number->string}} returns the result in a form which is slightly different from XPath Rec. specification
1029
1030<procedure>(sxml:boolean object)</procedure>
1031
1032The counterpart to XPath 'boolean' function (section 4.3 XPath Rec.).
1033Converts its argument to a boolean.
1034
1035<procedure>(sxml:number object)</procedure>
1036
1037The counterpart to XPath 'number' function (section 4.4 XPath Rec.).
1038Converts its argument to a number.
1039
1040Notes:
1041# The argument is not optional (yet?)
1042# string->number conversion is not IEEE 754 round-to-nearest
1043# NaN is represented as 0
1044
1045<procedure>(sxml:string-value node)</procedure>
1046
1047Returns a string value for a given node in accordance to
1048XPath Rec. 5.1 - 5.7
1049
1050<procedure>(sxml:id id-index)</procedure>
1051
1052Returns a procedure that accepts a nodeset and returns a nodeset
1053containing the elements in the id-index that match the string-values
1054of each entry of the nodeset.  XPath Rec. 4.1
1055
1056The {{id-index}} is an alist with unique IDs as key, and elements as
1057values:
1058
1059  id-index = ( (id-value . element) (id-value . element) ... )
1060
1061==== Comparators for XPath objects
1062
1063<procedure>(sxml:list-head list n)</procedure>
1064
1065Returns the {{n}} first members of {{list}}.  Mostly equivalent to
1066SRFI-1's {{take}} procedure, except it returns the {{list}} if {{n}}
1067is larger than the length of said list, instead of throwing an error.
1068
1069<procedure>(sxml:merge-sort less-than? list)</procedure>
1070
1071Returns the sorted list, the smallest member first.
1072  less-than? ::= (lambda (obj1 obj2) ...)
1073{{less-than?}} returns {{#t}} if {{obj1 < obj2}} with respect to the
1074given ordering.
1075
1076<procedure>(sxml:equality-cmp bool=? number=? string=?)</procedure>
1077
1078A helper for XPath equality operations: {{=}} , {{!=}}.  The
1079{{bool=?}}, {{number=?}} and {{string=?}} arguments are comparison
1080operations for booleans, numbers and strings respectively.
1081
1082Returns a procedure that accepts two objects, looks at the first
1083object's type and applies the correct comparison predicate to it.
1084Type coercion takes place depending on the rules described in the
1085XPath 1.0 spec, section 3.4 ("Booleans").
1086
1087<procedure>(sxml:equal? obj1 obj2)</procedure>
1088<procedure>(sxml:not-equal? obj1 obj2)</procedure>
1089
1090Equality procedures with the default comparison operators {{eq?}},
1091{{=}} and {{string=?}}, or their inverse, respectively.
1092
1093<procedure>(sxml:relational-cmp op)</procedure>
1094
1095A helper for XPath relational operations: {{<}}, {{>}}, {{<=}}, {{>=}}
1096for two XPath objects.  {{op}} is one of these operators.
1097
1098Returns a procedure that accepts two objects and returns the value of
1099the procedure applied to these objects, converted according to the
1100coercion rules described in the XPath 1.0 spec, section 3.4
1101("Booleans").
1102
1103==== XPath axes
1104
1105<procedure>(sxml:ancestor test-pred?)</procedure>
1106
1107Like {{sxml:parent}}, except it returns all the ancestors that match
1108{{test-pred?}}, not just the immediate parent.
1109
1110<procedure>(sxml:ancestor-or-self test-pred?)</procedure>
1111
1112Like {{sxml:ancestor}}, except also allows the node itself to match
1113the predicate.
1114
1115<procedure>(sxml:descendant test-pred?)</procedure>
1116
1117Like {{node-closure}}, except the resulting nodeset is in depth-first
1118order instead of breadth-first.
1119
1120<procedure>(sxml:descendant-or-self test-pred?)</procedure>
1121
1122Like {{sxml:descendant}}, except also allows the node itself to match
1123the predicate.
1124
1125<procedure>(sxml:following test-pred?)</procedure>
1126
1127Returns a procedure that accepts a root node and returns a new
1128procedure that accepts a node and returns all nodes following this
1129node in the document source matching the predicate.
1130
1131<procedure>(sxml:following-sibling test-pred?)</procedure>
1132
1133Like {{sxml:following}}, except only siblings (nodes at the same level
1134under the same parent) are returned.
1135
1136<procedure>(sxml:preceding test-pred?)</procedure>
1137
1138Returns a procedure that accepts a root node and returns a new
1139procedure that accepts a node and returns all nodes preceding this
1140node in the document source matching the predicate.
1141
1142<procedure>(sxml:preceding-sibling test-pred?)</procedure>
1143
1144Like {{sxml:preceding}}, except only siblings (nodes at the same level
1145under the same parent) are returned.
1146
1147<procedure>(sxml:namespace test-pred?)</procedure>
1148
1149Returns a procedure that accepts a nodeset and returns the namespace
1150lists of the nodes matching {{test-pred?}}.
1151
1152
1153== About this egg
1154
1155=== Author
1156
1157[[http://okmij.org/ftp/|Oleg Kiselyov]], [[http://www196.pair.com/lisovsky/|Kirill Lisovsky]], [[http://modis.ispras.ru/Lizorkin/index.html|Dmitry Lizorkin]].
1158
1159=== Version history
1160
1161; 0.1 : Split up the old sxml-tools egg into sxpath
1162
1163=== License
1164
1165The sxml-tools are in the public domain.
Note: See TracBrowser for help on using the repository browser.