Changeset 11787 in project


Ignore:
Timestamp:
08/27/08 22:58:06 (13 years ago)
Author:
sjamaan
Message:

Add some docs from the sxml-transforms sources to the wiki page, so it doesn't scare newbies so much :)

File:
1 edited

Legend:

Unmodified
Added
Removed
  • wiki/sxml-transforms

    r3274 r11787  
    11[[tags:eggs]]
    22
    3 This is version 1.1 of the '''sxml-transforms''' extension library for Chicken Scheme.
     3This is version 1.2 of the '''sxml-transforms''' extension library for Chicken Scheme.
    44
    55[[toc:]]
     
    77== Description
    88
    9 
    10 
    119The [[http://cvs.sourceforge.net/viewcvs.py/ssax/SSAX|SXML transformations]] (to XML, SXML, and HTML) from the [[http://ssax.sf.net|SSAX project]]
    1210
    13 
    1411== Documentation
    1512
    1613
    17 This egg provides the SXML transforms available in the SSAX/SXML Sourceforge project.  It incorporates one main extension, and an auxiliary one:
    18 
    19  '''extension:''' (require-extension sxml-transforms)
    20 ; From SXML-tree-trans.scm: : {{SRV:send-reply pre-post-order post-order foldts replace-range}}
    21 ; From SXML-to-HTML.scm: : {{SXML->HTML entag enattr string->goodHTML}}
    22 ; From SXML-to-HTML-ext.scm: : {{universal-conversion-rules universal-protected-rules alist-conv-rules}}
    23 ; From util.scm: : {{make-char-quotator}}
    24 ; Chicken-specific modifications: : ; {{entag-xhtml entag-html}} :
    25 entag-xhtml closes XHTML tags properly in an HTML compatible way.  entag is now an alias for entag-xhtml, so this behaviour is the default.  entag-html is an alias for the original entag.
     14This egg provides the SXML transforms available in the SSAX/SXML Sourceforge project.  It incorporates one main module, and an auxiliary one:
     15
     16=== sxml-transforms
     17
     18<procedure>(SRV:send-reply . fragments)</procedure>
     19
     20Output the FRAGMENTS to the current output port.
     21
     22The fragments are a list of strings, characters, numbers, thunks, #f,
     23#t -- and other fragments.  The function traverses the tree
     24depth-first, writes out strings and characters, executes thunks, and
     25ignores #f and '().  The function returns #t if anything was written
     26at all; otherwise the result is #f.  If #t occurs among the fragments,
     27it is not written out but causes the result of SRV:send-reply to be #t
     28
     29<procedure>(pre-post-order tree bindings)</procedure>
     30
     31Traversal of an SXML tree or a grove: a <Node> or a <Nodelist>
     32
     33A <Node> and a <Nodelist> are mutually-recursive datatypes that
     34underlie the SXML tree:
     35     <Node> ::= (name . <Nodelist>) | "text string"
     36An (ordered) set of nodes is just a list of the constituent nodes:
     37     <Nodelist> ::= (<Node> ...)
     38Nodelists, and Nodes other than text strings are both lists. A
     39<Nodelist> however is either an empty list, or a list whose head is
     40not a symbol (an atom in general). A symbol at the head of a node is
     41either an XML name (in which case it's a tag of an XML element), or
     42an administrative name such as '@'.
     43See SXPath.scm and SSAX.scm for more information on SXML.
     44
     45Pre-Post-order traversal of a tree and creation of a new tree:
     46    pre-post-order:: <tree> x <bindings> -> <new-tree>
     47where
     48    <bindings> ::= (<binding> ...)
     49    <binding> ::= (<trigger-symbol> *preorder* . <handler>) |
     50                  (<trigger-symbol> *macro* . <handler>) |
     51                  (<trigger-symbol> <new-bindings> . <handler>) |
     52                  (<trigger-symbol> . <handler>)
     53    <trigger-symbol> ::= XMLname | *text* | *default*
     54    <handler> :: <trigger-symbol> x [<tree>] -> <new-tree>
     55
     56The pre-post-order function visits the nodes and nodelists
     57pre-post-order (depth-first).  For each <Node> of the form (name
     58<Node> ...) it looks up an association with the given 'name' among
     59its <bindings>. If failed, pre-post-order tries to locate a
     60*default* binding. It's an error if the latter attempt fails as
     61well.  Having found a binding, the pre-post-order function first
     62checks to see if the binding is of the form
     63  (<trigger-symbol> *preorder* . <handler>)
     64If it is, the handler is 'applied' to the current node. Otherwise,
     65the pre-post-order function first calls itself recursively for each
     66child of the current node, with <new-bindings> prepended to the
     67<bindings> in effect. The result of these calls is passed to the
     68<handler> (along with the head of the current <Node>). To be more
     69precise, the handler is _applied_ to the head of the current node
     70and its processed children. The result of the handler, which should
     71also be a <tree>, replaces the current <Node>. If the current <Node>
     72is a text string or other atom, a special binding with a symbol
     73*text* is looked up.
     74
     75A binding can also be of a form
     76     (<trigger-symbol> *macro* . <handler>)
     77This is equivalent to *preorder* described above. However, the result
     78is re-processed again, with the current stylesheet.
     79
     80<procedure>(post-order tree bindings)</procedure>
     81
     82Deprecated. This was a version of pre-post-order that did not accept
     83{{*macro*}} or {{*preorder*}} directives.
     84
     85<procedure>(foldts fdown fup fhere seed tree)</procedure>
     86
     87Tree fold operator.
     88
     89    tree = atom | (node-name tree ...)
     90
     91    foldts fdown fup fhere seed (Leaf str) = fhere seed str
     92    foldts fdown fup fhere seed (Nd kids) =
     93          fup seed $ foldl (foldts fdown fup fhere) (fdown seed) kids
     94
     95    procedure fhere: seed -> atom -> seed
     96    procedure fdown: seed -> node -> seed
     97    procedure fup: parent-seed -> last-kid-seed -> node -> seed
     98
     99foldts returns the final seed
     100
     101<procedure>(replace-range beg-pred end-pred forest)</procedure>
     102
     103    procedure: replace-range:: BEG-PRED x END-PRED x FOREST -> FOREST
     104Traverse a forest depth-first and cut/replace ranges of nodes.
     105
     106The nodes that define a range don't have to have the same immediate
     107parent, don't have to be on the same level, and the end node of a
     108range doesn't even have to exist. A replace-range procedure removes
     109nodes from the beginning node of the range up to (but not including)
     110the end node of the range.  In addition, the beginning node of the
     111range can be replaced by a node or a list of nodes. The range of
     112nodes is cut while depth-first traversing the forest. If all
     113branches of the node are cut a node is cut as well.  The procedure
     114can cut several non-overlapping ranges from a forest.
     115
     116    replace-range:: BEG-PRED x END-PRED x FOREST -> FOREST
     117where
     118    type FOREST = (NODE ...)
     119    type NODE = Atom | (Name . FOREST) | FOREST
     120
     121The range of nodes is specified by two predicates, beg-pred and end-pred.
     122    beg-pred:: NODE -> #f | FOREST
     123    end-pred:: NODE -> #f | FOREST
     124The beg-pred predicate decides on the beginning of the range. The node
     125for which the predicate yields non-#f marks the beginning of the range
     126The non-#f value of the predicate replaces the node. The value can be a
     127list of nodes. The replace-range procedure then traverses the tree and skips
     128all the nodes, until the end-pred yields non-#f. The value of the end-pred
     129replaces the end-range node. The new end node and its brothers will be
     130re-scanned.
     131The predicates are evaluated pre-order. We do not descend into a node that
     132is marked as the beginning of the range.
     133
     134<procedure>(SXML->HTML tree)</procedure>
     135
     136This procedure is the most generic transformation of SXML
     137into the corresponding HTML document. The SXML tree is traversed
     138post-oder (depth-first) and transformed into another tree, which,
     139written in a depth-first fashion, results in an HTML document.
     140
     141It's basically like pre-post-order with the universal-conversion-rules
     142hardcoded. It also knows about a rule {{html:begin}}, which translates
     143the HTML code to oldskool uppercase HTML 3 code preceded by a
     144Content-Type header.
     145
     146<procedure>(entag tag elems)</procedure>
     147
     148Create the HTML markup fragments for tags. TAG is the name of the tag (a symbol) and ELEMS is the tree of elements that form the contents of this tag (''not'' recusively processed).
     149This is used in the node handlers for the (pre-)post-order function, to prepare it for output by {{SRV:send-reply}}.
     150This is an alias for {{entag-xhtml}} (see below, in the section about Chicken-specific modifications)
     151
     152<procedure>(enattr attr-key value)</procedure>
     153
     154Create the HTML markup fragments for attributes. The ATTR-KEY is the name of the attribute (a symbol) and VALUE is the value it should have.
     155This is used in the node handlers for the (pre-)post-order function, to prepare it for output by {{SRV:send-reply}}.
     156
     157<procedure>(string->goodHTML html)</procedure>
     158
     159Given a string, check to make sure it does not contain characters
     160such as '<' or '&' that require encoding. Return either the original
     161string, or a list of string fragments with special characters
     162replaced by appropriate character entities.
     163
     164<constant>universal-conversion-rules</constant>
     165
     166Bindings for the (pre-)post-order function, which traverses the SXML tree
     167and converts it to a tree of fragments. It contains rules to call
     168{{string->goodHTML}}, {{enattr}} and {{entag}} on all text, attributes and
     169tags. In normal situations you always append these rules to your own rules,
     170or add a final pre-post-order processing step with just these bindings.
     171
     172<constant>universal-protected-rules</constant>
     173
     174A variation of universal-conversion-rules which keeps
     175{{'<'}}, {{'>'}}, {{'&'}} and similar characters intact (ie, it
     176skips calling {{string->goodHTML}}).
     177The {{universal-protected-rules}} are useful when the tree of
     178fragments has to be traversed one more time.
     179
     180<constant>alist-conv-rules</constant>
     181
     182These rules define the identity transformation. You will usually need
     183to append these rules to all of the bindings you use with {{pre-post-order}},
     184unless you explicitly define your own conversion rules for {{*default*}}
     185and {{*text*}}.
     186
     187<procedure>(make-char-quotator quot-rules)</procedure>
     188
     189Given QUOT-RULES, an assoc list of (char . string) pairs, return
     190a quotation procedure. The returned quotation procedure takes a string
     191and returns either a string or a list of strings. The quotation procedure
     192check to see if its argument string contains any instance of a character
     193that needs to be encoded (quoted). If the argument string is "clean",
     194it is returned unchanged. Otherwise, the quotation procedure will
     195return a list of string fragments. The input straing will be broken
     196at the places where the special characters occur. The special character
     197will be replaced by the corresponding encoding strings.
     198
     199For example, to make a procedure that quotes special HTML characters, do:
     200<example>
     201(make-char-quotator
     202    '((#\< . "&lt;") (#\> . "&gt;") (#\& . "&amp;") (#\" . "&quot;")))
     203</example>
     204
     205==== Chicken-specific modifications
     206
     207<procedure>(entag-xhtml)</procedure>
     208
     209<procedure>(entag-html)</procedure>
     210
     211{{entag-xhtml}} closes XHTML tags properly in an HTML compatible way.  {{entag}} is now an alias for {{entag-xhtml}}, so this behaviour is the default.
     212{{entag-html}} is an alias for the original {{entag}}.
    26213
    27214Newlines before open tags in the rendered HTML output are omitted for inline elements, such as {{tt}} and {{strong}}.  This prevents the introduction of extraneous whitespace.
    28215
    29 
    30 Also, the universal conversion rules have been augmented a bit:
    31 
    32  '''rule:''' (& ENTITY-NAME ...)
     216Also, the {{universal-conversion-rules}} have been augmented a bit:
     217
     218The following rule has been added:
     219   (& ENTITY-NAME ...)
    33220
    34221Quotes character references given by strings {{ENTITY-NAME ...}}.
    35222
    36 Example: {{(& "ndash" "quot") => "&ndash;&quot;"}}
    37 
    38 
    39 
    40 
    41 
    42 
    43  '''extension:''' (require-extension sxml-to-sxml)
    44 
    45 Provides {{pre-post-order-composable}}, a variant of {{pre-post-order}} which always outputs strictly-conformant SXML.  This comes from sxml-to-sxml.scm.
    46 
     223Example:
     224<example>
     225<expr>(& "ndash" "quot")</expr>
     226<result>"&ndash;&quot;"</result>
     227</example>
     228
     229
     230=== sxml-to-sxml
     231
     232<procedure>(pre-post-order tree bindings)</procedure>
     233
     234This module's version of {{pre-post-order}} is a variant which always outputs strictly-conformant SXML. It unnests lists that do not have a tag as their {{car}} until they do.
     235This comes from {{sxml-to-sxml.scm}}. If you import it, be sure to rename or omit the one from the {{sxml-transforms}} module.
    47236
    48237
    49238== Examples
    50 
    51239
    52240[[http://okmij.org/ftp/Scheme/xml.html|Oleg's site]] is the main resource.  Be sure to read his examples and the ones in the SSAX repository (also included in the egg).  The following papers were of great help:
     
    56244Also, the [[eggdoc.html|eggdoc]] extension makes heavy use of sxml-transforms.
    57245
     246The initial documentation on this wiki page came straight from the comments in the extremely well-documented source code. It's recommended you read the code if you want to learn more.
     247
    58248== About this egg
    59249
     
    66256=== Version history
    67257
     258; 1.2 : Port to hygienic chicken
    68259; 1.1 : Improve inline element whitespace handling; add '&' rule.
    69260; 1.0 : Initial release
Note: See TracChangeset for help on using the changeset viewer.