Changeset 14838 in project


Ignore:
Timestamp:
05/31/09 00:44:15 (10 years ago)
Author:
sjamaan
Message:

Add some docs on ssax. Will improve it later

File:
1 copied

Legend:

Unmodified
Added
Removed
  • wiki/eggref/4/ssax

    r14835 r14838  
    1919None
    2020
    21 === Download
    22 
    23 [[http://www.call-with-current-continuation.org/eggs/ssax.egg|ssax.egg]]
    24 
    2521=== Documentation
    2622
     
    2824comprehensive documentation.
    2925
    30 The following procedures are exported:
     26The following procedure is exported:
    3127
    32 <procedure>ssax:warn </procedure>
    33 <procedure>ssax:skip-pi </procedure>
    34 <procedure>attlist-fold</procedure>
    35 <procedure>ssax:prefix-xml </procedure>
    36 <procedure>ssax:complete-start-tag </procedure>
    37 <procedure>ssax:skip-s </procedure>
    38 <procedure>ssax:read-markup-token</procedure>
    39 <procedure>ssax:assert-token </procedure>
    40 <procedure>ssax:read-char-data </procedure>
    41 <procedure>ssax:skip-internal-dtd </procedure>
    42 <procedure>ssax:s-chars </procedure>
    43 <procedure>ssax:read-qname</procedure>
    44 <procedure>ssax:ncname-starting-char? </procedure>
    45 <procedure>ssax:read-external-id </procedure>
    46 <procedure>ssax:scan-misc</procedure>
    47 <procedure>assert-cur-char</procedure>
    48 <procedure>ssax:handle-parsed-entity </procedure>
    49 <procedure>ssax:complete-start-tag </procedure>
    50 <procedure>xml-token-head </procedure>
    51 <procedure>xml-token-kind</procedure>
    52 <procedure>ssax:uri-string->symbol </procedure>
    53 <procedure>string-whitespace? </procedure>
    54 <procedure>ssax:read-pi-body-as-string</procedure>
    55 <procedure>ssax:read-ncname</procedure>
    56 <procedure>ssax:read-cdata-body</procedure>
    57 <procedure>ssax:read-attributes</procedure>
    58 <procedure>name-compare</procedure>
    59 <procedure>ssax:resolve-name</procedure>
    60 <procedure>SSAX:XML->SXML</procedure>
    61 <procedure>parser-error </procedure>
     28<procedure>(ssax:xml->sxml PORT NAMESPACE-PREFIX-ASSIG)</procedure>
    6229
    63 {{ssax:warn}} and {{parse-error}} are implemented.
     30This procedure reads XML data from {{PORT}} and returns an SXML
     31representation. {{NAMESPACE-PREFIX-ASSIG}} is an alist that maps user
     32prefixes (symbols) to namespaces (URI strings).
    6433
    6534The following macros are available:
    6635
    67 <macro>let-values*</macro>
    68 <macro>SSAX:make-pi-parser</macro>
    69 <macro>SSAX:make-elem-parser</macro>
    70 <macro>SSAX:make-parser</macro>
     36<macro>(ssax:make-parser TAG1 PROC1 [TAG2 PROC2 ...])</macro>
     37
     38Create a custom XML parser; an instance of the XML parsing framework.
     39This will be a SAX, a DOM or a specialized parser depending on the
     40supplied user-handlers.
     41
     42The arguments to {{ssax::make-parser}} are type/procedure pairs,
     43interleaved in the argument list.  In other words, {{TAG1}}, {{TAG2}}
     44etc are '''unquoted'''(!) symbols that identify the type of procedure
     45that follows the tag; see below for the list of allowed tags.  The
     46output of this macro is a procedure that represents a parser which
     47accepts two arguments, {{PORT}} and {{SEED}}.  {{PORT}} is the port
     48from which to read the XML data and {{SEED}} is the initial value of
     49an accumulator that will be passed into the first procedure, where it
     50can be appended to and returned.  Then this value will be passed on to
     51the next procedure and so on to eventually obtain a result, in a
     52{{FOLD}}-like fashion.
     53
     54Given below are tags and signatures of the corresponding procedures.
     55Not all tags have to be specified.  If some are omitted, reasonable
     56defaults will apply. {{SEED}} always represents the current value of
     57the accumulator that will eventually be returned by the parser.
     58
     59Tag: {{DOCTYPE}}
     60Handler-procedure: {{PORT DOCNAME SYSTEMID INTERNAL-SUBSET? SEED}}
     61
     62If {{INTERNAL-SUBSET?}} is {{#t}}, the current position in the port
     63is right after we have read {{#\[}} that begins the internal DTD subset.
     64We must finish reading of this subset before we return (or must call
     65{{ssax:skip-internal-dtd}} if we aren't interested in reading it).
     66
     67The port at exit must be at the first symbol after the whole
     68DOCTYPE declaration.
     69The handler-procedure must generate four values:
     70      ELEMS ENTITIES NAMESPACES SEED
     71See {{xml-decl::elems}} for {{ELEMS}}. It may be {{#f}} to switch
     72off the validation.
     73{{NAMESPACES}} will typically contain user prefixes for selected
     74URI symbols.
     75The default handler-procedure skips the internal subset, if any,
     76and returns {{(values #f '() '() SEED)}}.
     77
     78Tag: {{UNDECL-ROOT}}
     79Handler-procedure: {{ELEM-GI SEED}}
     80
     81{{ELEM-GI}} is an {{UNRES-NAME}} of the root element. This procedure
     82is called when an XML document under parsing contains ''no'' {{DOCTYPE}}
     83declaration.
     84The handler-procedure, as a DOCTYPE handler procedure above,
     85must generate four values:
     86       ELEMS ENTITIES NAMESPACES SEED
     87The default handler-procedure returns {{(values #f '() '() seed)}}
     88
     89Tag: {{NEW-LEVEL-SEED}}
     90Handler-procedure: see {{ssax:make-elem-parser}}, new-level-seed
     91
     92Tag: {{FINISH-ELEMENT}}
     93Handler-procedure: see {{ssax:make-elem-parser}}, finish-elem
     94
     95Tag: {{CHAR-DATA-HANDLER}}
     96Handler-procedure: see {{ssax:make-elem-parser}}, char-data-handler
     97
     98Tag: {{PI}}
     99Handler-procedure: see {{ssax:make-pi-parser}}
     100The default value is {{'()}}.
     101
     102<macro>(ssax:make-pi-parser PI-HANDLERS)</macro>
     103
     104Create a parser to parse and process one Processing Instruction (PI)
     105element.  {{PI-HANDLERS}} is an alist {{(PI-TAG . PI-HANDLER)}} where
     106{{PI-TAG}} is the name of the processing instruction and
     107{{PI-HANDLER}} is a procedure {{PORT PI-TAG SEED}}.
     108
     109The handler should read the rest of the PI from {{PORT}}, up to and
     110including the combination "{{?>}}" that terminates the PI. The handler
     111should return a new seed.
     112
     113One of the {{PI-TAG}}s may be the symbol {{*DEFAULT*}}. The
     114corresponding handler will handle PIs that no other handler will. If
     115the {{*DEFAULT*}} {{PI-TAG}} is not specified, {{ssax:make-pi-parser}}
     116will assume the default handler that skips the body of the PI.
     117
     118<macro>(ssax:make-elem-parser new-level-seed finish-elem char-data-handler pi-handlers)</macro>
     119
     120Create a parser to parse and process one element, including its
     121character content or children elements. The parser is typically
     122applied to the root element of a document.
     123
     124The generated parser is a procedure
     125{{START-TAG-HEAD PORT ELEMS ENTITIES NAMESPACES PRESERVE-WS? SEED}}
     126
     127{{new-level-seed}}
     128      procedure ELEM-GI ATTRIBUTES NAMESPACES EXPECTED-CONTENT SEED
     129where {{ELEM-GI}} is a {{RES-NAME}} of the element about to be
     130processed.  This procedure is to generate the seed to be passed to
     131handlers that process the content of the element.
     132
     133{{finish-element}}
     134      procedure ELEM-GI ATTRIBUTES NAMESPACES PARENT-SEED SEED
     135This procedure is called when parsing of {{ELEM-GI}} is finished.  The
     136{{SEED}} is the result from the last content parser (or from
     137{{new-level-seed}} if the element has the empty content).
     138{{PARENT-SEED}} is the same seed as was passed to {{new-level-seed}}.
     139The procedure is to generate a seed that will be the result of the
     140element parser.
     141
     142{{char-data-handler}}
     143A string handler.
     144
     145{{pi-handlers}}
     146See {{ssax:make-pi-handler}}.
    71147
    72148=== Unicode compatibility
    73149
    74 {{SSAX:XML->SXML}} will convert numeric entities to UTF-8 byte sequences.  It does not depend on the [[utf8]] egg for this.
     150{{ssax:xml->sxml}} will convert numeric entities to UTF-8 byte sequences.  It does not depend on the [[utf8]] egg for this.
    75151
    76152Otherwise, UTF-8 operation is not well tested.
     
    78154=== Changelog
    79155
     156* 5.0.0 Port to Chicken 4; fresh import of the clean upstream CVS tree (which now has downcased names)
    80157* 4.9.8 Convert numeric entities > 255 to UTF-8 [Jim Ursetto]
    81158* 4.9.7 Using ##sys#read/peek-char instead of read/peek-char [Daishi Kato]
Note: See TracChangeset for help on using the changeset viewer.