Changeset 31191 in project


Ignore:
Timestamp:
08/03/14 09:57:45 (6 years ago)
Author:
Alex Shinn
Message:

Fixing read-quoted on incomplete and unescaped &.

Location:
release/4/html-parser
Files:
3 edited
4 copied

Legend:

Unmodified
Added
Removed
  • release/4/html-parser/tags/0.5.6/html-parser.scm

    r30833 r31191  
    1 ;;;; html-parser.scm -- SSAX-like tree-folding html parser
    2 ;;
    3 ;; Copyright (c) 2003-2013 Alex Shinn.  All rights reserved.
     1;; html-parser.scm -- SSAX-like tree-folding html parser
     2;; Copyright (c) 2003-2014 Alex Shinn.  All rights reserved.
    43;; BSD-style license: http://synthcode.com/license.txt
    54
    6 ;; This is intended as a permissive HTML parser for people who prefer
    7 ;; the scalable interface described in Oleg Kiselyov's SSAX parser, as
    8 ;; well as providing simple convenience utilities.  It correctly
    9 ;; handles all invalid HTML, inserting "virtual" starting and closing
    10 ;; tags as needed to maintain the proper tree structure needed for the
    11 ;; foldts down/up logic.  A major goal of this parser is bug-for-bug
    12 ;; compatibility with the way common web browsers parse HTML.
    13 
    14 ;; Procedure: make-html-parser . keys
     5;;> A permissive HTML parser supporting scalable streaming with a
     6;;> folding interface.  This copies the interface of Oleg Kiselyov's
     7;;> SSAX parser, as well as providing simple convenience utilities.
     8;;> It correctly handles all invalid HTML, inserting "virtual"
     9;;> starting and closing tags as needed to maintain the proper tree
     10;;> structure needed for the foldts down/up logic.  A major goal of
     11;;> this parser is bug-for-bug compatibility with the way common web
     12;;> browsers parse HTML.
     13
     14;;> Procedure: make-html-parser . keys
    1515
    1616;;   Returns a procedure of two arguments, and initial seed and an
     
    236236       ((eqv? #\& (peek-char in))
    237237        (let ((x (read-entity in)))
    238           (lp (cons (if (eq? 'entity (car x))
    239                         (get-entity entities (cdr x))
    240                         (cdr x))
     238          (lp (cons (or (and (eq? 'entity (car x))
     239                             (get-entity entities (cdr x)))
     240                        (string-append "&" (cdr x)))
    241241                    res))))
    242242       (else
  • release/4/html-parser/tags/0.5.6/html-parser.setup

    r30833 r31191  
    55  'html-parser
    66  '("html-parser.so" "html-parser.import.so")
    7   '((version 0.5.3)
     7  '((version 0.5.6)
    88    (documentation "html-parser.html")))
  • release/4/html-parser/tags/0.5.6/test.scm

    r30833 r31191  
    6161    (html->sxml "<b id=\"&amp;\">&amp;</b>"))
    6262
     63(test '(*TOP* (foo (@ (bar "&x"))))
     64    (html->sxml "<foo bar=\"&x\" />"))
     65
    6366(test '(*TOP* (foo (@ (bar))))
    6467    (html->sxml "<foo bar></foo>"))
  • release/4/html-parser/trunk/html-parser.scm

    r30833 r31191  
    1 ;;;; html-parser.scm -- SSAX-like tree-folding html parser
    2 ;;
    3 ;; Copyright (c) 2003-2013 Alex Shinn.  All rights reserved.
     1;; html-parser.scm -- SSAX-like tree-folding html parser
     2;; Copyright (c) 2003-2014 Alex Shinn.  All rights reserved.
    43;; BSD-style license: http://synthcode.com/license.txt
    54
    6 ;; This is intended as a permissive HTML parser for people who prefer
    7 ;; the scalable interface described in Oleg Kiselyov's SSAX parser, as
    8 ;; well as providing simple convenience utilities.  It correctly
    9 ;; handles all invalid HTML, inserting "virtual" starting and closing
    10 ;; tags as needed to maintain the proper tree structure needed for the
    11 ;; foldts down/up logic.  A major goal of this parser is bug-for-bug
    12 ;; compatibility with the way common web browsers parse HTML.
    13 
    14 ;; Procedure: make-html-parser . keys
     5;;> A permissive HTML parser supporting scalable streaming with a
     6;;> folding interface.  This copies the interface of Oleg Kiselyov's
     7;;> SSAX parser, as well as providing simple convenience utilities.
     8;;> It correctly handles all invalid HTML, inserting "virtual"
     9;;> starting and closing tags as needed to maintain the proper tree
     10;;> structure needed for the foldts down/up logic.  A major goal of
     11;;> this parser is bug-for-bug compatibility with the way common web
     12;;> browsers parse HTML.
     13
     14;;> Procedure: make-html-parser . keys
    1515
    1616;;   Returns a procedure of two arguments, and initial seed and an
     
    236236       ((eqv? #\& (peek-char in))
    237237        (let ((x (read-entity in)))
    238           (lp (cons (if (eq? 'entity (car x))
    239                         (get-entity entities (cdr x))
    240                         (cdr x))
     238          (lp (cons (or (and (eq? 'entity (car x))
     239                             (get-entity entities (cdr x)))
     240                        (string-append "&" (cdr x)))
    241241                    res))))
    242242       (else
  • release/4/html-parser/trunk/html-parser.setup

    r30833 r31191  
    55  'html-parser
    66  '("html-parser.so" "html-parser.import.so")
    7   '((version 0.5.3)
     7  '((version 0.5.6)
    88    (documentation "html-parser.html")))
  • release/4/html-parser/trunk/test.scm

    r30833 r31191  
    6161    (html->sxml "<b id=\"&amp;\">&amp;</b>"))
    6262
     63(test '(*TOP* (foo (@ (bar "&x"))))
     64    (html->sxml "<foo bar=\"&x\" />"))
     65
    6366(test '(*TOP* (foo (@ (bar))))
    6467    (html->sxml "<foo bar></foo>"))
Note: See TracChangeset for help on using the changeset viewer.