Opened 11 years ago

Closed 11 years ago

#1047 closed defect (fixed)

irregex extracts wrong submatches in some situations

Reported by: kristianlm Owned by: sjamaan
Priority: critical Milestone: 4.9.0
Component: core libraries Version: 4.8.x
Keywords: Cc:
Estimated difficulty:

Description

This works:

(irregex-fold
 (irregex
  '(seq (* nonl)
        (or "kbd" (=> eventnum1 (seq "event" (+ num)))) (* nonl)
        (or "kbd" (=> eventnum2 (seq "event" (+ num)))) (* nonl)
        ;;eol
        
        )
  'backtrack
  )
 (lambda (i m s)
   (cons (or (irregex-match-substring m 'eventnum1 )
             (irregex-match-substring m 'eventnum2 ))
         s))
 '()
 "kbd event11\nkbd event10\nkbd event9")

But if you remove 'backtrack it misses the last "1" digit.

Change History (3)

comment:1 Changed 11 years ago by sjamaan

First attempt at simplification:

(irregex-match-substring
 (irregex-match (irregex
                 '(seq (* nonl) (or "x" ($ "a")) "a"))
                "xa")
 1)
=> "a" ;; Should return #f, since the first submatch doesn't match "x"

comment:2 Changed 11 years ago by sjamaan

It gets weirder: Upstream irregex behaves correctly on this!

comment:3 Changed 11 years ago by sjamaan

Resolution: fixed
Status: newclosed

OK, I just verified and it looks like Irregex 0.8.3 contains the bug, later ones do not.

I had no idea about the state of Chicken 4.8.0 (that's why I didn't bother testing this with master), but after digging through the git log and the mailinglist archives it turned out that chicken 4.8.0 contains a copy of Irregex 0.8.3; the latest release is 0.9.2, which has some major performance enhancements.

The good part is, CHICKEN master contains this version, and I'm happy to report it just works there. So that means this ticket can be closed. Woohoo ;)

I guess we really need to gear up for a release Real Soon Now!

Note: See TracTickets for help on using tickets.