Opened 4 months ago

Closed 4 weeks ago

#1441 closed defect (wontfix)

Macro keywords unexpectedly match against imported identifiers

Reported by: sjamaan Owned by: sjamaan
Priority: major Milestone: 4.14.0
Component: expander Version: 4.13.0
Keywords: hygiene, syntax-rules, capture Cc:
Estimated difficulty: hard

Description (last modified by sjamaan)

As reported by Joerg Wittenberger, Al Petrofsky's "unhygienic extraction" of identifiers via syntax-rules keywords list fails when the identifier was imported from another module.

A simplified example:

(module break (break)
  (import scheme)
  (define break "elsewhere"))


(module unhygienic-loop (loop)
  (import scheme)

(define-syntax find-identifier
  (syntax-rules ()
    ((_ ident (x . y) sk fk)
     (find-identifier ident x sk (find-identifier ident y sk fk)))
    ((_ ident #(x ...) sk fk)
     (find-identifier ident (x ...) sk fk))
    ((_ ident form sk fk)
     (let-syntax
         ((check
           (syntax-rules (ident)
             ((_ ident ident* (s-f . s-args) fk_) (s-f ident* . s-args))
             ((_ x y sk_ fk_) fk_))))
       (check form form sk fk)))))


(define-syntax loop
  (syntax-rules ()
    ((_ . exps)
     (let-syntax
         ((l (syntax-rules ()
               ((_ ident exps_)
                (call-with-current-continuation
                 (lambda (ident)
                   (let f ()
                     (begin 'prevent-empty-begin . exps_)
                     (f))))))))
       (find-identifier break exps (l exps) (l bogus exps)))))))


(module foo ()

(import scheme break unhygienic-loop)

;; Uncomment these and comment the above to see the differing behaviour:
;; (import scheme unhygienic-loop)

;; This definition is not required
;; (define break "local")

(display (loop (break 'foo))))

This appears to be incorrect, at least when compared to Chibi on an equivalent set of R7RS modules, it behaves identically in the case where the identifier is defined locally, undefined, or defined by another module.

In CHICKEN 4.12 and earlier, this code behaves the same regardless of where the identifier came from.

Change History (15)

comment:1 Changed 4 months ago by sjamaan

  • Description modified (diff)

comment:2 Changed 4 months ago by sjamaan

  • Description modified (diff)

comment:3 Changed 3 months ago by sjamaan

  • Milestone changed from someday to 4.14.0

comment:4 Changed 3 months ago by sjamaan

An interesting testcase where I believe the new version is behaving correctly but 4.12.0 behaving differently:

(module match-it (match-foo)
 (import scheme)
    
 (define-syntax match-foo
   (syntax-rules ()
     ((_)
      (let-syntax ((match-identifier
                    (syntax-rules (foo)
                      ((_ foo) #t)
                      ((_ ?whatever) #f))))
        (match-identifier foo))))))

(module myfoo (foo)
  (import scheme)
  (define foo 1))

(import match-it)
(print (match-foo))

(import myfoo)
(print (match-foo))

I would expect both to print #t like in 4.13.0

comment:5 Changed 3 months ago by sjamaan

Simplified case:

(module undef (undefined)
  (import scheme)
  (define undefined 1))

(module expando (do-it)
  (import scheme)
  (define-syntax do-it
    (syntax-rules ()
      ((_)
       (let-syntax ((final
                     (syntax-rules ()
                       ((_ ?x) ?x))))
         (final undefined))))))

(import undef expando)
;; Should fail, not print 1, because the let-syntax expansion should rename introduced 
;; identifiers from its SE (but input should be looked up here)
(print (do-it))

comment:6 Changed 3 months ago by sjamaan

OK, this simplified case is caused by ##sys#alias-global-hook, so perhaps we should ignore that for now (see #1131).

comment:7 Changed 3 months ago by sjamaan

Simplified case that triggers a bug in both versions :)

(module match-it (match-foo)
 (import scheme)
    
 (define-syntax match-foo
   (syntax-rules ()
     ((_ ?input)
      (let-syntax ((match-identifier
                    (syntax-rules (foo)
                      ((_ foo foo) #t)
                      ((_ foo ?whatever) 'mismatch)
                      ((_ ?whatever ?whatever) #f))))
        (match-identifier foo ?input))))))

(module myfoo (foo)
  (import scheme)
  (define foo 1))

(import match-it)
(print (match-foo foo))

(import myfoo)
(print (match-foo foo))

Should print #t in both cases

comment:8 Changed 3 months ago by sjamaan

Very interesting, in Scheme48 when I import a match-it module and try to expand (match-foo foo), it complains that foo is undefined. So it doesn't match there at all!

comment:9 Changed 3 months ago by sjamaan

Chibi prints #t in both cases, but Gauche complains after importing:

*** ERROR: pattern variable #<identifier match-it-chibi##<identifier match-it-chibi#?whatever.683ca40>.6840e20> appears more than once in the macro definition of match-identifier: (#<identifier match-it-chibi##<identifier match-it-chibi#_.683ca60>.6840e40> #<identifier match-it-chibi##<identifier match-it-chibi#?whatever.683ca40>.6840e20> #<identifier match-it-chibi##<identifier match-it-chibi#?whatever.683ca40>.6840e20>)
    While compiling: (#<identifier match-it-chibi##<identifier match-it-chibi#syntax-rules.683caa0>.6840e80 ...
    While compiling "(standard input)" at line 1: (print (match-foo foo))
Stack Trace:
_______________________________________
  0  (eval expr env)
        at "/usr/share/gauche-0.9/0.9.5/lib/gauche/interactive.scm":282

Files:

;; match-it-chibi.sld
(define-library (match-it-chibi)
  (export match-foo)
  (import (scheme base))
  (begin
    (define-syntax match-foo
      (syntax-rules ()
        ((_ ?input)
         (let-syntax ((match-identifier
                       (syntax-rules (foo)
                         ((_ foo foo) #t)
                         ((_ foo ?whatever) 'mismatch)
                         ((_ ?whatever ?whatever) #f))))
           (match-identifier foo ?input)))))))
;; myfoo-chibi.sld
(define-library (myfoo-chibi)
  (export foo)
  (import (scheme base))
  (begin (define foo 2)))

In Scheme48:

;; match-it-s48
(define-syntax match-foo
  (syntax-rules ()
    ((_ ?input)
     (let-syntax ((match-identifier
                   (syntax-rules (foo)
                     ((_ foo foo) #t)
                     ((_ foo ?whatever) 'mismatch)
                     ((_ ?whatever ?whatever) #f))))
       (match-identifier foo ?input)))))
;;; myfoo-s48.scm
(define foo 2)
;;; packages.scm
(define-structure match-it-s48 (export match-foo)
   (open scheme)
   (files match-it-s48))

(define-structure myfoo-s48 (export foo)
   (open scheme)
   (files myfoo-s48))

So all in all, so far I'm not so convinced anymore that this really is a bug, except for the fact that there is something of a "regression" when compared to how it used to work in CHICKEN.

comment:10 Changed 3 months ago by sjamaan

Racket also complains like Gauche does:

> (require "match-it-racket.scm")
> (match-foo foo)
; match-it-racket.scm:12:35: syntax-rules: variable used twice in pattern
;   at: ?whatever
;   in: (_ ?whatever ?whatever)

file:

#lang racket

(provide match-foo)

(define-syntax match-foo
  (syntax-rules ()
    ((_ ?input)
     (let-syntax ((match-identifier
                   (syntax-rules (foo)
                     ((_ foo foo) #t)
                     ((_ foo ?whatever) 'mismatch)
                     ((_ ?whatever ?whatever) #f))))
       (match-identifier foo ?input)))))

comment:11 Changed 3 months ago by sjamaan

Original bug report case ported to Racket:

;; unhygienic-loop-racket.scm
#lang racket

(provide loop)

(define-syntax find-identifier
  (syntax-rules ()
    ((_ ident (x . y) sk fk)
     (find-identifier ident x sk (find-identifier ident y sk fk)))
    ((_ ident #(x ...) sk fk)
     (find-identifier ident (x ...) sk fk))
    ((_ ident form sk fk)
     (let-syntax
         ((check
           (syntax-rules (ident)
             ((_ ident ident* (s-f . s-args) fk_) (s-f ident* . s-args))
             ((_ x y sk_ fk_) fk_))))
       (check form form sk fk)))))


(define-syntax loop
  (syntax-rules ()
    ((_ . exps)
     (let-syntax
         ((l (syntax-rules ()
               ((_ ident exps_)
                (call-with-current-continuation
                 (lambda (ident)
                   (let f ()
                     (begin 'prevent-empty-begin . exps_)
                     (f))))))))
       (find-identifier break exps (l exps) (l bogus exps))))))
;; myfoo-racket.scm
#lang racket

(provide foo)

(define foo 2)

Running it shows a difference, but not 100% the same as CIHCKEN 4.13.0 depending on the place where break was defined:

Welcome to Racket v6.7.
> (require "unhygienic-loop-racket.scm")
> (loop (break 'foo))
'foo
> (require "break-racket.scm")
> (loop (break 'foo))
; application: not a procedure;
;  expected a procedure that can be applied to arguments
;   given: "elsewhere"
; [,bt for context]
> (define break 2)
> (loop (break 'foo))
'foo

In Scheme48, it behaves differently:

;; unhygienic-loop-s48.scm
(define-syntax find-identifier
  (syntax-rules ()
    ((_ ident (x . y) sk fk)
     (find-identifier ident x sk (find-identifier ident y sk fk)))
    ((_ ident #(x ...) sk fk)
     (find-identifier ident (x ...) sk fk))
    ((_ ident form sk fk)
     (let-syntax
         ((check
           (syntax-rules (ident)
             ((_ ident ident* (s-f . s-args) fk_) (s-f ident* . s-args))
             ((_ x y sk_ fk_) fk_))))
       (check form form sk fk)))))


(define-syntax loop
  (syntax-rules ()
    ((_ . exps)
     (let-syntax
         ((l (syntax-rules ()
               ((_ ident exps_)
                (call-with-current-continuation
                 (lambda (ident)
                   (let f ()
                     (begin 'prevent-empty-begin . exps_)
                     (f))))))))
       (find-identifier break exps (l exps) (l bogus exps))))))
;;; break-s48
(define break "elsewhere")
;;; packages.scm
(define-structure unhygienic-loop-s48 (export loop)
   (open scheme)
   (files unhygienic-loop-s48))

(define-structure break-s48 (export break)
   (open scheme)
   (files break-s48))

When running it:

Welcome to Scheme 48 1.9 (made by binet on 2017-03-16)
See http://s48.org/ for more information.
Please report bugs to scheme-48-bugs@s48.org.
Get more information at http://www.s48.org/.
Type ,? (comma question-mark) for help.
> ,config ,load packages.scm
> ,open unhygienic-loop-s48
> (loop (break 'foo))

warning: invalid variable reference [get-location]
         loop
         #{package 284 user}

assertion-violation: undefined variable [global]
                     break
                     user
1> ,open break-s48
Newly accessible in user: (break)
1> (loop (break 'foo))

warning: invalid variable reference [get-location]
         loop
         #{package 284 user}

assertion-violation: attempt to call a non-procedure [call]
                     ("elsewhere" '(foo))
1> (define break 2)
; no values returned
1> (loop (break 'foo))

warning: invalid variable reference [get-location]
         loop
         #{package 284 user}

assertion-violation: attempt to call a non-procedure [call]
                     (2 '(foo))

Whereas in Chibi:

;;; unhygienic-loop-chibi.sld
(define-library (unhygienic-loop-chibi)
  (export loop)
  (import (scheme base))
  (begin
    (define-syntax find-identifier
      (syntax-rules ()
        ((_ ident (x . y) sk fk)
         (find-identifier ident x sk (find-identifier ident y sk fk)))
        ((_ ident #(x ...) sk fk)
         (find-identifier ident (x ...) sk fk))
        ((_ ident form sk fk)
         (let-syntax
             ((check
               (syntax-rules (ident)
                 ((_ ident ident* (s-f . s-args) fk_) (s-f ident* . s-args))
                 ((_ x y sk_ fk_) fk_))))
           (check form form sk fk)))))

    (define-syntax loop
      (syntax-rules ()
        ((_ . exps)
         (let-syntax
             ((l (syntax-rules ()
                   ((_ ident exps_)
                    (call-with-current-continuation
                     (lambda (ident)
                       (let f ()
                         (begin 'prevent-empty-begin . exps_)
                         (f))))))))
           (find-identifier break exps (l exps) (l bogus exps))))))))
;;; break-chibi.sld
(define-library (break-chibi)
  (export break)
  (import (scheme base))
  (begin (define break "elsewhere")))

Running it gives me:

> (import (unhygienic-loop-chibi))
> (loop (break 'foo))
foo
> (import (break-chibi))
> (loop (break 'foo))
foo
> (define break 1)
> (loop (break 'foo))
foo

In Gauche, I get the same results as in Chibi with the R7RS library definitions.

Perhaps others can check other r7rs implementations?

My preliminary conclusion is that this is not a bug, just a very big "undefined behaviour"-type hole in the specification.

comment:12 Changed 3 months ago by sjamaan

Correction to avoid confusion: The "identifier used twice" error in match-foo is caused by the ?whatever appearing twice.

The correct definition would be:

(module match-it (match-foo)
 (import scheme)
    
 (define-syntax match-foo
   (syntax-rules ()
     ((_ ?input)
      (let-syntax ((match-identifier
                    (syntax-rules (foo)
                      ((_ foo foo) #t)
                      ((_ foo ?whatever) 'mismatch)
                      ((_ ?whatever ?whatever2) #f))))
        (match-identifier foo ?input))))))

(module myfoo (foo)
  (import scheme)
  (define foo 1))

(import match-it)
(print (match-foo foo))

(import myfoo)
(print (match-foo foo))

This will indeed print #t twice in both Chibi and Gauche.

Nevertheless, the fixed definition in Scheme48 prints:

Welcome to Scheme 48 1.9 (made by binet on 2017-03-16)
See http://s48.org/ for more information.
Please report bugs to scheme-48-bugs@s48.org.
Get more information at http://www.s48.org/.
Type ,? (comma question-mark) for help.
> ,config ,load packages.scm
> ,open match-it-s48
> (match-foo foo)

warning: invalid variable reference [get-location]
         match-foo
         #{package 284 user}

assertion-violation: undefined variable [global]
                     foo
                     user
1> ,open myfoo-s48
Newly accessible in user: (foo)
1> (match-foo foo)

warning: invalid variable reference [get-location]
         match-foo
         #{package 284 user}

assertion-violation: undefined variable [global]
                     match-foo
                     match-it-s48

And Racket:

Welcome to Racket v6.7.
> (require "./match-it-racket.scm")
> (match-foo foo)
#t
> (require "./myfoo-racket.scm")
> (match-foo foo)
'mismatch
> (define foo 2)
> (match-foo foo)
#t

comment:13 Changed 3 months ago by johnwcowan

Here's a simplified version without a module wrapping the define-syntax that still fails:

#;1> (define-syntax match-foo
   (syntax-rules ()
     ((_ ?input)
      (let-syntax ((match-identifier
                    (syntax-rules (foo)
                      ((_ foo foo) #t)
                      ((_ foo ?whatever) 'mismatch)
                      ((_ ?whatever ?whatever2) #f))))
        (match-identifier foo ?input)))))
#;2> (module myfoo (foo)
  (import scheme)
  (define foo 1))
#;3> (print (match-foo foo))
#t
#;4> (import myfoo)
#;5> (print (match-foo foo))
#f
#;6>

comment:14 Changed 3 months ago by ashinn

FYI, as discussed on the Chibi behavior is non-standard permissive matching, which can be disabled at compile time with -DSEXP_USE_STRICT_TOPLEVEL_BINDINGS=1. This was intended to be more convenient, and matched the behavior of some other implementations at the time (I believe Chicken and Gauche), and because I hadn't seen a real-life example where it caused a bug.

This causes confusion, however, when comparisons like this come up, and since Chibi is the R7RS reference implementation I will reverse the default behavior.

comment:15 Changed 4 weeks ago by sjamaan

  • Resolution set to wontfix
  • Status changed from new to closed

This is too hairy to fix, and completely unspecified anyway.

If someone manages to fix it (and can explain why the fix works!), we'll accept a patch, but we won't invest time in fixing it.

Note: See TracTickets for help on using tickets.