#1441 closed defect (wontfix)
Macro keywords unexpectedly match against imported identifiers
Reported by: | sjamaan | Owned by: | sjamaan |
---|---|---|---|
Priority: | major | Milestone: | 5.1 |
Component: | expander | Version: | 4.13.0 |
Keywords: | hygiene, syntax-rules, capture | Cc: | |
Estimated difficulty: | hard |
Description (last modified by )
As reported by Joerg Wittenberger, Al Petrofsky's "unhygienic extraction" of identifiers via syntax-rules
keywords list fails when the identifier was imported from another module.
A simplified example:
(module break (break) (import scheme) (define break "elsewhere")) (module unhygienic-loop (loop) (import scheme) (define-syntax find-identifier (syntax-rules () ((_ ident (x . y) sk fk) (find-identifier ident x sk (find-identifier ident y sk fk))) ((_ ident #(x ...) sk fk) (find-identifier ident (x ...) sk fk)) ((_ ident form sk fk) (let-syntax ((check (syntax-rules (ident) ((_ ident ident* (s-f . s-args) fk_) (s-f ident* . s-args)) ((_ x y sk_ fk_) fk_)))) (check form form sk fk))))) (define-syntax loop (syntax-rules () ((_ . exps) (let-syntax ((l (syntax-rules () ((_ ident exps_) (call-with-current-continuation (lambda (ident) (let f () (begin 'prevent-empty-begin . exps_) (f)))))))) (find-identifier break exps (l exps) (l bogus exps))))))) (module foo () (import scheme break unhygienic-loop) ;; Uncomment these and comment the above to see the differing behaviour: ;; (import scheme unhygienic-loop) ;; This definition is not required ;; (define break "local") (display (loop (break 'foo))))
This appears to be incorrect, at least when compared to Chibi on an equivalent set of R7RS modules, it behaves identically in the case where the identifier is defined locally, undefined, or defined by another module.
In CHICKEN 4.12 and earlier, this code behaves the same regardless of where the identifier came from.
Change History (16)
comment:1 Changed 7 years ago by
Description: | modified (diff) |
---|
comment:2 Changed 7 years ago by
Description: | modified (diff) |
---|
comment:3 Changed 7 years ago by
Milestone: | someday → 4.14.0 |
---|
comment:4 Changed 7 years ago by
comment:5 Changed 7 years ago by
Simplified case:
(module undef (undefined) (import scheme) (define undefined 1)) (module expando (do-it) (import scheme) (define-syntax do-it (syntax-rules () ((_) (let-syntax ((final (syntax-rules () ((_ ?x) ?x)))) (final undefined)))))) (import undef expando) ;; Should fail, not print 1, because the let-syntax expansion should rename introduced ;; identifiers from its SE (but input should be looked up here) (print (do-it))
comment:6 Changed 7 years ago by
OK, this simplified case is caused by ##sys#alias-global-hook
, so perhaps we should ignore that for now (see #1131).
comment:7 Changed 7 years ago by
Simplified case that triggers a bug in both versions :)
(module match-it (match-foo) (import scheme) (define-syntax match-foo (syntax-rules () ((_ ?input) (let-syntax ((match-identifier (syntax-rules (foo) ((_ foo foo) #t) ((_ foo ?whatever) 'mismatch) ((_ ?whatever ?whatever) #f)))) (match-identifier foo ?input)))))) (module myfoo (foo) (import scheme) (define foo 1)) (import match-it) (print (match-foo foo)) (import myfoo) (print (match-foo foo))
Should print #t in both cases
comment:8 Changed 7 years ago by
Very interesting, in Scheme48 when I import a match-it module and try to expand (match-foo foo), it complains that foo is undefined. So it doesn't match there at all!
comment:9 Changed 7 years ago by
Chibi prints #t in both cases, but Gauche complains after importing:
*** ERROR: pattern variable #<identifier match-it-chibi##<identifier match-it-chibi#?whatever.683ca40>.6840e20> appears more than once in the macro definition of match-identifier: (#<identifier match-it-chibi##<identifier match-it-chibi#_.683ca60>.6840e40> #<identifier match-it-chibi##<identifier match-it-chibi#?whatever.683ca40>.6840e20> #<identifier match-it-chibi##<identifier match-it-chibi#?whatever.683ca40>.6840e20>) While compiling: (#<identifier match-it-chibi##<identifier match-it-chibi#syntax-rules.683caa0>.6840e80 ... While compiling "(standard input)" at line 1: (print (match-foo foo)) Stack Trace: _______________________________________ 0 (eval expr env) at "/usr/share/gauche-0.9/0.9.5/lib/gauche/interactive.scm":282
Files:
;; match-it-chibi.sld (define-library (match-it-chibi) (export match-foo) (import (scheme base)) (begin (define-syntax match-foo (syntax-rules () ((_ ?input) (let-syntax ((match-identifier (syntax-rules (foo) ((_ foo foo) #t) ((_ foo ?whatever) 'mismatch) ((_ ?whatever ?whatever) #f)))) (match-identifier foo ?input)))))))
;; myfoo-chibi.sld (define-library (myfoo-chibi) (export foo) (import (scheme base)) (begin (define foo 2)))
In Scheme48:
;; match-it-s48 (define-syntax match-foo (syntax-rules () ((_ ?input) (let-syntax ((match-identifier (syntax-rules (foo) ((_ foo foo) #t) ((_ foo ?whatever) 'mismatch) ((_ ?whatever ?whatever) #f)))) (match-identifier foo ?input)))))
;;; myfoo-s48.scm (define foo 2)
;;; packages.scm (define-structure match-it-s48 (export match-foo) (open scheme) (files match-it-s48)) (define-structure myfoo-s48 (export foo) (open scheme) (files myfoo-s48))
So all in all, so far I'm not so convinced anymore that this really is a bug, except for the fact that there is something of a "regression" when compared to how it used to work in CHICKEN.
comment:10 Changed 7 years ago by
Racket also complains like Gauche does:
> (require "match-it-racket.scm") > (match-foo foo) ; match-it-racket.scm:12:35: syntax-rules: variable used twice in pattern ; at: ?whatever ; in: (_ ?whatever ?whatever)
file:
#lang racket (provide match-foo) (define-syntax match-foo (syntax-rules () ((_ ?input) (let-syntax ((match-identifier (syntax-rules (foo) ((_ foo foo) #t) ((_ foo ?whatever) 'mismatch) ((_ ?whatever ?whatever) #f)))) (match-identifier foo ?input)))))
comment:11 Changed 7 years ago by
Original bug report case ported to Racket:
;; unhygienic-loop-racket.scm #lang racket (provide loop) (define-syntax find-identifier (syntax-rules () ((_ ident (x . y) sk fk) (find-identifier ident x sk (find-identifier ident y sk fk))) ((_ ident #(x ...) sk fk) (find-identifier ident (x ...) sk fk)) ((_ ident form sk fk) (let-syntax ((check (syntax-rules (ident) ((_ ident ident* (s-f . s-args) fk_) (s-f ident* . s-args)) ((_ x y sk_ fk_) fk_)))) (check form form sk fk))))) (define-syntax loop (syntax-rules () ((_ . exps) (let-syntax ((l (syntax-rules () ((_ ident exps_) (call-with-current-continuation (lambda (ident) (let f () (begin 'prevent-empty-begin . exps_) (f)))))))) (find-identifier break exps (l exps) (l bogus exps))))))
;; myfoo-racket.scm #lang racket (provide foo) (define foo 2)
Running it shows a difference, but not 100% the same as CIHCKEN 4.13.0 depending on the place where break
was defined:
Welcome to Racket v6.7. > (require "unhygienic-loop-racket.scm") > (loop (break 'foo)) 'foo > (require "break-racket.scm") > (loop (break 'foo)) ; application: not a procedure; ; expected a procedure that can be applied to arguments ; given: "elsewhere" ; [,bt for context] > (define break 2) > (loop (break 'foo)) 'foo
In Scheme48, it behaves differently:
;; unhygienic-loop-s48.scm (define-syntax find-identifier (syntax-rules () ((_ ident (x . y) sk fk) (find-identifier ident x sk (find-identifier ident y sk fk))) ((_ ident #(x ...) sk fk) (find-identifier ident (x ...) sk fk)) ((_ ident form sk fk) (let-syntax ((check (syntax-rules (ident) ((_ ident ident* (s-f . s-args) fk_) (s-f ident* . s-args)) ((_ x y sk_ fk_) fk_)))) (check form form sk fk))))) (define-syntax loop (syntax-rules () ((_ . exps) (let-syntax ((l (syntax-rules () ((_ ident exps_) (call-with-current-continuation (lambda (ident) (let f () (begin 'prevent-empty-begin . exps_) (f)))))))) (find-identifier break exps (l exps) (l bogus exps))))))
;;; break-s48 (define break "elsewhere")
;;; packages.scm (define-structure unhygienic-loop-s48 (export loop) (open scheme) (files unhygienic-loop-s48)) (define-structure break-s48 (export break) (open scheme) (files break-s48))
When running it:
Welcome to Scheme 48 1.9 (made by binet on 2017-03-16) See http://s48.org/ for more information. Please report bugs to scheme-48-bugs@s48.org. Get more information at http://www.s48.org/. Type ,? (comma question-mark) for help. > ,config ,load packages.scm > ,open unhygienic-loop-s48 > (loop (break 'foo)) warning: invalid variable reference [get-location] loop #{package 284 user} assertion-violation: undefined variable [global] break user 1> ,open break-s48 Newly accessible in user: (break) 1> (loop (break 'foo)) warning: invalid variable reference [get-location] loop #{package 284 user} assertion-violation: attempt to call a non-procedure [call] ("elsewhere" '(foo)) 1> (define break 2) ; no values returned 1> (loop (break 'foo)) warning: invalid variable reference [get-location] loop #{package 284 user} assertion-violation: attempt to call a non-procedure [call] (2 '(foo))
Whereas in Chibi:
;;; unhygienic-loop-chibi.sld (define-library (unhygienic-loop-chibi) (export loop) (import (scheme base)) (begin (define-syntax find-identifier (syntax-rules () ((_ ident (x . y) sk fk) (find-identifier ident x sk (find-identifier ident y sk fk))) ((_ ident #(x ...) sk fk) (find-identifier ident (x ...) sk fk)) ((_ ident form sk fk) (let-syntax ((check (syntax-rules (ident) ((_ ident ident* (s-f . s-args) fk_) (s-f ident* . s-args)) ((_ x y sk_ fk_) fk_)))) (check form form sk fk))))) (define-syntax loop (syntax-rules () ((_ . exps) (let-syntax ((l (syntax-rules () ((_ ident exps_) (call-with-current-continuation (lambda (ident) (let f () (begin 'prevent-empty-begin . exps_) (f)))))))) (find-identifier break exps (l exps) (l bogus exps))))))))
;;; break-chibi.sld (define-library (break-chibi) (export break) (import (scheme base)) (begin (define break "elsewhere")))
Running it gives me:
> (import (unhygienic-loop-chibi)) > (loop (break 'foo)) foo > (import (break-chibi)) > (loop (break 'foo)) foo > (define break 1) > (loop (break 'foo)) foo
In Gauche, I get the same results as in Chibi with the R7RS library definitions.
Perhaps others can check other r7rs implementations?
My preliminary conclusion is that this is not a bug, just a very big "undefined behaviour"-type hole in the specification.
comment:12 Changed 7 years ago by
Correction to avoid confusion: The "identifier used twice" error in match-foo is caused by the ?whatever
appearing twice.
The correct definition would be:
(module match-it (match-foo) (import scheme) (define-syntax match-foo (syntax-rules () ((_ ?input) (let-syntax ((match-identifier (syntax-rules (foo) ((_ foo foo) #t) ((_ foo ?whatever) 'mismatch) ((_ ?whatever ?whatever2) #f)))) (match-identifier foo ?input)))))) (module myfoo (foo) (import scheme) (define foo 1)) (import match-it) (print (match-foo foo)) (import myfoo) (print (match-foo foo))
This will indeed print #t twice in both Chibi and Gauche.
Nevertheless, the fixed definition in Scheme48 prints:
Welcome to Scheme 48 1.9 (made by binet on 2017-03-16) See http://s48.org/ for more information. Please report bugs to scheme-48-bugs@s48.org. Get more information at http://www.s48.org/. Type ,? (comma question-mark) for help. > ,config ,load packages.scm > ,open match-it-s48 > (match-foo foo) warning: invalid variable reference [get-location] match-foo #{package 284 user} assertion-violation: undefined variable [global] foo user 1> ,open myfoo-s48 Newly accessible in user: (foo) 1> (match-foo foo) warning: invalid variable reference [get-location] match-foo #{package 284 user} assertion-violation: undefined variable [global] match-foo match-it-s48
And Racket:
Welcome to Racket v6.7. > (require "./match-it-racket.scm") > (match-foo foo) #t > (require "./myfoo-racket.scm") > (match-foo foo) 'mismatch > (define foo 2) > (match-foo foo) #t
comment:13 Changed 7 years ago by
Here's a simplified version without a module wrapping the define-syntax
that still fails:
#;1> (define-syntax match-foo (syntax-rules () ((_ ?input) (let-syntax ((match-identifier (syntax-rules (foo) ((_ foo foo) #t) ((_ foo ?whatever) 'mismatch) ((_ ?whatever ?whatever2) #f)))) (match-identifier foo ?input))))) #;2> (module myfoo (foo) (import scheme) (define foo 1)) #;3> (print (match-foo foo)) #t #;4> (import myfoo) #;5> (print (match-foo foo)) #f #;6>
comment:14 Changed 7 years ago by
FYI, as discussed on the Chibi behavior is non-standard permissive matching, which can be disabled at compile time with -DSEXP_USE_STRICT_TOPLEVEL_BINDINGS=1
. This was intended to be more convenient, and matched the behavior of some other implementations at the time (I believe Chicken and Gauche), and because I hadn't seen a real-life example where it caused a bug.
This causes confusion, however, when comparisons like this come up, and since Chibi is the R7RS reference implementation I will reverse the default behavior.
comment:15 Changed 6 years ago by
Resolution: | → wontfix |
---|---|
Status: | new → closed |
This is too hairy to fix, and completely unspecified anyway.
If someone manages to fix it (and can explain why the fix works!), we'll accept a patch, but we won't invest time in fixing it.
An interesting testcase where I believe the new version is behaving correctly but 4.12.0 behaving differently:
I would expect both to print
#t
like in 4.13.0