Opened 14 years ago
Closed 14 years ago
#439 closed change request (fixed)
Quasiquote simplification and improved syntax checks
Reported by: | sjamaan | Owned by: | felix winkelmann |
---|---|---|---|
Priority: | minor | Milestone: | |
Component: | expander | Version: | 4.6.x |
Keywords: | expander, quasiquote, syntax | Cc: | |
Estimated difficulty: |
Description
While studying the expander, I stumbled upon a problem in the quasiquote code. It does not check for proper usage of unquote/unquote-splicing.
Right now, it silently ignores extra arguments to unquote and unquote-splicing:
(let ((a 1)) (quasiquote (unquote a b))) => 1
Also, there's an R5RS incompatibility:
(define x (list 1 2)) (quasiquote (quasiquote (unquote-splicing (unquote x)))) => (quasiquote (unquote-splicing (unquote x)))
However, this should return (quasiquote (unquote-splicing (1 2)))
because
The nesting level increases by one inside each successive quasiquotation, and decreases by one inside each unquotation
-- R5RS 4.2.6, second paragraph
The first quasiquote opens, the second increases nesting to one, the unquote-splicing decreases it to zero and the unquote is at level 0 so should result in the x getting unquoted.
The attached patch fixes these problems.
I've also taken the opportunity to simplify the code a bit, reducing the total "quasiquote"-code by 8 lines and making it more readable/understandable. Because R5RS says the behavior of unquote and unquote-splicing is undefined for cases where they occur in positions other than the car of a 2-element list, I decided it's not worth the extra checking whether the cdr of a nested quasiquote call is actually a list or not. It complicated the code and make it a little inconsistent.
The patch also adds tests and fixes the broken (f ..)
test syntax; it originally always succeeded, because it threw an error when no errors were thrown, causing the exception handler to always be invoked! Luckily, none of the existing tests are failing after this fix.
Attachments (1)
Change History (8)
Changed 14 years ago by
Attachment: | quasiquote-fix.patch added |
---|
comment:1 Changed 14 years ago by
comment:2 follow-up: 3 Changed 14 years ago by
This is the CR :)
If you want you can already apply the second hunk in the patch (the part which fixes the f
macro in syntax-tests.scm).
comment:3 Changed 14 years ago by
Milestone: | 4.7.0 |
---|---|
Type: | defect → change request |
Replying to sjamaan:
This is the CR :)
If you want you can already apply the second hunk in the patch (the part which fixes the
f
macro in syntax-tests.scm).
Ok, I have done so.
comment:4 Changed 14 years ago by
I add here the possible options as given by Peter in his mail to chicken-hackers:
Option 1: As (quasiquote a b c) throws an error, so could (quasiquote (quasiquote a b c)) or any even deeper-nested quasiquotes, as well as unquotes. Having multiple arguments to quasiquote is not allowed on level 0, so why should it be allowed on higher levels? This is the pedantically correct option and might ease the debugging of code that contains quasiquote calls because this code would be unportable. This could break existing code. This is probably the cleanest option from a language purity standpoint. It would also make the code even a bit simpler than my patch does, I think. We could put a call to ##sys#check-syntax after it's determined that the car of a pair contains one of the three words quasiquote, unquote or unquote-splicing. Then we have an if which checks the nesting level and either calls WALK recursively or does the unquoting. The disadvantage of this option compared to option 2 is that it will be a little less practical when you want to use the literal symbols quasiquote, unquote or unquote-splicing in deeper-nested quotations. With option 2, (quasiquote (x y (quasiquote a b c))) will "just work". If this option would be implemented, this would need to be written as (quasiquote (x y (unquote-splicing (quote (quasiquote a b c))))) or as (quasiquote (x y (unquote (quote quasiquote)) a b c)) otoh, when written as `(x y ,'quasiquote a b c) that's not too bad. However, something like (quasiquote (x quasiquote y z)) would also (perhaps unexpectedly) give an error as that's identical to (quasiquote (x . (quasiquote y z))) This difference cannot be detected because quasiquote, unlike normal Scheme code, allows improper lists to occur anywhere. In normal Scheme (x . (y z)) is identical to (x y z) and the y is never evaluated as a macro. Option 2 (this is what my current patch does): Only throw errors in case of invalid list arguments at level 0 and increase the level with 1 whenever quasiquote is found at the car of a pair. It will allow you to write quasiquotes and unquotes at nesting levels above 1 without any strange trickery, and will throw a nice error message when you try to use them at level 0 with wrong arity. The example above, (quasiquote (x quasiquote y z)), will just work and not complain. However, (quasiquote (x quasiquote y z (unquote a))) will not unquote the a, because this is identical to (quasiquote (x . (quasiquote y z (unquote a)))), so (unquote a) occurs at nesting level 1 and simply decreases the level to 0 instead of evaluating the "a". That's the price we'd pay for having a loose and non-strictly pedantic system. Option 3: Only fix R5RS behaviour and throw an error when an invalid number of arguments are found at level 0. This will undo the part of my patch that tries to make the behaviour more consistent. I am unsure right now whether this is possible at all or if the inconsistencies are what actually cause the r5rs breakage. The main difference is that the code in quasiquote stays the same; it will distinguish between quasiquote at the car of a pair which has a pair as cdr and quasiquote at the car of a pair which has a non-pair as cdr. Example: (quasiquote (quasiquote . #((unquote a)))) will result in trying to unquote a, whereas in options 1 and 2 it will stay quoted because the second quasiquote increases the nesting level. I can't think of any other cases than vectors where this differs though (unless extra read syntax is added later). Option 4: Only add an error message when wrong number of arguments are found at level 0. This would keep the R5RS-incompatibilities. Option 5: Do not change anything. It is not a bug, r5rs does not specify what happens so silently dropping extra arguments to unquote is acceptable. Option 6: Do not change the behaviour of unquote with multiple args, but do fix the behaviour of nested quasiquotes to pass only the new r5rs-compliance tests. This is option 3, but will leave the behaviour that (quasiquote (quasiquote a b c)) => (quasiquote a) Option 7: Do it the R6RS way. We'd have to flesh out what exactly happens when you do (quasiquote (quasiquote a b c)) or similar, and what happens when you pass multiple arguments to unquote in a non-list context.
(I find it very hard to think of all the possible implications, but option
2 appears to me the easiest and least risky way)
comment:5 Changed 14 years ago by
comment:7 Changed 14 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
Sorry, I had kind of forgotten about it. Thanks for reminding me, I've implemented it and pushed as changeset 3d8167f.
Peter, is this independent of the CR?