Opened 13 years ago

Closed 13 years ago

#439 closed change request (fixed)

Quasiquote simplification and improved syntax checks

Reported by: sjamaan Owned by: felix winkelmann
Priority: minor Milestone:
Component: expander Version: 4.6.x
Keywords: expander, quasiquote, syntax Cc:
Estimated difficulty:

Description

While studying the expander, I stumbled upon a problem in the quasiquote code. It does not check for proper usage of unquote/unquote-splicing.

Right now, it silently ignores extra arguments to unquote and unquote-splicing:

(let ((a 1)) (quasiquote (unquote a b))) => 1

Also, there's an R5RS incompatibility:

(define x (list 1 2))
(quasiquote (quasiquote (unquote-splicing (unquote x))))
=> (quasiquote (unquote-splicing (unquote x)))

However, this should return (quasiquote (unquote-splicing (1 2))) because

The nesting level increases by one inside each successive quasiquotation, and decreases by one inside each unquotation

-- R5RS 4.2.6, second paragraph

The first quasiquote opens, the second increases nesting to one, the unquote-splicing decreases it to zero and the unquote is at level 0 so should result in the x getting unquoted.

The attached patch fixes these problems.

I've also taken the opportunity to simplify the code a bit, reducing the total "quasiquote"-code by 8 lines and making it more readable/understandable. Because R5RS says the behavior of unquote and unquote-splicing is undefined for cases where they occur in positions other than the car of a 2-element list, I decided it's not worth the extra checking whether the cdr of a nested quasiquote call is actually a list or not. It complicated the code and make it a little inconsistent.

The patch also adds tests and fixes the broken (f ..) test syntax; it originally always succeeded, because it threw an error when no errors were thrown, causing the exception handler to always be invoked! Luckily, none of the existing tests are failing after this fix.

Attachments (1)

quasiquote-fix.patch (6.5 KB) - added by sjamaan 13 years ago.

Download all attachments as: .zip

Change History (8)

Changed 13 years ago by sjamaan

Attachment: quasiquote-fix.patch added

comment:1 Changed 13 years ago by felix winkelmann

Peter, is this independent of the CR?

comment:2 Changed 13 years ago by sjamaan

This is the CR :)

If you want you can already apply the second hunk in the patch (the part which fixes the f macro in syntax-tests.scm).

comment:3 in reply to:  2 Changed 13 years ago by felix winkelmann

Milestone: 4.7.0
Type: defectchange request

Replying to sjamaan:

This is the CR :)

If you want you can already apply the second hunk in the patch (the part which fixes the f macro in syntax-tests.scm).

Ok, I have done so.

comment:4 Changed 13 years ago by felix winkelmann

I add here the possible options as given by Peter in his mail to chicken-hackers:

Option 1:
As (quasiquote a b c) throws an error, so could
(quasiquote (quasiquote a b c)) or any even deeper-nested quasiquotes,
as well as unquotes.  Having multiple arguments to quasiquote is
not allowed on level 0, so why should it be allowed on higher levels?
This is the pedantically correct option and might ease the debugging
of code that contains quasiquote calls because this code would be
unportable.  This could break existing code.

This is probably the cleanest option from a language purity standpoint.
It would also make the code even a bit simpler than my patch does, I
think.  We could put a call to ##sys#check-syntax after it's determined
that the car of a pair contains one of the three words quasiquote,
unquote or unquote-splicing.  Then we have an if which checks the
nesting level and either calls WALK recursively or does the unquoting.

The disadvantage of this option compared to option 2 is that it
will be a little less practical when you want to use the literal
symbols quasiquote, unquote or unquote-splicing in deeper-nested
quotations.
With option 2, (quasiquote (x y (quasiquote a b c)))
will "just work".  If this option would be implemented, this
would need to be written as
(quasiquote (x y (unquote-splicing (quote (quasiquote a b c)))))
or as
(quasiquote (x y (unquote (quote quasiquote)) a b c))
otoh, when written as `(x y ,'quasiquote a b c) that's not too
bad.

However, something like (quasiquote (x quasiquote y z)) would
also (perhaps unexpectedly) give an error as that's identical to
(quasiquote (x . (quasiquote y z)))
This difference cannot be detected because quasiquote, unlike normal
Scheme code, allows improper lists to occur anywhere.  In normal Scheme
(x . (y z)) is identical to (x y z) and the y is never evaluated
as a macro.

Option 2 (this is what my current patch does):
Only throw errors in case of invalid list arguments at level 0
and increase the level with 1 whenever quasiquote is found at the
car of a pair.  It will allow you to write quasiquotes and unquotes
at nesting levels above 1 without any strange trickery, and will
throw a nice error message when you try to use them at level 0 with
wrong arity.

The example above, (quasiquote (x quasiquote y z)), will just work
and not complain.  However,  (quasiquote (x quasiquote y z (unquote a)))
will not unquote the a, because this is identical to
(quasiquote (x . (quasiquote y z (unquote a)))), so (unquote a) occurs
at nesting level 1 and simply decreases the level to 0 instead of
evaluating the "a".  That's the price we'd pay for having a loose and
non-strictly pedantic system.

Option 3:
Only fix R5RS behaviour and throw an error when an invalid number
of arguments are found at level 0.  This will undo the part of my
patch that tries to make the behaviour more consistent.  I am unsure
right now whether this is possible at all or if the inconsistencies
are what actually cause the r5rs breakage.

The main difference is that the code in quasiquote stays the same; it
will distinguish between quasiquote at the car of a pair which has a
pair as cdr and quasiquote at the car of a pair which has a non-pair
as cdr.  Example: (quasiquote (quasiquote . #((unquote a))))
will result in trying to unquote a, whereas in options 1 and 2 it
will stay quoted because the second quasiquote increases the nesting
level.  I can't think of any other cases than vectors where this
differs though (unless extra read syntax is added later).

Option 4:
Only add an error message when wrong number of arguments are found
at level 0.  This would keep the R5RS-incompatibilities.

Option 5:
Do not change anything.  It is not a bug, r5rs does not specify
what happens so silently dropping extra arguments to unquote is
acceptable.

Option 6:
Do not change the behaviour of unquote with multiple args, but do
fix the behaviour of nested quasiquotes to pass only the new
r5rs-compliance tests.  This is option 3, but will leave the
behaviour that (quasiquote (quasiquote a b c)) => (quasiquote a)

Option 7:
Do it the R6RS way.  We'd have to flesh out what exactly happens when
you do (quasiquote (quasiquote a b c)) or similar, and what happens when
you pass multiple arguments to unquote in a non-list context.

(I find it very hard to think of all the possible implications, but option
2 appears to me the easiest and least risky way)

comment:6 Changed 13 years ago by felix winkelmann

I think we should implement this.

comment:7 Changed 13 years ago by sjamaan

Resolution: fixed
Status: newclosed

Sorry, I had kind of forgotten about it. Thanks for reminding me, I've implemented it and pushed as changeset 3d8167f.

Note: See TracTickets for help on using tickets.