Opened 13 years ago

Closed 11 years ago

#505 closed enhancement (fixed)

permit user-defined read-syntax to restart the reader

Reported by: Jim Ursetto Owned by: Jim Ursetto
Priority: minor Milestone:
Component: core libraries Version: 4.6.x
Keywords: don't fear the reader Cc:
Estimated difficulty:

Description

Currently, user-defined read-syntax must return a value to the reader; it cannot simply discard input completely. A proposed solution, derived from CL reader macros (see CLHS 2.2 no.4) allows a zero-value (values) return to reinvoke the reader in tail position, discarding any consumed input forever.

This change permits user implementation of line comments, block comments, s-expr comments, read-time feature testing, read-time evaluation, and more; all with the existing user API.

Though a zero-value return is admittedly ugly and inefficient -- one noted luminary has rightfully referred to it as a "festering CL pustule on the buttocks of Scheme" -- it has the advantages of being compatible with the current API and being virtually impossible to trigger accidentally.

Attachments (2)

reader-restart-mv.diff.txt (1.9 KB) - added by Jim Ursetto 13 years ago.
multiple value implementation
reader-restart-rec.diff.txt (4.2 KB) - added by Jim Ursetto 13 years ago.
tail-call restart version

Download all attachments as: .zip

Change History (9)

comment:1 Changed 13 years ago by sjamaan

A good alternative might be to create a distinct new type (either a record type or an actual new primitive type, if there are still tag bits left and we want to sacrifice one for this) to distinguish "drop this value" from other types of returns. This is also backwards-compatible and doesn't require MV or other nasty tricks.

Changed 13 years ago by Jim Ursetto

Attachment: reader-restart-mv.diff.txt added

multiple value implementation

comment:2 Changed 13 years ago by Jim Ursetto

Here's another possibility which I think is cleaner. It adds new starred variants of set-read-syntax!, set-parameterized-read-syntax! and set-sharp-read-syntax!, which take 2 arguments (p restart) instead of just (p). Invoking (restart) will restart the reader, like if you had returned zero values in the previous patch. E.g.

;; instead of
(set-read-syntax! #\! (lambda (p) (read-line p) (values)))

;; we can use
(set-read-syntax!* #\! (lambda (p restart) (read-line p) (restart)))

;; then
(list 'foo ! 'bar 'baz
 'quux)
; => (foo quux)

The starred names are ugly, I also thought of set-restartable-read-syntax!, set-restartable-sharp-read-syntax! and set-restartable-parameterizable-read-syntax! but it gets ridiculously long. (To be fair, I doubt parameterizable read syntax will need a restartable variant.)

Only thing this doesn't do is allow calls to ##sys#user-read-hook to restart, but I can live with that!

Changed 13 years ago by Jim Ursetto

Attachment: reader-restart-rec.diff.txt added

tail-call restart version

comment:3 Changed 13 years ago by felix winkelmann

Actually I think the 0-values variant is simpler and doesn't need yet another library procedure. I understand that it is painful to use a CL idiom, a dark spot on the shining architectural awesomeness of CHICKEN. So this will be a very hard thing to decide on.

comment:4 in reply to:  3 Changed 13 years ago by Jim Ursetto

Replying to felix:

Actually I think the 0-values variant is simpler and doesn't need yet another library procedure. I understand that it is painful to use a CL idiom, a dark spot on the shining architectural awesomeness of CHICKEN.

More inefficient than ugly, IMO.
I think you might look at it like this: if you had from the start designed reader macros to be capable of restarting the reader, how would you have done it? Would you have passed a continuation argument, like I did in the second version? If so, you could consider the new library procedures to be correcting an API defect, and deprecate the old ones. Or, given that only six eggs use user-defined read-syntax, we could change the existing API and use a compile-time feature test in them to determine when the extra argument should be expected. (Or change everything to expect an #!optional restart argument and defer the test until runtime.)

If the values solution appeals to you more, then by all means use it. But I will need some way of testing the restart feature availability either way: returning (values) from a reader macro on existing Chicken will produce illegal code (atomic #<unspecified> value, as when returning (void)).

comment:5 Changed 13 years ago by Jim Ursetto

Final comment, I hope.
The 0-values solution assumes the reader (readrec) will be reinvoked in tail-position. However, many of the existing read-syntax definitions in the reader perform a non-tail call to (readrec), for example: quote quasiquote #$ #` #+ and even #;. Are these recursive calls equivalent to (read p)? If so, users can just call that. I assume it is so, and that the direct call to readrec is just an optimization there. But if not, the functionality should be available to users.

comment:6 Changed 13 years ago by felix winkelmann

Milestone: 4.7.0

comment:7 Changed 11 years ago by Jim Ursetto

Resolution: fixed
Status: newclosed

Zero-values solution was applied in Chicken 4.6.6.

Note: See TracTickets for help on using tickets.