Opened 11 years ago
Closed 6 years ago
#1077 closed defect (fixed)
Symbols containing newlines don't get quoted by write
Reported by: | sjamaan | Owned by: | sjamaan |
---|---|---|---|
Priority: | major | Milestone: | 5.1 |
Component: | core libraries | Version: | 4.8.x |
Keywords: | Cc: | ||
Estimated difficulty: | insane |
Description
(write '|\n|)
This should print |\n|
but just prints a newline, breaking read/write invariance.
Reported by zenspider on IRC.
Attachments (1)
Change History (11)
comment:1 Changed 11 years ago by
comment:2 Changed 11 years ago by
Thanks for your thorough analysis, Evan. I'm not sure what you mean by "longer namespaces like
##compiler# might have to change", though. Could you elaborate?
I was thinking we should just prefix them with ##compiler#. Do you expect problems with that?
comment:3 Changed 11 years ago by
I only meant that if we dropped namespace-max-id-len
to 3, for example, the ##compiler#
namespace would be too long ("compiler" length 8) and identifiers starting with it would fail to be recognized as qualified symbols (when read by r-ext-symbol
). I did try that out of curiosity and things exploded, though I didn't keep looking to see how badly.
Changed 11 years ago by
Attachment: | 0001-Do-not-use-a-private-namespace-for-the-csi-program.patch added |
---|
Remove private namespace for csi
comment:4 Changed 11 years ago by
This first patch is an easy one, but it makes it easier and more self-contained to remove the private namespace from the compiler itself. It removes the private namespace from the "csi" program - it is compiled separately and we can use the regular (declare (hide ...)) to hide any private variables that user code is not supposed to see.
comment:5 Changed 11 years ago by
Milestone: | 4.9.0 → 4.10.0 |
---|
Let's postpone to 4.10.0; it's not a blocker
comment:6 Changed 9 years ago by
Milestone: | 4.10.0 → 5.0 |
---|
This is closely related to #1131, which we'll fix somewhere in CHICKEN 5.
comment:7 Changed 8 years ago by
Estimated difficulty: | → insane |
---|
comment:8 Changed 8 years ago by
Milestone: | 5.0 → 5.1 |
---|
We're making headways with this by properly modularising the core system, but this won't get finished for 5.0 (maybe not even 5.1, but one can dream).
comment:9 Changed 7 years ago by
Looks like keywords also fall somewhere in here: any symbol that starts with \x00
gets written as a keyword.
comment:10 Changed 6 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
Made a new issue for just the keywords (#1576), so we can close this one; it's been fixed with 5.0.1.
I think this is a consequence of the way qualified symbols are encoded,
and affects all symbols whose first byte is less than 32 (please someone
correct me if any of the following is wrong).
Any symbol whose name has a leading byte under 32 is considered
qualified, with that byte specifying the length of the namespace part of
the ensuing string. Obviously, this is invalid for
'|\n|
, so whenit's handled as a qualified symbol in
##sys#print
after satisfying##sys#qualified-symbol?
(library.scm:3357),##sys#symbol->qualified-string
detects this invalid length, fallsback to simply returning the symbol's string value without any
qualification, and we get a lone newline printed out as the result.
You can see what would happen were 10 a valid length by extending the
symbol, e.g.
'|\naaaaaaaaaaa| => ##aaaaaaaaaa#a
.All that said, I'm not really sure what to do about this. We could make
##sys#qualified-symbol?
check whether its argument has a validnamespace length so its behavior at least matches that of
split
(library.scm:1184) and the procedures defined over it, but that leaves
things dependent on the length of the symbol (e.g. the difference
between
'|\n|
and'|\naaaaaaaaaaa|
above) so it isn't agreat option. We could drop
namespace-max-id-len
so that symbolscan begin with the more commonly-used values under 32 (
\n
,\t
, etc.), but even if we dropped it to something quite low we'dstill have problems with e.g.
'|\x03|
, and longer namespaces like##compiler#
might have to change, so that's also not really anoption either. We could... I don't know. Hopefully I'm missing a really
obvious fix.
Thoughts?