Opened 10 years ago
Closed 8 years ago
#1173 closed defect (fixed)
Symbols held in GC-roots are sometimes collected in "symbol-gc" mode
Reported by: | felix winkelmann | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | 5.0 |
Component: | core libraries | Version: | 4.9.x |
Keywords: | Cc: | ||
Estimated difficulty: | hard |
Description
(Initially reported by sjamaan)
If a symbol is stored in a GC-root, then it may be collected when symbol-GC is enabled (with the -:w
runtime option). Here a simple test case:
(foreign-declare "static void *root; static void foo(C_word x) { root = CHICKEN_new_gc_root(); CHICKEN_gc_root_set(root, x); } static C_word bar() { return CHICKEN_gc_root_ref(root); }") (define foo (foreign-lambda void "foo" scheme-object)) (define bar (foreign-lambda scheme-object "bar")) (gc #t) (foo (string->symbol "FOO")) (gc #t) (print (eq? (string->symbol "FOO") (bar)))
Without -:w
, this will print #t
, with -:w
it will print #f
(sometimes, not always. of course)
Attachments (4)
Change History (19)
comment:1 Changed 9 years ago by
Summary: | Symbols help in GC-roots are sometimes collected in "symbol-gc" mode → Symbols held in GC-roots are sometimes collected in "symbol-gc" mode |
---|
comment:2 Changed 9 years ago by
Milestone: | someday → 5.0 |
---|
comment:3 Changed 9 years ago by
I think this has nothing to do with GC roots in particular. The symbol GC is throwing away symbols too eagerly:
$ csi -e '(gc #t) (use posix)' # all good $ csi -:w -e '(gc #f) (use posix)' # still good $ csi -:w -e '(gc #t) (use posix)' # what the hell Error: (require) cannot load extension: posix Call history: <syntax> (gc #t) <eval> (gc #t) <syntax> (use posix) <syntax> (##core#require-extension (posix) #t) <syntax> (##core#begin (##core#begin (##sys#require (quote posix)) (import posix)) (##core#undefined)) <syntax> (##core#begin (##sys#require (quote posix)) (import posix)) <syntax> (##sys#require (quote posix)) <syntax> (quote posix) <syntax> (##core#quote posix) <syntax> (import posix) <syntax> (##core#undefined) <syntax> (##core#undefined) <eval> (##sys#require (quote posix)) <--
comment:4 Changed 8 years ago by
After some brief experimentation I think that the root cause is the fact that the symbol-gc greedily wipes all the symbols from the table, including the static ones that hold the names for the exported symbols.
Back to this specific case, ##sys#load-extension
is failing because an eq? comparison doesn't hold anymore (I think it was one of the memq, sadly I didn't write any note) and end up printing the error above.
As a possible fix I could devise two possible scenarios:
- Make the static symbols have a ∞ reference count (so that now the counter has 4 possible states: 0, 1, 1+, ∞)
- Avoid static symbols when evicting the symbols, you can recognize that a symbol's static if the slot-1 string resides outside the chicken heap.
I did implement the latter since it was easier and fixed this particular crash, I didn't test it much though.
comment:5 Changed 8 years ago by
It's true that eq?
would fail, but that will only happen if the symbol is re-interned while there's still some reference to it elsewhere. This can be fixed by skipping symbols that have a global binding or a plist attached to them, I think.
comment:6 Changed 8 years ago by
Special-casing static symbols might be helpful to avoid allocating memory for symbol strings that are already malloc
ed, though. But it wouldn't be an actual fix for the current situation.
comment:7 Changed 8 years ago by
BTW, statically allocated objects can be detected via C_permanentp
.
comment:8 Changed 8 years ago by
Grr, this program no longer fails under CHICKEN 5. We didn't actually fix the bug AFAIK, so this is a coincidence.
comment:9 Changed 8 years ago by
The reason the original version doesn't work seems to be because the weak table's displacement code might run out of space, and simply not insert the symbol in the table.
The fixed version dispenses with the weak table, and instead changes buckets to weakly refer to their symbol (by using C_SPECIALBLOCK_BIT
). After a GC, it scans the symbol table, and updates the weak references if their targets are forwarded. If the result isn't in the new heap or permanent, it can be discarded.
This implementation is rather slow, because I had to move the scanning of the symbol table to be done even for minor GCs, because symbols can be purely stack-allocated and unreferenced by anything except the symbol table, yet still need to be preserved: this happens when the symbol gets a global definition or a property list.
We can speed this up by changing the code for set!
and putprop
to "touch" the symbol table, so these symbols are put on the mutation stack, for example. Alternatively, we could make a separate stack of "symbols to persist", but that's a bit ugly.
If we add this "touch symbol" code, it probably also needs to remove the C_SPECIALBLOCK_BIT
on the bucket containing the symbol. That way, the symbol is preserved on the next major GC even if there's nothing referencing it. Then, the original symbol table scanning code in the GC can be restored again, which means the performance should be the same as it originally was.
comment:10 Changed 8 years ago by
A nice additional advantage of this new approach is that it no longer needs the bucket type to be distinct from regular pairs. I think we can get rid of this type tag; probably only in CHICKEN 5, because it's, strictly speaking, a backwards incompatibility.
comment:11 Changed 8 years ago by
This version applies the ideas I mentioned, and it results in exactly the same performance as with regular CHICKEN when not collecting symbols.
Unfortunately, collecting weak symbols is very slow. The knucleotide
benchmark is five times slower than with the old CHICKEN, when passing -:w
. Of course, this can be explained due to the fact that the old CHICKEN only reclaimed a few hundred symbols in major GCs on that benchmark while CHICKEN with patch reclaims several hundred thousand symbols.
comment:12 Changed 8 years ago by
hm, if I change rescan
in C_reclaim
like this, it's very fast, because it skips symbol GC like the original code did:
if(n > 0 && (h & C_BYTEBLOCK_BIT) == 0) { if(h & C_SPECIALBLOCK_BIT) { /* Minor GC needs to be fast; always mark weakly held buckets */ if (gc_mode != GC_MINOR || h != C_WEAK_BUCKET_TAG) { --n; ++p; } } while(n--) mark(p++); }
I'm not sure how much I like this: the loop is supposed to be as fast as possible, and adding another (highly specific) check in there seems a bit ugly.
WEAK_BUCKET_TAG
is defined as (C_BUCKET_TYPE | C_SPECIALBLOCK_BIT | (C_SIZEOF_BUCKET - 1))
of course. We could still define it as (C_PAIR_TYPE | C_SPECIALBLOCK_BIT | (C_SIZEOF_PAIR - 1))
to get rid of the bucket type.
It would be nice if we can make this behave more generically. Then we can make locatives work exactly like this, as well, and theoretically we could add other weak types too.
Changed 8 years ago by
Attachment: | fast-working-symbol-gc.patch added |
---|
Working symbol GC that's as fast as the original
comment:13 Changed 8 years ago by
OK, my original analysis about the hash table was (of course) completely wrong.
It looks like the problem with the original implementation is that we may encounter both the symbol, its bucket container and another item that refers to the symbol during minor GC. These are all copied to the fromspace. Nothing happens to the weak table because it's a minor GC. We scan over the bucket, and update the forwarding pointer to the symbol, so it points to the symbol in the fromspace.
Then, a major GC kicks in because fromspace is full.
We first copy the bucket to tospace. This enters it into the weak table with counter 0, using as item pointer the symbol's location in fromspace. Then, the symbol itself is copied and the counter is set to 1. Then, we encounter the other item that refers to the symbol. However, it still refers to the stack location of the symbol, which holds a forwarding pointer. The forwarding pointer gets updated, but the symbol's counter doesn't get incremented.
I think a less invasive patch might be possible if we check the weak table for an entry for the forwarded location while chasing forwarding pointers, to handle this case.
Changed 8 years ago by
Attachment: | simpler-fix.patch added |
---|
A noninvasive patch that works with the current weak table code
comment:15 Changed 8 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
The bug itself is now fixed by 2150ad5 / b6c9d81.
Will polish the other patch later and send it, but it's an improvement, not a bugfix.
I'd like to fix this for CHICKEN 5 (symbol GC should be the default, IMO)