Opened 16 months ago

Last modified 8 months ago

#1293 new defect

eq?-hash tables calculcate different value for objects after they're mutated and lose locatives after GC

Reported by: sjamaan Owned by:
Priority: critical Milestone: someday
Component: core libraries Version: 4.10.x
Keywords: srfi-69, hash tables, locatives Cc:
Estimated difficulty: insane

Description (last modified by sjamaan)

If you mutate a pair, it's still eq? to itself, but the hash-by-identity / eq?-hash doesn't calculate the same hash anymore, which means the hash table cannot find the key, unless the randomization factor happens to be such that it finds the same bucket, of course, but that's highly unlikely for simple cases like the one below:

#;1> (use srfi-69)
; loading /home/sjamaan/chickens/4.11.0/lib/chicken/8/srfi-69.import.so ...
; loading library srfi-69 ...
#;2> (define x (cons 1 2))
#;3> (hash-by-identity x)
380849329
#;4> (set-car! x 3)
#;5> (hash-by-identity x)
380718257
#;6> (= #3 #5)
#f
#;7> 

Originally pointed out by John Croisant:

;; Even though the filler slots are not used, if you remove any of
;; them, the hash of the instance will change after GC, causing the
;; hash table lookup to fail.

(use srfi-69 lolevel)

(define table (make-hash-table test: eq? hash: eq?-hash))

(define-record-type box
  (make-box contents)
  box?
  ;; Remove or comment out any of the next three lines:
  (filler1  box-filler1)
  (filler2  box-filler2)
  (filler3  box-filler3)
  (contents box-contents))

(define my-box (make-box (make-locative "foo")))

(hash-table-set! table my-box #t)

(printf "before gc, hash table contains ~S? ~S~N"
        my-box
        (hash-table-exists? table my-box))

(gc)

(printf "after gc,  hash table contains ~S? ~S~N"
        my-box
        (hash-table-exists? table my-box))

(printf "hash table as alist: ~S~N"
        (hash-table->alist table))

Attachments (1)

srfi-69-testcases.patch (2.1 KB) - added by sjamaan 13 months ago.
A patch for the srfi-69 test suite to detect these problems

Download all attachments as: .zip

Change History (11)

comment:1 Changed 16 months ago by jacius

Context from IRC:

<jacius> I am having the weirdest bug. If I erase an unused slot from the source code of a record type, it greatly changes the behavior of my app.
<jacius> No code is referencing the slot, and I can rename the slot with no effect. But if I delete it from the code things break.
<jacius> Also if I change the order of the slots, things break.
<jcowan> DO a make clean, or local equivalent, and recompile from scratch. Someone is probably hanging on to the old definition.
<jacius> Tried that, no luck. I even tried updating from 4.10 to 4.11.0rc2 to see if that might fix it.

(later)

<jacius> Okay, I think I have narrowed down my weird bug. I think it's related to hashing behavior for records and locatives.
<jacius> I think that if a record holds a locative in any of its first three slots, the record's hash will change after GC.
<jacius> But not if the locative is in the 4th or later slots.
<jacius> I think this also applies recursively for records that hold records that hold locatives.
<jacius> So, I think my app was breaking because I was using a record as a key in a hash table, but the record's hash was changing after GC. Whee.
<evhan> jacius: Wow, that's a pretty nice bug.
<evhan> "Good work." :)
<jacius> That's my theory, anyway. Making a minimal test case to see if I am right.
<jacius> I am surprised that the result of eq?-hash depends on the record's contents
<evhan> I think eq?-hash should just use the object's address.
<jacius> That's what I would think too
<jacius> Oh
<jacius> So maybe the record itself is moving during GC, so its address changes?
<jacius> No, that can't be it
<jacius> Because it only changes if one of the first three slots holds a locative

Last edited 16 months ago by jacius (previous) (diff)

comment:2 Changed 13 months ago by sjamaan

It looks like the real problem is that eq? hash tables don't work as you'd expect: If you mutate an object, it's still eq? to its previous incarnation (if there were such a thing), but the hash function calculates a different hash:

#;1> (use srfi-69)
; loading /home/sjamaan/chickens/4.11.0/lib/chicken/8/srfi-69.import.so ...
; loading library srfi-69 ...
#;2> (define x (cons 1 2))
#;3> (hash-by-identity x)
380849329
#;4> (set-car! x 3)
#;5> (hash-by-identity x)
380718257
#;6> (= #3 #5)
#f
#;7> 

This is completely broken, and it seems that the reference implementation has the exact same bug in hash-by-identity.

Racket and Guile seem to do this correctly, even mutation of a pair doesn't result in a new hash value.

I have no idea how to fix this, as of yet...

Changed 13 months ago by sjamaan

A patch for the srfi-69 test suite to detect these problems

comment:3 Changed 13 months ago by sjamaan

  • Priority changed from major to critical
  • Summary changed from Locatives get moved around causing hash tables to lose them to eq?-hash tables calculcate different value for objects after they're mutated and lose locatives after GC

comment:4 Changed 13 months ago by sjamaan

  • Description modified (diff)

comment:5 Changed 13 months ago by sjamaan

Fixing locatives for equal? hash tables is relatively easy: just special case it and use locative->object to retrieve the object and recur on it, instead of taking the raw pointer in the "special" slot.

comment:6 Changed 13 months ago by sjamaan

Actually, disregard that last comment: if an object is eq?, it must also be equal?, so the problem with mutation of the keys exists just as much for equal?-hash as for eq?-hash.

comment:7 Changed 13 months ago by sjamaan

One (very) ugly way to "fix" this is to change hash-by-identity: It can simply hash the object's pointer, and record it in a side table along with the current pointer. This means that every object that ever was hashed has an entry in that table.

Whenever the object is moved, update the current pointer. Then, if the object is looked up again we can obtain the originally assigned hash value. On major GC, walk the table and remove all the keys that no longer exist.

It's ugly and stupid and takes up a lot of unnecessary memory. I really don't like it, but we could do this in a generic/reusable way and also use it in core's symbol table, for example.

comment:8 Changed 13 months ago by sjamaan

Never mind, that won't work because then you get O(n) behaviour over all items ever hashed (across all tables!), which is exactly what you don't want when choosing a hash table!

comment:9 Changed 13 months ago by sjamaan

  • Estimated difficulty set to insane

comment:10 Changed 8 months ago by sjamaan

  • Milestone changed from 4.12.0 to someday
Note: See TracTickets for help on using tickets.