Opened 8 years ago

Last modified 7 years ago

#1293 new defect

eq?-hash tables calculcate different value for objects after they're mutated and lose locatives after GC

Reported by: sjamaan Owned by:
Priority: critical Milestone: someday
Component: core libraries Version: 4.10.x
Keywords: srfi-69, hash tables, locatives Cc:
Estimated difficulty: hard

Description (last modified by sjamaan)

If you mutate a pair, it's still eq? to itself, but the hash-by-identity / eq?-hash doesn't calculate the same hash anymore, which means the hash table cannot find the key, unless the randomization factor happens to be such that it finds the same bucket, of course, but that's highly unlikely for simple cases like the one below:

#;1> (use srfi-69)
; loading /home/sjamaan/chickens/4.11.0/lib/chicken/8/srfi-69.import.so ...
; loading library srfi-69 ...
#;2> (define x (cons 1 2))
#;3> (hash-by-identity x)
380849329
#;4> (set-car! x 3)
#;5> (hash-by-identity x)
380718257
#;6> (= #3 #5)
#f
#;7> 

Originally pointed out by John Croisant:

;; Even though the filler slots are not used, if you remove any of
;; them, the hash of the instance will change after GC, causing the
;; hash table lookup to fail.

(use srfi-69 lolevel)

(define table (make-hash-table test: eq? hash: eq?-hash))

(define-record-type box
  (make-box contents)
  box?
  ;; Remove or comment out any of the next three lines:
  (filler1  box-filler1)
  (filler2  box-filler2)
  (filler3  box-filler3)
  (contents box-contents))

(define my-box (make-box (make-locative "foo")))

(hash-table-set! table my-box #t)

(printf "before gc, hash table contains ~S? ~S~N"
        my-box
        (hash-table-exists? table my-box))

(gc)

(printf "after gc,  hash table contains ~S? ~S~N"
        my-box
        (hash-table-exists? table my-box))

(printf "hash table as alist: ~S~N"
        (hash-table->alist table))

Attachments (1)

srfi-69-testcases.patch (2.1 KB) - added by sjamaan 8 years ago.
A patch for the srfi-69 test suite to detect these problems

Download all attachments as: .zip

Change History (12)

comment:1 Changed 8 years ago by John Croisant

Context from IRC:

<jacius> I am having the weirdest bug. If I erase an unused slot from the source code of a record type, it greatly changes the behavior of my app.
<jacius> No code is referencing the slot, and I can rename the slot with no effect. But if I delete it from the code things break.
<jacius> Also if I change the order of the slots, things break.

(later)

<jacius> Okay, I think I have narrowed down my weird bug. I think it's related to hashing behavior for records and locatives.
<jacius> I think that if a record holds a locative in any of its first three slots, the record's hash will change after GC.
<jacius> But not if the locative is in the 4th or later slots.
<jacius> I think this also applies recursively for records that hold records that hold locatives.
<jacius> So, I think my app was breaking because I was using a record as a key in a hash table, but the record's hash was changing after GC. Whee.
<evhan> jacius: Wow, that's a pretty nice bug.
<evhan> "Good work." :)
<jacius> That's my theory, anyway. Making a minimal test case to see if I am right.
<jacius> I am surprised that the result of eq?-hash depends on the record's contents
<evhan> I think eq?-hash should just use the object's address.
<jacius> That's what I would think too
<jacius> Oh
<jacius> So maybe the record itself is moving during GC, so its address changes?
<jacius> No, that can't be it
<jacius> Because it only changes if one of the first three slots holds a locative

Version 0, edited 8 years ago by John Croisant (next)

comment:2 Changed 8 years ago by sjamaan

It looks like the real problem is that eq? hash tables don't work as you'd expect: If you mutate an object, it's still eq? to its previous incarnation (if there were such a thing), but the hash function calculates a different hash:

#;1> (use srfi-69)
; loading /home/sjamaan/chickens/4.11.0/lib/chicken/8/srfi-69.import.so ...
; loading library srfi-69 ...
#;2> (define x (cons 1 2))
#;3> (hash-by-identity x)
380849329
#;4> (set-car! x 3)
#;5> (hash-by-identity x)
380718257
#;6> (= #3 #5)
#f
#;7> 

This is completely broken, and it seems that the reference implementation has the exact same bug in hash-by-identity.

Racket and Guile seem to do this correctly, even mutation of a pair doesn't result in a new hash value.

I have no idea how to fix this, as of yet...

Changed 8 years ago by sjamaan

Attachment: srfi-69-testcases.patch added

A patch for the srfi-69 test suite to detect these problems

comment:3 Changed 8 years ago by sjamaan

Priority: majorcritical
Summary: Locatives get moved around causing hash tables to lose themeq?-hash tables calculcate different value for objects after they're mutated and lose locatives after GC

comment:4 Changed 8 years ago by sjamaan

Description: modified (diff)

comment:5 Changed 8 years ago by sjamaan

Fixing locatives for equal? hash tables is relatively easy: just special case it and use locative->object to retrieve the object and recur on it, instead of taking the raw pointer in the "special" slot.

comment:6 Changed 8 years ago by sjamaan

Actually, disregard that last comment: if an object is eq?, it must also be equal?, so the problem with mutation of the keys exists just as much for equal?-hash as for eq?-hash.

comment:7 Changed 8 years ago by sjamaan

One (very) ugly way to "fix" this is to change hash-by-identity: It can simply hash the object's pointer, and record it in a side table along with the current pointer. This means that every object that ever was hashed has an entry in that table.

Whenever the object is moved, update the current pointer. Then, if the object is looked up again we can obtain the originally assigned hash value. On major GC, walk the table and remove all the keys that no longer exist.

It's ugly and stupid and takes up a lot of unnecessary memory. I really don't like it, but we could do this in a generic/reusable way and also use it in core's symbol table, for example.

comment:8 Changed 8 years ago by sjamaan

Never mind, that won't work because then you get O(n) behaviour over all items ever hashed (across all tables!), which is exactly what you don't want when choosing a hash table!

comment:9 Changed 8 years ago by sjamaan

Estimated difficulty: insane

comment:10 Changed 8 years ago by sjamaan

Milestone: 4.12.0someday

comment:11 Changed 7 years ago by johnwcowan

Estimated difficulty: insanehard

The Right Thing here is to use a pointer as the eq-hash, and then modify the major GC (which has a conspectus of all of memory) so that it rehashes all the eq-hash-tables at once. This is hard, but not insanely hard.

Alternatively, simply declare that anyone who mutates a key deserves to lose, even in eq-hash-tables.

Note: See TracTickets for help on using tickets.