Opened 4 months ago

Last modified 3 months ago

#1356 new defect

trace buffer unnecessarily holds on to thread objects

Reported by: kristianlm Owned by:
Priority: minor Milestone: 5.0
Component: unknown Version: 4.12.0
Keywords: Cc:
Estimated difficulty: easy

Description

version:

;; 2015 klm@kth /s/p/a/a/s/ais-sse ➤ csi -version
;; 
;; CHICKEN
;; (c) 2008-2017, The CHICKEN Team
;; (c) 2000-2007, Felix L. Winkelmann
;; Version 4.12.0 (rev 6ea24b6)
;; linux-unix-gnu-x86-64 [ 64bit manyargs dload ptables ]
;; compiled 2017-02-19 on yves.more-magic.net (Linux)

running the program below should output this:

;; 2017 klm@kth /s/p/a/a/s/ais-sse ➤ cat weak-test.scm  | csi -p '(read)'
;; (use srfi-18 lolevel)
;; 2017 klm@kth /s/p/a/a/s/ais-sse ➤ csi -s weak-test.scm
;; locking mutex
;; unlocking mutex ...
;; weak locative: #(123)
;; weak locative: #(123)
;; weak locative: #(123)
;; weak locative: #(123)
;; ^C
;; *** user interrupt ***
;;

Where the locative to REFERENCE is kept alive forever. However, it looses track of this REFERENCE:

;; 
;; 2016 klm@kth /s/p/a/a/s/ais-sse ➤ csi -s weak-test.scm
;; locking mutex
;; unlocking mutex ...
;; weak locative: #(123)
;; weak locative: #f
;; weak locative: #f
;; weak locative: #f
;; ^C
;; *** user interrupt ***

There are several ways you can make this program work:

  • compile it and run it
  • remove the srfi-13 import (it's not used)
  • assign the first thread to a top variable

Here's the program in question that doesn't work:

(use srfi-13 ;; <-- killer import (any import will breaks things, try srfi-4 too for example)
     srfi-18 lolevel)

(define wl #f)

(thread-start!
 (lambda ()
   (define mutex (make-mutex))
   (define cv (make-condition-variable))
   (print "locking mutex")
   (mutex-lock! mutex)
   (let ((REFERENCE (vector 123)))
     (set! wl (make-weak-locative REFERENCE))
     (print "unlocking mutex ...")
     (mutex-unlock! mutex cv)
     (print "unlocked!")
     (print "DONE (REFERENCE may now be out of scope) " REFERENCE))))

(thread-start!
 (lambda ()
   (let loop ()
     (print "weak locative: " (locative->object wl))
     (thread-sleep! 1)
     (gc #t)
     (loop))))

(thread-sleep! 10)

Change History (7)

comment:1 Changed 4 months ago by sjamaan

From a quick first analysis, it looks like the entire thread is being garbage collected; ##sys#all-threads does not contain a reference to it, and the main thread doesn't have a reference either.

Perhaps we should create a list of threads blocked on mutexes somewhere?

comment:2 Changed 4 months ago by kristianlm

Like sjamaan suggested on #chicken, a finalizer may help to debug things:

(set-finalizer!
 (thread-start!
  (lambda ()
    (define mutex (make-mutex))
    (define cv (make-condition-variable))
    (print "locking mutex")
    (mutex-lock! mutex)
    (let ((REFERENCE (vector 1 2 3)))
      (set! wl (make-weak-locative REFERENCE))
      (print "unlocking mutex ...")
      (mutex-unlock! mutex cv)
      (print "unlocked!")
      (print "DONE (REFERENCE may now be out of scope) " REFERENCE))))
 (lambda (x)
   (print "THREAD COLLECTED!")))

On my machine, the finalizer gets called before REFERENCE is lost.

comment:3 Changed 3 months ago by kristianlm

After some thinking, I realize this bug-report should be the other way around. Just like sjamaan mentioned on #chicken, he wasn't convinced about the correct behaviour. My original claim, running the program below should output this:, is wrong.

The correct behaviour is that the thread and its REFERENCE gets garbage collected and the weak-reference returning #f. Because nobody is holding onto the condition-variable, the thread can never resume, and it can and should disappear from memory! This allows me to do cool things like this!

So, that leaves us with why the thread dot not always get garbage-collected. I can't trigger the following finalizers consistently:

(use srfi-18 miscmacros)

(repeat 1
 (set-finalizer!
  (thread-start! (lambda () (mutex-unlock! (make-mutex) (make-condition-variable))))
  (lambda (x) (print "blocked thread gced!")))

 (set-finalizer!
  (thread-start! (lambda () #f))
  (lambda (x) (print "exited thread gced!")))

 (set-finalizer!
  (vector 1 2 3)
  (lambda (x) (print "vector gced"))))

(thread-sleep! 1) ;; ensure other thread blocks on cv
(gc #t)           ;; major gc and run all finalizers
(thread-sleep! 1) ;; ensure finalizers get run

That gives me (on CHICKEN 4.12 at least):

$ csc /tmp/finalizer-test.scm  ; and /tmp/finalizer-test
vector gced
exited thread gced!

No blocked thread gced! :-( However, if I change to (repeat 1000 ...) , the blocked thread finalizers also run, so that's great! But why won't it run with only 1 blocked thread started? I think this behaviour should be consistent.

Last edited 3 months ago by kristianlm (previous) (diff)

comment:4 Changed 3 months ago by sjamaan

Interestingly, in a DEBUGBUILD CHICKEN, I see the blocked thread exit as well.

comment:5 Changed 3 months ago by sjamaan

Found the reason: The call chain holds a reference to the thread in which a procedure call occurs. C_trace will grab the currently running thread from ##sys#current-thread (as current_thread_symbol in runtime.c).

So the mere fact that this first thread calls make-condition and mutex-lock! causes it to stick around.

This can be verified in several ways:

  • If you add a procedure call to one of the other threads (instead of just returning a bare literal), you'll notice that that thread won't be garbage collected either. In the interpreter, a simple change of #f to (void) will do. When you compile the program you need something more elaborate due to optimisations, or use -O0 to avoid that.
  • If you run the application with -:a3 (it won't go any lower), the trace buffer won't have this reference anymore and you'll see that the finalizer is run.
  • If you compile with -d0, the C_trace() calls are omitted and the threads will all be cleaned up as you expect.
  • If the program does a lot more to fill up the trace buffer, as you have noticed yourself, the thread will eventually drop out of the trace buffer and the finalizer can trigger.

So all in all, I think this isn't a bug: the system works perfectly, it's just that there's a reference still holding on to the thread which you weren't expecting.

Having said that, perhaps we can store just the thread's name rather than the complete thread object, that would allow us to collect it earlier.

comment:6 Changed 3 months ago by sjamaan

  • Estimated difficulty set to easy
  • Milestone changed from 4.13.0 to 5.0
  • Priority changed from major to minor

Changing the trace buffer representation is not something we should do in CHICKEN 4, I think.

Last edited 3 months ago by sjamaan (previous) (diff)

comment:7 Changed 3 months ago by sjamaan

  • Summary changed from weak-locatives references return #f when it shouldn't to trace buffer unnecessarily holds on to thread objects
Note: See TracTickets for help on using tickets.