Opened 8 years ago

Closed 7 years ago

#1356 closed defect (fixed)

trace buffer unnecessarily holds on to thread objects

Reported by: kristianlm Owned by: felix winkelmann
Priority: minor Milestone: 5.0
Component: unknown Version: 4.12.0
Keywords: Cc:
Estimated difficulty: medium

Description

version:

;; 2015 klm@kth /s/p/a/a/s/ais-sse ➤ csi -version
;; 
;; CHICKEN
;; (c) 2008-2017, The CHICKEN Team
;; (c) 2000-2007, Felix L. Winkelmann
;; Version 4.12.0 (rev 6ea24b6)
;; linux-unix-gnu-x86-64 [ 64bit manyargs dload ptables ]
;; compiled 2017-02-19 on yves.more-magic.net (Linux)

running the program below should output this:

;; 2017 klm@kth /s/p/a/a/s/ais-sse ➤ cat weak-test.scm  | csi -p '(read)'
;; (use srfi-18 lolevel)
;; 2017 klm@kth /s/p/a/a/s/ais-sse ➤ csi -s weak-test.scm
;; locking mutex
;; unlocking mutex ...
;; weak locative: #(123)
;; weak locative: #(123)
;; weak locative: #(123)
;; weak locative: #(123)
;; ^C
;; *** user interrupt ***
;;

Where the locative to REFERENCE is kept alive forever. However, it looses track of this REFERENCE:

;; 
;; 2016 klm@kth /s/p/a/a/s/ais-sse ➤ csi -s weak-test.scm
;; locking mutex
;; unlocking mutex ...
;; weak locative: #(123)
;; weak locative: #f
;; weak locative: #f
;; weak locative: #f
;; ^C
;; *** user interrupt ***

There are several ways you can make this program work:

  • compile it and run it
  • remove the srfi-13 import (it's not used)
  • assign the first thread to a top variable

Here's the program in question that doesn't work:

(use srfi-13 ;; <-- killer import (any import will breaks things, try srfi-4 too for example)
     srfi-18 lolevel)

(define wl #f)

(thread-start!
 (lambda ()
   (define mutex (make-mutex))
   (define cv (make-condition-variable))
   (print "locking mutex")
   (mutex-lock! mutex)
   (let ((REFERENCE (vector 123)))
     (set! wl (make-weak-locative REFERENCE))
     (print "unlocking mutex ...")
     (mutex-unlock! mutex cv)
     (print "unlocked!")
     (print "DONE (REFERENCE may now be out of scope) " REFERENCE))))

(thread-start!
 (lambda ()
   (let loop ()
     (print "weak locative: " (locative->object wl))
     (thread-sleep! 1)
     (gc #t)
     (loop))))

(thread-sleep! 10)

Attachments (1)

0001-Store-thread-names-in-the-trasce-buffer-not-threads.patch (3.5 KB) - added by felix winkelmann 7 years ago.

Download all attachments as: .zip

Change History (14)

comment:1 Changed 8 years ago by sjamaan

From a quick first analysis, it looks like the entire thread is being garbage collected; ##sys#all-threads does not contain a reference to it, and the main thread doesn't have a reference either.

Perhaps we should create a list of threads blocked on mutexes somewhere?

comment:2 Changed 8 years ago by kristianlm

Like sjamaan suggested on #chicken, a finalizer may help to debug things:

(set-finalizer!
 (thread-start!
  (lambda ()
    (define mutex (make-mutex))
    (define cv (make-condition-variable))
    (print "locking mutex")
    (mutex-lock! mutex)
    (let ((REFERENCE (vector 1 2 3)))
      (set! wl (make-weak-locative REFERENCE))
      (print "unlocking mutex ...")
      (mutex-unlock! mutex cv)
      (print "unlocked!")
      (print "DONE (REFERENCE may now be out of scope) " REFERENCE))))
 (lambda (x)
   (print "THREAD COLLECTED!")))

On my machine, the finalizer gets called before REFERENCE is lost.

comment:3 Changed 8 years ago by kristianlm

After some thinking, I realize this bug-report should be the other way around. Just like sjamaan mentioned on #chicken, he wasn't convinced about the correct behaviour. My original claim, running the program below should output this:, is wrong.

The correct behaviour is that the thread and its REFERENCE gets garbage collected and the weak-reference returning #f. Because nobody is holding onto the condition-variable, the thread can never resume, and it can and should disappear from memory! This allows me to do cool things like this!

So, that leaves us with why the thread dot not always get garbage-collected. I can't trigger the following finalizers consistently:

(use srfi-18 miscmacros)

(repeat 1
 (set-finalizer!
  (thread-start! (lambda () (mutex-unlock! (make-mutex) (make-condition-variable))))
  (lambda (x) (print "blocked thread gced!")))

 (set-finalizer!
  (thread-start! (lambda () #f))
  (lambda (x) (print "exited thread gced!")))

 (set-finalizer!
  (vector 1 2 3)
  (lambda (x) (print "vector gced"))))

(thread-sleep! 1) ;; ensure other thread blocks on cv
(gc #t)           ;; major gc and run all finalizers
(thread-sleep! 1) ;; ensure finalizers get run

That gives me (on CHICKEN 4.12 at least):

$ csc /tmp/finalizer-test.scm  ; and /tmp/finalizer-test
vector gced
exited thread gced!

No blocked thread gced! :-( However, if I change to (repeat 1000 ...) , the blocked thread finalizers also run, so that's great! But why won't it run with only 1 blocked thread started? I think this behaviour should be consistent.

Last edited 8 years ago by kristianlm (previous) (diff)

comment:4 Changed 8 years ago by sjamaan

Interestingly, in a DEBUGBUILD CHICKEN, I see the blocked thread exit as well.

comment:5 Changed 8 years ago by sjamaan

Found the reason: The call chain holds a reference to the thread in which a procedure call occurs. C_trace will grab the currently running thread from ##sys#current-thread (as current_thread_symbol in runtime.c).

So the mere fact that this first thread calls make-condition and mutex-lock! causes it to stick around.

This can be verified in several ways:

  • If you add a procedure call to one of the other threads (instead of just returning a bare literal), you'll notice that that thread won't be garbage collected either. In the interpreter, a simple change of #f to (void) will do. When you compile the program you need something more elaborate due to optimisations, or use -O0 to avoid that.
  • If you run the application with -:a3 (it won't go any lower), the trace buffer won't have this reference anymore and you'll see that the finalizer is run.
  • If you compile with -d0, the C_trace() calls are omitted and the threads will all be cleaned up as you expect.
  • If the program does a lot more to fill up the trace buffer, as you have noticed yourself, the thread will eventually drop out of the trace buffer and the finalizer can trigger.

So all in all, I think this isn't a bug: the system works perfectly, it's just that there's a reference still holding on to the thread which you weren't expecting.

Having said that, perhaps we can store just the thread's name rather than the complete thread object, that would allow us to collect it earlier.

comment:6 Changed 8 years ago by sjamaan

Estimated difficulty: easy
Milestone: 4.13.05.0
Priority: majorminor

Changing the trace buffer representation is not something we should do in CHICKEN 4, I think.

Last edited 8 years ago by sjamaan (previous) (diff)

comment:7 Changed 8 years ago by sjamaan

Summary: weak-locatives references return #f when it shouldn'ttrace buffer unnecessarily holds on to thread objects

comment:8 Changed 7 years ago by felix winkelmann

Estimated difficulty: easymedium

Basically it is not hard to replace the thread slot in the trace buffer entry with the name (a patch is attached), but the problem is that when obtaining the trace for a particular thread, we want to extract only the entries for this thread. chicken.base#get-call-chain uses the thread slot to distinguish the different currently running threads that have entries in the trace buffer. We could compare by name (this is what the patch does), but if two threads have ther same name then their backtraces will be merged.

That is, we need a way to obtain the identity of a thread without holding the thread itself. An additonal slot with a fixnum id could be enough, for example.

comment:9 Changed 7 years ago by sjamaan

I wasn't aware we were using the thread to construct the call chain, that really changes things (I thought it was only shown, but obviously that was just a brainfart of mine).

Using just the thread name like this seems really wrong (unless we can enforce the name being unique somehow), because the danger is that you'd merge call chains from completely unrelated threads, which could quickly get really confusing.

Here's an idea: Perhaps we could treat the "thread" property in the trace buffer as a weak reference: Simply avoid marking the thread in the GC, and perform a forwarding pointer chase for the live items in the table at the end of a GC. Any dead items should then be marked with #f or something. The get-call-chain already has some code to deal with the possibility that the "thread" object is not a thread (which I don't quite understand: why would you include any trace entries for non-threads in there?), we could extend or tweak that a little.

I know hacking the GC even more is not ideal and the code for dealing with forwarding pointers is rather bulky, but it would provide a conceptually elegant solution to the problem: if a thread has been garbage collected, it means no call chain will ever be requested for it anyway (because there are no references to it anymore). We could even compress the call chain by removing those entries instead of nuking the thread slot (not sure if that's a good idea).

In any case, the precise layout of the call chain is a (mostly?) internal interface (it's undocumented for sure) so I'm not even sure we need to fix this for 5.0. I think we should probably postpone it. Do you agree?

comment:10 Changed 7 years ago by felix winkelmann

Ugh. No, let's not make the GC responsible for that as well, it already has enough special cases as it is.

I think we agree that no user code uses the thread, and we only use it for identity checking. Why not simply add a fixnum id to each thread and use that for identifcation, fed by a global continuously increasing counter?

comment:11 in reply to:  10 Changed 7 years ago by sjamaan

Replying to felix:

Ugh. No, let's not make the GC responsible for that as well, it already has enough special cases as it is.

To get rid of the special case we could pre-allocate a weak locative per trace buffer entry and store the thread in there. Locatives are already handled correctly.

I think we agree that no user code uses the thread, and we only use it for identity checking. Why not simply add a fixnum id to each thread and use that for identifcation, fed by a global continuously increasing counter?

That would require adding an additional slot to the thread, right? If it's always a fixnum that makes me a bit hesitant regarding wraparound (let's say you're quickly creating lots and lots of threads but one is keeping around for longer than all the rest: eventually you'll wrap around and clobber them), or we'd need to use bignums and create ever-increasing numbers.

It's probably not a huge issue, but I'd prefer something we know is stable, like creating an additional gensym per thread, instead. Those are properly garbage collected and guaranteed to be unique, even if the fixnum counter wraps around and we start generating names we've already seen.

comment:12 Changed 7 years ago by felix winkelmann

Owner: set to felix winkelmann
Status: newaccepted

Well, 2 billion threads is a lot... Anyway, gensym sounds good.

comment:13 Changed 7 years ago by sjamaan

Resolution: fixed
Status: acceptedclosed

Fixed with 00d8dcd2425a9eaa31f768fc5eac08555952e89c

Note: See TracTickets for help on using tickets.