guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: hacking on 1.7 threads


From: Julian Graham
Subject: Re: hacking on 1.7 threads
Date: Sat, 6 Nov 2004 23:30:42 -0500

Hi everyone,
  I'm attaching a patch (against HEAD, created in guile/ via 'cvs diff
-Nau') that represents the current state of my work on thread
cancellation (except that I removed the cancellation-disabling stuff
I'd added temporarily to gc.c; I wasn't super confident that it had
any effect).
  I've also attached a little code that demonstrates the functionality
I've added as well as the difficulty I've been having.  Because this
patch doesn't completely work, I haven't included any changes to the
Changelog -- if it's not clear from my previous messages to this list
or from my comments in the code exactly how I've implemented any of
this or what it provides, just drop me a line.


Cheers,
Julian


On Sat, 30 Oct 2004 16:45:14 -0400, Julian Graham <address@hidden> wrote:
> Alright, having combatted the corruption that seems to occur during
> the cancellation handler for about a solid straight week and a half,
> I'm getting pretty demoralized.  Here's where I am at this point:
> 
> - Realized that the GC must be aware of the list of thread cleanup
> handler expressions and protected them as part of scm_thread_mark
> - The scm_thread data structure removes *itself* from the all_threads
> list once it's finished, so I don't think premature deallocation is a
> problem
> - Realized that the GC might be interrupted by a cancellation signal
> in the middle of a collection, since I'm pretty sure it calls
> functions that are cancellation points for deferred-cancellation POSIX
> threads.  I assume that a half-finished collection could have
> disastrous effects for data consistency, so I've taken the stopgap
> measure of disabling cancellation while scm_igc() is running.
> - It occurs to me that after the cancellation signal is received and a
> bunch of pthreads stuff is unwound to call the pthread cancellation
> handler, the Scheme evaluation environment for that thread may be in
> some unknown state...
> 
> ...which might explain why I've been getting SIGABRTs and SIGSEGVs
> when I call scm_i_eval in my pthread cancellation handler.  Here's a
> characteristic stack trace for a SIGABRT
> 
> #42 0x40017c2c in ?? ()
> #43 0x40b68228 in ?? ()
> #44 0x40b681f0 in ?? ()
> #45 0x40007def in _dl_lookup_symbol () from /lib/ld-linux.so.2
> #46 0x4008e26c in scm_cons (x=0x806e270, y=0x204) at pairs.c:59
> #47 0x40058c57 in scm_i_eval (exp=0x806e270, env=0x4031dc40) at eval.c:5859
> #48 0x400b4f27 in handler_cancellation (thread=0x80932a8) at threads.c:302
> #49 0x4018303b in __pthread_unwind () from /lib/tls/libpthread.so.0
> #50 0x4017e4a8 in sigcancel_handler () from /lib/tls/libpthread.so.0
> 
> ...with many many more ?? stack frames and then a SIGABRT in some
> internal libc function.  I can't seem to reproduce the SIGSEGV at the
> moment.  I've tried preserving the current evaluation environment in
> addition to the expression at the time of the 'push' from Scheme code,
> and then evaluating the expression in that saved environment when the
> pthread cancellation handler runs, but that doesn't seem to do much
> good (though it does raise the question: In what environment should
> the cancellation handler expressions be evaluated?  The env. at the
> time they were pushed onto the list?  Or the environment at the time
> the thread received the cancellation signal?  And what should the
> correct error-handling behavior be during evaluation of cleanup
> handler expressions?).
>   So having tried all this and more with no success, I'm kind of at my
> wits' end;  if anyone would like to volunteer to take this code over
> from me (it's like 50-60 lines of new code in threads.c,
> threads-plugin.c, pthreads-threads.c, and a teensy little bit in
> gc.c), I'd be more than happy to comment it up and post the files or a
> patch to HEAD.  Or you can rewrite the whole thing from scratch, since
> my design may be just plain stupid.
> 
> Cheers,
> 
> 
> Julian
> 
> On Sun, 24 Oct 2004 11:29:06 +0200, Mikael Djurfeldt
> <address@hidden> wrote:
> > Note, though, that this is the easy part.  I do expect that there also
> > could arise nasty complications having to do with the order in which
> > things are done at cancellation.  It's for example important that the
> > scm_thread data structure isn't deallocated before the handlers are
> > invoked.  It's also important that the GC is still aware of the thread
> > at that point in time.  It's important that the thread *is* properly
> > deallocated *after* the handlers have run---that kind of stuff.  But
> > maybe there's no problem at all.
>

Attachment: thread-cancellation-HEAD.patch
Description: Text Data

Attachment: thread-cancellation-test.scm
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]