guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

wip-threads-and-fork


From: Andy Wingo
Subject: wip-threads-and-fork
Date: Wed, 08 Feb 2012 23:10:47 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.3 (gnu/linux)

Hi,

[Copying Bruno for an iconv question; see the end]

I was testing out the threaded web server and it was working well.  Then
I tried it out with tekuti, a blog engine that uses git as the backend.
It uses (ice-9 popen) to talk to the git binaries.  It was deadlocking
and segfaulting and all kinds of things!

The reason is that when you fork(), only the thread that calls fork()
survives into the child.  If another thread was in a critical section --
i.e., held a mutex -- it just stops.  The mutex remains taken, and
nothing will ever unlock it.  (Or is it in an undefined state?
Standards lawyers are welcome to input here.)  Of course whatever data
structures the threads are working on are in whatever inconsistent state
they were in as well.

This is a problem for Guile, even in the (ice-9 popen) case in which we
try to do the minimal amount of Schemely things before calling exec().
In particular there is the symbol table, which gets new things interned
due to make-prompt-tag (clearly prompt tags should not be symbols), and
there is the GC allocation lock, and there is the ports table, and the
mutexes on the individual ports (in master).

The solution is, besides just avoiding fork() and threads, to take locks
on all interesting data structures when you fork().  Fortunately there
are not too many, and most locks are not nested, so it seems to be a
doable thing.  In wip-threads-and-fork, I added a scm_c_atfork interface
to define functions to call before and after a fork.  It's like
pthread_atfork, though without the separate functions for parent and
child (is that needed?), and with the ability to have user data.  Also
the allocation lock is taken last.

I also added some CLOEXEC-related hacks to that branch.  We'll need a
new gnulib for accept4.

Finally, there was some mess with iconv().  iconv_open() is threadsafe,
but on glibc it loads gconv modules, within a lock.  It could be that
another thread was in iconv_open (and holding the gconv lock) when the
fork happens, preventing the child from doing scm_to_locale_string to
produce the argv for the execlp.  To fix this, I added locking around
all iconv usage, including indirect usage via libunistring.  This is
pretty nasty.  I guess the Right Thing would be a pthread_atfork()
within gconv; Bruno, is that correct?

I'm hesitant to do much threading-related work on stable-2.0, as master
has a much better story there.  Dunno.

Anyway, thoughts welcome.  With that patch series and
wip-threaded-web-server, I am able to happily hammer a test tekuti
instance without problems.  I'll try it out in production shortly.
Fingers crossed!

Cheers,

Andy
-- 
http://wingolog.org/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]