[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#31925: 'guix substitutes' sometimes hangs on glibc 2.27
From: |
Andy Wingo |
Subject: |
bug#31925: 'guix substitutes' sometimes hangs on glibc 2.27 |
Date: |
Thu, 05 Jul 2018 10:00:52 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) |
Hi!
On Thu 05 Jul 2018 05:33, Mark H Weaver <address@hidden> writes:
>> One problem I’ve noticed is that the child process that
>> ‘call-with-decompressed-port’ spawns would be stuck trying to get the
>> allocation lock:
>>
>> So it seems quite clear that the thing has the alloc lock taken. I
>> suppose this can happen if one of the libgc threads runs right when we
>> call fork and takes the alloc lock, right?
>
> Does libgc spawn threads that run concurrently with user threads? If
> so, that would be news to me. My understanding was that incremental
> marking occurs within GC allocation calls, and marking threads are only
> spawned after all user threads have been stopped, but I could be wrong.
I think Mark is correct.
> The first idea that comes to my mind is that perhaps the finalization
> thread is holding the GC allocation lock when 'fork' is called.
So of course we agree you're only supposed to "fork" when there are no
other threads running, I think.
As far as the finalizer thread goes, "primitive-fork" calls
"scm_i_finalizer_pre_fork" which should join the finalizer thread,
before the fork. There could be a bug obviously but the intention is
for Guile to shut down its internal threads. Here's the body of
primitive-fork fwiw:
{
int pid;
scm_i_finalizer_pre_fork ();
if (scm_ilength (scm_all_threads ()) != 1)
/* Other threads may be holding on to resources that Guile needs --
it is not safe to permit one thread to fork while others are
running.
In addition, POSIX clearly specifies that if a multi-threaded
program forks, the child must only call functions that are
async-signal-safe. We can't guarantee that in general. The best
we can do is to allow forking only very early, before any call to
sigaction spawns the signal-handling thread. */
scm_display
(scm_from_latin1_string
("warning: call to primitive-fork while multiple threads are
running;\n"
" further behavior unspecified. See \"Processes\" in the\n"
" manual, for more information.\n"),
scm_current_warning_port ());
pid = fork ();
if (pid == -1)
SCM_SYSERROR;
return scm_from_int (pid);
}
> Another possibility: both the finalization thread and the signal
> delivery thread call 'scm_without_guile', which calls 'GC_do_blocking',
> which also temporarily grabs the GC allocation lock before calling the
> specified function. See 'GC_do_blocking_inner' in pthread_support.c in
> libgc. You spawn the signal delivery thread by calling 'sigaction' and
> you make work for it to do every second when the SIGALRM is delivered.
The signal thread is a possibility though in that case you'd get a
warning; the signal-handling thread appears in scm_all_threads. Do you
see a warning? If you do, that is a problem :)
>> If that is correct, the fix would be to call fork within
>> ‘GC_call_with_alloc_lock’.
>>
>> How does that sound?
>
> Sure, sounds good to me.
I don't think this is necessary. I think the problem is that other
threads are running. If we solve that, then we solve this issue; if we
don't solve that, we don't know what else those threads are doing, so we
don't know what mutexes and other state they might have.
Andy