guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Terrific Dead Lock


From: Ludovic Courtès
Subject: Terrific Dead Lock
Date: Thu, 13 Mar 2008 23:29:56 +0100
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.1 (gnu/linux)

Hello,

I'm experiencing a dead lock while running the test suite (in a NixOS
build), and I don't remember ever seeing it before.  Sorry for the long
copy/paste, but it helped me understand the problem as I was writing
this message.

Here we go:

(gdb) info threads 
* 3 Thread 0x40b70b90 (LWP 6675)  0xffffe410 in ?? ()
  2 Thread 0x416d3b90 (LWP 6853)  0xffffe410 in ?? ()
  1 Thread 0x402da8d0 (LWP 5049)  0xffffe410 in ?? ()

(gdb) thread 1
[Switching to thread 1 (Thread 0x402da8d0 (LWP 5049))]#0  0xffffe410 in ?? ()
(gdb) bt
#0  0xffffe410 in ?? ()
#1  0xbfbc3e58 in ?? ()
#2  0x00000002 in ?? ()
#3  0x00000080 in ?? ()
#4  0x401912b9 in __lll_lock_wait () from 
/nix/store/zahfcxzylmadvaj865j5xmm1dsvs03r7-glibc-2.7/lib/libpthread.so.0
#5  0x4018c9d6 in _L_lock_95 () from 
/nix/store/zahfcxzylmadvaj865j5xmm1dsvs03r7-glibc-2.7/lib/libpthread.so.0
#6  0x4018c3ba in pthread_mutex_lock () from 
/nix/store/zahfcxzylmadvaj865j5xmm1dsvs03r7-glibc-2.7/lib/libpthread.so.0
#7  0x400bb6fb in scm_i_thread_put_to_sleep () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#8  0x40069159 in scm_i_gc () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#9  0x4006afbe in increase_mtrigger () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#10 0x4009d8be in scm_make_srcprops () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#11 0x400977d9 in scm_read_sexp () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#12 0x4009672f in scm_read_expression () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#13 0x40097622 in scm_read_sexp () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#14 0x4009672f in scm_read_expression () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#15 0x4009769e in scm_read_sexp () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#16 0x4009672f in scm_read_expression () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#17 0x4009769e in scm_read_sexp () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#18 0x4009672f in scm_read_expression () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#19 0x4007d8da in scm_primitive_load () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#20 0x40062ed3 in ceval () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#21 0x4004dc2b in scm_start_stack () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#22 0x4004e3a1 in scm_m_start_stack () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#23 0x4005cb71 in scm_apply () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#24 0x40061a15 in ceval () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#25 0x400617bd in scm_call_0 () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#26 0x400664ad in apply_thunk () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#27 0x4006668e in scm_c_with_fluid () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#28 0x400666e5 in scm_with_fluid () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#29 0x40062093 in ceval () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#30 0x400617bd in scm_call_0 () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#31 0x40051e98 in scm_dynamic_wind () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#32 0x40062093 in ceval () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#33 0x400617bd in scm_call_0 () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#34 0x400664ad in apply_thunk () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#35 0x4006668e in scm_c_with_fluid () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#36 0x400666e5 in scm_with_fluid () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#37 0x40062093 in ceval () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#38 0x40064bb6 in call_closure_1 () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#39 0x4005d48e in scm_for_each () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#40 0x40062eba in ceval () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#41 0x40063156 in ceval () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#42 0x40063a79 in ceval () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#43 0x400648da in scm_primitive_eval_x () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#44 0x40064935 in scm_eval_x () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#45 0x4009a021 in scm_shell () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#46 0x4007a546 in invoke_main_func () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#47 0x4004c492 in c_body () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#48 0x400bdbd9 in scm_c_catch () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#49 0x4004ca02 in scm_i_with_continuation_barrier () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#50 0x4004cae3 in scm_c_with_continuation_barrier () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#51 0x400bcd79 in scm_i_with_guile_and_parent () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#52 0x400bce6e in scm_with_guile () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#53 0x4007a4df in scm_boot_guile () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#54 0x08048a06 in main ()

(gdb) thread 2
[Switching to thread 2 (Thread 0x416d3b90 (LWP 6853))]#0  0xffffe410 in ?? ()
(gdb) bt
#0  0xffffe410 in ?? ()
#1  0x416d31a8 in ?? ()
#2  0x00000002 in ?? ()
#3  0x00000080 in ?? ()
#4  0x401912b9 in __lll_lock_wait () from 
/nix/store/zahfcxzylmadvaj865j5xmm1dsvs03r7-glibc-2.7/lib/libpthread.so.0
#5  0x4018c9e4 in _L_lock_236 () from 
/nix/store/zahfcxzylmadvaj865j5xmm1dsvs03r7-glibc-2.7/lib/libpthread.so.0
#6  0x4018c43b in pthread_mutex_lock () from 
/nix/store/zahfcxzylmadvaj865j5xmm1dsvs03r7-glibc-2.7/lib/libpthread.so.0
#7  0x400bdbed in scm_c_catch () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#8  0x4004ca02 in scm_i_with_continuation_barrier () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#9  0x4004cae3 in scm_c_with_continuation_barrier () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#10 0x400bcd79 in scm_i_with_guile_and_parent () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#11 0x400bce6e in scm_with_guile () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#12 0x400bcec3 in on_thread_exit () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#13 0x40189dc0 in __nptl_deallocate_tsd () from 
/nix/store/zahfcxzylmadvaj865j5xmm1dsvs03r7-glibc-2.7/lib/libpthread.so.0
#14 0x4018a189 in start_thread () from 
/nix/store/zahfcxzylmadvaj865j5xmm1dsvs03r7-glibc-2.7/lib/libpthread.so.0
#15 0x40264dae in clone () from 
/nix/store/zahfcxzylmadvaj865j5xmm1dsvs03r7-glibc-2.7/lib/libc.so.6

(gdb) thread 3
[Switching to thread 3 (Thread 0x40b70b90 (LWP 6675))]#0  0xffffe410 in ?? ()
(gdb) bt
#0  0xffffe410 in ?? ()
#1  0x40b6ff78 in ?? ()
#2  0x00000001 in ?? ()
#3  0x40b7005b in ?? ()
#4  0x401916cb in read () from 
/nix/store/zahfcxzylmadvaj865j5xmm1dsvs03r7-glibc-2.7/lib/libpthread.so.0
#5  0x400988f3 in do_read_without_guile () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#6  0x400bb7cc in scm_without_guile () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#7  0x40098855 in signal_delivery_thread () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#8  0x400bdbd9 in scm_c_catch () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#9  0x400bdde9 in scm_internal_catch () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#10 0x400bca4d in really_spawn () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#11 0x4004c492 in c_body () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#12 0x400bdbd9 in scm_c_catch () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#13 0x4004ca02 in scm_i_with_continuation_barrier () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#14 0x4004cae3 in scm_c_with_continuation_barrier () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#15 0x400bcd79 in scm_i_with_guile_and_parent () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#16 0x400bcddf in spawn_thread () from 
/tmp/nix-5221-14/guile-1.8.4/libguile/.libs/libguile.so.17
#17 0x4018a17b in start_thread () from 
/nix/store/zahfcxzylmadvaj865j5xmm1dsvs03r7-glibc-2.7/lib/libpthread.so.0
#18 0x40264dae in clone () from 
/nix/store/zahfcxzylmadvaj865j5xmm1dsvs03r7-glibc-2.7/lib/libc.so.6

All this happens apparently while reading `unif.test' (which comes right
after `time.test'):

$ sudo tail -n 3 /tmp/nix-5221-14/guile-1.8.4/check-guile.log 
PASS: time.test: strptime: in another thread after error
PASS: time.test: strptime: GNU %s format: gmtoff on GMT
PASS: time.test: strptime: GNU %s format: gmtoff on EST+5


To summarize:

  * Thread 2 is exiting.  It holds THREAD_ADMIN_MUTEX (it acquired it at
    the beginning of `do_thread_exit ()') and is waiting on
    SCM_I_CRITICAL_SECTION_MUTEX in `scm_c_catch ()'.

  * Thread 1 is reading, actually GC'ing.  It's trying to acquire
    THREAD_ADMIN_MUTEX in `scm_i_thread_put_to_sleep ()'.  It holds
    SCM_I_CRITICAL_SECTION_MUTEX from `scm_make_srcprops ()'.
    
One might wonder: why the heck does `scm_make_srcprops ()' enter a
critical section?  Could it just use a private mutex to protect accesses
to `srcprops_freelist'?

Han-Wen's reimplementation of it in HEAD (2007-01-19) doesn't use a
critical section, nor a mutex, but is thread-safe AFAIUI.

Two possibilities to fix it:

  1. Copy `srcprop.[ch]' and `eval.c' bits from HEAD to 1.8.  After all,
     it's probably solid enough (I use almost only HEAD).  See [0] for
     an overview of the initial patch.  It doesn't break the public API
     nor the ABI, but it (re)moves stuff from the `srcprop.h'.

  2. Remove the critical section from 1.8 and synchronize accesses to
     `srcprops_freelist' with a private mutex, assuming that's a correct
     fix.

I'd be in favor of the first approach.

Comments?

Thanks,
Ludovic.

[0] http://thread.gmane.org/gmane.lisp.guile.devel/6439





reply via email to

[Prev in Thread] Current Thread [Next in Thread]