bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#36609: 27.0.50; Possible race-condition in threading implementation


From: Eli Zaretskii
Subject: bug#36609: 27.0.50; Possible race-condition in threading implementation
Date: Fri, 12 Jul 2019 10:47:37 +0300

> From: Andreas Politz <politza@hochschule-trier.de>
> Date: Thu, 11 Jul 2019 22:51:10 +0200
> 
> I think there is a race-condition in the implementation of threads.

Not sure what you mean by "race condition".  Since only one Lisp
thread can run at any given time, where could this race happen, and
between what threads?

> I tried to find a minimal test-case, without success.  Thus, I've
> attached a lengthy source-file.  Loading that file should trigger
> this bug and may freeze your session.

FWIW, it doesn't freeze my Emacs here, neither on GNU/Linux nor on
MS-Windows (with today's master).  Are you getting freezes with 100%
reproducibility?  If so, please show all the details of your build, as
collected by report-emacs-bug.

> Indications:
> 
> 1. The main-thread has the name of one of created threads (XEmacs in
>    this case), instead of "emacs".

I don't think this is related to the problem.  I think we have a bug
in the implementation of the 'name' attribute of threads: we call
prctl(PR_SET_NAME), which AFAIU changes the name of the _calling_
thread, and the calling thread at that point is the main thread, see
sys_thread_create.  If I evaluate this:

  (make-thread 'ignore "XEmacs")

and then attach a debugger, I see that the name of the main thread has
changed to "XEmacs".

> 2. Emacs stops processing all keyboard/mouse input while looping in
>    wait_reading_process_output.  Sending commands via emacsclient still
>    works.
> 
> GDB output:
> 
> (gdb) info threads
>   Id   Target Id                                        Frame 
> * 1    Thread 0x7ffff17f5d40 (LWP 26264) "XEmacs"       0x000055555576eac0 in 
> XPNTR (a=XIL(0x7ffff1312533)) at alloc.c:535
>   2    Thread 0x7ffff0ac4700 (LWP 26265) "gmain"        0x00007ffff50d1667 in 
> poll () from /usr/lib/libc.so.6
>   3    Thread 0x7fffebd1a700 (LWP 26266) "gdbus"        0x00007ffff50d1667 in 
> poll () from /usr/lib/libc.so.6
>   4    Thread 0x7fffeb519700 (LWP 26267) "dconf worker" 0x00007ffff50d1667 in 
> poll () from /usr/lib/libc.so.6
> 
> (gdb) bt full

This "bt full" is not enough, because it shows the backtrace of one
thread only, the main thread.  Please show the backtrace of the other
3 threads by typing "thread apply all bt full" instead.

> #5  0x0000555555802a78 in wait_reading_process_output (time_limit=0, nsecs=0, 
> read_kbd=-1, do_display=true, wait_for_cell=XIL(0), wait_proc=0x0, 
> just_wait_proc=0) at process.c:5423
>         process_skipped = false
>         channel = 6
>         nfds = 1
>         Available = {
>           fds_bits = {32, 0 <repeats 15 times>}
>         }

Is the keyboard descriptor's bit in Available set or not?  What is the
contents of the fd_callback_info array at this point?

> ;; -*- lexical-binding: t -*-
> 
> (require 'threads)
> (require 'eieio)
> (require 'cl-lib)
> (require 'ring)
> 
> (defun debug (fmt &rest args)
>   (princ (apply #'format fmt args) #'external-debugging-output)
>   (terpri #'external-debugging-output))

Please describe what this program tries to accomplish, and how.  It's
not easy to reverse-engineer that from 200 lines of non-trivial code.
It's possible that the reason(s) for the freeze will be apparent from
describing your implementation ideas.

Thanks.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]