emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#40981: closed (shepherd 0.8.0 race condition can lead to stopping it


From: GNU bug Tracking System
Subject: bug#40981: closed (shepherd 0.8.0 race condition can lead to stopping itself)
Date: Mon, 11 May 2020 21:10:02 +0000

Your message dated Mon, 11 May 2020 23:09:05 +0200
with message-id <address@hidden>
and subject line Re: bug#40981: Graphical installer tests sometimes hang.
has caused the debbugs.gnu.org bug report #40981,
regarding shepherd 0.8.0 race condition can lead to stopping itself
to be marked as done.

(If you believe you have received this mail in error, please contact
address@hidden.)


-- 
40981: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=40981
GNU Bug Tracking System
Contact address@hidden with problems
--- Begin Message --- Subject: Graphical installer tests sometimes hang. Date: Thu, 30 Apr 2020 13:51:49 +0200 User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
Hello,

Graphical installer tests sometimes hang, before starting marionette
tests. See for instance:
https://ci.guix.gnu.org/log/d31s9sycixhvfak5lpzdg0mzvz5aa9av-gui-installed-os-encrypted.

Restarting tests just after a hang (on the same installed image),
sometimes work. I removed the "quiet" kernel argument to see what's
going on.

--8<---------------cut here---------------start------------->8---
[    0.862608] Freeing unused kernel image memory: 1964K
[    0.863177] Run /init as init process
GC Warning: pthread_getattr_np or pthread_attr_getstack failed for main thread
GC Warning: Couldn't read /proc/stat
Welcome, this is GNU's early boot Guile.
Use '--repl' for an initrd REPL.

loading kernel modules...
[    0.915640] usbcore: registered new interface driver usb-storage
[    0.917800] usbcore: registered new interface driver uas
[    0.919569] hidraw: raw HID events driver (C) Jiri Kosina
[    0.920519] usbcore: registered new interface driver usbhid
[    0.921177] usbhid: USB HID core driver
[    0.933506] isci: Intel(R) C600 SAS Controller Driver - version 1.2.0
[    0.951303] PCI Interrupt Link [LNKD] enabled at IRQ 11
[    0.970144] PCI Interrupt Link [LNKA] enabled at IRQ 10
[    0.974033] virtio_blk virtio1: [vda] 4505600 512-byte logical blocks (2.31 
GB/2.15 GiB)
[    0.976186]  vda: vda1 vda2

;; hanging here.
--8<---------------cut here---------------end--------------->8---

It seems that the boot freezes, soon after the initrd is started, and
before loading the boot script.

Mathieu



--- End Message ---
--- Begin Message --- Subject: Re: bug#40981: Graphical installer tests sometimes hang. Date: Mon, 11 May 2020 23:09:05 +0200 User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
Hello,

Mathieu Othacehe <address@hidden> skribis:

>>> The work-around here is to save the installed SIGTERM handler and reset
>>> it. Then, after forking, the parent can restore the SIGTERM handler. The
>>> child will use the default SIGTERM handler that terminates the process.
>>
>> OK, makes sense.  (Another occasion to see how something like
>> ‘posix_spawn’ would be more appropriate than fork + exec…)
>
> Didn't know about that function, but it seems way easier to manipulate
> and less error prone indeed!

Make sure to read “A fork() in the Road” on that topic:

  https://lwn.net/Articles/785430/

>>> +# Try to trigger eventual race conditions, when killing a process between 
>>> fork
>>> +# and execv calls.
>>> +for i in {1..50}
>>> +do
>>> +    $herd restart test3
>>> +done
>>
>> Would it reproduce the problem well enough?
>
> On a slow machine one time out of two, and on a faster machine,
> never. The 'reproducer' I used, was to add a 'sleep' between fork and
> exec, it works way better!
>
> Tell me if you think its better to drop it.

It’s better than nothing, let’s keep it.

>>From 79d3603bf15b8f815136178be8c8a236734a7596 Mon Sep 17 00:00:00 2001
> From: Mathieu Othacehe <address@hidden>
> Date: Thu, 7 May 2020 18:39:41 +0200
> Subject: [PATCH] service: Fix 'make-kill-destructor' when PGID is zero.
>
> When a process is forked, and before its GID is changed in "exec-command",
> it will share the parent GID, which is 0 for Shepherd. In that case, use
> the PID instead of the PGID.
>
> Also make sure to reset the SIGTERM handler before forking a process. Failing
> to do so, will result in stopping Shepherd if a SIGTERM is received between
> fork and execv calls. Restore the SIGTERM handler once the process has been
> forked.
>
> * modules/shepherd/service.scm (fork+exec-command): Save the current SIGTERM
> handler and reset it before forking. Then, restore it on the parent after
> forking.
> (make-kill-destructor): Handle the case when PGID is zero, between the process
> fork and exec.

I added a “Fixes” line and pushed it.

Thanks a lot!

I can roll a 0.8.1 release soonish (I’d like to add signalfd support
while at it, we’ll see.)

Ludo’.


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]