--- Begin Message ---
Subject: |
Graphical installer tests sometimes hang. |
Date: |
Thu, 30 Apr 2020 13:51:49 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) |
Hello,
Graphical installer tests sometimes hang, before starting marionette
tests. See for instance:
https://ci.guix.gnu.org/log/d31s9sycixhvfak5lpzdg0mzvz5aa9av-gui-installed-os-encrypted.
Restarting tests just after a hang (on the same installed image),
sometimes work. I removed the "quiet" kernel argument to see what's
going on.
--8<---------------cut here---------------start------------->8---
[ 0.862608] Freeing unused kernel image memory: 1964K
[ 0.863177] Run /init as init process
GC Warning: pthread_getattr_np or pthread_attr_getstack failed for main thread
GC Warning: Couldn't read /proc/stat
Welcome, this is GNU's early boot Guile.
Use '--repl' for an initrd REPL.
loading kernel modules...
[ 0.915640] usbcore: registered new interface driver usb-storage
[ 0.917800] usbcore: registered new interface driver uas
[ 0.919569] hidraw: raw HID events driver (C) Jiri Kosina
[ 0.920519] usbcore: registered new interface driver usbhid
[ 0.921177] usbhid: USB HID core driver
[ 0.933506] isci: Intel(R) C600 SAS Controller Driver - version 1.2.0
[ 0.951303] PCI Interrupt Link [LNKD] enabled at IRQ 11
[ 0.970144] PCI Interrupt Link [LNKA] enabled at IRQ 10
[ 0.974033] virtio_blk virtio1: [vda] 4505600 512-byte logical blocks (2.31
GB/2.15 GiB)
[ 0.976186] vda: vda1 vda2
;; hanging here.
--8<---------------cut here---------------end--------------->8---
It seems that the boot freezes, soon after the initrd is started, and
before loading the boot script.
Mathieu
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#40981: Graphical installer tests sometimes hang. |
Date: |
Mon, 11 May 2020 23:09:05 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) |
Hello,
Mathieu Othacehe <address@hidden> skribis:
>>> The work-around here is to save the installed SIGTERM handler and reset
>>> it. Then, after forking, the parent can restore the SIGTERM handler. The
>>> child will use the default SIGTERM handler that terminates the process.
>>
>> OK, makes sense. (Another occasion to see how something like
>> ‘posix_spawn’ would be more appropriate than fork + exec…)
>
> Didn't know about that function, but it seems way easier to manipulate
> and less error prone indeed!
Make sure to read “A fork() in the Road” on that topic:
https://lwn.net/Articles/785430/
>>> +# Try to trigger eventual race conditions, when killing a process between
>>> fork
>>> +# and execv calls.
>>> +for i in {1..50}
>>> +do
>>> + $herd restart test3
>>> +done
>>
>> Would it reproduce the problem well enough?
>
> On a slow machine one time out of two, and on a faster machine,
> never. The 'reproducer' I used, was to add a 'sleep' between fork and
> exec, it works way better!
>
> Tell me if you think its better to drop it.
It’s better than nothing, let’s keep it.
>>From 79d3603bf15b8f815136178be8c8a236734a7596 Mon Sep 17 00:00:00 2001
> From: Mathieu Othacehe <address@hidden>
> Date: Thu, 7 May 2020 18:39:41 +0200
> Subject: [PATCH] service: Fix 'make-kill-destructor' when PGID is zero.
>
> When a process is forked, and before its GID is changed in "exec-command",
> it will share the parent GID, which is 0 for Shepherd. In that case, use
> the PID instead of the PGID.
>
> Also make sure to reset the SIGTERM handler before forking a process. Failing
> to do so, will result in stopping Shepherd if a SIGTERM is received between
> fork and execv calls. Restore the SIGTERM handler once the process has been
> forked.
>
> * modules/shepherd/service.scm (fork+exec-command): Save the current SIGTERM
> handler and reset it before forking. Then, restore it on the parent after
> forking.
> (make-kill-destructor): Handle the case when PGID is zero, between the process
> fork and exec.
I added a “Fixes” line and pushed it.
Thanks a lot!
I can roll a 0.8.1 release soonish (I’d like to add signalfd support
while at it, we’ll see.)
Ludo’.
--- End Message ---