bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#40981: Graphical installer tests sometimes hang.


From: Mathieu Othacehe
Subject: bug#40981: Graphical installer tests sometimes hang.
Date: Thu, 07 May 2020 18:48:54 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)

Hello,

The previous patch was not working. The reason is that, when a process
is forked and before execv is called, it shares the parent signal
handler.

So when sending a SIGTERM to the child process, we are stopping
root-service, if the signal is received before the child has forked.

The work-around here is to save the installed SIGTERM handler and reset
it. Then, after forking, the parent can restore the SIGTERM handler. The
child will use the default SIGTERM handler that terminates the process.

WDYT?

Thanks,

Mathieu
>From aa6f67068f1fdd622673ec0223f05fd8f8a96baa Mon Sep 17 00:00:00 2001
From: Mathieu Othacehe <address@hidden>
Date: Thu, 7 May 2020 18:39:41 +0200
Subject: [PATCH] service: Fix 'make-kill-destructor' when PGID is zero.

When a process is forked, and before its GID is changed in "exec-command",
it will share the parent GID, which is 0 for Shepherd. In that case, use
the PID instead of the PGID.

Also make sure to reset the SIGTERM handler before forking a process. Failing
to do so, will result in stopping Shepherd if a SIGTERM is received between
fork and execv calls. Restore the SIGTERM handler once the process has been
forked.

* modules/shepherd/service.scm (fork+exec-command): Save the current SIGTERM
handler and reset it before forking. Then, restore it on the parent after
forking.
(make-kill-destructor): Handle the case when PGID is zero, between the process
fork and exec.
---
 modules/shepherd/service.scm | 36 +++++++++++++++++++++++++++++-------
 tests/forking-service.sh     | 18 ++++++++++++++++++
 2 files changed, 47 insertions(+), 7 deletions(-)

diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
index 8604d2f..c68d7e0 100644
--- a/modules/shepherd/service.scm
+++ b/modules/shepherd/service.scm
@@ -5,6 +5,7 @@
 ;; Copyright (C) 2016 Alex Kost <address@hidden>
 ;; Copyright (C) 2018 Carlo Zancanaro <address@hidden>
 ;; Copyright (C) 2019 Ricardo Wurmus <address@hidden>
+;; Copyright (C) 2020 Mathieu Othacehe <address@hidden>
 ;;
 ;; This file is part of the GNU Shepherd.
 ;;
@@ -884,7 +885,18 @@ its PID."
   (unless %sigchld-handler-installed?
     (sigaction SIGCHLD handle-SIGCHLD SA_NOCLDSTOP)
     (set! %sigchld-handler-installed? #t))
-  (let ((pid (primitive-fork)))
+
+  ;; When forking a process, the signal handlers are inherited, until it
+  ;; forks. If SIGTERM is received by the forked process, before it calls
+  ;; execv, the installed SIGTERM handler, stopping Shepherd will be called.
+  ;; To avoid this, save the SIGTERM handler, disable it, and restore it once,
+  ;; the process has been forked. This way, the forked process will use the
+  ;; default SIGTERM handler stopping the process.
+  (let ((term-handler (match (sigaction SIGTERM)
+                        ((proc . _)
+                         proc)))
+        (pid (and (sigaction SIGTERM SIG_DFL)
+                  (primitive-fork))))
     (if (zero? pid)
         (exec-command command
                       #:user user
@@ -893,7 +905,10 @@ its PID."
                       #:directory directory
                       #:file-creation-mask file-creation-mask
                       #:environment-variables environment-variables)
-        pid)))
+        (begin
+          ;; Restore the initial SIGTERM handler.
+          (sigaction SIGTERM term-handler)
+          pid))))
 
 (define* (make-forkexec-constructor command
                                     #:key
@@ -957,11 +972,18 @@ start."
   "Return a procedure that sends SIGNAL to the process group of the PID given
 as argument, where SIGNAL defaults to `SIGTERM'."
   (lambda (pid . args)
-    ;; Kill the whole process group PID belongs to.  Don't assume that PID
-    ;; is a process group ID: that's not the case when using #:pid-file,
-    ;; where the process group ID is the PID of the process that
-    ;; "daemonized".
-    (kill (- (getpgid pid)) signal)
+    ;; Kill the whole process group PID belongs to.  Don't assume that PID is
+    ;; a process group ID: that's not the case when using #:pid-file, where
+    ;; the process group ID is the PID of the process that "daemonized".  If
+    ;; this procedure is called, between the process fork and exec, the PGID
+    ;; will still be zero (the Shepherd PGID). In that case, use the PID.
+    (let ((current-pgid (getpgid 0))
+          (pgid (getpgid pid)))
+      (if (eq? pgid current-pgid)
+          (begin
+            (kill pid signal))
+          (begin
+            (kill (- pgid) signal))))
     #f))
 
 ;; Produce a constructor that executes a command.
diff --git a/tests/forking-service.sh b/tests/forking-service.sh
index 7126c3d..47b09a2 100644
--- a/tests/forking-service.sh
+++ b/tests/forking-service.sh
@@ -71,6 +71,17 @@ cat > "$conf"<<EOF
                                       #:pid-file "$PWD/$service2_pid")
    #:stop  (make-kill-destructor)
    #:respawn? #t))
+
+(define %command3
+  '("$SHELL" "-c" "sleep 600"))
+
+(register-services
+ (make <service>
+   ;; A service that forks into a different process.
+   #:provides '(test3)
+   #:start (make-forkexec-constructor %command3)
+   #:stop  (make-kill-destructor)
+   #:respawn? #t))
 EOF
 cat $conf
 
@@ -113,3 +124,10 @@ sleep 1;
 $herd status test2 | grep started
 test "`cat $PWD/$service2_started`" = "started
 started"
+
+# Try to trigger eventual race conditions, when killing a process between fork
+# and execv calls.
+for i in {1..50}
+do
+    $herd restart test3
+done
-- 
2.26.0


reply via email to

[Prev in Thread] Current Thread [Next in Thread]