bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#40981: Graphical installer tests sometimes hang.


From: Ludovic Courtès
Subject: bug#40981: Graphical installer tests sometimes hang.
Date: Sun, 10 May 2020 12:32:00 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)

Hi!

Mathieu Othacehe <address@hidden> skribis:

> The previous patch was not working. The reason is that, when a process
> is forked and before execv is called, it shares the parent signal
> handler.
>
> So when sending a SIGTERM to the child process, we are stopping
> root-service, if the signal is received before the child has forked.

Woow, good catch!

> The work-around here is to save the installed SIGTERM handler and reset
> it. Then, after forking, the parent can restore the SIGTERM handler. The
> child will use the default SIGTERM handler that terminates the process.

OK, makes sense.  (Another occasion to see how something like
‘posix_spawn’ would be more appropriate than fork + exec…)

> From aa6f67068f1fdd622673ec0223f05fd8f8a96baa Mon Sep 17 00:00:00 2001
> From: Mathieu Othacehe <address@hidden>
> Date: Thu, 7 May 2020 18:39:41 +0200
> Subject: [PATCH] service: Fix 'make-kill-destructor' when PGID is zero.
>
> When a process is forked, and before its GID is changed in "exec-command",
> it will share the parent GID, which is 0 for Shepherd. In that case, use
> the PID instead of the PGID.
>
> Also make sure to reset the SIGTERM handler before forking a process. Failing
> to do so, will result in stopping Shepherd if a SIGTERM is received between
> fork and execv calls. Restore the SIGTERM handler once the process has been
> forked.
>
> * modules/shepherd/service.scm (fork+exec-command): Save the current SIGTERM
> handler and reset it before forking. Then, restore it on the parent after
> forking.
> (make-kill-destructor): Handle the case when PGID is zero, between the process
> fork and exec.

[...]

> +    ;; Kill the whole process group PID belongs to.  Don't assume that PID is
> +    ;; a process group ID: that's not the case when using #:pid-file, where
> +    ;; the process group ID is the PID of the process that "daemonized".  If
> +    ;; this procedure is called, between the process fork and exec, the PGID
> +    ;; will still be zero (the Shepherd PGID). In that case, use the PID.
> +    (let ((current-pgid (getpgid 0))
> +          (pgid (getpgid pid)))
> +      (if (eq? pgid current-pgid)
> +          (begin
> +            (kill pid signal))
> +          (begin
> +            (kill (- pgid) signal))))

Shouldn’t it be:

  (let ((pgid (getpgid pid)))
    (if (= (getpgid 0) pgid)
        (kill pid signal)  ;don't kill ourself
        (kill (-p pgid) signal)))

?

Note: Use the most specific comparison procedure, ‘=’ in this case,
because we know we’re dealing with numbers (it enables proper type
checking, better compiler optimizations, etc.).  More importantly, ‘eq?’
performs pointer comparison, so it shouldn’t be used with numbers (in
practice it works with “fixnums” but not with bignums).

> +# Try to trigger eventual race conditions, when killing a process between 
> fork
> +# and execv calls.
> +for i in {1..50}
> +do
> +    $herd restart test3
> +done

Would it reproduce the problem well enough?

Use `seq 1 50` to avoid relying on a Bash-specific construct (which I
think it is, right?).

Could you send an updated patch?

Thanks for the bug hunting and for the patch!

Ludo’.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]