[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#63982: Shepherd can crash when a user service fails to start
From: |
Ludovic Courtès |
Subject: |
bug#63982: Shepherd can crash when a user service fails to start |
Date: |
Wed, 12 Jul 2023 19:46:56 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) |
Hi!
Ludovic Courtès <ludo@gnu.org> skribis:
> Turns out that this happens when calling the ‘daemonize’ action on
> ‘root’. I have a reproducer now and am investigating…
Good news: this is fixed in Shepherd commit
f4272d2f0f393d2aa3e9d76b36ab6aa5f2fc72c2!
The root cause is inconsistent semantics when mixing epoll, signalfd,
and fork, specifically this part from signalfd(2):
epoll(7) semantics
If a process adds (via epoll_ctl(2)) a signalfd file descriptor to an
epoll(7) instance, then epoll_wait(2) returns events only for signals
sent to that process. In particular, if the process then uses fork(2)
to create a child process, then the child will be able to read(2) sig‐
nals that are sent to it using the signalfd file descriptor, but
epoll_wait(2) will not indicate that the signalfd file descriptor is
ready. In this scenario, a possible workaround is that after the
fork(2), the child process can close the signalfd file descriptor that
it inherited from the parent process and then create another signalfd
file descriptor and add it to the epoll instance. […]
The C program below illustrates this behavior:
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/signal.h>
#include <sys/signalfd.h>
#include <sys/epoll.h>
int
main ()
{
int ep, sfd;
sigset_t signals;
sigemptyset (&signals);
sigaddset (&signals, SIGINT);
sigaddset (&signals, SIGHUP);
sigprocmask (SIG_BLOCK, &signals, NULL);
sfd = signalfd (-1, &signals, SFD_CLOEXEC);
ep = epoll_create1 (EPOLL_CLOEXEC);
struct epoll_event events = { .events = EPOLLIN | EPOLLONESHOT, .data = NULL
};
epoll_ctl (ep, EPOLL_CTL_ADD, sfd, &events);
epoll_wait (ep, &events, 1, 123);
if (fork () == 0)
{
/* Quoth signalfd(2):
If a process adds (via epoll_ctl(2)) a signalfd file descriptor to an
epoll(7) instance, then epoll_wait(2) returns events only for signals
sent to that process. In particular, if the process then uses fork(2)
to create a child process, then the child will be able to read(2) sig‐
nals that are sent to it using the signalfd file descriptor, but
epoll_wait(2) will not indicate that the signalfd file descriptor is
ready. */
printf ("try this: kill -INT %i\n", getpid ());
while (1)
{
struct signalfd_siginfo info;
if (epoll_wait (ep, &events, 1, 777) > 0)
{
read (sfd, &info, sizeof info);
printf ("got signal %i!\n", info.ssi_signo);
epoll_ctl (ep, EPOLL_CTL_MOD, sfd, &events);
}
}
}
return 0;
}
Of course it took me a while to find out about this; I first looked at
things individually and didn’t expect the mixture to behave
inconsistently.
Maxim, let me know if it works for you!
Thanks,
Ludo’.
- bug#63982: Shepherd can crash when a user service fails to start,
Ludovic Courtès <=