emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Emacs Hangs on Filesystem Operations on Stale NFS


From: Alexander Shukaev
Subject: Re: Emacs Hangs on Filesystem Operations on Stale NFS
Date: Mon, 11 Jun 2018 13:55:51 +0200
User-agent: Roundcube Webmail/1.1.2

On 2018-06-11 12:27, Alexander Shukaev wrote:
Hi Everyone,


I initiated a discussion back in 2015 [1] about fragility of Emacs in
terms of filesystem operations on stale NFS.  No solution actually
came out of this discussion.  I still find this issue very disruptive.
 Yet another example would be `recentf-cleanup' which is in my case
triggered on Emacs start up, when the file comes from stale NFS, the
corresponding `file-readable-p' down the stack will hang indefinitely,
and there would be no way to unfreeze it apart from issuing 'kill -9'
to that Emacs instance.  Don't you people find it unacceptable for the
daily usage?  Well, I do.  Such hangs always disrupt daily work and
require quite some time to track them down as they are not
Lisp-debuggable with e.g. <C-g> in a straightforward way (these are
dead hangs from C code, where even attaching a GDB does not work).

Well, enough rant.  I think I have a proposal how to fix the issue,
even given the blocking nature of Emacs.  How about introducing a
variable `file-access-timeout' defaulting to `nil', which would
reflect a configurable timeout for all access operations (such as
`file-readable-p')?  This would be achieved via `SIGALARM' in the C
code, which would protect every such operation.  For example,

#include <sigaction.h>
#include <sys/stat.h>
#include <unistd.h>
#include <string.h>

static void alarm_handler(int sig)
{
    return;
}

int emacs_stat(const char* path, struct stat* s, unsigned int seconds)
{
    struct sigaction newact;
    struct sigaction oldact;

    memset(&newact, 0, sizeof(newact));
    memset(&oldact, 0, sizeof(oldact));

    sigemptyset(&newact.sa_mask);

    newact.sa_flags   = 0;
    newact.sa_handler = alarm_handler;
    sigaction(SIGALRM, &newact, &oldact);

    alarm(seconds);

    errno                 = 0;
    const int rc          = stat(path, s);
    const int saved_errno = errno;

    alarm(0);
    sigaction(SIGALRM, &oldact, NULL);

    errno = saved_errno;
    return rc;
}

where `seconds' should be initialized with the value of
`file-access-timeout'.  The cool advantage of this that I see is that
one can then also selectively `let'-bind different values for
`file-access-timeout', thus having total control over the use cases in
which one wants to protect oneself from indefinite hangs.

Kind regards,
Alexander

[1] https://lists.gnu.org/archive/html/help-gnu-emacs/2015-11/msg00251.html

A couple of more ideas:
- I think it is reasonable to actually signal a dedicated error in case of the timeout so that API consumers can handle it accordingly to their needs. - It might be worth to also factor this alarm mechanism out into a separate macro, e.g. similar to `condition-case', where one could wrap a piece of Lisp code into that macro by supplying a timeout and expect it to call a timeout handler code in case of timeout:

(with-system-timeout 3
    (do-something)
  (message "%s" "Timed out after 3 seconds..."))

This would also give Lisp developers full control over system related interactions.

Regards,
Alexander



reply via email to

[Prev in Thread] Current Thread [Next in Thread]