emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Dataloss copying file using copy-file on RHEL 8.


From: David Koppelman
Subject: Re: Dataloss copying file using copy-file on RHEL 8.
Date: Thu, 13 Feb 2020 11:08:00 -0600
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

Thank you for the reproducer! I was able to reproduce the file
corruption with a modified version of the C file in which the
destination file times were set. Otherwise there is no corruption. I'm
attaching the modified file. I'm going to file a bug with Red Hat using
Paul's modified reproducer, if that's okay.

The nfs mounted filesystem is on another Red Hat system. I'm going to
file the Red Hat bug before gathering additional information.

Thanks for your help!

David


Attachment: cfrbug3.c
Description: Reproducer for copy file problem.



Paul Eggert <address@hidden> writes:

> On 2/12/20 2:37 PM, David Koppelman wrote:
>> Except I don't get the
>> efficient kernel-space-to-kernel-space transfer that copy_file_range
>> uses.)
>
> It's more than just kernel-space-to-kernel-space copying. When copying
> a file within an NFS server, you don't need to ship its contents over
> the network; the server can do the copy. Also, many modern filesystems
> can copy files by fiddling with pointers rather than data and thus can
> copy much faster than read+write would do, even on local filesystems.
> So avoiding copy_file_range entirely would mean a big performance loss
> on big files.
>> I do not experience the problem on the version of Emacs packaged with
>> rhel 8, "GNU Emacs 26.1 (build 1, x86_64-redhat-linux-gnu, GTK+
>> Version 3.22.30) of 2018-09-10".
>
> Emacs 26.1 doesn't use copy_file_range, which explains why it doesn't
> encounter your problem. Emacs 27 is planned to use it, though, so we
> should see how to best fix the problem.
>
> As you say, it's a serious bug in your filesystem. It strikes me that
> it is likely to affect programs other than Emacs, so it should be high
> priority to fix regardless of what we do in Emacs.
>
> Some questions: What is the NFS fileserver (NetApp, etc.)? What's the
> blocksize on the remote file system? Does copy_file_range work
> correctly when the size is a multiple of 32*1024? If so, perhaps we
> could tweak Emacs to use copy_file_range for most of the file, and use
> read+write only for the trailing <32 KiB.
>
>> When I have time I'll try to reproduce the problem with a quick C++
>> routine using copy_file_range.
>
> To save you some time, attached is a quick C routine that attempts to
> reproduce the problem. Does it reproduce the problem for you? If so,
> you can use it in your bug report to Red Hat.
>
> Also, can you strace the failing Emacs? Something like this:
>
> strace -o trace.log emacs -Q -batch -eval '(copy-file "a" "b" t t)'
>
> and then look at the relevant part of trace.log.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]