bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK

bug-coreutils

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE

From:	Paul Eggert
Subject:	bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date:	Thu, 28 Oct 2021 12:11:41 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.1.2

On 10/28/21 06:54, Pádraig Brady wrote:

Further debugging from Nix folks suggest ZFS was in consideration always,
as invalid artifacts were written to a central cache from ZFS backed hosts.
So we should at least change the comment in the patch to only mention ZFS.


Yes, that sounds reasonable.

This ZFS bug sounds pretty serious, though. Apparently it affects starand other programs too. I'm not sure we should attempt to work around itin coreutils, if the workarounds penalize everybody not using ZFS.

Is it cheap to check whether a file is actually in a ZFS filesystem?(Don't know how this'd work with loopback mounts, NFS, etc.) If so, itmight be better to simply fdatasync (or even fsync) every input filethat's on ZFS, until we know the ZFS bugs are fixed.

In theory we could fdatasync/fsync every input file on every platform.It'd be a shame to do that, though; that would slow down everybodymerely to work around this ZFS bug.

Also it seems like fsync() does avoid the ZFS issue as mentioned in:
https://github.com/openzfs/zfs/issues/11900

Yes. I'm hoping that fdatasync suffices as it's lighter-weight. But iffsync is needed we can use fsync.

BTW I'm slightly worried about retrying SEEK_DATA as
FreeBSD 9.1 has a bug with large sparse files at least
where it takes ages for SEEK_DATA to return:
   36.13290615 lseek(3,0x0,SEEK_DATA)         = -32768 (0xffff8000)
If ENXIO is not set in that case, then there is no issue.

Wait - lseek returns a number less than -1?! We could easily check forthat FreeBSD bug, perhaps as an independent patch; this shouldn'trequire any extra syscalls.

Also please see<https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=256205>. It appearsthat ZFS has significant bugs in this area on FreeBSD, bugs that haven'tbeen fixed yet. That bug report does suggest that an fsync (and I hopefdatasync) works around the bugs.

Also I'm not sure restricting sync to ENXIO is general enough,
as an strace from a problematic cp, from the github issue above is:
   lseek(3, 0, SEEK_DATA)            = 0
   lseek(3, 0, SEEK_HOLE)            = 131072
   lseek(3, 0, SEEK_SET)             = 0
   read(3, "\177ELF\2\1"..., 131072) = 131072
   write(4, "\177ELF\2\"..., 131072) = 131072
   lseek(3, 131072, SEEK_DATA)       = -1 ENXIO
   ftruncate(4, 3318813)             = 0

How about if we also do an fdatasync+retry after that 2nd lseek yieldsENXIO? Would that suffice to work around the ZFS bug? Would it be toomuch of a performance penalty for non-ZFS users?

[Prev in Thread]

Current Thread

[Next in Thread]

bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Janne Heß, 2021/10/27
- bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Pádraig Brady, 2021/10/27
- bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Paul Eggert, 2021/10/28
  - bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Pádraig Brady, 2021/10/28
    - bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Paul Eggert <=
    - bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Pádraig Brady, 2021/10/28
    - bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Paul Eggert, 2021/10/29
    - bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Pádraig Brady, 2021/10/31
    - bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Paul Eggert, 2021/10/29
    - bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Pádraig Brady, 2021/10/30
    - bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Paul Eggert, 2021/10/30
    - bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Pádraig Brady, 2021/10/31

Prev by Date: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Next by Date: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Previous by thread: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Next by thread: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Index(es):
- Date
- Thread