bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK

bug-coreutils

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE

From:	Paul Eggert
Subject:	bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date:	Thu, 28 Oct 2021 00:56:11 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.1.2

On 10/27/21 03:00, Janne Heß wrote:

Building another package (peertube) on x86_64-linux on ext4 also fails with 
strange errors in the
test suite, something about "Error: The service is no longer running". This 
does not happen when the mentioned
coreutils commit is undone by replacing #ifdef with #if 0 [3].

So the problem is not limited to ZFS? Which means that even if weimplemented Pádraig's suggestion and disabled SEEK_HOLE on zfs, we'dstill run into problems? That's really puzzling. Particularly since it'snot clear what program is generating the diagnostic "The service is nolonger running", or how it's related to GNU cp.

Anyway, the ZFS issue sounds like a serious bug in lseek+SEEK_DATA thatreally needs to be fixed. This is not just a coreutils issue, as otherprograms use SEEK_DATA.

I assume the ZFS bug (if the bug is related to ZFS, anyway) is a racecondition of some sort; at least, that's what the trace in<https://github.com/openzfs/zfs/issues/11900> suggests.

In particular, I was struck that the depthcharge.config file that 'cp'was reading from was created by some other process, this way:

[pid 3014182] openat(AT_FDCWD,"/build/guybrush/tmp/portage/sys-boot/depthcharge-0.0.1-r3237/image/firmware/guybrush/depthcharge/depthcharge.config",O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 4

[pid 3014182] fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0

[pid 3014182] ioctl(4, TCGETS, 0x7ffd919d61c0) = -1 ENOTTY(Inappropriate ioctl for device)

[pid 3014182] lseek(3, 0, SEEK_CUR)     = 0
[pid 3014182] lseek(3, 0, SEEK_DATA)    = 0
[pid 3014182] lseek(3, 0, SEEK_HOLE)    = 9608
[pid 3014182] copy_file_range(3, [0], 4, [0], 9608, 0) = 9608
[pid 3014182] lseek(3, 0, SEEK_CUR)     = 9608

[pid 3014182] lseek(3, 9608, SEEK_DATA) = -1 ENXIO (No such device oraddress)

[pid 3014182] lseek(3, 0, SEEK_END)     = 9608
[pid 3014182] ftruncate(4, 9608)        = 0
[pid 3014182] close(4)                  = 0

So, one hypothesis is that ZFS's implementation of copy_file_range doesnot set up data structures appropriately for cp's later use oflseek+SEEK_DATA when reading depthcharge.config. That is, from cp'spoint of view, the ftruncate(4, 9608) has been executed but thecopy_file_range(3, [0], 4, [0], 9608, 0) has not been executed yet (it'scached somewhere, no doubt).

If my guess is right, then an fdatasync or fsync on cp's input mightwork around common instances of this ZFS bug. Could you try theattached coreutils patch, and see whether it works around the bug? Orperhaps change 'fdatasync' with 'fsync' in the attached patch? Thanks.

0001-cp-attempt-to-work-around-ZFS-bug.patch
Description: Text Data

[Prev in Thread]

Current Thread

[Next in Thread]

bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Janne Heß, 2021/10/27
- bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Pádraig Brady, 2021/10/27
- bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Paul Eggert <=
  - bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Pádraig Brady, 2021/10/28
    - bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Paul Eggert, 2021/10/28
    - bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Pádraig Brady, 2021/10/28
    - bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Paul Eggert, 2021/10/29
    - bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Pádraig Brady, 2021/10/31
    - bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Paul Eggert, 2021/10/29
    - bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Pádraig Brady, 2021/10/30
    - bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Paul Eggert, 2021/10/30
    - bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE, Pádraig Brady, 2021/10/31

Prev by Date: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Next by Date: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Previous by thread: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Next by thread: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Index(es):
- Date
- Thread