bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#51345:


From: Sworddragon
Subject: bug#51345:
Date: Sun, 24 Oct 2021 07:35:59 +0000

> You could try running the following immediately after,
> to see if it also returns quickly:
>
>    blockdev --flushbufs /dev/sdb


The issue does not reproduce always and the related USB Thumb Drive has
already been prepared for and does store important data so that is not an
easy task. The USB Thumb Drive is also a pretty old device (roughly 10
years or even older) with only 1 GB of storage space. When dd with
conv=fsync returned after half of its usual writing time I guess it is
unlikely that the controller of the USB Thumb Drive has its own dedicated
512 MiB buffer attached to it.


> Well we're relying on the kernel here to not return from fync()
> until appropriate.

But the question is if there is a minor unobvious bug somewhere is the
controlling logic of dd that might still cause such a bug. But I checked
the manpages for the sync() and fsync() calls and they are actually quite
interesting. fsync() describes as flushing the caches for even data
retrieval after crashes/reboots. But the interesting part here is that it
describes after that it blocks until the devices reports completion. But
what happens if the device reports completion even if the kernel still sees
cached writes in its memory-mapped area (since storage devices are like
their own small computers and could lie or have faulty firmwares)? If
fsync() returns early here it would not be against the documention in the
manpage. sync() is here more simple as it describes itself as writing all
pending modifications for file (meta)data to the underlying filesystems. If
this would result returning after the device reports completion but the
kernel still sees cached writes in its context that would be strictly a
bug. The interesting part here is that the notes section of sync() sets
sync(), syncfs() and fsync() equal in guarantees.

With this information I see 3 possibilities here:

1. This is a bug in the controlling logic of dd that might not be obvious
at all.
2. This is a bug in fsync() or somewhere more below in the Linux Kernel.
3. Returning early is the intended behavior of fsync() and does not
strictly conflict with the manpage.

If the last is the case it might be worth proposing a change to the Linux
Kernel to additionally ensure that all cached writes are being sent out
from the Kernel's context before a return from fsync() is possible. It
would also mean that currently users can't rely on fsync() (e.g. via dd
with conv=fsync) to ensure the data has been flushed - instead they would
need to take additional action like executing a sync in the terminal
afterwards.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]