[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#74203: Further investigation and workaround
From: |
Collin J. Doering |
Subject: |
bug#74203: Further investigation and workaround |
Date: |
Wed, 13 Nov 2024 21:50:47 -0500 |
Hi again,
I wanted to follow up on my previous report and patch. I still think its useful
to consider disabling the coreutils test I previously suggested, however I
found a way to work around the issue and wanted to make note of it, as well as
provide some details of my investigation.
To work around the coreutils test `tests/cp/reflink-auto.sh` failing on guix
commit `66611696975409a52478b95a862a464daeaefe2a`, I temporarily mounted a
tmpfs to replace /tmp (which was on btrfs).
--8<---------------cut here---------------start------------->8---
mv /tmp /tmp.old
mkdir /tmp
mount -t tmpfs tmpfs /tmp
chmod 1777 /tmp
mv /tmp.old/{.*,*} /tmp/
--8<---------------cut here---------------end--------------->8---
Now, what made me do this? Well let me explain!
In `tests/cp/reflink-auto.sh`
(https://github.com/coreutils/coreutils/blob/v9.1/tests/cp/reflink-auto.sh),
the failing part of the test:
--8<---------------cut here---------------start------------->8---
# we shouldn't be able to reflink() files on separate partitions
. "$abs_srcdir/tests/other-fs-tmpdir"
a_other="$other_partition_tmpdir/a"
<..>
returns_ 1 cp --reflink "$a_other" b || fail=1
--8<---------------cut here---------------end--------------->8---
'$other_partition_tmpdir' is defined in 'tests/other-fs-tmpdir'
(https://github.com/coreutils/coreutils/blob/v9.1/tests/other-fs-tmpdir) by
looking through a list of candidate directories, comparing the current working
directory to each candidate to see if they have different device ids (as given
by 'stat -c %d <path>') and that the current user can create directories there.
Once it finds a candidate, it sets '$other_partition_tmpdir' to the temporary
directory it created. The candidate directories that are considered are as
follows:
--8<---------------cut here---------------start------------->8---
test "${CANDIDATE_TMP_DIRS+set}" = set \
|| CANDIDATE_TMP_DIRS="$TMPDIR /tmp /dev/shm /var/tmp /usr/tmp $HOME"
--8<---------------cut here---------------end--------------->8---
Looking at a remaining failed build of coreutils (left over by building with
`--keep-failed`), I see that in 'top/environment-variables', 'TMPDIR' is set to
'/tmp/guix-build-guix-1.4.0-26.5ab3c4c.drv-0'. This directory is the same place
the build is taking place, so I would expect it to 'be on the same partition'.
So, next would be /tmp, where the same premise applies; next is /dev/shm. From
my tests simulating the coreutils guix shell build environment, this would meet
the conditions and be selected. However, if this were the case, I wouldn't
expect the coreutils reflink test to fail.
My suspicion is that for some reason, 'stat -c %d <path>' to check whether two
files, a and b are on the same partition doesn't play well with btrfs
subvolumes in some instances with guix-daemon sandboxed builds. However, when
trying to test this in a simulated coreutils guix shell build environment, I
found that paths outside of the environment on different subvolumes (that do
indeed show different device ids (as per 'stat -c %d <path>' outside of the
guix shell container)), show the same id's within it. I suspect this is related
to why the coreutils test fails, but does not when I use a tmpfs for /tmp. Its
worth noting that on the system impacted, /gnu/store is a btrfs subvolume.
I am not yet satisfied with my with my partial explanation, and am very curious
if anyone spots something I'm missing (eg. has a better understanding of the
guix build environment and why the reflink coreutils test could be failing like
this).
Thanks for your time and attention.
--
Collin J. Doering
http://rekahsoft.ca
http://blog.rekahsoft.ca
http://git.rekahsoft.ca
signature.asc
Description: PGP signature