[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Strange hole creation behavior
From: |
Eric Sandeen |
Subject: |
Re: Strange hole creation behavior |
Date: |
Fri, 11 Apr 2014 18:05:49 -0500 |
User-agent: |
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 |
On 4/11/14, 5:58 PM, Pádraig Brady wrote:
> On 04/11/2014 09:43 PM, Brian Foster wrote:
>> On Fri, Apr 11, 2014 at 06:13:59PM +0100, Pádraig Brady wrote:
>>> So this coreutils test is failing on XFS:
>>> http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=blob;f=tests/dd/sparse.sh;h=06efc7017
>>> Specifically the last hole check on line 66.
>>>
>>> In summary what's happening is that a write(1MiB), lseek(1MiB), write(1MiB)
>>> creates only a 64KiB hole. Is that expected?
>>>
>>
>> This is expected behavior due to speculative preallocation. An FAQ with
>> regard to this behavior is pending, but see here for reference:
>>
>> http://oss.sgi.com/archives/xfs/2014-04/msg00083.html
>>
>> In that particular write(1MB), lseek(+1MB), write(1MB) workload, each
>> write is preallocating some extra space beyond the current EOF. The seek
>> then moves past that space, but the space doesn't go away. The
>> subsequent writes will extend EOF. The previously preallocated space now
>> resides in the middle of the file and can't be trimmed away when the
>> file is closed.
>>
>>> Now a 1MiB hole is supported using truncate:
>>> dd if=/dev/urandom of=file.in bs=1M count=1 iflag=fullblock
>>> truncate -s+1M file.in
>>> dd if=/dev/urandom of=file.in bs=1M count=1 iflag=fullblock conv=notrunc
>>> oflag=append
>>> $ du -k file.in
>>> 2048 file.in
>>>
>>
>> This works simply because it is broken into multiple commands. When the
>> first dd exits, the excess space is trimmed off (the file descriptor is
>> closed). The subsequent truncate extends the file size without any
>> extra space getting caught between the old and new EOF.
>>
>> You can confirm this by using the 'allocsize=4k' mount option to the XFS
>> mount. If you wanted something more generic for the purpose of testing
>> the coreutils functionality, you could also set the size of file.out in
>> advance. E.g., with preallocation in effect:
>>
>> # dd if=file.in of=file.out bs=1M conv=sparse
>> # xfs_bmap -v file.out
>> file.out:
>> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
>> 0: [0..3967]: 9773944..9777911 1 (9080..13047) 3968
>> 1: [3968..4095]: hole 128
>> 2: [4096..6143]: 9778040..9780087 1 (13176..15223) 2048
>>
>> ... and then prevent preallocation by ensuring writes do not extend the
>> file:
>>
>> # rm -f file.out
>> # truncate --size=3M file.out
>> # dd if=file.in of=file.out bs=1M conv=sparse,notrunc
>> # xfs_bmap -v file.out
>> file.out:
>> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
>> 0: [0..2047]: 9773944..9775991 1 (9080..11127) 2048
>> 1: [2048..4095]: hole 2048
>> 2: [4096..6143]: 9778040..9780087 1 (13176..15223) 2048
>>
>> Hope that helps.
>
> Excellent info thanks.
> With that I can adjust the test so it passes (patch attached).
>
> So for reference this means that cp can no longer recreate holes
> <= 1MiB from source to dest (with the default XFS allocation size):
Well, the allocation size changes based on the filesize; there's a
heuristic involved. So I fear that if you hard-code it into your
tests, you risk failing again in the future...
> We could I suppose use FALLOC_FL_PUNCH_HOLE where available
> to cater for this case. I'll see whether this is worth adding.
That might make sense.
But filesystems get to pick their layout; even ext4 may opportunistically
fill in holes, etc - so I think you need to be pretty careful with these
sorts of tests...
-Eric