coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: making a file sparse - in-place?


From: Pádraig Brady
Subject: Re: making a file sparse - in-place?
Date: Fri, 24 Jan 2014 03:30:20 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

On 01/24/2014 03:12 AM, Rodrigo Campos wrote:
> On Fri, Jan 24, 2014 at 02:59:41AM +0000, Pádraig Brady wrote:
>> On 01/24/2014 02:41 AM, Rodrigo Campos wrote:
>>> On Fri, Jan 24, 2014 at 01:07:21AM +0000, Pádraig Brady wrote:
>>>> On 01/24/2014 12:47 AM, Bernhard Voelker wrote:
>>>>> Inspired by a recent post on util-linux ML [1], talking about turning
>>>>> a file into a sparse file in-place, i.e. not using a 2-step approach
>>>>> like `cp --sparse file file2 && mv file2 file`), I thought, hey, don't
>>>>> we have this in coreutils already?
>>>>
>>>>> b)
>>>>> Then, I tried
>>>>>   $ dd if=file of=file conv=sparse,notrunc
>>>>> to avoid truncating the output file. That didn't corrupt the data,
>>>>> but the file still was not sparse afterward.
>>>>> What's the reason for conv=sparse not to work in this situation?
>>>>> BTW: generally, writing to the same file seems to work, e.g.:
>>>>>   dd if=file of=file conv=ucase,notrunc
>>>>
>>>> To deallocate the zeros we'd have to use fallocate(FALLOC_FL_PUNCH_HOLE).
>>>> Also for efficiency reasons it would be nice to detect holes efficiently.
>>>> We can do this with the current fiemap code, but really we should try
>>>> and use the new SEEK_HOLE functionality available in the kernel.
>>>
>>> I looked into this, but I think it won't. I even tried (maybe I did it 
>>> wrong ?)
>>> when implementing the tool to make a file sparse in-place, but it didn't 
>>> report
>>> the '\0's already allocated.
>>
>> Right, you need to manually detect those, which dd does with is_nul():
>> http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/system.h;h=39750e82f#l499
>>
>> Then detected runs of zeros could be sparsified with FALLOC_FL_PUNCH_HOLE ?
> 
> You can use FALLOC_FL_PUNCH_HOLE where you know there are zeros, yes.
> 
> But FALLOC_FL_PUNCH_HOLE is not portable. It's not a problem to use this only 
> on
> linux in dd ?
> 
>> I've not tried this myself but would be optimistic is works on some file 
>> systems.
>> If this wasn't supported then we'd stop immediately when if=of,
>> or otherwise revert to seeking.
>>
>> Note also theh caveats noted for the conv=sparse option:
>>
>>    Be careful when using
>>    this option in conjunction with `conv=notrunc' or
>>    `oflag=append'.  With `conv=notrunc', existing data in the
>>    output file corresponding to NUL blocks from the input, will
>>    be untouched.  With `oflag=append' the seeks performed will
> 
> Well, I think this promise of them being untouched might block to implement
> in-place "sparsify" of files on dd with both flags active.
> 
> 
> But, a question about policy: is it okay to implement linux-only extensions 
> here ?

If the current system doesn't support in place sparsify,
then be could document that limitation along the
same lines as the conv=notrunc case above.
If one wanted more portable guarantees about sparsifying a file,
then it would be best to use a temporary file anyway.
If there are other methods to punch a hole in a file
on other systems, they can be added as an option to coreutils
without changing the interface.

thanks,
Pádraig.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]