coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: making a file sparse - in-place?


From: Rodrigo Campos
Subject: Re: making a file sparse - in-place?
Date: Fri, 24 Jan 2014 02:41:38 +0000
User-agent: Mutt/1.5.21 (2010-09-15)

On Fri, Jan 24, 2014 at 01:07:21AM +0000, Pádraig Brady wrote:
> On 01/24/2014 12:47 AM, Bernhard Voelker wrote:
> > Inspired by a recent post on util-linux ML [1], talking about turning
> > a file into a sparse file in-place, i.e. not using a 2-step approach
> > like `cp --sparse file file2 && mv file2 file`), I thought, hey, don't
> > we have this in coreutils already?
> 
> > b)
> > Then, I tried
> >   $ dd if=file of=file conv=sparse,notrunc
> > to avoid truncating the output file. That didn't corrupt the data,
> > but the file still was not sparse afterward.
> > What's the reason for conv=sparse not to work in this situation?
> > BTW: generally, writing to the same file seems to work, e.g.:
> >   dd if=file of=file conv=ucase,notrunc
> 
> To deallocate the zeros we'd have to use fallocate(FALLOC_FL_PUNCH_HOLE).
> Also for efficiency reasons it would be nice to detect holes efficiently.
> We can do this with the current fiemap code, but really we should try
> and use the new SEEK_HOLE functionality available in the kernel.

I looked into this, but I think it won't. I even tried (maybe I did it wrong ?)
when implementing the tool to make a file sparse in-place, but it didn't report
the '\0's already allocated. The manpage says:

        These  operations  allow  applications  to map holes in a sparsely
        allocated file.  This can be useful for applications such as file backup
        tools, which can save space when creating backups and preserve holes, if
        they have a mechanism for discovering holes.

        For the purposes of these operations, a hole is a sequence of zeros that
        (normally) has not been allocated in the underlying file  storage.
        How‐ ever, a filesystem is not obliged to report holes, so these
        operations are not a guaranteed mechanism for mapping the storage space
        actually allo‐ cated to a file.  (Furthermore, a sequence of zeros that
        actually has been written to the underlying storage may not be reported
        as a  hole.)   In the simplest implementation, a filesystem can support
        the operations by making SEEK_HOLE always return the offset of the end
        of the file, and mak‐ ing SEEK_DATA always return offset (i.e., even if
        the location referred to by offset is a hole, it can be considered to
        consist of data that is  a sequence of zeros).


So, this operations seems to be oriented to easily handle already sparse files
in applications. And let me remark one specific part of it:

        Furthermore, a sequence of zeros that actually has been written to the
        underlying storage may not be reported as a hole.


And that is the case that we are interested in. And, I tried it in ext4 on a
3.11 kernel and it does not detect as a hole a sequence of zeros that has been
written to the underlying storage.





Thanks a lot,
Rodrigo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]