bug-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-tar] tar -S with --use-compress-prog=pbzip2 does not keep spars


From: Nathan Stratton Treadway
Subject: Re: [Bug-tar] tar -S with --use-compress-prog=pbzip2 does not keep sparseness
Date: Thu, 5 Sep 2019 14:42:07 -0400
User-agent: Mutt/1.5.20 (2009-06-14)

On Wed, Sep 04, 2019 at 13:48:33 -0300, Chris Mitchell wrote:
> While doing some tests on managing KVM's .qcow2 files, I discovered the
> following behaviour in GNU tar:
> 
> `tar --use-compress-prog=pbzip2 -cSf archive.tar.pbz2
> path/to/file.qcow2` produces a compressed tar archive as expected, but
> then when that archive is extracted using `tar
> --use-compress-prog=pbzip2 -xSf archive.tar.pbz2`, the extracted .qcow2
> file's size  has ballooned from a 'real' size of just the contents to a
> 'real' size matching the entire 'apparent' size (as reported by `ls`
> and `du`).
> 
> I'm no filesystem expert, and I'm new at dealing with sparse files, but
> I believe the above behaviour indicates that during either archive
> creation or extraction, zeros have been written where there were holes,
> resulting in a non-sparse file.

I don't have any specific answer to your question, but here are some
general comments on the topic:

* The tar info page (e.g. 
    https://www.gnu.org/software/tar/manual/html_chapter/tar_8.html#SEC137
  section "8.1.2 Archiving Sparse Files") explains that the "-S" option
  is only meaningful on archive creation (or update), not on extraction. 
  If -S is in effect when the archive is created, tar will detect
  sparse files being added to the archive and mark that status within
  the archive.  (Any sparse member found within the archive during
  extraction will be created as a sparse file, with or without the -S
  option on the "tar -x" command.)  

* In general, tar creates the archive "file" first, and then pipes the
  contents of the archive to the compression program -- so that
  sparse-file-detection step should happen before the compression program
  is involved in any way.  
  

 
> I also tested both tar's own bzip2 support with `tar -cSjf`, and piping
> from tar without compression to pbzip2, both of which preserved the
> sparseness of the file when extracted. (The 'apparent' size matching the
> storage quota I gave when creating the volume in KVM, but 'real' size
> matching just the size of the contents.)

It might be helpful to post the exact commands  you used to test these
various scenarios, etc.

Sounds like 
  $ tar --use-compress-prog=pbzip2 -cSf pbz2archive.tar.pbz2 path/to/file.qcow2
and 
  $ tar -cSjf bz2archive.tar.bz2 path/to/file.qcow2 
should produce (essentially) identical .tar.*gz2 files.  Do they?  (Or,
if you decompress each one, are the two resulting *.tar files
essentially identical?)

                                                        Nathan

p.s. I have found that "ls -sl --block-size=1" is a handy way to see in
one command whether a file is sparse or not:
  $ ls -sl --block-size=1 temp.sparse
  4096 -rw-r--r-- 1 root root 16896 Mar 30  2014 temp.sparse

The first column gives the number of bytes allocated to the file (since
block-size is set to 1), while the 6th column gives the "file size"
field -- so if the first column is smaller than the 6th, you know you
have a sparse file.





----------------------------------------------------------------------------
Nathan Stratton Treadway  -  address@hidden  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239



reply via email to

[Prev in Thread] Current Thread [Next in Thread]