bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#38621: gdu showing different sizes


From: TJ Luoma
Subject: bug#38621: gdu showing different sizes
Date: Mon, 16 Dec 2019 14:43:37 -0500

AHA! Ok, now I understand a little better. I have seen the difference
between "size" and "size on disk" and did not realize that applied
here.

I'm still not 100% clear on _why_ two "identical" files would have
different results for "size on disk" (it _seems_ like those should be
identical) but I suspect that the answer is probably of a technical
nature that would be "over my head" so to speak, and truthfully, all I
really need to know is "sometimes that happens" rather than
understanding the technical details of why.

I appreciate you taking the time to educate me further about this.

Cheers

Tj



On Mon, Dec 16, 2019 at 2:47 AM Bernhard Voelker
<address@hidden> wrote:
>
> On 2019-12-16 07:25, TJ Luoma wrote:
> > I sort of followed most of the technical part of that but I still don’t
> > understand why it’s not a bug to show different information about two
> > identical files.
> >
> > Which may indicate that I didn’t understand the technical part very well.
> >
> > As an end user, it’s hard to understand how that inconsistency isn’t both
> > undesirable and a bug.
> >
> > I could maybe see if they were two files with the same byte-count but
> > different composition that made the calculations off by 1, but this is an
> > identical file and it’s showing up with two different sizes, in a tool
> > meant to report sizes.
> >
> > That just seems “obviously” wrong even if it’s somehow technically
> > explainable.
>
> Thanks for following up on this for further clarifications.
>
> I think the problem is the word "size":
> while 'ls' and 'du --apparent-size' show the length of the content of
> a file, 'du' (without --apparent-size') reports the space the file
> needs on disk.
>
>   $ du --help | sed 3q
>   Usage: du [OPTION]... [FILE]...
>     or:  du [OPTION]... --files0-from=F
>   Summarize disk usage of the set of FILEs, recursively for directories.
> ____________^^^^^^^^^^
>
> One reason for those sizes to differ are "holes".  As an extreme case,
> one can create a 4 Terabyte file (just NULs) on a filesystem which is
> much smaller than that:
>
>   # Filesystem size.
>   $ df -h --out=size,target .
>    Size Mounted on
>    591G /mnt
>
>   # Create a NUL-only file of size 4 Terabyte.
>   $ truncate -s4T f2
>
>   # 'ls' shows the 4T of file size.
>   $ ls -logh f2
>   -rw-r--r-- 1 4.0T Dec 16 08:36 f2
>
>   # 'du' shows that the file does not even require any disk usage.
>   $ du -h f2
>   0     f2
>
>   # ... but with '--apparent-size' reports the real (content) size.
>   $ du -h --apparent-size f2
>   4.0T  f2
>
>   # Any program will see the 4T content transparently.
>   $ wc -c < f2
>   4398046511104
>
> In your case, the file was a mixture of regular data and holes,
> and 'cp' (without --sparse=always) tried to automatically determine
> if the target file should have holes or not (see 'man cp').
> Therefore, your 2 files had a different disk usage, but the net length
> of the content is identical, of course.
>
> Have a nice day,
> Berny





reply via email to

[Prev in Thread] Current Thread [Next in Thread]