bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#38621: gdu showing different sizes


From: TJ Luoma
Subject: bug#38621: gdu showing different sizes
Date: Mon, 16 Dec 2019 01:25:40 -0500

I sort of followed most of the technical part of that but I still don’t
understand why it’s not a bug to show different information about two
identical files.

Which may indicate that I didn’t understand the technical part very well.

As an end user, it’s hard to understand how that inconsistency isn’t both
undesirable and a bug.

I could maybe see if they were two files with the same byte-count but
different composition that made the calculations off by 1, but this is an
identical file and it’s showing up with two different sizes, in a tool
meant to report sizes.

That just seems “obviously” wrong even if it’s somehow technically
explainable.

TjL

On Sun, Dec 15, 2019 at 4:19 PM Bernhard Voelker <address@hidden>
wrote:

> tag 38621 notabug
> close 38621
> stop
>
> On 2019-12-15 06:15, TJ Luoma wrote:
> > I ended up with two version of the same file
> > 'StreamDeck-4.4.2.12189.pkg' and 'Stream_Deck_4.4.2.12189.pkg' and
> > wanted to check to see if they were the same file.
> >
> > I checked the size with `gdu` like so:
> >
> > % /usr/local/bin/gdu --si -s *pkg
> > 101M     StreamDeck-4.4.2.12189.pkg
> > 102M     Stream_Deck_4.4.2.12189.pkg
> >
> > Which led me to think they were different files / sizes. But when I
> > used `ls -l` I was surprised to see this:
> >
> > % command ls -l *pkg
> > -rw-r--r--  1 tjluoma  staff  88885047 Dec 15 00:00
> StreamDeck-4.4.2.12189.pkg
> > -rw-r--r--@ 1 tjluoma  staff  88885047 Dec 15 00:02
> Stream_Deck_4.4.2.12189.pkg
> >
> > So they _are_ the same size. Are they the same file? I used `md5` to
> check
> >
> > % command md5 -r *pkg
> > 98ac563a36386ca3aa87f62893302b4f StreamDeck-4.4.2.12189.pkg
> > 98ac563a36386ca3aa87f62893302b4f Stream_Deck_4.4.2.12189.pkg
> >
> > OK, so these are exactly the same file. So… why did `gdu` tell me they
> > are different sizes?
> >
> > %  gdu --version
> > du (GNU coreutils) 8.31
> > Copyright (C) 2019 Free Software Foundation, Inc.
> > License GPLv3+: GNU GPL version 3 or later <
> https://gnu.org/licenses/gpl.html>.
> > This is free software: you are free to change and redistribute it.
> > There is NO WARRANTY, to the extent permitted by law.
> >
> > Written by Torbjorn Granlund, David MacKenzie, Paul Eggert,
> > and Jim Meyering.
> >
> > I'm using Mac OS X 10.14.6 (18G2022) with `coreutils` installed via
> `brew`.
> >
> > Any help would be appreciated.
>
> This is a "sparse" file, i.e., a file with longer sequences of Zeroes
> somewhere in between which can be stored more efficient on the disk.
> Any application reading the data will get the correct number of Zeroes,
> while some disk space is saved.
>
> E.g. the following creates a 300M file, with the first 100M and the last
> 100M
> with random data, and the 100M between is a "hole":
>
>   # Write the 1st 100M (as usual).
>   $ dd bs=1M count=100 if=/dev/urandom of=f
>   100+ 0 records in
>   100+0 records out
>   104857600 bytes (105 MB, 100 MiB) copied, 0.466356 s, 225 MB/s
>
>   # Write another 100M, but starting at a position of 200M,
>   # thus leaving Zeroes in between.
>   $ dd bs=1M seek=200 count=100 if=/dev/urandom of=f
>   100+0 records in
>   100+0 records out
>   104857600 bytes (105 MB, 100 MiB) copied, 0.462072 s, 227 MB/s
>
>   $ ls -logh f
>   -rw-r--r-- 1 300M Dec 15 18:17 f
>
>   $ du -h f  # shows the space occupied on disk.
>   200M  f
>
>   $ du --apparent-size -h f  # shows the size applications would read.
>   300M  f
>
> See the documentation of 'cp' and 'du':
> https://www.gnu.org/software/coreutils/cp  (the --sparse option)
> https://www.gnu.org/software/coreutils/du  (the --apparent-size option)
>
> As this is not a bug in du(1), I'm marking this as such, and close the
> ticket
> in our bug tracker.  The discussion can continue, of course.
>
> Have a nice day,
> Berny
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]