[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#10281: change in behavior of du with multiple arguments (commit
From: |
Jim Meyering |
Subject: |
bug#10281: change in behavior of du with multiple arguments (commit |
Date: |
Sat, 17 Dec 2011 10:20:09 +0100 |
Alan Curry wrote:
...
> By comparison to a proper tool which doesn't do any unnecessary traversals of
> extra directories, your use of du is slow and brittle (if the user forgets
> an alternate directory containing a link, the result is wrong) and has only
> the slight advantage of already being implemented.
>
> Here's a working outline of the single-traversal method. I wouldn't suggest
> that du should contain equivalent code. A single-purpose perl script, even
> without pretty output formatting, feels clean enough to me. Since I've gone
> to the trouble (not much) of writing it, I'll keep it as ~/bin/predict_rm_rf
> for future use.
>
> #!/usr/bin/perl -W
> use strict;
> use File::Find;
>
> @ARGV or die "Usage: $0 directory [directory ...]\n";
>
> my $total = 0;
> my %pending = ();
>
> File::Find::find({wanted => sub {
> my ($dev,$ino,$nlink,$blocks) = (lstat($_))[0,1,3,12];
> if(-d _ || $nlink==1) {
> $total += $blocks;
> return;
> }
> if($nlink == ++$pending{"$dev.$ino"}) {
> delete $pending{"$dev.$ino"};
> $total += $blocks;
> }
> }}, @ARGV);
>
> print "$total blocks would be freed by rm -rf @ARGV\n";
That seems useful.
However, the number it prints is too large whenever it processes
a file or directory more than $nlink times, e.g., when invoked as
predict_rm_rf F F
it prints double the correct number.
To account for that, the script must record every dev/ino pair
it processes, say via:
File::Find::find({wanted => sub {
my ($dev,$ino,$nlink,$blocks) = (lstat($_))[0,1,3,12];
defined $pending{"$dev.$ino"} && $pending{"$dev.$ino"} < 0
and return;
if(-d _ || $nlink==1 || $nlink == ++$pending{"$dev.$ino"}) {
$total += $blocks;
$pending{"$dev.$ino"} = -1;
return;
}
}}, @ARGV);
Note that for a large tree, the perl code will be far less efficient
than C code like du because:
- the perl script must call lstat for every single entry (du can
use dirent.d_ino on some file systems). When I checked about a year
ago, Perl still had no good way to get something like dirent.d_ino.
- du uses a compact representation for a device/inode pair, so
may use a lot less memory.
- bug#10281: change in behavior of du with multiple arguments (commit efe53cc), (continued)
- bug#10281: change in behavior of du with multiple arguments (commit efe53cc), Elliott Forney, 2011/12/14
- bug#10281: change in behavior of du with multiple arguments (commit, Paul Eggert, 2011/12/14
- bug#10281: change in behavior of du with multiple arguments (commit, Alan Curry, 2011/12/16
- bug#10281: change in behavior of du with multiple arguments (commit, Paul Eggert, 2011/12/17
- bug#10281: change in behavior of du with multiple arguments (commit, Alan Curry, 2011/12/17
- bug#10281: change in behavior of du with multiple arguments (commit,
Jim Meyering <=
- bug#10281: change in behavior of du with multiple arguments (commit efe53cc), Elliott Forney, 2011/12/16
- bug#10281: change in behavior of du with multiple arguments (commit efe53cc), Eric Blake, 2011/12/16
- bug#10281: change in behavior of du with multiple arguments (commit efe53cc), Eric Blake, 2011/12/16
- bug#10281: change in behavior of du with multiple arguments (commit efe53cc), Voelker, Bernhard, 2011/12/19
bug#10282: change in behavior of du with multiple arguments (commit efe53cc), Eric Blake, 2011/12/12