[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RFC: How du counts size of hardlinked files
From: |
Johannes Niess |
Subject: |
RFC: How du counts size of hardlinked files |
Date: |
Thu, 12 Jan 2006 23:59:47 +0100 |
User-agent: |
KMail/1.9 |
Hi list,
du (with default options) seems to count files with multiple hard links in the
first directory it traverses.
The -l option changes that.
But there are other valid viewpoints.
Somehow the byte count of multiple hardlinks partially belongs to all of them,
even when not part of traversed directories. In this mode a file with 10
bytes and 3 hardlinks would be counted as 3 files with 3 bytes (an only one
hardlink) each. The rounding error of integers is acceptable in this
'approximate' mode. Programmatically this is should be very similar to the -l
mode. Use case: Different physical owners of the hardlinks and doing fair
accounting for them. (Of course the inode has only one common logical owner
for all directory entries).
Not counting multiple AND out-of-tree hardlinks is also usefull. It tells us
how much space we really gain when deleting that tree. 'rm-size' could be a
name for this mode. Programmatically this is similar to default mode: In Perl
I'd use hash keys for the test in default mode. In 'rm-size' mode I'd
increase the hash values of visited inodes. Finally compare # of visited
directory entries to the # of links.
du seems to be the natural home for this functionality. Or is it feature
bloat?
Background: Backups via 'cp -l' need (almost) no space for files unchanged in
several cycles. But these shadow forests of hardlinks are difficult to
account for. Especially when combined with finding and linking identical
files across several physical owners.
Johannes Niess
P.S: I'm not volunteering to implement this. I did not even feel enough need
to do the perl scripts.
- RFC: How du counts size of hardlinked files,
Johannes Niess <=