[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [rdiff-backup-users] Q. on max-file-size behavior
From: |
Maarten Bezemer |
Subject: |
Re: [rdiff-backup-users] Q. on max-file-size behavior |
Date: |
Sun, 14 Mar 2010 15:31:13 +0100 (CET) |
On Sat, 13 Mar 2010, Whit Blauvelt wrote:
On Sat, Mar 13, 2010 at 11:58:42PM +0100, Jernej Simonÿÿiÿÿ wrote:
I'd say this is expected behaviour - the destination saw the file on
previous run, but didn't see it on current run (because the source
likely doesn't inform it about files it skips), so it treats the file
as deleted on source.
Probably so. A corner case then. Even though it would be easy for the source
to inform it about files skipped and avoid this, it's probably not worth the
coding effort.
I don't think this is even a corner case. If you want to exclude large
files, then a file that is larger than the limit you specify (something
you explicitly and deliberatly do!) should not be in the backup. Also, it
should not _remain_ in the 'current' backup tree, because it would no
longer match the original in the source tree.
Since rdiff-backup keeps history of the backups, there is no other way
than to treat it as 'deleted from the source'. That's the only way to keep
the history intact AND have a proper 'current' backup tree.
Another question comes up though. If gzip'ing a huge file can cause a
resonably fast machine to tie up considerable resources for > 30 minutes
because it's logic tells it it's time to gzip a 16g file, it would be good
if there's a way to ask it not to do that.
Why would it?
If you want to remove a file from the backup (including the history), feel
free to add wishlist-items for patches or external tools to accomplish
that. Aside from that, you could also run rdiff-backup with nice and/or
ionice so it wouldn't "tie up" resources.
(BTW, spending 30 minutes on a 16GB file, I don't think that would be so
strange. Even md5sum-ing a 4.7GB iso image can take a few minutes on a
busy system with lots of disk i/o.)
I see that compression can be
turned off for all files, but not how to turn compression off just for the
largest files. Is there some trick that would accomplish that? Basically,
compression on smaller files is always good; compression on the very largest
files almost always bad; and somewhere in between - depending on system
resources - it gets iffy. It would be useful to have a flag to set a
file-size threshold where only files below that would compress.
These are quite strong claims without any proof or supporting theory.
Compressing a 7KB file might indeed make it considerably smaller, suppose
it would be 4.1K when zipped. But on file systems with 4KB blocks, that
would not even save 1 block. And filesystems supporting multiple 16GB
files tend to have larger block sizes...
Larger files on the other hand can often be compressed with much larger
space-savings. As always, it all depends on the type of data in the files,
so YMMV.
Contrary to what you suggest, I could think of two wishlist-items that
would make more sense. And I'm not even posting them as wishlist-items as
I don't think they would be worth implementing.
1) limit the (cpu) time spent on compressing a file, and leave the file
uncompressed when it takes too long. Heck, maybe even make it a
user-configurable duration.
2) if compressing is taking longer than X seconds/minutes, check if
compression is doing any good (check compression ratio for the part of
the file that has already been processed) and leave the file
uncompressed when the ratio suggests it wouldn't be worth continuing
the compression process.
Both of these would not help me with the disk image files I have here.
Those tend to have large space-savings at the end of the file. But then
again, I wouldn't use rdiff-backup on them anyway.
Just my 2 cents.
Maarten
- [rdiff-backup-users] Q. on max-file-size behavior, Whit Blauvelt, 2010/03/13
- Re: [rdiff-backup-users] Q. on max-file-size behavior, Whit Blauvelt, 2010/03/13
- Re: [rdiff-backup-users] Q. on max-file-size behavior, Jernej Simončič, 2010/03/13
- Re: [rdiff-backup-users] Q. on max-file-size behavior, Josh Nisly, 2010/03/13
- Re: [rdiff-backup-users] Q. on max-file-size behavior, Whit Blauvelt, 2010/03/13
- Re: [rdiff-backup-users] Q. on max-file-size behavior, Jernej Simončič, 2010/03/13
- Re: [rdiff-backup-users] Q. on max-file-size behavior, Whit Blauvelt, 2010/03/13
- Re: [rdiff-backup-users] Q. on max-file-size behavior,
Maarten Bezemer <=
- Re: [rdiff-backup-users] Q. on max-file-size behavior, Whit Blauvelt, 2010/03/14
- Re: [rdiff-backup-users] Q. on max-file-size behavior, Maarten Bezemer, 2010/03/14
- Re: [rdiff-backup-users] Q. on max-file-size behavior, Whit Blauvelt, 2010/03/14