[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[rdiff-backup-users] Re: How are moved/renamed files treated?
From: |
Jens Benecke |
Subject: |
[rdiff-backup-users] Re: How are moved/renamed files treated? |
Date: |
Fri, 08 Aug 2003 16:25:58 +0200 |
User-agent: |
KNode/0.7.6 |
Ben Escoto wrote:
>>>>>> "JB" == Jens Benecke <[work]" <address@hidden>>
>>>>>> wrote the following on Wed, 06 Aug 2003 15:39:23 +0200
>
> JB> I am doing this because in the home directories and also in
> JB> /var/log, there are a lot of files that get renamed daily (like
> JB> file.dat from yesterday becomes file.1.dat, file.1.dat becomes
> JB> file.2.dat, etc) but the contents stay the same. Thus, I (expect
> JB> to) benefit from rsync's ability to detect identical parts in
> JB> files, even if not at the same place.
>
> JB> How does rdiff-backup treat such files?
>
> Sorry, rdiff-backup is too dumb to know that a file has moved. To
> rdiff-backup, moving file A to B is equivalent to creating a new file
> B and deleting A, so increments will contain duplicate information.
> There have been some proposals about tracking renames through inode
> numbers or similar file names, but they are pretty complicated and I
> don't intend to implement them.
I would suggest tracking changed files via MD5 sums. inode numbers are only
applicable to moved files (not copied ones), and only applicable where the
file system supports inodes.
> That being said, log files compress well, so you may want to try it
> anyway in case the duplication is tolerable.
It is not... that's why I asked. The files are 100M-1000M in size, compress
to maybe 10 or 50M, but the change within them is only maybe 10k-100k. And
there are hundreds of those files.
I am currently tar'ing the files together and using rsync on the single .tar
file. I keep three to four generations. Additionally I do (seperate) daily
incremental backups, these are only the full monthly ones.
Can I use rdiff-backup for this _one_ tar file to not have lots of duplicate
data on the local backup server? Currently it looks like this (excerpt):
insgesamt 26835416
-rw------- 1 jens 4060907520 2003-04-09 03:49 RESCUE_2003-04-10.tar
-rw------- 1 jens 4300564480 2003-05-12 03:56 RESCUE_2003-05-12.tar
-rw------- 1 jens 5541826560 2003-06-25 03:58 RESCUE_2003-06-25.tar
-rw------- 2 jens 6777825280 2003-08-04 04:11 RESCUE_2003-08-04.tar
-rw------- 2 jens 6777825280 2003-08-04 04:11 RESCUE.tar
I'd say 95% of the data in the .tar files is identical.
Can I do rdiff-backup ssh://myserver/home/.RESCUE.tar
/home/backups/RESCUE.tar, and the difference between the (newer) remote
.BACKUP.tar and the local BACKUP.tar get saved to a local RESCUE.rdiff (or
something), so I can (locally) recreate the newest RESCUE.tar when I need
it?
Thank you!
--
Jens Benecke (address@hidden)