rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] Restructuring an archive


From: Dominic
Subject: Re: [rdiff-backup-users] Restructuring an archive
Date: Thu, 13 Nov 2008 09:10:03 +0000
User-agent: Thunderbird 2.0.0.17 (Windows/20080914)

Hi Alan

I'm new to rdiff-backup, but I'm trying to understand the pros and cons, including any cons that might not become apparent until 2 years down the road.

What is the problem with your pre-existing backup arrangement: current files with a vast number of small incremental reverse diff files. Are you seeking to save disk space? To make it faster to recover the file versions that existed (say) 2 years ago? In general I would have thought it very rare that people want to recover versions of files from a long time ago, so it is brilliant that rdiff-backup offers this facility and it doesn't matter if it is a bit slow doing so. Also that in such cases one will usually be recovering a specific file rather than a whole treeful.

I'm also curious about the additional overhead (in disk space) that is created by frequent rdiff-backup runs. If one backs up daily, how much more disk space is used than if one backs up weekly? In theory no more because 7 x daily incremental diffs have the same info as 1 x weekly incremental diff.

My last question, if anyone can help, is a bit off-topic but it comes in here because, like my last q, it relates to the timing and frequency of backups: how does rdiff-backup deal with locked (in use) files when backing up, or what strategies can be suggested for handling these? I would be backing up mostly from Windows computers and from Samba shares, but certainly including database and email files which might be in use at the time of the backup - in fact these would be both the biggest and the most critical files for backup.

Regards

Dominic



Alan Douglas wrote:
To answer my own questions... for this simple case, this method of directly  
applying rdiffs while keeping some interim versions, is working extremely 
well.   It looks like it will build the new archive in less than a day, 
compared with an estimated three months using rdiff-backup to restore each 
increment.

The gotchas I discovered were:

- you use "rdiff patch" to apply the rdiffs, not "patch"  (duh)
- if a file has no rdiff increment for a given backup, then you copy the 
version from the newer backup, not the older 
- be sure to set the timestamp and permissions after patching, and use "cp -p" 
when copying. 

Buoyed by this success, I'm going to see if I can extend the method to more 
general cases.  I might also look at hacking archfs to use this approach.  

  
Date: Mon, 10 Nov 2008 11:56:32 -0700
From: Alan Douglas <address@hidden>
Subject: [rdiff-backup-users] Restructuring an archive
To: address@hidden

I've been running rdiff-backup for two years, and now desperately need to
restructure things (this is with version 1.1.15 on Ubuntu).  I've tried
SplitRdiffBackup but it was taking far too long and really wasn't doing
what I wanted.  I then wrote a script that would restore each increment in
turn for the necessary paths, then build a new archive using
--current-time, but again it was taking far too long.  I've looked at
archfs, but computationaly speaking, it would be doing the same thing as my
script.

It looks like I'm going to have roll my own solution -- one that applies
the rdiffs in a more intelligent fashion.

At least for this first phase, I am working with files from a single
directory, which keeps it simple.

My plan is to have a seeding script that will restore back to the oldest
increment by applying the rdiffs directly using patch, but every ten
increments would save an intermediate version of the files.  A second
script would then restore to each increment (adding it to the new archive),
by applying rdiffs to the closest intermediate version.

This method should run about 65 times faster than the brute force approach,
while eating about 200GB for all the intermediate copies.  If I had a
couple of TBs of spare disk space, it could be done a lot faster and
simpler, but I don't.

Has anyone done anything like this before?  Are there any problems or
gotchas with applying the rdiffs directly, rather than restoring using
rdiff-backup? Are there any alternatives that I have missed?  I tried
searching the list archives but it was hard to find good search terms.

Thanks,
Alan/
    


_______________________________________________
rdiff-backup-users mailing list at address@hidden
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki


  

reply via email to

[Prev in Thread] Current Thread [Next in Thread]