[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [rdiff-backup-users] Some thoughts and questions for a "large" rdiff
From: |
Ben Escoto |
Subject: |
Re: [rdiff-backup-users] Some thoughts and questions for a "large" rdiff-backup setup. |
Date: |
Sat, 16 Aug 2003 01:41:47 -0700 |
>>>>> "EF" == Erik Forsberg <address@hidden>
>>>>> wrote the following on Thu, 14 Aug 2003 19:49:49 +0200
EF> Hi! I'm thinking about using rdiff-backup to implement a backup
EF> solution for the Academic Computer Society I'm a member
EF> of. We're talking 500+ members, each with their own home
EF> directory, a mailserver with mailspool, web server with webdisk
EF> and a bunch of other disks that needs backup. All members have
EF> Unix shell access on a variety of Hardware/Operating system
EF> combinations (Solaris, Linux, AIX, HP-UX, Tru64, UNICOS, you
EF> name it :) ).
...
EF> My workaround here could be to run rdiff-backup once for each
EF> user, backing to a directory where only the user can
EF> read. Comments on this? I have to admit I'm not *that* happy
EF> about having to run 500+ rdiff-backups each night, each as it's
EF> own user. Trying to parallelize it could be a nightmare. Better
EF> ideas?
I have no experience running a system like this, but since no one else
has responded I'll add my 2c. It seems to be a good idea to run a
separate session for each user. That way older increments can be
removed on a user-by-user basis. Also if there is a problem (for
instance rdiff-backup had trouble handling sockets on one of your
systems apparently) it will only affect one user.
If rdiff-backup takes 2 seconds overhead to start a new session that
would still only be about 15 minutes for 500 sessions. It probably
wouldn't be a good idea to run all 500 at the same time... Running
two or three at once may be faster than running one at a time though,
it probably depends on the system.
Also you may not want to let the users write to the rdiff-backup
directory, since they might mess it up.
EF> On a different side, wouldn't it be nice if rdiff-backup could
EF> auto-clean it's destination directory when it's getting
EF> full. That is, it would be nice if you could specify to
EF> rdiff-backup that it should use up to a specific amount of
EF> diskspace in the destination directory, and if a new backup
EF> doesn't fit in, it should try cleaning one day of backups.
EF> I guess this isn't that easy to implement, especially since you
EF> don't know how much space a new backup will occupy. Ideas on
EF> solving this problem?
Yes, if anyone has any ideas about this problem let me know. Right
now the only way to run rdiff-backup that's really convenient is to
specify, for instance, --remove-older-than 30D (for 30 days) every
once in a while. There is no option to remove just the number of
increments that would let you complete the current backup (this would
actually be very hard to add).
Another thing to watch out for is running out of space. rdiff-backup
is supposed to fail gracefully (as in, if it runs out of space it acts
as if the current session never existed), but subsequent sessions will
also fail. I'm not sure what the best way to handle this is, but it
is probably a good idea to check rdiff-backup's exit code and trigger
some warning if it fails to complete (non-zero code).
EF> Another feature that would be really nice on this system would
EF> be to have .nsr-lookalike-files. For those of you not familiar
EF> with Networker, .nsr files are a way to specify how to handle a
EF> specific file, a specific directory or a specific directory and
EF> all its subdirectories. By putting a file named .nsr in a
EF> directory, you can for example say that the subdirectory 'trash'
EF> never should be copied to the backup, or that a specific logfile
EF> should be ignored. Are there any plans for such functionality in
EF> rdiff-backup? How hard would it be to implement? (We might be
EF> able to help, we love Python :) ).
You could write a script to check for $HOME/.rdiff-backup-excludes.
If it exists, rdiff-backup is run with --exclude-filelist
$HOME/.rdiff-backup-excludes. This is actually pretty flexible since
includes can also be in an exclude file (see man page).
EF> Another idea I had for this setup was to set a quota on the
EF> amount of backup space a specific user could use, and then let
EF> the user him/herself choose how many days he/she wants the
EF> backup to cover by adapting what directories should be copied
EF> (using a mechanism such as the one above, the
EF> .nsr-file-lookalike one).
Yes, probably a good idea, make the user decide/hangself...
--
Ben Escoto
pgpAcKN8R297u.pgp
Description: PGP signature