[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[rdiff-backup-users] rdiff-backup memory usage problems
From: |
David |
Subject: |
[rdiff-backup-users] rdiff-backup memory usage problems |
Date: |
Thu, 20 Aug 2009 11:43:01 +0200 |
Hi there.
For massive filesystems (eg: millions of files), it seems like
rdiff-backup likes to use a large amount of RAM, and the amount used
keeps growing while the scan proceeds.
Here is my setup:
rdiff-backup, version 1.2.8-1 (version which ships with Debian Lenny).
I set it up like this:
# Create 2 duplicate directories of an existing huge one:
rsync --progress --numeric-ids --delete
--link-dest=/some/huge/filesystem --exclude=/rdiff-backup-data -azH
/some/huge/filesystem/ /dest/dir/source
rsync --progress --numeric-ids --delete
--link-dest=/some/huge/filesystem --exclude=/rdiff-backup-data -azH
/some/huge/filesystem/ /dest/dir/dest
The above creates a hardlink snapshot copy of an existing huge
filesystem in 2 different directories. Basically for testing purposes
(and also, this is how my backup scripts work internally, to conserve
backup server harddrive space).
# Run rdiff-backup:
rdiff-backup -v9 --preserve-numerical-ids --no-compare-inode --force
/dest/dir/source/ /dest/dir/dest/
The output is as expected:
[....]
Thu Aug 20 11:13:55 2009 Backup: must_escape_dos_devices = 0
Thu Aug 20 11:13:55 2009 Starting mirror new to files
Thu Aug 20 11:13:55 2009 Processing changed file .
Although I would like to see more details with -v9, like which files
are being compared.
And then, while this is running, top reports that rdiff-backup is
using an increasing % of memory the whole time. And eventually,
rdiff-backup causes a lot of swapping, which slows things down a huge
amount and causes other problems on the server, and rdiff-backup never
finishes either (4 days later...), causing the other backups to never
run.
This makes rdiff-backup unsuitable for backing up our servers with
larger filesystems :-(. I'm experimenting with other backup tools, but
I'd ideally like to use rdiff-backup, if the memory-usage this
particular memory leak was fixed. I'm even tempted to make my own
version of rdiff-backup, just to work around this issue :-( (since
rdiff-backup's Python logic looks really complicated).
Is it not possible for rdiff-backup to use an algorithm closer to
rsync, like an incremental file list, instead of loading a huge number
of per-file details into memory?
I have an idea that I may be causing this problem myself (with my
hardlink-based copies), but theoretically rdiff-backup should be able
to handle this in a memory-efficient way. And I need to use that kind
of logic to preserve harddrive space on the backup server.
I see that Debian's version of rdiff-backup is a bit behind the
development version on the rdiff-backup site. But, looking at the
changelog, there doesn't seem to be anything related to this in there.
Any suggestions? Do other people have this problem? Should I file a
bug for this?
Thanks,
David.
- [rdiff-backup-users] rdiff-backup memory usage problems,
David <=