|
From: | Dominic Raferd |
Subject: | Re: [rdiff-backup-users] What are the bottlenecks in --verify? (Or how to speed up verification?) |
Date: | Thu, 03 Oct 2013 14:52:25 +0100 |
User-agent: | Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.0 |
There have been discussions about verification speeds and issues
here before. I think you are right that the CPU core is a bottleneck
as rdiff-backup only uses one core when running a verification. Verifications do use temporary space and can use a lot of it even though the temporary files never seem to be visible in the filesystem. I would advise an explicit --tempdir setting, and a different spindle with lots of space (and speed) would be ideal. You might find it helpful to see or use my timedicer-verify.sh script - http://www.timedicer.co.uk/programs/help/timedicer-verify.sh.php. This is a wrapper for rdiff-backup --verify-at-time, can remember previous successful verifications, and runs multiple concurrent verifications (thus using multiple cores efficiently). It allows you to specify a temporary location (passed to rdiff-backup as --tempdir). It can make and use a temporary LVM snapshot as source - this allows you to continue a verification session while updating the underlying repository, but can only work if your repository/ies are on a logical volume (/root or /home as currently written). And as currently written it assumes that the repository/ies are all located at /home/*/[here] or /home/*/*/[here]. Dominic -- TimeDicer: Free File Recovery from Whenever On 26/09/2013 04:12, Thomas Harold
wrote:
What are some options for speeding up the verification of past increments? My guess is that the CPU might be a bottleneck for the SHA1 hash calculation, so that's something I would check first. But how does the verify process work? Does it reconstruct the file in memory, or does it use a temporary directory? ... It seems like if I don't have TMP, TMPDIR, or TEMP defined as environment variables, it is operating system dependent on where Python creates the temporary file. Or unless I pass the --tempdir option to rdiff-backup. http://docs.python.org/2/library/tempfile.html And if the --verify or --verify-at-time options create lots of temporary files, and write/read lots of data to the temporary directory, then I should probably move that directory to a separate set of spindles. |
[Prev in Thread] | Current Thread | [Next in Thread] |