rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: The Verification Treadmill


From: Robert Nichols
Subject: Re: The Verification Treadmill
Date: Sun, 18 Feb 2024 20:57:43 -0600
User-agent: Mozilla Thunderbird

On 2/18/24 01:10, Dominic Raferd wrote:
On 17/02/2024 03:14, Robert Nichols wrote:
On 2/16/24 08:44, Dominic Raferd wrote:
Until then, I am interested in your parallel processing approach. Presumably 
you start 8 parallel rdiff-backup verify sessions for datetime points -1 to -8 
(and then, when they are all complete, -9 to -16, -17 to -23...)? And you run 8 
in parallel because your CPU has 8 cores?

I have 16 cores, actually, but by experiment I found that 8 parallel threads 
seems to be the sweet spot. I don't know how much of that is unique to my 
system and the nature of my backups. I did have to add a pre-scan of the 
file_statistics metadata files to look for increment sizes of 1GB or greater, 
and limit the number of parallel checks to 1 if any are found. All it takes is 
one huge ISO file in the increments to gobble up cache and make the parallel 
checks really slow. I haven't spent much time trying to tune that adjustment, 
and all the experimenting was done back when I had just 32GB of RAM.

I let the parallel threads run independently, without waiting for anything in 
the others. Effectively, I run the threads with the level sequences:
     {-1..-99..8}
     {-2..-99..8}
     {-3..-99..8}
     ...
     {-8..-99..8}
and then just wait for everything to complete.

The code is really nothing like that, but that is the effect. You might expect 
the threads to get badly out of sync, but because of the effects of I/O 
caching, whichever threads finish a step first find themselves slowed down by 
I/O waits more than do the threads that advance to a new step later. The 
threads tend to stay quite beautifully in sync. Again, that's on my system with 
my backups. YMMV.
Very interesting. A while ago I set my timedicer-verify script to run 
verifications in parallel but it seemed to make everything slower not faster, 
admittedly when running on a much less powerful (and virtual) machine than 
yours, so I stripped out all that code. But I should look at it again (I guess 
I must have backups!)...

It's largely a matter of how much memory you have for the kernel's 
buffer/cache, but for a VM the situation is murkier. I doubt that having both 
the host and the VM doing I/O caching would be a productive use of memory. I 
really don't know how that would behave.

--
Bob Nichols     "NOSPAM" is really part of my email address.
                Do NOT delete it.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]