Re: [rdiff-backup-users] Delay between EndTime and process end

rdiff-backup-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] Delay between EndTime and process end - followu

From:	Chris Wilson
Subject:	Re: [rdiff-backup-users] Delay between EndTime and process end - followup
Date:	Mon, 12 Aug 2013 14:29:18 +0100 (BST)
User-agent:	Alpine 2.02 (DEB 1266 2009-07-14)

Hi Laurent,

On Mon, 12 Aug 2013, Laurent De Buyst wrote:

Two weeks ago I sent a mail to this list asking about some strangedelays I was seeing between the EndTime as logged byt rdiff-backup andthe actual end of the running process.

Since then, I've been digging deeper and I'd like to ask once more ifanyone can tell me what's going on and how I might be able to 'fix'this.

See, I ran my backups using -v9 for a few days and this has shown ingreater detail when the delay happens.


2013/08/12 06:00:56   Mon Aug 12 05:56:03 2013  Copying inc attrs from ... to 
...
2013/08/12 06:00:56   Mon Aug 12 05:56:03 2013  Setting time of ... to 
1376107515
2013/08/12 06:00:56   Mon Aug 12 05:57:36 2013  Renaming ... to 
.../backup/home/domino/data/webadmin.ntf
...
2013/08/12 06:00:56   Mon Aug 12 06:00:56 2013  Setting time of .../backup/tmp 
to 1376245594
2013/08/12 06:00:56   Mon Aug 12 06:00:56 2013  Copying attributes from () to 
.../backup
2013/08/12 07:17:03   Mon Aug 12 06:00:56 2013  Setting time of .../backup to 
1371199946
2013/08/12 07:17:03   Mon Aug 12 06:01:07 2013  Touching 
.../backup/rdiff-backup-data/extended_attributes.2013-08-12T00:53:28+02:00.snapshot
2013/08/12 07:17:03   Mon Aug 12 06:02:41 2013  Touching 
.../backup/rdiff-backup-data/access_control_lists.2013-08-12T00:53:28+02:00.snapshot
2013/08/12 07:17:03   Mon Aug 12 06:03:26 2013  Writing mirror_metadata diff
2013/08/12 07:17:03   Mon Aug 12 06:04:00 2013  Deleting 
.../backup/rdiff-backup-data/mirror_metadata.2013-08-11T00:55:43+02:00.snapshot.gz
2013/08/12 07:17:03   --------------[ Session statistics ]--------------

[...]

2013/08/12 07:17:03   --------------------------------------------------
2013/08/12 07:17:03   Mon Aug 12 07:17:02 2013  Deleting 
.../backup/rdiff-backup-data/current_mirror.2013-08-11T00:55:43+02:00.data

Now, the first timestamp comes from the wrapper I'm using, while the second 
ones are from rdiff-backup.
The first thing I've noticed even before is that output fromrdiff-backup seems to come in bursts. You can see that it's doing thingsat 05:56 that only get logged at 06:00. This seems to be normaloperational mode, I assume there's a cache involved somewhere.

Depending on how your wrapper runs rdiff-backup, if you are feeding theresults into a pipe (for example, rdiff-backup | wrapper-script) then theywill be buffered by the OS. It's possible to turn that buffering off ifyou need to. But I would say, just don't trust the timestamps added byyour wrapper script. The timestamps from rdiff-backup contain the timewhen the message was generated.


Bearing that in mind, the delay actually seems to come here:

2013/08/12 07:17:03 Mon Aug 12 06:04:00 2013 Deleting

.../backup/rdiff-backup-data/mirror_metadata.2013-08-11T00:55:43+02:00.snapshot.gz

2013/08/12 07:17:03   --------------[ Session statistics ]--------------

[...]

2013/08/12 07:17:03   --------------------------------------------------
2013/08/12 07:17:03 Mon Aug 12 07:17:02 2013 Deleting

.../backup/rdiff-backup-data/current_mirror.2013-08-11T00:55:43+02:00.data

Since I doubt that the delete operations take a long time (unless you'rerunning on a weird distributed filesystem, or a very heavily loadedfileserver, in which case all bets are off), I'd say the delay is incalculating the session statistics.

I don't know quite what it does, but scanning the code should help you, orstrace()ing the process while it's in this delaying mode. It might bestat()ing 48,000 files. It shouldn't take hours to do that, but if it'sover a network, with high latency, it might do.

You could run rdiff-backup manually without any pipes, so that you can seethe output as it's generated and know when "deleting snapshot.gz" happens.After that, it's calculating session statistics. Then attach strace to theprocess and see what it's doing and which operations take a long time.

And as long as I don't know exactly what it is, I can't really get towork on this because I don't know if I need more ram, more CPU power,more CPU speed, faster disks, ...

You could look at which resource is most heavily used when the server isrunning the backup jobs. top can help with that. Is the CPU at 100%? Is itmuch less than 100%, but you have a high iowait time? Do you have a lot ofswap usage? Does the swap usage grow while the backups are running? Doesvmstat show more blocks in/out or swap in/out while the backups arerunning?


Cheers, Chris.
--
_____ __     _
\  __/ / ,__(_)_  | Chris Wilson <address@hidden> Cambs UK |
/ (_/ ,\/ _/ /_ \ | Security/C/C++/Java/Ruby/Perl/SQL Developer |
\__/_/_/_//_/___/ | We are GNU : free your mind & your software |

[Prev in Thread]

Current Thread

[Next in Thread]

[rdiff-backup-users] Delay between EndTime and process end - followup, Laurent De Buyst, 2013/08/12
- Re: [rdiff-backup-users] Delay between EndTime and process end - followup, Chris Wilson <=
- Re: [rdiff-backup-users] Delay between EndTime and process end - followup, Robert Nichols, 2013/08/12

Prev by Date: [rdiff-backup-users] Delay between EndTime and process end - followup
Next by Date: Re: [rdiff-backup-users] Delay between EndTime and process end - followup
Previous by thread: [rdiff-backup-users] Delay between EndTime and process end - followup
Next by thread: Re: [rdiff-backup-users] Delay between EndTime and process end - followup
Index(es):
- Date
- Thread