[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [rdiff-backup-users] Re: Computed SHA1 digest doesn't match recorde
From: |
Billy Crook |
Subject: |
Re: [rdiff-backup-users] Re: Computed SHA1 digest doesn't match recorded digest |
Date: |
Fri, 28 Aug 2009 21:38:16 -0500 |
On Fri, Aug 28, 2009 at 10:17, Daniel Miller<address@hidden> wrote:
>> I can't speak for others, but I can tell you that I personally care.
>
> Thank you, Josh.
>
>> My general policy is that if someone is willing to send me a repository
>> that demonstrates the problem, I'm willing to take a look at it. I guess I'm
>> a little slow to respond, because it often happens that it is bad hardware,
>> etc that's causing the problem. However, if you're certain that that's not
>> the case here, and are willing to send me a repository that duplicates the
>> problem, I can look into it.
Josh helped me quite a bit before. But the backup job itself took 12
hours under normal conditions. I had a week or so delay buying and
waiting to receive additional hard drives just to clone the repo to
continue digging further into the problem, and I'm near continuously
backlogged in my personal projects, so it didn't end up going very
far. My repositories are in the TB range, and a good bit of its
contents is not able to be disclosed. I had to hand scrub several
thousand lines of debug output on multiple occasions.
My problems were not due to bad ram or a failing hard drive. My repos
are accessed over iSCSI, and a couple times, the network became
unavaliable. Probably during a backup. I do however think a backup
tool should be able to recover from that sort of problem. And from
what I observed, it tried to, but terminated with a traceback during
its attempt. And I do not think it should exit abnormally, even if
it's unable to recover.
It is not unreasonable for a backup utility to expect filesystem
errors. If disk was perfect, there would be much less reason for
backup utilities in the first place.
I was able to confirm the problem could recur on the same backup repo
copied to multiple filesystems on independant,
verified-stable-and-functioning-properly machines.