duplicity-talk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Duplicity-talk] Feature request/discussion: Store identical files o


From: Peter Schuller
Subject: Re: [Duplicity-talk] Feature request/discussion: Store identical files only once
Date: Wed, 25 Jun 2008 08:12:11 +0200
User-agent: KMail/1.9.7

> The point of using the path along with the checksum is to minimise the
> risk of not backing up a file due to a (highly unlikely) checksum
> collission. If a file with the right checksum but the wrong path is
> available, it could still be used if extra caution is used to verify
> that it's the same file. Or perhaps the system could just use the
> filename instead of the whole path.

I considered in the past implementing an exclusively hash-based backup where 
the backup store would just be a refcounted set of checksum->content 
mappings, and each backup set a tree of meta-data with checksums.

The idea was to decrease complexity by not doing something clever like rsync 
(only whole-file checksums), while still giving very good performance for 
many cases. Every backup snapshot would be a "full" snapshot depending on 
nothing but the correctly maintained checksum->content mapping, and any 
specific backup set can be removed at any time (so for example, implementing 
a gradual backoff in terms of frequency is simple, even with just a single 
backup target, because there are no inter-dependencies).

I generally like the idea. The problem I see is security. I'd be fine usng 
this on e.g. my own home directory. But more general use can be quite 
sensitive. One may allow for accidental hash collisions being sufficiently 
unlikely that you can ignore the problem; but in the presence of malicious 
intent you are also relying on the security of the hash algorithm - 
especially given the various ways that information of the sort that "user X 
has a file Y with checksum Z" might possibly be leaked to third parties.

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <address@hidden>'
Key retrieval: Send an E-Mail to address@hidden
E-Mail: address@hidden Web: http://www.scode.org

Attachment: signature.asc
Description: This is a digitally signed message part.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]