duplicity-talk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Duplicity-talk] Backuping up a whole filesystem?


From: Colin Ryan
Subject: Re: [Duplicity-talk] Backuping up a whole filesystem?
Date: Thu, 13 Aug 2009 21:31:15 -0400
User-agent: Thunderbird 1.5.0.14 (Macintosh/20071210)

One might look at BackupPC to get some hints. While Perl based not python this is the only OSS project I've run across that does anything remotely like de-duplication.

If this is the kind of direction you're looking for. Only real problem I've seen with it is that it relies on hard links in the storage filesystem making some of the more exotic virtual filesystems unsuitable for storage targets (which is a shame ;-) )

C

Paul Harris wrote:
2009/8/9 Gabriel Ambuehl <address@hidden>

On 9.8.09 David Stanaway wrote:
EG: I have a logfile which gets rotated to logfile.1 - that is the same
as logile in the previous backup, I don't need to send it again.
EG: I have some family pics that got emailed to me in my Family Maildir,
I fwd the email to someone else.  The mimeenc attachment data is the
same. I haven't tested this, but I would think you had a solid archive
file (tar or fs dump) thhat this kind of duplications of data would drop
out.
I would assume that these would get compressed away but only if you had a
really giant compression dictionary?


<wild half-baked idea>

fingerprint all the files, and then when it comes to storage of the file,
you only store the same fingerprinted file once.

so if you have 5 copies of a file, or the file moves around, then its only
backed up once.

as for log files, that could be dealt with nicer if (eg) the fingerprints
were done in chunks.  that way the first half of a log file would only be
backed up once.

</wild half-baked idea>

<problems>
first one would be correctly checking for hash-collisions, so two different
chunks of data that coincidentally share the same fingerprint don't only get
half backed up
</problems>

------------------------------------------------------------------------

_______________________________________________
Duplicity-talk mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/duplicity-talk





reply via email to

[Prev in Thread] Current Thread [Next in Thread]