I wasn't really prepared to make this announcement so soon, but now seems like
a good time to let the community know. I've been working on a new
implementation of rdiff-backup since about a month ago when I dug into the
current codebase and discovered its disappointing quality. While what I have
right now is functional and works on simple cases, it does not cover the broad
range of features currently offered by rdiff-backup. I could use some help in
bringing it up to par if others are interested in the path I have taken. While
I have used the current codebase for direction and inspiration, I have started
with a clean slate for several reasons:
- The current repository layout has a critical design flaw that causes
performance degradation as a repository grows. Most difference information is
stored in a single file tree (rdiff-backup-data/increments), that has a very
similar structure to the mirror. The problem is that as files get
added/deleted/changed the directories in the increments tree are always growing
in size, meaning it takes longer and longer to list the contents of directories
in the tree. This performance problem is negligible in small-to-medium sized
backup sets, but becomes apparent in very large backup sets as the number of
increments grows. I have redesigned the repository layout in my new
implementation to eliminate this performance issue. Note that I do not know for
sure if my new layout will completely eliminate this problem since I have not
tested it yet with a very large backup set over a long period of time.