gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] trusted.glusterfs.version xattr


From: Derek Price
Subject: Re: [Gluster-devel] trusted.glusterfs.version xattr
Date: Thu, 08 May 2008 13:21:20 -0400
User-agent: Thunderbird 2.0.0.14 (Windows/20080421)

Martin Fick wrote:
--- Martin Fick <address@hidden> wrote:
  Original creation process and versioning:

  /
 v1
  /dir1/
 v2   v1
  /dir1/dir2/
 v2   v2   v1
  /dir1/dir2/file
 v2   v2   v2  v1

Mirror goes off-line with version #s of dir2 and
file as: v2/v1.

-> file deleted

  /dir1/dir2/
 v2   v2   v3

-> dir2 deleted

  /dir1/
 v2   v3

-> dir2 recreated

  /dir1/dir2/
 v2   v4   v1

-> file recreated

  /dir1/dir2/file
 v2   v4   v2   v1
...
However, if we were looking at the versions all the way to the root, when the mirror went off-line we would have had: /v2/v2/v2/v1 and now we have: /v2/v4/v2/v1. There is a chance that we are talking about different files now. Of course, the problem I see now is that the files could in fact have been the same even though the version number is
different with this scheme!  Since the only version
# that is different is that of dir1 (v4), this could
have been caused by simply adding two new files to that directory!

Hmm, I think that my logic may have been flawed here
and that the scheme would actually work (as long as you go to the root). The mismatch above would only
exist if in fact the file had been recreated!  If the
file had not been recreated, its version # would still
be /v2/v2/v2/v1 and even though if you were to
recalculate it now it would yield /v2/v4/v2/v1.  But
we are not recalculating it, we are trying to see
if the files on two subnodes were created at the same time, and thus the version history should have
been the same right?

This assumption only holds if the parent directories
all the way to root are healed before a file is
created/modified though.  I am, not sure that it
currently does with AFR? Does it?
If the parent directories (all the way up) are not
healed, then a version mismatch could be created when a file is modified and its version is updated. In this case, despite the version mismatches, the files are in fact the same. It does not seem like it would be too difficult to force the parent directories to heal before writing to the file. Unless, a directory heal causes all changed file data (or just new files+data?) in those directories to heal, that could be a long delay. Thoughts? I must admit, I am having a hard time following all these constraints. :) ... If this works, no useless resyncing because we thought that files have changed as I previously surmised.

If you increment directory version numbers on all directory listing changes, I still see a major problem:

1. Adding, renaming, or removing a file or directory in ANY directory now cascades the version number change up to the root directory, effectively incrementing the version number of ALL files and marking them as dirty/needing update to all other servers. I hope you agree this is Very Bad (tm). You could solve it with checksums, but as someone pointed out, that could get expensive, even with a checksum cache, when the entire tree needs to be checked every time.

I believe that this cascade and healing is necessary is illustrated in the following example: given a synchronized /a/b/c/file, against server 1:

        $ cd /
        $ mv a z
        $ mkdir -p a/b/c
        $ echo whatever >file

Then, against server 2:

        $ cat /a/b/c/file

Would have to know to heal directory listings all the way up to its root directory listing to give the correct answer here.

I think the single, global version number I mentioned in the "Client side AFR race conditions" provides an interesting solution here. Consider the following commands and their corresponding file system states starting with an empty root. In this model, changing the content/version number of any child element is considered to change the directory listing of the parent, and renames update the version number of all children of the renamed element:

/                       v1

        $ mkdir /a
/                       v2
/a                      v2

        $ mkdir /b
/                       v3
/a                      v2
/b                      v3

        $ echo whatever > /a/1
/                       v4
/a                      v4
/a/1                    v4
/b                      v3

        $ echo whatever > /a/2
/                       v5
/a                      v5
/a/1                    v4
/a/2                    v5
/b                      v3

        $ mv /a /z
/                       v6
/b                      v3
/z                      v6
/z/1                    v6
/z/2                    v6

        $ rm /z/2
/                       v7
/b                      v3
/a                      v7
/a/1                    v6

This glosses over the locking issues we were discussing in the other thread, but in this model, a client can quickly determine whether its copy of any directory listing or file is up to date based on solely that file or directory's own version number (locally and on the server), and giving a parent directory a new version number does not invalidate the data of all its children.

Regards,

Derek
--
Derek R. Price
Solutions Architect
Ximbiot, LLC <http://ximbiot.com>
Get CVS and Subversion Support from Ximbiot!

v: +1 248.835.1260
f: +1 248.246.1176




reply via email to

[Prev in Thread] Current Thread [Next in Thread]