gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Improvements in Quota Translator


From: Varun Shastry
Subject: Re: [Gluster-devel] Improvements in Quota Translator
Date: Wed, 10 Apr 2013 18:14:16 +0530
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130329 Thunderbird/17.0.5

Hi Jeff

On Wednesday 10 April 2013 01:33 AM, Jeff Darcy wrote:
On 04/09/2013 09:39 AM, Varun Shastry wrote:
Hi Everyone,

As gluster quota was facing some issues in its functionality, its
required to make it fool-proof, robust and reliable. So, below are
the some of the major problems we are facing and the modifications
to overcome the same.

Current implementation * Client side implementation of quota - Not
secure - Increased traffic in updating the ctx - Relying on xattrs
updation through lookup calls * Problem with NFS mount - lack of
lookups (handling through 'file handles')

So, the new design is proposed,

* Two level of quota implementation soft and hard quota, similar to
the XFS's quota, is introduced. A message is logged on reaching soft
quota and no more writes allowed after hard limit.
Not only should it be similar to XFS's quota, but we should actually be
able to have XFS do the enforcement if the user so chooses.  Ditto for
other local filesystems with similar-enough quota functionality.  In
those cases we'd be there only to help manage the local FS.
Since XFS doesn't allow hard links across directory tree quota boundaries - we get EXDEV, it would prevent gluster from creation ".glusterfs" directory entries. So Gluster quota does both accounting and enforcing of quota.

* Quota is moved to server-side. Server side implementation removes
the client dependability for specific calls and secures the quota
from mounting with modified volfile.
Absolutely agree that this is required.

To get the cluster view, A trusted quota client process**will be
spawned, on set of random 'n' bricks, containing only the cluster
xlators, to aggregate the size on all the bricks of the volume. By
querying for getxattrs on the directories, for a fixed time interval
(say t secs), it updates the context of the quota xlator in server
graph, by sending the setxattr with a key in dict. The t depends on
lists, in the descending order for, 1. below soft limit 2. above soft
limit; AND it is tunable.
Can you elaborate a bit on how this part is supposed to work?  What
we've talked about before (since CloudFS days) is that there would be a
"quota rebalancing daemon" that would observe when we're about to run
out of quota on one brick, and "borrow" quota from another brick, and so
on ad infinitum.  That sounds roughly like what you're suggesting,
except that there will be multiple such daemons active at once.  How do
they relate to one another?  Are they dividing the work among
themselves, using something like the same methods already in DHT and
proposed for parallel geo-replication?  What algorithms do they use to
decide when to intervene, and in what way?  A too-simple algorithm might
be prone to thrashing quota around as usage fluctuates, so we'll
probably need to build in some sort of damping function.

As you explained above, no, its not the same approach. We don't assign 'the' size to bricks and change it when one of them reaches its limit.

What we're thinking:-
Moving the quota from client to server loses its cluster view, so its just need know the cluster wide disk resource allocation for the directories on which limits are set. The gluster client process in the server side (trusted client) will periodically queries for the xattrs (quota sizes on all the bricks) and aggregates it. By sending the aggregated sizes (cluster wide consumption) through setxattrs, quota in the server graph gets the cluster-wide quota consumption. So, there by server quota xlator enforces the quota.

- Varun Shastry



reply via email to

[Prev in Thread] Current Thread [Next in Thread]