Re: [Gluster-devel] Performance Translators' Stability and Usefulness

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Performance Translators' Stability and Usefulness

From:	Shehjar Tikoo
Subject:	Re: [Gluster-devel] Performance Translators' Stability and Usefulness
Date:	Sat, 04 Jul 2009 12:17:03 +0530
User-agent:	Mozilla-Thunderbird 2.0.0.19 (X11/20090103)

Gordan Bobic wrote:

Just reading through the wiki on this and a few things are unclear,so I'm hoping someone can clarify.
1) readahead
- Is there any point in using this on systems where the interconnect<= 1Gb/s? The wiki implies there is no point in this, but doesn'tquite state it explicitly.

I am pretty sure it helps. The question of using read-ahead is more of a
question related to the workload rather than the interconnect, for eg.
it'll be useful for sequential reading, without any doubts.
Of course, there can be cases where excessive read-ahead chokes the
100 Mib/s link, but then read-ahead can be configured to reduce its
utilization of the network by reducing the page-count option.

- Is there any point in using this on a server that is also it's ownclient when use with replicate/afr? I'm guessing there isn't sincethe local fs will be doing it's own read-ahead but I'd like someconfirmation on that.

No. Generally, read-ahead will be most beneficial only on the client
side since it helps avoid the need to go to the network when an
application does need the data already read-ahead. Yes, on the server
side, on-disk file systems read-ahead already does it best.

In your setup above, in case the system has more than a few CPUs/cores,
it might be possible to get a little better performance while using
io-threads on the client. That'll make it possible to offload the
read-ahead to an io-thread without blocking the main glusterfs thread.
Then, the benefit of read-ahead + io-threads might show up when the data
is actually needed, and could be served without a kernel entry/exit for
file system call.


2) io-threads

Is this (usefully) applicable on the client side?


It is. Using io-threads on the client side helps offload the processing
of individual file operations onto a separate thread, freeing up
the main thread to perform other tasks. This is especially applicable
when using io-threads under a write-behind and/or read-ahead translators
where the write-behind and read-ahead requests, i.e. background or
asynchronous requests essentially, can be offloaded to the threads while
freeing up the main glusterfs thread to handle sync requests, i.e.
requests that could make the application block on a syscall.

Also, using io-threads on client side could help in performing network
IO in a separate thread, again freeing up the main thread for other
in-band tasks.

Then again, if the workload is not concurrent in terms of number of
processes or number of files/dirs, then io-threads might not help much.

3) io-cache
The wiki page has the same paragraph pasted for both io-threads andio-cache. Are they the same thing, or is this a documentation bug?

No, they're not the same. The documentation is still in a flux. Hope
this version will help:
http://www.gluster.org/docs/index.php/Translators_options

What does io-cache do?

io-cache is a translator that caches data from files so that future
references do not lead to network requests. It is generally used along
with read-ahead so that the data that gets read ahead or any data that
gets read, for that matter, will be available from the local client
cache. We're also working on incorporating support for write buffering
in io-cache so that write operations can also benefit from local
buffering until a point in time suitable for actual transmission to the
server.

Finally - which translators are deemed stable (no know issues -memory leaks/bloat, crashes, corruption, etc.)?


We can definitely vouch for a higher degree of stability of the
releases. Otherwise, I dont think there is any performance translator we
can call completely stable/mature because of the roadmap we have for
constantly upgrading algorithms, functionality, etc.

Any particular suggestions on which performance translatorcombination would be good to apply for a shared root AFR over a WAN?I already have read-subvolume set to the local mirror, but anyimprovement is welcome when latencies soar to 100ms and b/w getshammered down to 1-2.5 Mb/s.



WANs are generally characterised as having a large bandwidth-delay
product. That basically means, for good throughput, we should be
pipelining as much data as possible over the link, so that the long
latency overhead can be mitigated or amortised by sending larger amount
of data for the same fixed overhead.

That said, what particular workload is it that gives you a throughput of
1-2.5 Mb/s?

When you say "latencies soar to 100ms", does that mean, these are just
unusual spikes or is that the normal latency observed?

It'd help to see your volfiles and how the performance translators are
arranged.

Another thing - when a node works standalone in AFR, performance ispretty good, but as soon as a peer node joins, even though theoriginal node is the primary, performance degrades on the primarynode quite significantly, even though the interconnect is directgigabit, which shouldn't be adding any particular latency (< 0.1ms)or overheads, especially on the primary node. Is there any particularreason for this degradation? It's OK in normal usage, but someoperations (e.g. building an big bootstrapping initrd (50MBcompressed, including all the gernel drivers) takes nearly 10x longerwhen the peers join than when the node is standalone. I expectedsome degradation, but only on the order of added network latency, andthis is way, way more. I tried with and without direct-io=off, andthat didn't make a great amount of difference. Which performancetranslators are likely to help with this use case?


I think Vikas will be able to answer that better.

-Shehjar

Gordan


_______________________________________________ Gluster-devel mailing
list address@hiddenhttp://lists.nongnu.org/mailman/listinfo/gluster-devel

[Prev in Thread]

Current Thread

[Next in Thread]

[Gluster-devel] Performance Translators' Stability and Usefulness, Gordan Bobic, 2009/07/03
- Re: [Gluster-devel] Performance Translators' Stability and Usefulness, Shehjar Tikoo <=
  - Re: [Gluster-devel] Performance Translators' Stability and Usefulness, Geoff Kassel, 2009/07/04
    - Re: [Gluster-devel] Performance Translators' Stability and Usefulness, Gordan Bobic, 2009/07/04
    - Re: [Gluster-devel] Performance Translators' Stability and Usefulness, Geoff Kassel, 2009/07/04
    - Re: [Gluster-devel] Performance Translators' Stability and Usefulness, Gordan Bobic, 2009/07/04
    - Re: [Gluster-devel] Performance Translators' Stability and Usefulness, Shehjar Tikoo, 2009/07/05
    - Re: [Gluster-devel] Performance Translators' Stability and Usefulness, Geoff Kassel, 2009/07/05
    - Re: [Gluster-devel] Performance Translators' Stability and Usefulness, Gordan Bobic, 2009/07/05
    - Re: [Gluster-devel] Performance Translators' Stability and Usefulness, Filipe Maia, 2009/07/05
    - Re: [Gluster-devel] Performance Translators' Stability and Usefulness, Geoff Kassel, 2009/07/06
    - Re: [Gluster-devel] Performance Translators' Stability and Usefulness, Michael Cassaniti, 2009/07/06

Prev by Date: [Gluster-devel] Performance Translators' Stability and Usefulness
Next by Date: Re: [Gluster-devel] Performance Translators' Stability and Usefulness
Previous by thread: [Gluster-devel] Performance Translators' Stability and Usefulness
Next by thread: Re: [Gluster-devel] Performance Translators' Stability and Usefulness
Index(es):
- Date
- Thread