[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gluster-devel] question on time-out parameters
From: |
Pranith Kumar Karampuri |
Subject: |
Re: [Gluster-devel] question on time-out parameters |
Date: |
Wed, 1 Aug 2012 05:13:51 -0400 (EDT) |
Jules,
When a frame hits its time-out 'rpc/rpc-lib/src/rpc-clnt.c:138:call_bail
(void *data)' is triggered.
When the client observes a network disconnection (ping-timer-expiry etc) it
triggers 'rpc/rpc-lib/src/rpc-clnt.c:341:saved_frames_unwind (struct
saved_frames *saved_frames)'. When a node goes down, ping timer will expire and
then the frames are unwound in at max ~42 seconds. So in VM scenario it wont
hang for 30 minutes.
To answer your actual question, why such a big frame timeout: Afr takes
entry-locks while performing self-heals, which block other entry fops like
create, delete etc. The timeout is put sufficiently large to succeed the entry
operations.
Afr used to take a lock on entire file to perform data-self-heal on a regular
file, we managed to remove that. We are working on doing the same for
entry-self-heal. Once that happens we will be in a good position to change
these to lower values.
Pranith.
----- Original Message -----
From: "Jules Wang" <address@hidden>
To: "devel" <address@hidden>
Sent: Wednesday, August 1, 2012 1:55:47 PM
Subject: [Gluster-devel] question on time-out parameters
hi, all
When I was tracking the bug https://bugzilla.redhat.com/show_bug.cgi?id=794699
I noticed that the default value of "ping-timeout" was 42 and the default value
of "frame-timeout" was 1800(30 minutes) (in
xlators/protocol/client/src/client.c)
When a node is down(ex. powered off), the volume will be out-of-service for a
long time. If there is a vm run on the volume, it will probably get crush.
So I wonder why we set large number to these parameters?
Best Regards.
Jules Wang
_______________________________________________
Gluster-devel mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/gluster-devel