monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Check program problem


From: Carina Haupt
Subject: Re: Check program problem
Date: Mon, 19 Nov 2012 16:59:23 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.10) Gecko/20121029 Thunderbird/10.0.10

Perhaps also a timeout after the start could help. I have this in the definition of my start command.

'/tmp/script.sh start' timeout 30 second(s)

Ciao Carina

On 19.11.2012 16:43, Jan-Henrik Haukeland wrote:
I'm not sure I understand the problem, but that does not prevent me from having 
a suggestion :) I'm wondering if the every statement could help in this 
situation? As in:

check program with path '/tmp/script.sh'
   every 2 cycles
   if status != 0 then exec '/tmp/some_service.sh restart'

Any luck with that?


On Nov 19, 2012, at 12:12 PM, Dmitry Zamaruev<address@hidden>  wrote:

Hi,

I'm using 'check program' to monitor thread leak in one of our applications. 
All is working nice, except that application is always restarted twice. I dig 
through source code and found that it should be related to how 'check program' 
is handled.
Here is my configuration example:

check program with path '/tmp/script.sh'
   if status != 0 then exec '/tmp/some_service.sh restart'

Here is the workflow I'm seeing:

- Poll period #1:
   - start /tmp/script.sh

- Poll period #2:
   - collect exit code from /tmp/script.sh
   - raise event with status = 1
   - start /tmp/script.sh<<== problem here, script is run against service 
before restart! so it will return status=1
   - process event - exec '/tmp/some_service.sh restart'

- Poll period #3
   - collect exit code from /tmp/script.sh
   - raise event with status = 1
   - start /tmp/script.sh<<== here script is run against fresh service after 
restart at step #2
   - process event - exec '/tmp/some_service.sh restart'

- Poll period #4
   - collect exit code from /tmp/script.sh
   - exit status == 0, so all ok now

If I try to use different condition, for example 'status == 1 for 2 cycles' - this event 
chain will be just longer, i.e. after two failures it will restart application, but 
because next poll cycle is also "failure" - three failed cycles, monit will 
still successfully match against 'status == 1 for 2 cycles'.

Is there any way to workaround double restart (time for restart is up to 15-20 
seconds) using monit configuration, either ignoring exit status on some step,  
or writing some special condition ?

wbr,
Dmitry.



--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

--

Carina Haupt

Fraunhofer-Institute for Algorithms and Scientific Computing (SCAI)
Schloss Birlinghoven
D-53754 Sankt Augustin

Tel.: +49 - 2241 - 14 - 3480
E-mail: address@hidden
Internet: http://www.scai.fraunhofer.de

and

Bonn-Aachen International Center for Information Technology (B-IT)
Dahlmannstrasse 2
D-53113 Bonn

E-mail: address@hidden
Internet: http://www.b-it-center.de



reply via email to

[Prev in Thread] Current Thread [Next in Thread]