Re: Check program problem

monit-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Check program problem

From:	Jan-Henrik Haukeland
Subject:	Re: Check program problem
Date:	Mon, 19 Nov 2012 16:43:22 +0100

I'm not sure I understand the problem, but that does not prevent me from having 
a suggestion :) I'm wondering if the every statement could help in this 
situation? As in:

check program with path '/tmp/script.sh'
  every 2 cycles
  if status != 0 then exec '/tmp/some_service.sh restart'

Any luck with that?


On Nov 19, 2012, at 12:12 PM, Dmitry Zamaruev <address@hidden> wrote:

> Hi,
> 
> I'm using 'check program' to monitor thread leak in one of our applications. 
> All is working nice, except that application is always restarted twice. I dig 
> through source code and found that it should be related to how 'check 
> program' is handled.
> Here is my configuration example:
> 
> check program with path '/tmp/script.sh'
>   if status != 0 then exec '/tmp/some_service.sh restart'
> 
> Here is the workflow I'm seeing:
> 
> - Poll period #1:
>   - start /tmp/script.sh
> 
> - Poll period #2:
>   - collect exit code from /tmp/script.sh
>   - raise event with status = 1
>   - start /tmp/script.sh  <<== problem here, script is run against service 
> before restart! so it will return status=1 
>   - process event - exec '/tmp/some_service.sh restart'
> 
> - Poll period #3 
>   - collect exit code from /tmp/script.sh
>   - raise event with status = 1
>   - start /tmp/script.sh  <<== here script is run against fresh service after 
> restart at step #2 
>   - process event - exec '/tmp/some_service.sh restart'
> 
> - Poll period #4
>   - collect exit code from /tmp/script.sh
>   - exit status == 0, so all ok now
> 
> If I try to use different condition, for example 'status == 1 for 2 cycles' - 
> this event chain will be just longer, i.e. after two failures it will restart 
> application, but because next poll cycle is also "failure" - three failed 
> cycles, monit will still successfully match against 'status == 1 for 2 
> cycles'.
> 
> Is there any way to workaround double restart (time for restart is up to 
> 15-20 seconds) using monit configuration, either ignoring exit status on some 
> step,  or writing some special condition ?
> 
> wbr,
> Dmitry.

[Prev in Thread]

Current Thread

[Next in Thread]

Check program problem, Dmitry Zamaruev, 2012/11/19
- Re: Check program problem, Jan-Henrik Haukeland <=
  - Re: Check program problem, Carina Haupt, 2012/11/19
  - Re: Check program problem, Dmitry Zamaruev, 2012/11/19
    - Re: Check program problem, Eric PAILLEAU, 2012/11/19
    - Re: Check program problem, Dmitry Zamaruev, 2012/11/19

Prev by Date: Check program problem
Next by Date: Re: Check program problem
Previous by thread: Check program problem
Next by thread: Re: Check program problem
Index(es):
- Date
- Thread