monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Alerting granularity


From: Martin Pala
Subject: Re: Alerting granularity
Date: Sun, 15 Aug 2010 21:12:35 +0200

You can set action for unsuccessful recovery attempt using following statement 
(example):

      if 3 restarts within 5 cycles then alert

Next monit release also allows to customize non-exist action as well - 
changelog excerpt:

--8<--
* Allow to override default action when service doesn't exist (for example 
process
  is not running, file doesn't exist). Default action is service restart, it 
can be
  customized with following statement now:
      if [does] not exist [[<x> times within] <y> cycles] then <action1>
     [else if succeeded   [[<x> times within] <y> cycles] then <action2>]
  For example instead of default restart action:
      check file with path /cifs/mydata
        if does not exist for 5 cycles then exec "/usr/bin/mount_cifs.sh"
--8<--




On Aug 12, 2010, at 7:15 PM, Randy Ramsdell wrote:

> Martin Pala wrote:
>> You can use specify how many errors are needed to trigger alert.
>> 
>> For example:
>> 
>>       if failed port 80 for 3 times within 5 cycles then alert
>> 
>> See monit manual for more details:
>> http://www.mmonit.com/monit/documentation/monit.html#service_tests
>> 
>> Regards,
>> Martin
>> 
>> 
>> On Aug 12, 2010, at 5:02 PM, Randy Ramsdell wrote:
>> 
>>  
> 
> When I set the alert address@hidden, the initial failed host check will 
> alert. Then the setting you provided will also alert. Note that I have read 
> everything in the manual about alerting and have tried them, but nothing 
> seems to do what we need.
> 
>>> Hi,
>>> 
>>> I have been trying to configure alerting for our 24/7 oncall person. One 
>>> issue I do not seem to stop is the initial alert. Monit alerts as soon as 
>>> the service fails which I do not want. I only want monit to alert if the 
>>> service fails for a specific time. We use an exec if failed but need to 
>>> know when the exec is unable to restart the service for a specified time 
>>> period.
>>> 
>>> I have tried almost every form of alerting but can't solve the "not" alert 
>>> on initial fail.
>>> 
>>> Is is possible to alert the way we need?
>>> 
>>> 
>>> Example: This host runs the same service but on diff ports then we rotor 
>>> with load balancer.
>>> 
>>> 
>>> 
>>> check host blah blah
>>> 
>>> alert address@hidden
>>> 
>>> if failed host localhost port 81 send "POST / HTTP/1.15 blah blah" expect 
>>> "ZP4" then exec "$RUNSCRIPT"
>>> if failed host localhost port 82 send "POST / HTTP/1.15 blah blah" expect 
>>> "ZP4" then exec "$RUNSCRIPT"
>>> if failed host localhost port 83 send "POST / HTTP/1.15 blah blah" expect 
>>> "ZP4" then exec "$RUNSCRIPT"
>>> if failed host localhost port 84 send "POST / HTTP/1.15 blah blah" expect 
>>> "ZP4" then exec "$RUNSCRIPT"
>>> 
>>> 
>>> --
>>> To unsubscribe:
>>> http://lists.nongnu.org/mailman/listinfo/monit-general
>>>    
>> 
>> 
>> --
>> To unsubscribe:
>> http://lists.nongnu.org/mailman/listinfo/monit-general
>>  
> 
> 
> --
> To unsubscribe:
> http://lists.nongnu.org/mailman/listinfo/monit-general




reply via email to

[Prev in Thread] Current Thread [Next in Thread]