monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Host connection test and alert/exec


From: Jan-Henrik Haukeland
Subject: Re: Host connection test and alert/exec
Date: Fri, 17 Oct 2003 18:26:46 +0200
User-agent: Gnus/5.1002 (Gnus v5.10.2) XEmacs/21.4 (Reasonable Discussion, linux)

Andreas Rust <address@hidden> writes:

> Another thing I have been thinking about is that, whenever the line
> is cut to one of the remotely monitored machines and I am checking
> some 3-5 services on each machine, I am still flooded with 3-5
> alerts depending on the network problem. Possibly even more.
> Can't see an elegant way to surround this yet.

One solution, although far from a perfect is to use the new ping test:

check host xyzzy with address xyzzy.foo.bar
      if failed icmp type echo then unmonitor
      if failed port 80 protocol http  then alert
      if failed port 443 type TCPSSL then alert 
      alert address@hidden

The idea here is to use the ping test on the host and if this test
fails then the host is not monitored anymore and you will not get mail
for failed port connection tests. 

But there are a couple of problems with this: first, you must reset
the service to monitoring mode again manually, either by clicking the
"Enable monitoring" button in the web interface or calling monit from
the console: "monit monitor xyzzy". 

Second and worse, it wont work because the monit code now runs the
port-connection test before the icmp ping test regardless on how the
if-tests are written in the control file. I have fixed this last
problem, so the icmp ping test now always will run first (if
defined). You must download the updated validate.c code and recompile
monit. Download the file here:

http://savannah.gnu.org/cgi-bin/viewcvs/*checkout*/monit/monit/validate.c?rev=1.95

> Last but not least I am wondering how exactly monit handles stuff
> internally.  Imagine some 5 host tests with 15 seconds timeout each
> and a cycleperiod of 60 seconds. How is this being dealt with?  Are
> all tests running at the same time or would I have to raise seconds
> between cycles ?

The cycle sleep and test runs SEQUENTIAL and tests will take the time
they need, so this is not a problem. This illustration explains it
better maybe:

|<test time>|<sleep>|<     test time 75sec    >|<sleep>|<testtime>|<sleep>|..

-- 
Jan-Henrik Haukeland




reply via email to

[Prev in Thread] Current Thread [Next in Thread]