monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Debugging remote connection failures


From: Jan-Henrik Haukeland
Subject: Re: Debugging remote connection failures
Date: Sat, 20 Sep 2014 00:32:49 +0200

Hi

Network tests does fail from time to time. It could be anything from an 
overworked server to traffic spikes on the network. Usually there are no real 
problems, just that Monit was not able to connect within 5 sec by default. This 
happens in real life also, but Browsers for instance, will retry and also open 
several connections at once so it is not very noticeable.

These alerts, while real, are borderline in the false positive category, 
because sooner or later with continuous testing there will be a network or 
server hiccup which happen at the time Monit tries to connect. What you usually 
want is to ignore these incidents, but instead get an alert if the server 
really is down for a "significant" period. 

This is why the "for x cycles" statement is so useful and highly recommended, 
especially for network testing. I see that you already is using "for x within 
Y", but I would just simplify this to something like, 

check host example.com with address www.example.com
        if failed port 80 protocol http for 3 cycles then alert
        if failed port 587 protocol smtp for 3 cycles then alert

How many cycles you should use is a tuning questions and also related to how 
often Monit runs. At least 2, possible more if Monit runs several times per 
minute. 

Running Monit with -Iv is mostly for debugging and not recommended in 
production as the output is very verbose and usually not very interesting. 
Simply run Monit in the background without any parameters is recommended. If an 
error occurs Monit will write this to its log-file so you wont miss out on the 
important stuff.


On 18 Sep 2014, at 21:41, David Kozinn, K2DBK <address@hidden> wrote:

> New Monit user here, I'm really just kind of kicking the tires.
> 
> I've got a several things that I'm monitoring on a small server that I have, 
> but I've also get it set up to monitor services on another box. The relevant 
> portion of monitrc looks like this:
> 
> check host example.com with address www.example.com
>         if failed port 80 protocol http 3 times within 5 cycles then alert
>         if failed port 587 protocol smtp then alert
> 
> The vast majority of the time this works just fine. However, periodically 
> I'll get a failure on one (or very occasionally on both) of these tests, 
> which clear up on the next test cycle (60 seconds later). A few times I've 
> been connected to the machine running monit and as soon as I get the failure, 
> I'll try to manually telnet to the other machine on the appropriate port and 
> it's always worked. I'm trying to figure out why it's failing.
> 
> The problem is that this doesn't happen terribly frequently, so I'm thinking 
> that just running with -Iv might not be practical, since I'd get tons of 
> output. (And to be honest, I'm not quite sure if I'd even see anything there.)
> 
> Can anyone suggest the best way to figure out why these tests are actually 
> failing? Maybe run with verbose mode then tail & filter the output? (Filter 
> for what?)
> 
> Thanks.
> 
> David
> --
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general




reply via email to

[Prev in Thread] Current Thread [Next in Thread]