monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: monit 4.6 problem


From: Pavel Urban
Subject: Re: monit 4.6 problem
Date: Wed, 04 Jan 2006 09:57:54 +0100
User-agent: Mozilla Thunderbird 1.0.7-1.1.fc3 (X11/20050929)

Pavel Urban wrote:
Hello,

monit is acting strangely on our Solaris 8 server.

At 1:51, test of LDAP server failed. Probably log rotation, database maintenance or something like that, it is quite normal. Two cycles later it was ok, BUT - monit keeps sending out alerts and finaly unmonitors, even when in logs it says everything is ok. Well, it's not exactly true - it says that is OK over and over again, which is not normal, either.

Is it known problem?


Further exploration revealed that it is probably different problem.

I wanted this: when some important service fails, send me an email alert. Repeat sending for some time, then send me a sms and stop monitoring. Don't restart or anything.

I've tried to do it this way:

set alert address@hidden with reminder on 10 cycles
set alert address@hidden on { timeout }

check process ldap-master with pidfile global/ldapims/d1/ldapmaster/slapd-master-1/logs/pid
  if failed host 192.168.100.107 port 389 protocol ldap3 then alert
if failed host 192.168.100.107 port 389 protocol ldap3 for 5 cycles then unmonitor
  mode passive

This configuration behaves as following: when it detects problem, it sends out warning.

Jan 4 01:51:19 ims1 monit[5358]: [ID 702911 user.error] LDAP: error receiving data -- Resource temporarily unavailable Jan 4 01:51:19 ims1 monit[5358]: [ID 702911 user.error] 'ldap-master' failed protocol test [LDAP3] at INET[192.168.100.107:389].

After some (quite short) time, monit detects recover and send out 'pass' alert.

Jan 4 01:55:36 ims1 monit[5358]: [ID 702911 user.error] 'ldap-master' connection passed

After cca 18 minutes however, it sends out some kind of 'unmonitor' alert, but without actually unmonitoring or logging anything (note the last line)

*********************
Connection failed Service ldap-master

        Date:   Wed, 04 Jan 2006 02:09:39 +0100
        Action: unmonitor
        Host:   ims1

Your faithful employee,
monit

'ldap-master' connection passed
*********************

From this time, monit keeps sending this mail for every 20 minutes.

???

--
***********************************************************************
Pavel Urban (address@hidden)
IOL system disaster
Internet OnLine, www.iol.cz (owned by Czech Telecom, www.ct.cz)
***********************************************************************
   Vegetables should not operate electronic equipment.
          Computer Stupidities, http://rinkworks.com/stupid/
***********************************************************************




reply via email to

[Prev in Thread] Current Thread [Next in Thread]