[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[monit] monit 5.0 beta4 bug - sends same message every cycle
From: |
Aleksander Kamenik |
Subject: |
[monit] monit 5.0 beta4 bug - sends same message every cycle |
Date: |
Wed, 19 Nov 2008 13:31:34 +0200 |
User-agent: |
Thunderbird 2.0.0.16 (X11/20080723) |
Hi,
This bug occurs the second time now, the first time was on 13th Nov also
beta4.
monit detects a high load at 05:12 (expected):
"Monit alert devel.kisise at Wed, 19 Nov 2008 05:12:11 +0200 on devel
loadavg(1min) of 3.1 matches resource limit [loadavg(1min)>3.0]"
But this load stays there only for a minute, but instead of the resource
succeeded message I get the same message the next cycle (55s). And the
next cycle and the next one etc.
I got almost 400 messages, all exactly the same, before I arrived at
work at noon and shut down monit.
monit unmonitor all did not stop the messages from being sent. monit
summary showed that no services were monitored, but the messages still
kept coming.
Shutting down monit stopped the messages, but as soon as I started monit
up again, even with the services unmonitored, it started spamming me
with the same message again. I tried to monitor and unmonitor again, but
this did not help.
So this buggy state survives restarts.
I shut down monit again and here's my little investigation, note the
bunch of *.devel.kisise files:
devel:/var/monit # pwd
/var/monit
devel:/var/monit # ll
total 136
-rw------- 1 root root 154 Nov 18 05:13 1226977982_devel.kisise
-rw------- 1 root root 152 Nov 18 05:39 1226979565_devel.kisise
-rw------- 1 root root 196 Nov 19 05:12 1227064336_devel.kisise
-rw------- 1 root root 154 Nov 19 05:12 1227064376_devel.kisise
-rw------- 1 root root 152 Nov 19 05:38 1227065904_devel.kisise
-rw------- 1 root root 156 Nov 19 10:31 1227083466_apache2_bin
-rw------- 1 root root 157 Nov 19 10:31 1227083466_apache2_init
-rw------- 1 root root 154 Nov 19 10:31 1227083466_bootfs
-rw------- 1 root root 152 Nov 19 10:31 1227083466_cron
-rw------- 1 root root 154 Nov 19 10:31 1227083466_devel.kisise
-rw------- 1 root root 160 Nov 19 10:31 1227083466_mysqld_bin
-rw------- 1 root root 161 Nov 19 10:31 1227083466_mysqld_init
-rw------- 1 root root 164 Nov 19 10:31 1227083466_mysqldsafe_bin
-rw------- 1 root root 157 Nov 19 10:31 1227083466_ntpd_bin
-rw------- 1 root root 158 Nov 19 10:31 1227083466_ntpd_init
-rw------- 1 root root 157 Nov 19 10:31 1227083466_postfix_bin
-rw------- 1 root root 158 Nov 19 10:31 1227083466_postfix_init
-rw------- 1 root root 154 Nov 19 10:31 1227083466_rootfs
-rw------- 1 root root 157 Nov 19 10:31 1227083466_samba_init
-rw------- 1 root root 154 Nov 19 10:31 1227083466_sshd_bin
-rw------- 1 root root 155 Nov 19 10:31 1227083466_sshd_init
-rw------- 1 root root 152 Nov 19 10:31 1227083469_apache2
-rw------- 1 root root 155 Nov 19 10:31 1227083469_mysql
-rw------- 1 root root 153 Nov 19 10:31 1227083469_ntpd
-rw------- 1 root root 153 Nov 19 10:31 1227083469_postfix
-rw------- 1 root root 161 Nov 19 10:31 1227083469_samba_smbd_bin
-rw------- 1 root root 150 Nov 19 10:31 1227083469_smb
-rw------- 1 root root 150 Nov 19 10:31 1227083469_sshd
-rw------- 1 root root 146 Nov 19 10:37 1227083853_devel.kisise
-rw------- 1 root root 146 Nov 19 11:22 1227086528_devel.kisise
-rw------- 1 root root 146 Nov 19 13:16 1227093413_devel.kisise
-rw------- 1 root root 152 Nov 19 13:17 1227093437_devel.kisise
-rw------- 1 root root 154 Nov 19 13:19 1227093593_devel.kisise
-rw------- 1 root root 146 Nov 19 13:21 1227093677_devel.kisise
devel:/var/monit # grep 3.1 *
Binary file 1227064336_devel.kisise matches
devel:/var/monit # strings 1227064336_devel.kisise
devel.kisise
loadavg(1min) of 3.1 matches resource limit [loadavg(1min)>3.0]
devel:/var/monit #
The only fortunate thing about this is, is that devel.kisise is the only
box which sends only emails, but no sms. :)
This error obviously does not occer every night, it's the second time
tonight though. The last time a proper restart of monit killed the bug
though, this time not.
The box is running SLES10SP2 x86. This is monit 5.0 beta4, I'd say this
bug was introduced in one of the last betas.
If you need any more info, ask.
Regards,
--
Aleksander Kamenik
System Administrator
Krediidiinfo AS
an Experian Company
Phone: +372 665 9649
Email: address@hidden
http://www.krediidiinfo.ee/
http://www.experiangroup.com/
- [monit] monit 5.0 beta4 bug - sends same message every cycle,
Aleksander Kamenik <=