monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

failure with 'unmonitor all'


From: Ben Hartshorne
Subject: failure with 'unmonitor all'
Date: Fri, 7 Oct 2005 09:14:05 -0700
User-agent: Mutt/1.5.9i

Hi all,

I recently experienced a problem that I have been unable to reproduce.
I'm curious if any of you have seen something similar.

My monitoring host's upstream provider became very flaky, and so I
started getting inundated with false positives. All my tests are URL
tests, of the format:

check host Google with address www.google.com
        start program = "/bin/true"
        stop program = "/bin/true"
        if 2 restarts within 3 cycles then timeout
      if failed url http://www.google.com/
              then restart

The tests were in various states of up, down, and somewhere in the
middle.

I was aware of the source of the problem, so decided to suspend
monitoring until my upstream provider became stable again.

I ran 'monit unmonitor all', and then checked the monit web server.  It
listed 6 out of 9 services in state 'unmonitored' and the other three in
other states.  Unfortunately, I didn't write down their exact states,
but at least one was 'connection failed' and I continued to get pages as
those three services went up and down.

I tried to recreate this on a test host (by running 'unmonitor all' at
various states of up and down) but was unable to do so. 

Has anyone else seen this before?

Can any of you developers identify a possible situation in which this
might occur?

Thanks,

-ben


-- 
Ben Hartshorne
email: address@hidden
http://ben.hartshorne.net

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]