monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [monit] Resetting Checksums


From: Martin Pala
Subject: Re: [monit] Resetting Checksums
Date: Sun, 31 Aug 2008 00:28:36 +0200


On Aug 28, 2008, at 4:43 AM, Art Age Software wrote:

Hi,

A couple monit questions:

1. Let's say I have monit monitoring the checksum of a file. I then
make a change to the file which invalidates the checksum. What is the
recommended way to tell monit to regenerate the checksum so that it
does not alert and unmonitor the file, but causing the least impact.
So far, the only thing that has worked for me has been to kill and
restart monit itself.

This is simple - just use the "if changed checksum" statement:

--8<--
check file myfile with path /tmp/aaa
    if changed checksum then alert
--8<--

The "if changed checksum" reset the checksum and it check with the new value next cycle already.



2. When I restart monit, any "mode manual" services that were
monitored become unmonitored after restart. Is there any way to
restart monit and have it resume monitoring all the services it had
been monitoring prior to restarting, including "mode manual" services?


The manual mode was planned for cluster - if the node is stopped, the services will be started on the other node (by heartbeat for example). Then if the original node is booted again, it's not good to start the same services on the same node, since they will be running twice (for example trying to get the cluster active/passive shared filesystem).

Monit stores the services state however for the unlikely event that monit will crash. If monit is started after such accident, it recovers to the state before the crash (including monitoring state of manual mode services). As workaround - if you are sure that you want to restart monit and keep the services state - you can kill monit using SIGKILL (pkill -9 monit). This way monit will be terminated uncleanly and will use the state self-healing on start - recover the original state.

We could also change the manual mode behavior to be persistent across restarts - it may make sense, i.e. if it was monitored before monit stop, enable monitoring after monit start again). The cluster framework should thus unmonitor the manual mode services if it is going to stop monit (or whole node) due to service failover.

Martin









reply via email to

[Prev in Thread] Current Thread [Next in Thread]