monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] Running an external script with EXEC on a timeout


From: Patrick Burns
Subject: [PATCH] Running an external script with EXEC on a timeout
Date: Sat, 26 Feb 2005 16:35:22 +1100

I need this feature so much I'm trying to put it in myself. I'm
intending to use monit to look after the nodes in my Heartbeat clusters.
If something on a node fails (E.g. Apache goes down) monit can try to
restart it. However if a number of restarts are unsuccessful, it would
be good to have the node gracefully leave the cluster and initiate a
fail-over.

I've got this in /etc/monitrc:

---
set daemon 10
set alert address@hidden
check process foo with pidfile /tmp/foo
        if 3 restarts within 5 cycles then exec /tmp/bar
---

/tmp/bar just contains:

---
#!/bin/bash
echo Hello World
---

Output looks like this (edited for brevity):

---
mail:~# monit -I -v -c /etc/monitrc
Runtime constants:
(removed)

The service list contains the following entries:

Process Name          = foo
 Group                = (not defined)
 Pid file             = /tmp/foo
 Monitoring mode      = active
 Timeout              = If 3 restart within 5 cycles then exec else if
 recovered then alert

-------------------------------------------------------------------------------
Starting monit daemon
'foo' process is not running
Does not exist notification is sent to address@hidden
monit: Start or stop method not defined -- process foo
'foo' process is not running
monit: Start or stop method not defined -- process foo
'foo' process is not running
monit: Start or stop method not defined -- process foo
'foo' service timed out and will not be checked anymore
Timeout notification is sent to address@hidden
Monitoring disabled -- service foo
Hello World
^C
monit daemon with pid [3155] killed
You have new mail in /var/mail/patrickb
---

You can see the exec worked, as it printed "Hello World" to the console
after the service timed out.

If I can exec any arbitrary command after a timeout, there's no reason
why I can't put in "/etc/init.d/heartbeat stop" to cause the node to
give up it's resources if a service looks terminally broken. (Assuming
the error hasn't propagated to the other node in the cluster as well...)

Patch attached...

-- 
  Patrick Burns
  address@hidden

Attachment: timeout.patch
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]