On Wed, Mar 16, 2011 at 05:51:42AM -0700, Adam Beguelin wrote:
Monit supports commands like:
start program = "/etc/init.d/crazyd start"
But it doesn't support:
restart program = "/etc/init.d/crazyd restart"
I have a daemon that expects a restart command so it can gracefully shut down the running process, then start a new one. Restarting via start/stop doesn't do the right thing since it starts a new daemon while the old one is still running.
Any suggestions on how to get around this?
Excuse me if I don't understand the scenario but, are you sure that monit is the rigth tool for this task?
This "restart procedure" that the daemon needs, is associated with a
failure condition of the daemon, to be managed by monit?
(Looks like a faulty daemon)
Anyway, you are free to define the stop/start monit programs, maybe you can
write two scripts for the tasks
- /home/adam/adam-stop: performs a '/etc/init.d/crazyd restart'
- /home/adam/adam-start: does nothing
Or maybe you can use a rule like
... then exec '/home/adam/custom-restart.sh'
Thanks for the response.
Your suggestion could work for restart, however then I won't be able to actually start or stop daemons using monit.
I'm already using monit to monitor daemons of this type. I find that the daemon 'escapes' from monit after monit tries to restart it. I think this is happening because of the way the stop daemon works. I'm using the ruby gem daemons two wrap my daemon code. When the daemon gets a stop, it sets a flag and waits for the underlying code to exit gracefully (this is how the daemons package works). When monit executes stop and then start, the daemon can still be running when the start is issued. Monit is confused after that.
The daemons package supports a restart that simply shuts down the running daemon and when it has finished, it starts up a new one. I was hoping that monit would support that.
The initial restart is getting triggered because the daemon memory has grown large and I'm asking monit to restart it when it goes over some set limit. This seems to happen once a week or so. I fix it by going back to the machine and killing the running daemon, then use monit to start it so the daemon is actually running under monit again.