monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[monit] "failed to stop" messages


From: Dylan Stamat
Subject: [monit] "failed to stop" messages
Date: Sat, 20 Feb 2010 09:57:48 -0800

Hello!

I'm using Monit to monitor some processes, and can't seem to get my simple configuration working correctly.
When my threshold is met, I end up getting sent constant "failed to stop" messages.

Here is the output in my logs:
---------------------------------------------------------------------------------------------------------------------------------------------------------
monit[4823]: 'thin8007' total mem amount of 205988kB matches resource limit [total mem amount>163840kB]
monit[4823]: 'thin8007' trying to restart
monit[4823]: 'thin8007' stop: /usr/bin/kill
monit[4823]: 'thin8007' failed to stop
---------------------------------------------------------------------------------------------------------------------------------------------------------

Here is my configuration:
---------------------------------------------------------------------------------------------------------------------------------------------------------
set daemon  20
set logfile syslog facility log_daemon
  check process thin8007 with pidfile /shared/pids/thin.8007.pid
  start program = "/usr/bin/thin start -C /etc/thin/application.yml --only 8000"
  stop program  = "/usr/bin/kill -9 `cat /shared/pids/thin.8007.pid` && rm -f /shared/pids/thin.8007.pid"
  if totalmem > 160.0 MB for 1 cycles then restart
  if cpu > 90% for 1 cycles then restart
  group thin
---------------------------------------------------------------------------------------------------------------------------------------------------------

As you can see, the "stop" directive is a bit of a brute force method.  Prior to using that, I was using the "stop" command
of the application (thin) I'm trying to monitor.  I ran into a problem when the application wouldn't clean up after itself, and
it would end up leaving stale pid files around.  So, I decided to SIGKILL the process and clean up the pid manually.

If I run the stop command manually, the process is killed and the pid file is gone.  However, when it is run through Monit, I
get the "failed to stop" message.  Monit is run as root on this system, but, it still seems like it could be a permissions issue?
Is there anyway to get more verbose output in regard to why it "failed to stop"?  Is there anything that Monit could glean from
the output of the system calls it makes?  I'd be happy to patch if that was a possibility!

Any suggestions would be welcome!
Thanks!
==
Dylan


reply via email to

[Prev in Thread] Current Thread [Next in Thread]