|
From: | Russell Simpkins |
Subject: | Re: Monit stopped noticing a pid died |
Date: | Tue, 23 Dec 2014 11:46:04 -0500 |
You need to stop monit, then start it again ("monit -v" is sufficient). The attached output is just configuration dump + waked up running monit daemon. You will find the output in related syslog file - once the problem will occur again (monit thinks the process is running while it is stopped), please check/send the log.Regards,MartinOn 23 Dec 2014, at 15:51, Russell Simpkins <address@hidden> wrote:Adding host allow 'localhost'
Skipping redundant host 'localhost'
Skipping redundant host 'localhost'
Adding credentials for user 'admin'
Adding PAM group 'monit'
Adding PAM group 'users'
Runtime constants:
Control file = /etc/monitrc
Log file = syslog
Pid file = /var/run/monit.pid
Id file = /root/.monit.id
State file = /root/.monit.state
Debug = True
Log = True
Use syslog = True
Is Daemon = True
Use process engine = True
Poll time = 10 seconds with start delay 240 seconds
Expect buffer = 256 bytes
Mail from = (not defined)
Mail subject = (not defined)
Mail message = (not defined)
Start monit httpd = True
httpd bind address = localhost
httpd portnumber = 2812
httpd signature = True
Use ssl encryption = False
httpd auth. style = Basic Authentication and Host/Net allow list
The service list contains the following entries:
Process Name = recentnews-feed
Pid file = /var/run/recentnews-feed.pid
Monitoring mode = active
Start program = '/sbin/service recentnews-feed start' timeout 30 second(s)
Stop program = '/sbin/service recentnews-feed stop' timeout 30 second(s)
Existence = if does not exist then restart
Pid = if changed then alert
PPid = if changed then alert
Timeout = If restarted 50 times within 50 cycle(s) then unmonitor
Process Name = nscd
Pid file = /var/run/nscd/nscd.pid
Monitoring mode = active
Start program = '/etc/init.d/nscd start' timeout 30 second(s)
Stop program = '/etc/init.d/nscd stop' timeout 30 second(s)
Existence = if does not exist then restart
Pid = if changed then alert
PPid = if changed then alert
Timeout = If restarted 5 times within 5 cycle(s) then unmonitor
Process Name = nrpe
Pid file = /var/run/nrpe.pid
Monitoring mode = active
Start program = '/etc/init.d/nrpe start' timeout 30 second(s)
Stop program = '/etc/init.d/nrpe stop' timeout 30 second(s)
Existence = if does not exist then restart
Pid = if changed then alert
PPid = if changed then alert
Timeout = If restarted 5 times within 5 cycle(s) then unmonitor
Process Name = httpd
Pid file = /var/run/httpd/httpd.pid
Monitoring mode = active
Start program = '/etc/init.d/httpd start' timeout 30 second(s)
Stop program = '/usr/bin/killall -9 httpd' timeout 30 second(s)
Existence = if does not exist then restart
Pid = if changed then alert
PPid = if changed then alert
Timeout = If restarted 5 times within 5 cycle(s) then unmonitor
Process Name = emissary-master
Pid file = /var/run/emissary.pid
Monitoring mode = active
Start program = '/etc/init.d/emissary start' timeout 30 second(s)
Stop program = '/usr/bin/pkill -9 -f (emissary-master|emop_node)' timeout 30 second(s)
Existence = if does not exist then restart
Pid = if changed then alert
PPid = if changed then alert
Timeout = If restarted 5 times within 5 cycle(s) then unmonitor
File Name = emissary-pidfile
Path = /var/run/emissary.pid
Monitoring mode = active
Start program = '/etc/init.d/emissary start' timeout 30 second(s)
Stop program = '/usr/bin/pkill -9 -f (emissary-master|emop_node)' timeout 30 second(s)
Existence = if does not exist then restart
Timestamp = if greater than 3600 second(s) then restart
Process Name = du-glass-broker-feed
Pid file = /var/run/DU_GlassBroker.php.pid
Monitoring mode = active
Start program = '/sbin/service du-glass-broker-feed start' timeout 30 second(s)
Stop program = '/sbin/service du-glass-broker-feed stop' timeout 30 second(s)
Existence = if does not exist then restart
Pid = if changed then alert
PPid = if changed then alert
Timeout = If restarted 5 times within 5 cycle(s) then unmonitor
System Name = du-proc-00001.du-proc.data-universe-production
Monitoring mode = active
-------------------------------------------------------------------------------
Monit daemon with PID 12702 awakened--On Tue, Dec 23, 2014 at 9:42 AM, Martin Pala <address@hidden> wrote:Then it should notice the process died quickly.Please run Monit in debug mode and send output:monit -vIRegards,MartinOn 23 Dec 2014, at 15:33, Russell Simpkins <address@hidden> wrote:Hi,I have the daemon set to 10 secondsset daemon 10
with start delay 240
--On Tue, Dec 23, 2014 at 8:57 AM, Martin Pala <address@hidden> wrote:Hi,what's the poll cycle settings? ("set daemon <seconds>" statement)Monit performs the checks and then sleep for given number of seconds. If you poll cycle is long, Monit will not notice the process died until next cycle.Regards,MartinOn 23 Dec 2014, at 14:45, Russell Simpkins <address@hidden> wrote:--Hi,I have a check on a process via a pid file that monit reported as up, when the pid was dead and I was wondering if there were any good tips for figuring out why. We're running monit 5.9. When I run a status, I can see my process listed as running and monitored:Process 'recentnews-feed'status Runningmonitoring status Monitoredpid 9680parent pid 1uid 5005effective uid 5005gid 5006uptime 1d 3h 20mchildren 0memory kilobytes 1805064memory kilobytes total 1805064memory percent 25.7%memory percent total 25.7%cpu percent 0.1%cpu percent total 0.1%data collected Fri, 19 Dec 2014 04:14:04When I check to see if the pid is actually running, it's not there:$ ps -ef | grep -i 9680root 24520 24029 0 06:39 pts/0 00:00:00 grep -i 9680My monit config:check process recentnews-feed with pidfile /var/run/recentnews-feed.pid start program = "/sbin/service recentnews-feed start"
stop program = "/sbin/service recentnews-feed stop"
if 50 restarts within 50 cycles then timeoutAgain, just curious if this is a known issue in 5.9 or how to figure out why monit thought the pid was up when it was not.
Thanks,
Russ
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general
[Prev in Thread] | Current Thread | [Next in Thread] |