monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Monit stopped noticing a pid died


From: Martin Pala
Subject: Re: Monit stopped noticing a pid died
Date: Tue, 23 Dec 2014 17:34:01 +0100

You need to stop monit, then start it again ("monit -v" is sufficient). The attached output is just configuration dump + waked up running monit daemon. You will find the output in related syslog file - once the problem will occur again (monit thinks the process is running while it is stopped), please check/send the log.

Regards,
Martin


On 23 Dec 2014, at 15:51, Russell Simpkins <address@hidden> wrote:

Adding host allow 'localhost'
Skipping redundant host 'localhost'
Skipping redundant host 'localhost'
Adding credentials for user 'admin'
Adding PAM group 'monit'
Adding PAM group 'users'
Runtime constants:
 Control file       = /etc/monitrc
 Log file           = syslog
 Pid file           = /var/run/monit.pid
 Id file            = /root/.monit.id
 State file         = /root/.monit.state
 Debug              = True
 Log                = True
 Use syslog         = True
 Is Daemon          = True
 Use process engine = True
 Poll time          = 10 seconds with start delay 240 seconds
 Expect buffer      = 256 bytes
 Mail from          = (not defined)
 Mail subject       = (not defined)
 Mail message       = (not defined)
 Start monit httpd  = True
 httpd bind address = localhost
 httpd portnumber   = 2812
 httpd signature    = True
 Use ssl encryption = False
 httpd auth. style  = Basic Authentication and Host/Net allow list

The service list contains the following entries:

Process Name          = recentnews-feed
 Pid file             = /var/run/recentnews-feed.pid
 Monitoring mode      = active
 Start program        = '/sbin/service recentnews-feed start' timeout 30 second(s)
 Stop program         = '/sbin/service recentnews-feed stop' timeout 30 second(s)
 Existence            = if does not exist then restart
 Pid                  = if changed then alert
 PPid                 = if changed then alert
 Timeout              = If restarted 50 times within 50 cycle(s) then unmonitor

Process Name          = nscd
 Pid file             = /var/run/nscd/nscd.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/nscd start' timeout 30 second(s)
 Stop program         = '/etc/init.d/nscd stop' timeout 30 second(s)
 Existence            = if does not exist then restart
 Pid                  = if changed then alert
 PPid                 = if changed then alert
 Timeout              = If restarted 5 times within 5 cycle(s) then unmonitor

Process Name          = nrpe
 Pid file             = /var/run/nrpe.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/nrpe start' timeout 30 second(s)
 Stop program         = '/etc/init.d/nrpe stop' timeout 30 second(s)
 Existence            = if does not exist then restart
 Pid                  = if changed then alert
 PPid                 = if changed then alert
 Timeout              = If restarted 5 times within 5 cycle(s) then unmonitor

Process Name          = httpd
 Pid file             = /var/run/httpd/httpd.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/httpd start' timeout 30 second(s)
 Stop program         = '/usr/bin/killall -9 httpd' timeout 30 second(s)
 Existence            = if does not exist then restart
 Pid                  = if changed then alert
 PPid                 = if changed then alert
 Timeout              = If restarted 5 times within 5 cycle(s) then unmonitor

Process Name          = emissary-master
 Pid file             = /var/run/emissary.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/emissary start' timeout 30 second(s)
 Stop program         = '/usr/bin/pkill -9 -f (emissary-master|emop_node)' timeout 30 second(s)
 Existence            = if does not exist then restart
 Pid                  = if changed then alert
 PPid                 = if changed then alert
 Timeout              = If restarted 5 times within 5 cycle(s) then unmonitor

File Name             = emissary-pidfile
 Path                 = /var/run/emissary.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/emissary start' timeout 30 second(s)
 Stop program         = '/usr/bin/pkill -9 -f (emissary-master|emop_node)' timeout 30 second(s)
 Existence            = if does not exist then restart
 Timestamp            = if greater than 3600 second(s) then restart

Process Name          = du-glass-broker-feed
 Pid file             = /var/run/DU_GlassBroker.php.pid
 Monitoring mode      = active
 Start program        = '/sbin/service du-glass-broker-feed start' timeout 30 second(s)
 Stop program         = '/sbin/service du-glass-broker-feed stop' timeout 30 second(s)
 Existence            = if does not exist then restart
 Pid                  = if changed then alert
 PPid                 = if changed then alert
 Timeout              = If restarted 5 times within 5 cycle(s) then unmonitor

System Name           = du-proc-00001.du-proc.data-universe-production
 Monitoring mode      = active

-------------------------------------------------------------------------------
Monit daemon with PID 12702 awakened

On Tue, Dec 23, 2014 at 9:42 AM, Martin Pala <address@hidden> wrote:
Then it should notice the process died quickly.

Please run Monit in debug mode and send output:

monit -vI

Regards,
Martin


On 23 Dec 2014, at 15:33, Russell Simpkins <address@hidden> wrote:

Hi,

I have the daemon set to 10 seconds

set daemon  10
   with start delay 240



On Tue, Dec 23, 2014 at 8:57 AM, Martin Pala <address@hidden> wrote:
Hi,

what's the poll cycle settings? ("set daemon <seconds>" statement)

Monit performs the checks and then sleep for given number of seconds. If you poll cycle is long, Monit will not notice the process died until next cycle.

Regards,
Martin


On 23 Dec 2014, at 14:45, Russell Simpkins <address@hidden> wrote:

Hi,

I have a check on a process via a pid file that monit reported as up, when the pid was dead and I was wondering if there were any good tips for figuring out why. We're running monit 5.9. When I run a status, I can see my process listed as running and monitored:

  Process 'recentnews-feed'
    status                            Running
    monitoring status                 Monitored
    pid                               9680
    parent pid                        1
    uid                               5005
    effective uid                     5005
    gid                               5006
    uptime                            1d 3h 20m 
    children                          0
    memory kilobytes                  1805064
    memory kilobytes total            1805064
    memory percent                    25.7%
    memory percent total              25.7%
    cpu percent                       0.1%
    cpu percent total                 0.1%
    data collected                    Fri, 19 Dec 2014 04:14:04

When I check to see if the pid is actually running, it's not there:

  $ ps -ef | grep -i 9680
  root     24520 24029  0 06:39 pts/0    00:00:00 grep -i 9680


My monit config:

check process recentnews-feed with pidfile /var/run/recentnews-feed.pid      start program = "/sbin/service recentnews-feed start"
  stop program = "/sbin/service recentnews-feed stop"
  if 50 restarts within 50 cycles then timeout

Again, just curious if this is a known issue in 5.9 or how to figure out why monit thought the pid was up when it was not.

Thanks,

Russ

--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general


--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general


--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general


reply via email to

[Prev in Thread] Current Thread [Next in Thread]