monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [monit] Monit Span Multiple instances


From: Nick Upson
Subject: Re: [monit] Monit Span Multiple instances
Date: Tue, 5 Feb 2008 13:36:59 +0000

this looks and sounds like the same problem I'm having. A restart of a
process causes 2 copies to run, when they are both talking to the same
serial port this is not good.

here is my verbose monit (4.10) output

[MST Feb  4 08:27:38] debug    : 'bs1' cpu usage check passed [current
cpu usage=0.0%]
[MST Feb  4 08:27:38] debug    : 'bs1' total mem amount check passed
[current total mem amount=776kB]
[MST Feb  4 08:28:00] debug    : 'bs1' zombie check passed [status_flag=0000]
[MST Feb  4 08:28:00] debug    : 'bs1' PID has not changed since last cycle
[MST Feb  4 08:28:00] debug    : 'bs1' PPID has not changed since last cycle
[MST Feb  4 08:28:00] debug    : 'bs1' cpu usage check passed [current
cpu usage=0.0%]
[MST Feb  4 08:28:00] debug    : 'bs1' total mem amount check passed
[current total mem amount=776kB]
Mon Feb  4 08:28:07 MST 2008 restart bs1
[MST Feb  4 08:28:07] info     : restart service 'bs1' on user request
[MST Feb  4 08:28:07] info     : 'bs1' trying to restart
[MST Feb  4 08:28:07] debug    : Monitoring disabled -- service bs1
[MST Feb  4 08:28:07] info     : 'bs1' stop: /opt/unb/bin/bs.sh
[MST Feb  4 08:28:08] debug    : 'bs1' Error testing process id
[10793] -- No such process
[MST Feb  4 08:28:08] debug    : 'bs1' Error testing process id
[10793] -- No such process
[MST Feb  4 08:28:08] debug    : 'bs1' Error testing process id
[10793] -- No such process
[MST Feb  4 08:28:08] info     : 'bs1' start: /opt/unb/bin/bs.sh
[MST Feb  4 08:28:08] debug    : 'bs1' Error testing process id
[10793] -- No such process
[MST Feb  4 08:28:08] debug    : Monitoring enabled -- service bs1
[MST Feb  4 08:28:08] debug    : 'bs1' check skipped -- service
already handled in a dependency chain
[MST Feb  4 08:28:08] debug    : 'bs1' Error testing process id
[10793] -- No such process
[MST Feb  4 08:28:09] debug    : monit: pidfile '/var/run/bs1.pid'
does not exist
[MST Feb  4 08:28:10] debug    : monit: pidfile '/var/run/bs1.pid'
does not exist
which continues until
[MST Feb  4 08:30:06] debug    : monit: pidfile '/var/run/bs1.pid'
does not exist
[MST Feb  4 08:30:07] debug    : monit: pidfile '/var/run/bs1.pid'
does not exist
[MST Feb  4 08:30:08] debug    : monit: pidfile '/var/run/bs1.pid'
does not exist
[MST Feb  4 08:30:08] error    : 'bs1' process is not running
[MST Feb  4 08:30:08] info     : 'bs1' trying to restart
[MST Feb  4 08:30:08] debug    : Monitoring disabled -- service bs1
[MST Feb  4 08:30:08] debug    : monit: pidfile '/var/run/bs1.pid'
does not exist
[MST Feb  4 08:30:08] debug    : monit: pidfile '/var/run/bs1.pid'
does not exist
[MST Feb  4 08:30:08] info     : 'bs1' start: /opt/unb/bin/bs.sh
[MST Feb  4 08:30:08] debug    : monit: pidfile '/var/run/bs1.pid'
does not exist
[MST Feb  4 08:30:08] debug    : Monitoring enabled -- service bs1
[MST Feb  4 08:30:08] debug    : monit: pidfile '/var/run/bs1.pid'
does not exist
[MST Feb  4 08:30:08] error    : 'bs1' failed to start
[MST Feb  4 08:30:09] info     : 'bs1' started
[MST Feb  4 08:32:08] info     : 'bs1' process is running with pid 1370
[MST Feb  4 08:32:08] debug    : 'bs1' zombie check passed [status_flag=0000]
[MST Feb  4 08:32:08] debug    : 'bs1' cpu usage check passed [current
cpu usage=0.0%]
[MST Feb  4 08:32:08] debug    : 'bs1' total mem amount check passed
[current total mem amount=556kB]
[MST Feb  4 08:34:08] debug    : 'bs1' zombie check passed [status_flag=0000]
[MST Feb  4 08:34:08] debug    : 'bs1' PID has not changed since last cycle
[MST Feb  4 08:34:08] debug    : 'bs1' PPID has not changed since last cycle
[MST Feb  4 08:34:08] debug    : 'bs1' cpu usage check passed [current
cpu usage=0.0%]
[MST Feb  4 08:34:08] debug    : 'bs1' total mem amount check passed
[current total mem amount=556kB]

but there are now 2 copies of the bs1 process running

This is on fc5. The start/stop script does use the standard start/stop
routines for fc5 which include a remove of the pid file

On 04/02/2008, Martin Pala <address@hidden> wrote:
> Hi,
>
> there was similar problem which was fixed in monit 4.9:
>
> --8<--
> * Fix the extra restart action which was called by monit
>   in addition to user requested start action of stopped
>   process. This didn't occured in the case that the 'every'
>   statement was used on the service definition as well. Thanks
>   to Aaron Scamehorn for help.
> --8<--
>
> It seems however that some applications still has this or similar
> problem (reported by several users).
>
> I'll look on it ...
>
>
> Martin
>
>
>
> Navaneethakrishnan Goapl wrote:
> >
> > Hi,
> >
> > Monit Version : 4.9
> > OS Version : CentOS release 4.4
> >
> > I am facing the following issue more often. Monit is working fine for
> > some time. But at some point of time, if I restart the process, Monit
> > span multiple instance of that process. I see that this is the problem
> > with earlier releases of MONIT. Is this issue still persist in the
> > latest version? Could some one reply to this?
> >
> > root      8580    1  0 05:31 ?        00:00:00 /bin/sh
> > /opt/CSCOacsvw/resources/monit/monit_script.sh jobmanager start
> > root      8591    1  0 05:31 ?        00:00:00 /bin/sh
> > /opt/CSCOacsvw/resources/monit/monit_script.sh jobmanager start
> >
> >
> > Monitrc
> > -------
> >
> > check process Jobmanager with pidfile
> > "/opt/CSCOacsvw/resources/monit/jobmanager.pid"
> >         start program = "/opt/CSCOacsvw/resources/monit/monit_script.sh
> > jobmanager start"
> >         stop program = "/opt/CSCOacsvw/resources/monit/monit_script.sh
> > jobmanager stop"
> >
> > monit -vc ./monitrc start all
> >
> > Runtime constants:
> > Control file      = ./monitrc
> > Log file          = /opt/CSCOacsvw/log/monit_errors.log
> > Pid file          = /var/run/monit.pid
> > Debug              = True
> > Log                = True
> > Use syslog        = False
> > Is Daemon          = True
> > Use process engine = True
> > Poll time          = 60 seconds
> > Mail server(s)    = localhost
> > Mail from          = (not defined)
> > Mail subject      = (not defined)
> > Mail message      = (not defined)
> > Start monit httpd  = True
> > httpd bind address = Any/All
> > httpd portnumber  = 2812
> > httpd signature    = True
> > Use ssl encryption = False
> > httpd auth. style  = Host/Net allow list
> >
> >
> > Process Name          = Jobmanager
> > Pid file            = /opt/CSCOacsvw/resources/monit/jobmanager.pid
> > Monitoring mode      = active
> > Start program        = '/opt/CSCOacsvw/resources/monit/monit_script.sh
> > jobmanager start' timeout 1 cycle(s)
> > Stop program        = '/opt/CSCOacsvw/resources/monit/monit_script.sh
> > jobmanager stop' timeout 1 cycle(s)
> > Pid                  = if changed 1 times within 1 cycle(s) then alert
> > Ppid                = if changed 1 times within 1 cycle(s) then alert
> >
> >
> > Regards,
> > navanee
> >
>
>
>
> --
> To unsubscribe:
> http://lists.nongnu.org/mailman/listinfo/monit-general
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]