monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Can monit block/wait on start/stop exec at all? (Chris McKenzie


From: Chris McKenzie
Subject: Re: Can monit block/wait on start/stop exec at all? (Chris McKenzie
Date: Thu, 17 May 2007 17:40:34 -0400

Thanks for the explaination.

Some testing I did on RedHat Enterprise Linux 4 update 4 (at least I believe I'm at update 4) behaves otherwise.

Here's a clip of the stop in my init.d: (ignore the crudeness of the beefy test to ensure the proc is dead)

   rm -f /var/lock/subsys/app
   if [ -f "/var/run/app.pid" ]; then
     PID=`cat /var/run/app.pid`
     kill $PID 2>/dev/null 1>&2 && success || failure
     RETVAL=$?
     CNT=0
     LMT=10
     while [ $CNT -lt $LMT ]; do
       sleep 2
       if [ "`kill -0 $PID 2>&1`" == "" ]; then
         let CNT=10
       else
         let CNT=CNT+1
       fi
     done
     if [ $CNT -eq $LMT ]; then
       kill -9 $PID 2>/dev/null 1>&2 && success || failure
       RETVAL=$?
     fi
     rm -f /var/run/app.pid

However when I call monit stop for the process it executes and returns immediately. In a separate console I would check ps to see if the app is alive, it takes about 2 seconds or so to die after the monit stop returned. Infact the 2 second sleep itself should stop monit from returning immediately.

I don't remember this being a problem on RHEL3 at all. The kernel used is 2.6.9-42.ELsmp.

The logic used to test file/pid after a stop is fine enough to me.

Thanks!

- Chris

From: Jan-Henrik Haukeland <address@hidden>
Reply-To: This is the general mailing list for monit <address@hidden>
To: This is the general mailing list for monit <address@hidden>
Subject: Re: Can monit block/wait on start/stop exec at all? (Chris McKenzie
Date: Thu, 17 May 2007 23:05:07 +0200

On 17. mai. 2007, at 19.38, Chris McKenzie wrote:

I want to know if I can get monit to wait for the program exec

On stop and restart, monit will in fact wait for the program to stop (see control.c and the function do_stop). On restart, monit waits until the program is stopped before it starts the program again. The way monit does this is first to call the process stop program and then go into a loop and check if either the pid in the pid file is gone or if the pid file itself is gone. If this is the case it goes on to call the start program again otherwise an alert error is raised.

There will be a problem if the program to be stopped removes its pid file before it is actually stopped. Normally one of the last thing a daemon program should do is to remove its pid file. If it does this earlier in the shutdown process there is going to be a problem since monit then will assume that the process is gone (because the pid file is gone) and continue and call the start program.

Looking at the monit code now I can see that this can be improved by caching the pid before calling stop and instead of testing for the existence of both the pid file and process id only test for the process id. I'll see if I can hack a solution and I'll let you know when it can be tested.

Best regards
--
Jan-Henrik Haukeland
http://tildeslash.com/




--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general






reply via email to

[Prev in Thread] Current Thread [Next in Thread]