monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Monit definitely not waiting for stop scripts to complete


From: Jason L. Buberel
Subject: Re: Monit definitely not waiting for stop scripts to complete
Date: Wed, 05 Sep 2007 09:47:13 -0700
User-agent: Thunderbird 2.0.0.0 (X11/20070326)

Jan-Henrik, Stanislaw,

Thanks for your thoughtful consideration/response.

Question #1) Maybe start time is not the crucial point here, but rather "correctness", that is, monit should wait for A to come up before starting B and so on?

My Answer) My preference would trust the start/stop script with the responsibility of determining when it has completed its processing and the service is fully started or stopped. Monit's approach - detecting the presence of the PID and the process - takes that responsibility away from the script. For most services, that assumption is a safe one. But there are many cases of more complex services for which process creation and PID file creation happen well before that service is fully started. For that reason, I would vote in favor of making all start/stop actions synchronous.
If you are concerned with start/stop scripts taking too long to completed, I 
would suggest the use of a timeout value:

check process tomcat with pidfile /var/run/tomcat.pid
        stop program with timeout 30 seconds = "/etc/init.d/tomcat stop"
start program with timeout 30 seconds = "/etc/init.d/tomcat start"
If the start/stop script does not exit within the timeout period, the execution 
attempt should be considered to have failed.

Question #2) Apropos correctness. This brings up another question, if you have a dependency chain, say, C->B->A, today if monit should fail to start A, it still goes on and try to start B and C. If A failed to start, should monit instead abort the chain and not try to start B and C regardless? I'm not sure.

My Answer) The very definition of a dependency chain is that one service relies on the other. So if the service at the root of the chain fails to start properly, the next service in the chain should not be started.
Regards,
jason





reply via email to

[Prev in Thread] Current Thread [Next in Thread]