monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: start and stop process by monit


From: Jan-Henrik Haukeland
Subject: Re: start and stop process by monit
Date: Sat, 04 Oct 2003 15:35:02 +0200
User-agent: Gnus/5.1002 (Gnus v5.10.2) XEmacs/21.4 (Reasonable Discussion, linux)

"Kenneth Yip" <address@hidden> writes:

> I have some questions about how monit start and stop 
> a monitored process - Does monit relies on the
> return code to determine whether a process
> startup / shutdown failed?

No, monit does _not_ relies on the return code. Monit calls execve(2)
to execute a program, this function does not return if exec
succeeded. Because execve may succeed but the program may not, monit
use the following method in addition to verify that a process was
started; monit checks that a pid file was created and that the pid
file contains a running process.

> For example, if the return code of the startup
> script is '1', but it actually did startup the process
> to be monitored successfully, will monit consider the 
> startup as failed?

No

> Also, how monit detects a shutdown of a process
> failed? Check whether the pid file still exists or
> some other method?

The same as above but reversed, if execve did not succeed or if the
process is still running (after poll seconds).

> I raised the questions because of an incident last night
> where monit reported failure to shutdown a monitored
> process. From the message log, monit only attempted to
> shutdown the process once and after the shutdown failed,
> it did nothing further. Monit have timeout on restart, but does it 
> has timeout on stopping or starting services?

This happened because upon process stop, monit first set the process
in unmonitoring mode (because if it was stopped monit should not start
it again in a later poll cycle). Then monit goes on with the shutdown
process outlined above, if this fails monit sends an alert "Failed to
stop ...". (You will *only* get this alert if you have registered
alert notification in the service entry in .monitrc.) Since the
process was set in unmonitoring mode, monit will not do anything more
with the process.

If you really want to make sure that a process was stopped, you can in
the stop program or in the script monit calls to stop the process, add
code that kills the process with SIGKILL if graceful termination did
not succeed. (Of course, this could instead be implemented in monit,
i.e. if monit failed to stop a process it sends SIGKILL to the process
as it's final action, but I think this decision should be done in the
stop program monit calls and not by monit).

-- 
Jan-Henrik Haukeland




reply via email to

[Prev in Thread] Current Thread [Next in Thread]