monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Monit Issue during reboot


From: Shankar SM
Subject: RE: Monit Issue during reboot
Date: Tue, 22 Nov 2016 18:00:20 +0000

Hi Geoff,
        We tried that however killing monit does not kill all the child 
processes (These child process is action that monit performs on a service check 
failure ex: running a script), it might have already started. The way monit 
forks these child processes causes the child processes parent to be init. And 
monit does not kill these child processes when it is quitting.

A simple test I did was with something as below.

/etc/monitrc  (Note: monit cycle interval is 30 seconds)
############################################
Check file non_existing_file.txt with path /tmp/non_existing_file.txt
   If does not exist then exec my_test_script.sh
############################################


my_test_script.sh (Script that will be used to restart an application that died)
############################################
#!/bin/sh

echo “Test Script: Starting”
sleep 20
echo “Test Script: Done”
############################################

Now if monit is started with this above monitrc config, we see that 
my_test_script.sh runs for 20 seconds and can be seen with `ps` command. 
Now while the test script is started by monit and is running, if we kill monit, 
the script does not get killed. 
Similarly during a shutdown sequence monit could have spawned this child 
process before it will be killed. 
I could not think of an easy way to wait for these kind of child process to 
complete before we run other kill script.

I understand this is a rare corner case meaning for this problem to happen the 
service check should fail (which ideally should never happen) and the monit 
should start this child process just before it is killed/unmonitored.
Just wanted to check if there was any recommendation to avoid this.

Thanks
Shankar


From: monit-general [mailto:address@hidden On Behalf Of Geoff Goas
Sent: Monday, November 21, 2016 4:23 PM
To: This is the general mailing list for monit <address@hidden>
Subject: Re: Monit Issue during reboot

What if you setup an entry in inittab to kill monit when it switches to 
runlevel 0 or 6? Making the assumption of course that runlevel 0 is halt and 6 
is reboot.

On Mon, Nov 21, 2016 at 3:13 PM, Shankar SM <address@hidden> wrote:
Hi Geoff,

        The services are actually started by init script, so during shutdown 
the kill scripts are called.
Since monit is started from inittab, there is a chance that monit daemon 
interval expires during a shutdown process and restarts applications that have 
shutdown gracefully.
So basically it is to ensure a clean shutdown of all services.

Thanks
Shankar


From: monit-general [mailto:address@hidden On Behalf Of Geoff Goas
Sent: Monday, November 21, 2016 2:19 PM
To: This is the general mailing list for monit <address@hidden>
Subject: Re: Monit Issue during reboot

What is the reason for having monit unmonitor all services prior to shutdown?

On Mon, Nov 21, 2016 at 8:51 AM, Shankar SM <address@hidden> wrote:
Hi All,
                We are seeing an issue using monit during a reboot or shutdown 
sequence. First some background of the some of the configuration of monit.

1. Monit version is 5.5
2. Monit is started from inittab as null::respawn:/usr/bin/monit -Ic 
/etc/monitrc 3. Monit is configured to check services every 30 secs.
4. Monit is monitoring certain custom applications and restarts them if they 
are not running by calling the init script of that particular custom 
application.
5. The applications have their own init script in /etc/init.d/ folder that are 
started on bootup. Monit starts after these init scripts are run from inittab.

Now to the problem at hand.
During a shutdown/reboot of the system first a command is sent to monit to stop 
monitoring all services. After this all the kill scripts in /etc/init.d/ are 
called. For the most part it works but there is a corner case where monit 
starts an application when the application has already shutdown gracefully, 
which is wrong. I have tried to capture the problem in the Sequence diagram 
below link.

https://i.stack.imgur.com/XwYk9.png

I looked at monit source code and it looks like when the check for a service 
fails monit tries to run the command mentioned by forking and executing it. So 
this forked process can still be in progress when monit receives a unmonitor 
command. It also looks like monit does not stop processes that it started when 
it receives this command and immediately returns.

Is there a way to wait till all the forked processes that are started by monit 
are complete?

Any other recommendation to avoid this problem?

Thanks
Shankar


--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general




--
Geoff Goas
Systems Engineer
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general




-- 
Geoff Goas
Systems Engineer

reply via email to

[Prev in Thread] Current Thread [Next in Thread]