[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Zombie processes and exit code retrieval

From: Struan Bartlett
Subject: Zombie processes and exit code retrieval
Date: Mon, 22 Jun 2015 19:05:39 +0100
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.7.0


I'd like to query the rationale for a behaviour I've experiencing in monit. I'm testing with the following config:

# Test config start
set daemon 10

check program MyProgram with path "/bin/dash -c 'echo OK!; exit 1'"
   every "06 * * * *"
   if status != 0 then alert
# Test config end

As expected, monit runs the dash test program at 6 minutes past the hour. The dash script finishes immediately. However, Monit doesn't pick up, report or alert on the exit code in a timely manner. Until the next time Monit is scheduled to run the test script, the dash script remains as a zombie. But that is an hour later, which is a long time to wait to be alerted to the script failing.

If the 'every' schedule was "06 0 * * *" then it would seem one should expect to wait 24 hours before being alerted to the script failing!

I realise the Monit manual explains:

"The asynchronous nature of the program check [...] comes with a side-effect: when the program has finished executing and is waiting for Monit to collect the result, it becomes a so-called "zombie" process [...] the zombie process is removed from the system as soon as Monit collects the exit status. This means that every "check program" will be associated with either a running process or a temporary zombie. This unwanted zombie side-effect will be removed in a later release of Monit."

That may be so, however why doesn't Monit reap the child and collect the exit code at the *next poll cycle after the child exits* (i.e. within 10 seconds of the test script finishing given the 'set daemon 10' line in the test config above) rather than when the program is next scheduled to be run? Maybe I'm missing something, but the current behaviour seems to undermine the entire purpose of providing alerts on program failure (when used in conjunction with cron-style scheduling). That is the behaviour I'd like to query the rationale for.

Thanks in advance.

Kind regards



Struan Bartlett
NewsNow Publishing Limited

Tel:  +44 (0)845 838 8890
Fax:  +44 (0)845 838 8898

The UK's #1 News Portal:
> (est. 1998)

Also tailored for Mobile:

Now with FREE Personalisation:
> Register

Bespoke B2B Internet News Monitoring:
> Internet News Monitoring

Bespoke B2B Headlines for Websites:
> Editorial-In-A-Box

NewsNow Publishing Limited, trading also as, is a company registered in England and Wales under company no. 3435857 with registered office The Euston Office, 1 Euston Square, 40 Melton Street, London NW1 2FD

reply via email to

[Prev in Thread] Current Thread [Next in Thread]