monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [monit] Problem with monit's "not monitoring" status


From: Alex Soto
Subject: Re: [monit] Problem with monit's "not monitoring" status
Date: Wed, 24 Feb 2010 09:07:21 -0800

I've seen problems where long running monit procesess(months) stop being able to start things while running the start command works fine manually.

Interestingly, I've only experienced it with a single ruby daemon (ar_mailer) so maybe it's related since you're using backgroundrb.

A simple restart of the monit daemon fixed my problem.

Sent from my phone

On Feb 24, 2010, at 7:20 AM, David Bristow <address@hidden> wrote:

Here is a copy of the configuration for backgroundrb:

check process backgroundrb with pidfile
/home/rails/ideeli/qa/current/tmp/pids/backgroundrb_8888.pid
 group backgroundrb
 start program = "/usr/local/bin/backgroundrb_wrapper start qa
/home/rails/ideeli/qa/current/tmp/pids/backgroundrb_8888.pid" with
timeout 40 seconds
 stop program = "/usr/local/bin/backgroundrb_wrapper stop qa
/home/rails/ideeli/qa/current/tmp/pids/backgroundrb_8888.pid" with
timeout 60 seconds
 if memory > 240 Mb then restart

There are no more interesting things in the logs at around this time.
Nothing related to backgroundrb, at least.

On Mon, Feb 22, 2010 at 4:56 PM, Martin Pala <address@hidden> wrote:
Hi David,

the service is unmonitored on stop ... the service start enables monitoring again, so it's not expected to see unmonitored service after start.

It seems to me that your 'backgroundrb' service has no "start program = ..." in your monit config file. If the "start program" would be defined, it should log similar message to "'backgroundrb' stop: /usr/local/bin/backgroundrb_wrapper", but with "start" word instead of "stop". The message is missing in the log so it was logged either past 11:45:32 (which is likely of start is defined) or start program is not defined and thus service was not started - check maybe timed out (don't know your configuration so i cannot say) ... or maybe somebody stopped it again.

Please can you provide full monit configuration for 'backgroundrb' service and rest of debug log between 11:44:48 and 12:08:33?

Are you able to reproduce the issue on will? I tried to replicate the problem but it works fine for me.

Best regards,
Martin



On Feb 22, 2010, at 3:02 PM, David Bristow wrote:

We are having trouble with certain services managed by monit that do
not restart as they should after being shut down and then started up
again.

For example, we use backgroundrb. Someone shut it down for updating,
and started it up afterwards.  Here is a sample section of the
monit.log  that shows what was happening at the time:

[EST Feb 19 11:44:48] debug : stop service 'backgroundrb' on user request
[EST Feb 19 11:44:48] info     : monit daemon at 19023 awakened
[EST Feb 19 11:45:10] error    : 'syslog-ng' failed to start
[EST Feb 19 11:45:10] info     : 'backgroundrb' stop:
/usr/local/bin/backgroundrb_wrapper
[EST Feb 19 11:45:19] debug : start service 'backgroundrb' on user request
[EST Feb 19 11:45:19] info     : monit daemon at 19023 awakened
[EST Feb 19 11:45:31] info     : 'backgroundrb' start action done
[EST Feb 19 11:45:32] info     : Awakened by User defined signal 1

And at 12:09AM, this is the "monit status" for backgroundrb:

Process 'backgroundrb'
 status                            not monitored
 monitoring status                 not monitored
 data collected                    Fri Feb 19 12:08:33 2010

Why does this happen?  We are using monit 5.0.3.

--
David Bristow <address@hidden>


--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general



--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general




--
David Bristow <address@hidden>


--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general




reply via email to

[Prev in Thread] Current Thread [Next in Thread]