Hi All,
We are using monit software for the last 6 months and it has been very
useful for all of us. Thanks to all who made this a huge succes.
I would really appreciate your help with following problem:
I am using monit V4.4 to monitor a list of services.
On a running system, if some services cannot be restarted for some
reason, I specify in the /etc/monitrc file to retry a number of times,
before 'timeout'. After a specified number of retries (in the
/etc/monitrc file), monit shows that it will give up monitoring the system.
For example, I have a service called foobar, which failed to start
multiple times (5 times in this case), and exceeded the limit I
specified in the /etc/monitrc file:
check process foobar
with pidfile "/var/run/foobar.pid"
start program = "/etc/rc.d/init.d/foobar start"
stop program = "/etc/rc.d/init.d/foobar stop"
if 5 restarts within 6 cycles then timeout
We are calling "monit monitor all" immediately after starting monit to
start monitoring all the applications (including timed out ones) once a
reboot occurs.
But incase of a abnormal/abrupt reboot, i.e When you poweroff the system
directly or if you press reset button(Improper reboot), the monit is not
monitoring the timed out applications.
Only if we forcibly the state file that monit maintains in user's home
directory i.e $home/.monit.state, I am able to remonitor the timed out
applcations. Is there any other way to remonitor these services?
What could be the problem? I failed to find something useful in Google :(
Thanks in advance!
Thanks and Regards,
Rajesh G
"SASKEN RATED THE BEST COMPANY TO WORK FOR IN INDIA - SURVEY 2005
conducted by the BUSINESS TODAY - Mercer - TNS India"
SASKEN BUSINESS DISCLAIMER
This message may contain confidential, proprietary or legally Privileged
information. In case you are not the original intended Recipient of the
message, you must not, directly or indirectly, use, Disclose,
distribute, print, or copy any part of this message and you are
requested to delete it and inform the sender. Any views expressed in
this message are those of the individual sender unless otherwise stated.
Nothing contained in this message shall be construed as an offer or
acceptance of any offer by Sasken Communication Technologies Limited
("Sasken") unless sent with that express intent and with due authority
of Sasken. Sasken has taken enough precautions to prevent the spread of
viruses. However the company accepts no liability for any damage caused
by any virus transmitted by this email
------------------------------------------------------------------------
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general