If a "program" that Monit controls has more than 1 PID and
all of those are started from a single start script, but ALL must be
stopped BEFORE the start command is issued on a restart... how is that done with
Monit? Not every piece of software has just a single PID associated with
it.
Yes, monit reads only one pid from pidfile
On Mar 5, 2009, at 7:30 PM, Perdue, Emmett wrote:
Process
Name =
jboss_eradbre Group
= server Pid
file =
/opt/local/software/jboss/jboss-4.0.5.GA/logs/jboss_eradbre.pid Monitoring
mode = active Start
program = '/etc/init.d/jboss_eradbre
start' timeout 30 second(s) Stop
program =
'/etc/init.d/jboss_eradbre stop' timeout 30
second(s) Pid
= if changed 1 times within 1 cycle(s) then
alert Ppid
= if changed 1 times within 1 cycle(s) then
alert Timeout
= If 3 restart within 5 cycles then unmonitor else if succeeded then
alert
There are multiple PID's in the jboss_eradbre.pid file. 3
in this case. See below:
$ cat
/opt/local/software/jboss/jboss-4.0.5.GA/logs/jboss_eradbre.pid 9800 9981 10004
Could this be the problem? Could Monit be stopping the
1st PID and then issuing the start command without waiting on the 2nd and 3rd
PID's to stop?
If I run the /etc/init.d/jboss_eradbre itself, the
problem does not happen, it only happens when Monit handles
process.
If the service is process, monit execs the stop command and waits
for the process with pid matching the pidfile content to stop. As soon as the
process stops, start script is executed. If the process is stopping quickly,
the start script can be executed very quickly (within the same second).
If the check is for different service type (like file, directory, host,
etc.), then the stop script is executed followed by start immediately since
monit has currently no way how to identify whether the stop script finished OK
or not.
What is the configuration of jboss_eradbre
service?
You can run monit with -v option to see details.
On Mar 5, 2009, at 5:44 PM, Perdue, Emmett wrote:
I am seeing some strange behavior from Monit when
a restart command is issued. When I issue a "monit restart app_name"
command, Monit is sending the stop and start commands in the monitrc file
back to back within 1/10 of a second. It is not sending the stop command and
waiting for it to finish before sending the start command.
If I run the scripts outside of Monit, all is
fine. What should I look for? Below is a snip of the Monit log from when the
problem happens…
[EST Mar 5 10:12:32]
debug : restart service 'jboss_eradbre' on user
request [EST Mar 5 10:12:32]
info : monit daemon at 25448 awakened
[EST Mar 5 10:12:32]
info : Awakened by User defined signal 1
[EST Mar 5 10:12:32]
info : 'jboss_eradbre' trying to restart
[EST Mar 5 10:12:32]
info : 'jboss_eradbre' stop:
/etc/init.d/jboss_eradbre [EST Mar
5 10:12:33] info : 'jboss_eradbre' start:
/etc/init.d/jboss_eradbre
Thank You,
Emmett D.
Perdue CSX
Corp. Sr. Systems Admin - RHCE Middleware Software
Provisioning Phone: (904) 633-5187 RNX: 633-5187 E-Mail: address@hidden
"Individuals Play the Game, But Teams Win
Championships!"
This email transmission and any accompanying attachments may
contain CSX privileged and confidential information intended only for the
use of the intended addressee. Any dissemination, distribution, copying or
action taken in reliance on the contents of this email by anyone other than
the intended recipient is strictly prohibited. If you have received this
email in error please immediately delete it and notify sender at the above
CSX email address. Sender and CSX accept no liability for any damage caused
directly or indirectly by receipt of this email. -- To
unsubscribe: http://lists.nongnu.org/mailman/listinfo/monit-general
This email transmission and any accompanying attachments may
contain CSX privileged and confidential information intended only for the use
of the intended addressee. Any dissemination, distribution, copying or action
taken in reliance on the contents of this email by anyone other than the
intended recipient is strictly prohibited. If you have received this email in
error please immediately delete it and notify sender at the above CSX email
address. Sender and CSX accept no liability for any damage caused directly or
indirectly by receipt of this email. -- To
unsubscribe: http://lists.nongnu.org/mailman/listinfo/monit-general
This email transmission and any accompanying attachments may contain CSX privileged and confidential information intended only for the use of the intended addressee. Any dissemination, distribution, copying or action taken in reliance on the contents of this email by anyone other than the intended recipient is strictly prohibited. If you have received this email in error please immediately delete it and notify sender at the above CSX email address. Sender and CSX accept no liability for any damage caused directly or indirectly by receipt of this email.
|