monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [monit] Monit restart command problem


From: Perdue, Emmett
Subject: RE: [monit] Monit restart command problem
Date: Thu, 5 Mar 2009 13:38:43 -0500

If a "program" that Monit controls has more than 1 PID and all of those are started from a single start script, but ALL must be stopped BEFORE the start command is issued on a restart... how is that done with Monit? Not every piece of software has just a single PID associated with it.


From: address@hidden [mailto:address@hidden On Behalf Of Martin Pala
Sent: Thursday, March 05, 2009 1:34 PM
To: This is the general mailing list for monit
Subject: Re: [monit] Monit restart command problem

Yes, monit reads only one pid from pidfile


On Mar 5, 2009, at 7:30 PM, Perdue, Emmett wrote:

Process Name          = jboss_eradbre
 Group                = server
 Pid file             = /opt/local/software/jboss/jboss-4.0.5.GA/logs/jboss_eradbre.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/jboss_eradbre start' timeout 30 second(s)
 Stop program         = '/etc/init.d/jboss_eradbre stop' timeout 30 second(s)
 Pid                  = if changed 1 times within 1 cycle(s) then alert
 Ppid                 = if changed 1 times within 1 cycle(s) then alert
 Timeout              = If 3 restart within 5 cycles then unmonitor else if succeeded then alert
 
There are multiple PID's in the jboss_eradbre.pid file. 3 in this case. See below:
 
$ cat /opt/local/software/jboss/jboss-4.0.5.GA/logs/jboss_eradbre.pid
9800
9981
10004
Could this be the problem? Could Monit be stopping the 1st PID and then issuing the start command without waiting on the 2nd and 3rd PID's to stop?
 
If I run the /etc/init.d/jboss_eradbre itself, the problem does not happen, it only happens when Monit handles process.

From: address@hidden [mailto:address@hidden] On Behalf Of Martin Pala
Sent: Thursday, March 05, 2009 1:20 PM
To: This is the general mailing list for monit
Subject: Re: [monit] Monit restart command problem

If the service is process, monit execs the stop command and waits for the process with pid matching the pidfile content to stop. As soon as the process stops, start script is executed. If the process is stopping quickly, the start script can be executed very quickly (within the same second).

If the check is for different service type (like file, directory, host, etc.), then the stop script is executed followed by start immediately since monit has currently no way how to identify whether the stop script finished OK or not.

What is the configuration of jboss_eradbre service?

You can run monit with -v option to see details.


On Mar 5, 2009, at 5:44 PM, Perdue, Emmett wrote:

I am seeing some strange behavior from Monit when a restart command is issued. When I issue a "monit restart app_name" command, Monit is sending the stop and start commands in the monitrc file back to back within 1/10 of a second. It is not sending the stop command and waiting for it to finish before sending the start command.

If I run the scripts outside of Monit, all is fine. What should I look for? Below is a snip of the Monit log from when the problem happens…

[EST Mar  5 10:12:32] debug    : restart service 'jboss_eradbre' on user request
[EST Mar  5 10:12:32] info     : monit daemon at 25448 awakened
[EST Mar  5 10:12:32] info     : Awakened by User defined signal 1
[EST Mar  5 10:12:32] info     : 'jboss_eradbre' trying to restart
[EST Mar  5 10:12:32] info     : 'jboss_eradbre' stop: /etc/init.d/jboss_eradbre
[EST Mar  5 10:12:33] info     : 'jboss_eradbre' start: /etc/init.d/jboss_eradbre


Thank You,

Emmett D. Perdue
CSX Corp.
Sr. Systems Admin - RHCE
Middleware Software Provisioning
Phone: (904) 633-5187 RNX: 633-5187
E-Mail: address@hidden

Picture (Metafile)
"Individuals Play the Game, But Teams Win Championships!"




This email transmission and any accompanying attachments may contain CSX privileged and confidential information intended only for the use of the intended addressee. Any dissemination, distribution, copying or action taken in reliance on the contents of this email by anyone other than the intended recipient is strictly prohibited. If you have received this email in error please immediately delete it and notify sender at the above CSX email address. Sender and CSX accept no liability for any damage caused directly or indirectly by receipt of this email.

--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general




This email transmission and any accompanying attachments may contain CSX privileged and confidential information intended only for the use of the intended addressee. Any dissemination, distribution, copying or action taken in reliance on the contents of this email by anyone other than the intended recipient is strictly prohibited. If you have received this email in error please immediately delete it and notify sender at the above CSX email address. Sender and CSX accept no liability for any damage caused directly or indirectly by receipt of this email.

--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general


This email transmission and any accompanying attachments may contain CSX privileged and confidential information intended only for the use of the intended addressee. Any dissemination, distribution, copying or action taken in reliance on the contents of this email by anyone other than the intended recipient is strictly prohibited. If you have received this email in error please immediately delete it and notify sender at the above CSX email address. Sender and CSX accept no liability for any damage caused directly or indirectly by receipt of this email.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]