monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [monit] Monit restart command problem


From: Perdue, Emmett
Subject: RE: [monit] Monit restart command problem
Date: Thu, 5 Mar 2009 14:37:09 -0500

OK, Thanks for your efforts!! 

-----Original Message-----
From: address@hidden
[mailto:address@hidden On
Behalf Of Martin Pala
Sent: Thursday, March 05, 2009 2:27 PM
To: This is the general mailing list for monit
Subject: Re: [monit] Monit restart command problem

The beta is over - next release is final, due in 1 week at maximum. We
can add the feature to some next release after 5.0.



Perdue, Emmett wrote:
> I understand the Parent - Child relationship. But that does not hold 
> true in all cases. JBoss is a good case. There is a single start 
> executable / script. That will spawn several processes, of which not
all 
> have a Parent - Child relationship.
>  
> I can work around the issue I think temporarily. But I think a
permanent 
> solution is for Monit to process every PID in a .pid file. That would
be 
> the cleanest way to handle. Is that something that could be worked
into 
> Monit during the beta phase?
> 
>
------------------------------------------------------------------------
> *From:* address@hidden 
> [mailto:address@hidden *On 
> Behalf Of *Martin Pala
> *Sent:* Thursday, March 05, 2009 1:52 PM
> *To:* This is the general mailing list for monit
> *Subject:* Re: [monit] Monit restart command problem
> 
> If these processes have common parent process (like apache which
spawns 
> child processes), monit watches the parent process.
> 
> If your script starts three independent processes with parent being
init 
> (pid 1), then you will need some workaround. For example modify the 
> start script to check that all processes are stopped before starting -

> if they are running, sleep 1 and check again.
> 
> We can most also modify monit to check all pids from pidfile.
> 
> 
> 
>  split the configuration and starup script to three independent 
> processes (which they really are)
> 
> On Mar 5, 2009, at 7:38 PM, Perdue, Emmett wrote:
> 
>> If a "program" that Monit controls has more than 1 PID and all of 
>> those are started from a single start script, but ALL must be stopped

>> BEFORE the start command is issued on a restart... how is that done 
>> with Monit? Not every piece of software has just a single PID 
>> associated with it.
>>
>>
------------------------------------------------------------------------
>> *From:* address@hidden 
>> [mailto:address@hidden *On 
>> Behalf Of *Martin Pala
>> *Sent:* Thursday, March 05, 2009 1:34 PM
>> *To:* This is the general mailing list for monit
>> *Subject:* Re: [monit] Monit restart command problem
>>
>> Yes, monit reads only one pid from pidfile
>>
>>
>> On Mar 5, 2009, at 7:30 PM, Perdue, Emmett wrote:
>>
>>> Process Name          = jboss_eradbre
>>>  Group                = server
>>>  Pid file             = 
>>> /opt/local/software/jboss/jboss-4.0.5.GA/logs/jboss_eradbre.pid
>>>  Monitoring mode      = active
>>>  Start program        = '/etc/init.d/jboss_eradbre start' timeout 30

>>> second(s)
>>>  Stop program         = '/etc/init.d/jboss_eradbre stop' timeout 30 
>>> second(s)
>>>  Pid                  = if changed 1 times within 1 cycle(s) then
alert
>>>  Ppid                 = if changed 1 times within 1 cycle(s) then
alert
>>>  Timeout              = If 3 restart within 5 cycles then unmonitor 
>>> else if succeeded then alert
>>>  
>>> There are multiple PID's in the jboss_eradbre.pid file. 3 in this 
>>> case. See below:
>>>  
>>> $ cat
/opt/local/software/jboss/jboss-4.0.5.GA/logs/jboss_eradbre.pid
>>> 9800
>>> 9981
>>> 10004
>>> Could this be the problem? Could Monit be stopping the 1st PID and 
>>> then issuing the start command without waiting on the 2nd and 3rd 
>>> PID's to stop?
>>>  
>>> If I run the /etc/init.d/jboss_eradbre itself, the problem does not 
>>> happen, it only happens when Monit handles process.
>>>
------------------------------------------------------------------------
>>> *From:* address@hidden 
>>> <mailto:address@hidden> 
>>> [mailto:address@hidden *On 
>>> Behalf Of *Martin Pala
>>> *Sent:* Thursday, March 05, 2009 1:20 PM
>>> *To:* This is the general mailing list for monit
>>> *Subject:* Re: [monit] Monit restart command problem
>>>
>>> If the service is process, monit execs the stop command and waits
for 
>>> the process with pid matching the pidfile content to stop. As soon
as 
>>> the process stops, start script is executed. If the process is 
>>> stopping quickly, the start script can be executed very quickly 
>>> (within the same second).
>>>
>>> If the check is for different service type (like file, directory, 
>>> host, etc.), then the stop script is executed followed by start 
>>> immediately since monit has currently no way how to identify whether

>>> the stop script finished OK or not.
>>>
>>> What is the configuration of jboss_eradbre service?
>>>
>>> You can run monit with -v option to see details.
>>>
>>>
>>> On Mar 5, 2009, at 5:44 PM, Perdue, Emmett wrote:
>>>
>>>> I am seeing some strange behavior from Monit when a restart command

>>>> is issued. When I issue a "monit restart app_name" command, Monit
is 
>>>> sending the stop and start commands in the monitrc file back to
back 
>>>> within 1/10 of a second. It is not sending the stop command and 
>>>> waiting for it to finish before sending the start command.
>>>>
>>>> If I run the scripts outside of Monit, all is fine. What should I 
>>>> look for? Below is a snip of the Monit log from when the problem 
>>>> happens...
>>>>
>>>> [EST Mar  5 10:12:32] debug    : restart service 'jboss_eradbre' on

>>>> user request
>>>> [EST Mar  5 10:12:32] info     : monit daemon at 25448 awakened
>>>> [EST Mar  5 10:12:32] info     : Awakened by User defined signal 1
>>>> [EST Mar  5 10:12:32] info     : 'jboss_eradbre' trying to restart
>>>> [EST Mar  5 10:12:32] info     : 'jboss_eradbre' stop: 
>>>> /etc/init.d/jboss_eradbre
>>>> [EST Mar  5 10:12:33] info     : 'jboss_eradbre' start: 
>>>> /etc/init.d/jboss_eradbre
>>>>
>>>>
>>>> Thank You,
>>>>
>>>> Emmett D. Perdue
>>>> *CSX Corp.*
>>>> *Sr. Systems Admin - RHCE*
>>>> *Middleware Software Provisioning*
>>>> Phone: (904) 633-5187 RNX: 633-5187
>>>> E-Mail: address@hidden <mailto:address@hidden>
>>>>
>>>> Picture (Metafile)
>>>> */"Individuals Play the Game, But Teams Win Championships!"/*
>>>>
>>>>
>>>>
------------------------------------------------------------------------
>>>>
>>>> *This email transmission and any accompanying attachments may 
>>>> contain CSX privileged and confidential information intended only 
>>>> for the use of the intended addressee. Any dissemination, 
>>>> distribution, copying or action taken in reliance on the contents
of 
>>>> this email by anyone other than the intended recipient is strictly 
>>>> prohibited. If you have received this email in error please 
>>>> immediately delete it and notify sender at the above CSX email 
>>>> address. Sender and CSX accept no liability for any damage caused 
>>>> directly or indirectly by receipt of this email. *
>>>>
>>>> --
>>>> To unsubscribe:
>>>> http://lists.nongnu.org/mailman/listinfo/monit-general
>>>
>>>
>>>
------------------------------------------------------------------------
>>>
>>> *This email transmission and any accompanying attachments may
contain 
>>> CSX privileged and confidential information intended only for the
use 
>>> of the intended addressee. Any dissemination, distribution, copying 
>>> or action taken in reliance on the contents of this email by anyone 
>>> other than the intended recipient is strictly prohibited. If you
have 
>>> received this email in error please immediately delete it and notify

>>> sender at the above CSX email address. Sender and CSX accept no 
>>> liability for any damage caused directly or indirectly by receipt of

>>> this email. *
>>>
>>> --
>>> To unsubscribe:
>>> http://lists.nongnu.org/mailman/listinfo/monit-general
>>
>>
>>
------------------------------------------------------------------------
>>
>> *This email transmission and any accompanying attachments may contain

>> CSX privileged and confidential information intended only for the use

>> of the intended addressee. Any dissemination, distribution, copying
or 
>> action taken in reliance on the contents of this email by anyone
other 
>> than the intended recipient is strictly prohibited. If you have 
>> received this email in error please immediately delete it and notify 
>> sender at the above CSX email address. Sender and CSX accept no 
>> liability for any damage caused directly or indirectly by receipt of 
>> this email. *
>>
>> --
>> To unsubscribe:
>> http://lists.nongnu.org/mailman/listinfo/monit-general
> 
>
------------------------------------------------------------------------
> 
> * This email transmission and any accompanying attachments may contain

> CSX privileged and confidential information intended only for the use
of 
> the intended addressee. Any dissemination, distribution, copying or 
> action taken in reliance on the contents of this email by anyone other

> than the intended recipient is strictly prohibited. If you have
received 
> this email in error please immediately delete it and notify sender at 
> the above CSX email address. Sender and CSX accept no liability for
any 
> damage caused directly or indirectly by receipt of this email. *
> 
> 
>
------------------------------------------------------------------------
> 
> --
> To unsubscribe:
> http://lists.nongnu.org/mailman/listinfo/monit-general


--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general




reply via email to

[Prev in Thread] Current Thread [Next in Thread]