[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [monit] Monit restart command problem
From: |
Perdue, Emmett |
Subject: |
RE: [monit] Monit restart command problem |
Date: |
Thu, 5 Mar 2009 14:37:09 -0500 |
OK, Thanks for your efforts!!
-----Original Message-----
From: address@hidden
[mailto:address@hidden On
Behalf Of Martin Pala
Sent: Thursday, March 05, 2009 2:27 PM
To: This is the general mailing list for monit
Subject: Re: [monit] Monit restart command problem
The beta is over - next release is final, due in 1 week at maximum. We
can add the feature to some next release after 5.0.
Perdue, Emmett wrote:
> I understand the Parent - Child relationship. But that does not hold
> true in all cases. JBoss is a good case. There is a single start
> executable / script. That will spawn several processes, of which not
all
> have a Parent - Child relationship.
>
> I can work around the issue I think temporarily. But I think a
permanent
> solution is for Monit to process every PID in a .pid file. That would
be
> the cleanest way to handle. Is that something that could be worked
into
> Monit during the beta phase?
>
>
------------------------------------------------------------------------
> *From:* address@hidden
> [mailto:address@hidden *On
> Behalf Of *Martin Pala
> *Sent:* Thursday, March 05, 2009 1:52 PM
> *To:* This is the general mailing list for monit
> *Subject:* Re: [monit] Monit restart command problem
>
> If these processes have common parent process (like apache which
spawns
> child processes), monit watches the parent process.
>
> If your script starts three independent processes with parent being
init
> (pid 1), then you will need some workaround. For example modify the
> start script to check that all processes are stopped before starting -
> if they are running, sleep 1 and check again.
>
> We can most also modify monit to check all pids from pidfile.
>
>
>
> split the configuration and starup script to three independent
> processes (which they really are)
>
> On Mar 5, 2009, at 7:38 PM, Perdue, Emmett wrote:
>
>> If a "program" that Monit controls has more than 1 PID and all of
>> those are started from a single start script, but ALL must be stopped
>> BEFORE the start command is issued on a restart... how is that done
>> with Monit? Not every piece of software has just a single PID
>> associated with it.
>>
>>
------------------------------------------------------------------------
>> *From:* address@hidden
>> [mailto:address@hidden *On
>> Behalf Of *Martin Pala
>> *Sent:* Thursday, March 05, 2009 1:34 PM
>> *To:* This is the general mailing list for monit
>> *Subject:* Re: [monit] Monit restart command problem
>>
>> Yes, monit reads only one pid from pidfile
>>
>>
>> On Mar 5, 2009, at 7:30 PM, Perdue, Emmett wrote:
>>
>>> Process Name = jboss_eradbre
>>> Group = server
>>> Pid file =
>>> /opt/local/software/jboss/jboss-4.0.5.GA/logs/jboss_eradbre.pid
>>> Monitoring mode = active
>>> Start program = '/etc/init.d/jboss_eradbre start' timeout 30
>>> second(s)
>>> Stop program = '/etc/init.d/jboss_eradbre stop' timeout 30
>>> second(s)
>>> Pid = if changed 1 times within 1 cycle(s) then
alert
>>> Ppid = if changed 1 times within 1 cycle(s) then
alert
>>> Timeout = If 3 restart within 5 cycles then unmonitor
>>> else if succeeded then alert
>>>
>>> There are multiple PID's in the jboss_eradbre.pid file. 3 in this
>>> case. See below:
>>>
>>> $ cat
/opt/local/software/jboss/jboss-4.0.5.GA/logs/jboss_eradbre.pid
>>> 9800
>>> 9981
>>> 10004
>>> Could this be the problem? Could Monit be stopping the 1st PID and
>>> then issuing the start command without waiting on the 2nd and 3rd
>>> PID's to stop?
>>>
>>> If I run the /etc/init.d/jboss_eradbre itself, the problem does not
>>> happen, it only happens when Monit handles process.
>>>
------------------------------------------------------------------------
>>> *From:* address@hidden
>>> <mailto:address@hidden>
>>> [mailto:address@hidden *On
>>> Behalf Of *Martin Pala
>>> *Sent:* Thursday, March 05, 2009 1:20 PM
>>> *To:* This is the general mailing list for monit
>>> *Subject:* Re: [monit] Monit restart command problem
>>>
>>> If the service is process, monit execs the stop command and waits
for
>>> the process with pid matching the pidfile content to stop. As soon
as
>>> the process stops, start script is executed. If the process is
>>> stopping quickly, the start script can be executed very quickly
>>> (within the same second).
>>>
>>> If the check is for different service type (like file, directory,
>>> host, etc.), then the stop script is executed followed by start
>>> immediately since monit has currently no way how to identify whether
>>> the stop script finished OK or not.
>>>
>>> What is the configuration of jboss_eradbre service?
>>>
>>> You can run monit with -v option to see details.
>>>
>>>
>>> On Mar 5, 2009, at 5:44 PM, Perdue, Emmett wrote:
>>>
>>>> I am seeing some strange behavior from Monit when a restart command
>>>> is issued. When I issue a "monit restart app_name" command, Monit
is
>>>> sending the stop and start commands in the monitrc file back to
back
>>>> within 1/10 of a second. It is not sending the stop command and
>>>> waiting for it to finish before sending the start command.
>>>>
>>>> If I run the scripts outside of Monit, all is fine. What should I
>>>> look for? Below is a snip of the Monit log from when the problem
>>>> happens...
>>>>
>>>> [EST Mar 5 10:12:32] debug : restart service 'jboss_eradbre' on
>>>> user request
>>>> [EST Mar 5 10:12:32] info : monit daemon at 25448 awakened
>>>> [EST Mar 5 10:12:32] info : Awakened by User defined signal 1
>>>> [EST Mar 5 10:12:32] info : 'jboss_eradbre' trying to restart
>>>> [EST Mar 5 10:12:32] info : 'jboss_eradbre' stop:
>>>> /etc/init.d/jboss_eradbre
>>>> [EST Mar 5 10:12:33] info : 'jboss_eradbre' start:
>>>> /etc/init.d/jboss_eradbre
>>>>
>>>>
>>>> Thank You,
>>>>
>>>> Emmett D. Perdue
>>>> *CSX Corp.*
>>>> *Sr. Systems Admin - RHCE*
>>>> *Middleware Software Provisioning*
>>>> Phone: (904) 633-5187 RNX: 633-5187
>>>> E-Mail: address@hidden <mailto:address@hidden>
>>>>
>>>> Picture (Metafile)
>>>> */"Individuals Play the Game, But Teams Win Championships!"/*
>>>>
>>>>
>>>>
------------------------------------------------------------------------
>>>>
>>>> *This email transmission and any accompanying attachments may
>>>> contain CSX privileged and confidential information intended only
>>>> for the use of the intended addressee. Any dissemination,
>>>> distribution, copying or action taken in reliance on the contents
of
>>>> this email by anyone other than the intended recipient is strictly
>>>> prohibited. If you have received this email in error please
>>>> immediately delete it and notify sender at the above CSX email
>>>> address. Sender and CSX accept no liability for any damage caused
>>>> directly or indirectly by receipt of this email. *
>>>>
>>>> --
>>>> To unsubscribe:
>>>> http://lists.nongnu.org/mailman/listinfo/monit-general
>>>
>>>
>>>
------------------------------------------------------------------------
>>>
>>> *This email transmission and any accompanying attachments may
contain
>>> CSX privileged and confidential information intended only for the
use
>>> of the intended addressee. Any dissemination, distribution, copying
>>> or action taken in reliance on the contents of this email by anyone
>>> other than the intended recipient is strictly prohibited. If you
have
>>> received this email in error please immediately delete it and notify
>>> sender at the above CSX email address. Sender and CSX accept no
>>> liability for any damage caused directly or indirectly by receipt of
>>> this email. *
>>>
>>> --
>>> To unsubscribe:
>>> http://lists.nongnu.org/mailman/listinfo/monit-general
>>
>>
>>
------------------------------------------------------------------------
>>
>> *This email transmission and any accompanying attachments may contain
>> CSX privileged and confidential information intended only for the use
>> of the intended addressee. Any dissemination, distribution, copying
or
>> action taken in reliance on the contents of this email by anyone
other
>> than the intended recipient is strictly prohibited. If you have
>> received this email in error please immediately delete it and notify
>> sender at the above CSX email address. Sender and CSX accept no
>> liability for any damage caused directly or indirectly by receipt of
>> this email. *
>>
>> --
>> To unsubscribe:
>> http://lists.nongnu.org/mailman/listinfo/monit-general
>
>
------------------------------------------------------------------------
>
> * This email transmission and any accompanying attachments may contain
> CSX privileged and confidential information intended only for the use
of
> the intended addressee. Any dissemination, distribution, copying or
> action taken in reliance on the contents of this email by anyone other
> than the intended recipient is strictly prohibited. If you have
received
> this email in error please immediately delete it and notify sender at
> the above CSX email address. Sender and CSX accept no liability for
any
> damage caused directly or indirectly by receipt of this email. *
>
>
>
------------------------------------------------------------------------
>
> --
> To unsubscribe:
> http://lists.nongnu.org/mailman/listinfo/monit-general
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general
- [monit] Monit restart command problem, Perdue, Emmett, 2009/03/05
- Re: [monit] Monit restart command problem, Martin Pala, 2009/03/05
- RE: [monit] Monit restart command problem, Perdue, Emmett, 2009/03/05
- Re: [monit] Monit restart command problem, Martin Pala, 2009/03/05
- RE: [monit] Monit restart command problem, Perdue, Emmett, 2009/03/05
- Re: [monit] Monit restart command problem, Martin Pala, 2009/03/05
- RE: [monit] Monit restart command problem, Perdue, Emmett, 2009/03/05
- Re: [monit] Monit restart command problem, Martin Pala, 2009/03/05
- RE: [monit] Monit restart command problem,
Perdue, Emmett <=