monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Problem stopping services


From: Benjamin Krajmalnik
Subject: RE: Problem stopping services
Date: Sat, 26 Mar 2011 16:12:29 -0600

Martin, definitely a typo in the email.

I do not see where I can add any debugging to the script itself.

It appears to be working fine when called from the shell.

I tweaked the rules and now it does not attempt to kill the processes so I am ok for the time being.

I will see if I can figure out what is going on.  Maybe I will break the script into multiple parts sending the output to nonit.out to see if we can see something and then put it back together.

 

 

From: address@hidden [mailto:address@hidden On Behalf Of Martin Pala
Sent: Saturday, March 26, 2011 2:14 PM
To: This is the general mailing list for monit
Subject: Re: Problem stopping services

 

Then it could be good to add some debug logging to the stopdaemon.sh script.

 

It's probably just typo in mail, but i can see "/usr/binawk" (missing slash) in your stopdaemon.sh script instead of "/usr/bin/awk" ... please can you verify it?

 

 

 

On Mar 24, 2011, at 9:36 PM, Benjamin Krajmalnik wrote:



Hello Martin,

 

I made the changes per your request.

The monit_stop.out file was created but it is empty.

Stop command failed to stop the service.

 

The section from monitrc follows:

 

  check process staledaemon with pidfile /var/run/staledaemon.php.pid

    start program = "/root/startdaemon.sh staledaemon" with timeout 60 seconds

    stop program  = "/usr/local/bin/bash -c '/root/stopdaemon.sh staledaemon >> /tmp/monit_stop.out 2>&1' "

    if cpu > 60% for 2 cycles then alert

    if cpu > 80% for 5 cycles then alert

    if cpu > 80% for 5 cycles then restart

    if loadavg(5min) greater than 16 for 8 cycles then alert

    if loadavg(5min) greater than 16 for 10 cycles then restart

    if children < 2 for 2 cycles then restart

 

Previously I had been using webmin to monitor the processes, and everything worked fine with it.  I decided to move to monit because if the higher granularity and functionality.  I am running the tests every 30 seconds.  Also, I have one test which monitors a process by name which seem to think the process is going down – not certain if it is or not, still checking.  In this case the process is pgagent (the PostgreSQL agent).  I had never seen it go down before, so will do more checks, but I had seen a thread concerning some issues with monitoring a process by name.

 

 

 

From: address@hidden [mailto:address@hidden On Behalf Of Martin Pala
Sent: Tuesday, March 22, 2011 2:07 AM
To: This is the general mailing list for monit
Subject: Re: Problem stopping services

 

Hello,

 

please can you add the start/stop statement which you use for the stopdaemon.sh in monit configuration file?

 

You can log the script output this way:

 

stop program = "/bin/bash -c '/root/stopdaemon.sh staledaemon >>/tmp/monit_stop.out 2>&1'"

 

 

Regards,

Martin

 

 

On Mar 22, 2011, at 12:36 AM, Benjamin Krajmalnik wrote:




I am having a strange issue stopping a group of services (daemons written in PHP).

OS is FreeBSD 8.1, monit version is 5.2.4.

 

I have tried various approaches.

I tried running it through a stopdaemon.sh which looks as follows:

 

#!/bin/sh

#stop a php daemon

mydaemon="$1.php"

killstring="/bin/ps -aux | /usr/bin/grep ‘php $mydaemon’ | /usr/bin/grep -v stopdaemon | /usr/bin/grep -v grep | /usr/binawk ' {print \$2}' | /usr/bin/xargs /bin/kill -s KILL && sleep 10"

eval $killstring

 

Running this from the command line works fine.  Running it from within monit fails – not sure why.

Checking the processes which run as the stop script is called I see the following, which indicates it was called properly and with root access:

 

root      61480  0.6  0.0  8264  1784  ??  S     5:28PM   0:00.01 /bin/sh /root/stopdaemon.sh staledaemon

root      60597  0.0  0.1 99664 21116  ??  Ss    5:27PM   0:00.03 /usr/local/bin/php staledaemon.php

root      60598  0.0  0.1 99664 21200  ??  I     5:27PM   0:00.01 /usr/local/bin/php staledaemon.php

root      61620  0.0  0.1 99664 21204  ??  S     5:28PM   0:00.00 /usr/local/bin/php staledaemon.php

 

Launching the script from the command line works fine.  Any ideas will be deeply appreciated, since it is critical that I be able to stop the processes.

I monitor the number of processes which are running, and if they fall below a certain level I need to restart the service, since each process in the service has its own functionality.

 

Any assistance will be deeply appreciated.

 

--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general

 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]