monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Process being restarted by monit validate even if it is running


From: Mehul Ved
Subject: RE: Process being restarted by monit validate even if it is running
Date: Fri, 7 Mar 2014 11:21:28 +0000

# ls -ld /var/run
lrwxrwxrwx 1 root root 4 Jul  5  2013 /var/run -> /run

# ls -ld /var/run/node/
drwxr-xr-x 2 root root 120 Mar  7 08:52 /var/run/node/

# ls -l /var/run/node/v2.pid 
-rw-r--r-- 1 root root 4 Mar  7 08:52 /var/run/node/v2.pid

# lsof -i :2812
COMMAND   PID USER   FD   TYPE   DEVICE SIZE/OFF NODE NAME
monit   14382 root    6u  IPv4 70768563      0t0  TCP localhost:2812 (LISTEN)

I am running monit as root and also monit validate is run from root login. 

I had added "as uid root" specifically because I was having problem reading 
pidfile without that. Probably redundant now.
Another thing I should have made clear. This is only happening only for 
processes that I have written a start/stop script. My script gets the pid of 
the process and echo's it into the given file. To the best of my knowledge 
that's the correct thing to do with a pidfile. And considering that monit is 
getting the pid correctly, I believe that part is fine. It even manages to stop 
the program correctly, which would have failed if the program wasn't running 
with correct pid. 

I am failing to figure out as to why monit doesn't see that process running. 
Any suggested debugging steps I can follow?

________________________________________
From: address@hidden <address@hidden> on behalf of Martin Pala <address@hidden>
Sent: Friday, March 07, 2014 4:26 PM
To: This is the general mailing list for monit
Subject: Re: Process being restarted by monit validate even if it is running

Hi,

is Monit running as root or as different user?

If it is running as root, then the "as uid root" in stop/start programs is not 
necessary:

        start program = "/usr/local/bin/nodeinit v2 start" as uid root

If it is running as different user (which may be the reason for adding "as uid 
root", but that most probably won't work, as the user won't have permission to 
switch to root unless the binary is setuid or sudo is used), then it is 
possible that Monit cannot read the pidfile, please check the permissions of 
the whole path to the pidfile and the pidfile itself:

        ls -ld /var/run
        ls -ld /var/run/node
        ls -l /var/run/node/v2.pid

You can run Monit in debug mode to get more details about the test progress:

        monit -vI


Regards,
Martin


On 07 Mar 2014, at 10:03, Mehul Ved <address@hidden> wrote:

> Hi,
>   I have a process which I am monitoring with following rules
>
> check process v2api with pidfile /var/run/node/v2.pid
>   start program = "/usr/local/bin/nodeinit v2 start"
>     as uid root
>   stop program = "/usr/local/bin/nodeinit v2 stop"
>     as uid root
>   if failed host 127.0.0.1 port 10400 protocol http
>     request /api/v2/ping
>     with timeout 10 seconds
>     then restart
>   if 5 restarts within 10 cycles then alert
>
> Before running monit validate, I checked the pidfile of the process
>
> # grep [0-9]* /var/run/node/
> /var/run/node/v2.pid:31566
>
> # cat /var/run/node/v2.pid
> 31566
>
> I also verified with ps on the process id
>
> # ps aux | grep 31566
> root     31566  0.3  1.7 605684 29820 ?        Sl   05:24   0:01 
> /usr/local/bin/node /usr/local/share/nodeapis/server.js
>
>
> Now when I run
>
> # monit validate --verbose
> 'v2api' Error testing process id [31566] -- No such process
> 'v2api' process is not running
> Does not exist notification is sent to address@hidden
> 'v2api' trying to restart
> 'v2api' stop: /usr/local/bin/nodeinit
> 'v2api' Error testing process id [31566] -- No such process
> /usr/local/bin/nodeinit: line 80: kill: (31566) - No such process
> Killed v2 process with pid 31566
> monit: pidfile '/var/run/node/v2.pid' does not exist
> monit: pidfile '/var/run/node/v2.pid' does not exist
> 'v2api' start: /usr/local/bin/nodeinit
> monit: pidfile '/var/run/node/v2.pid' does not exist
> v2 has started with PID: 1150
>
> It complains that the process is not running.
>
> I am using the development version of monit that Martin linked to a couple of 
> days back, with websocket support.
>
> # monit --version
> This is Monit version 5.8
> Copyright (C) 2001-2014 Tildeslash Ltd. All Rights Reserved.
>
>
>
>
>
>
>
>
> --
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general


--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

reply via email to

[Prev in Thread] Current Thread [Next in Thread]