monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [monit] Aborting monit on failure


From: Eric Pailleau
Subject: Re: [monit] Aborting monit on failure
Date: Fri, 07 Aug 2009 14:51:39 +0200
User-agent: Thunderbird 2.0.0.22 (X11/20090625)

Stephan-Frank Henry a écrit :
Eric Pailleau wrote:
(Sorry this discussion is not 'monit related')
<snip>

Sounds interesting. Do you know of any online resources (besides the DRBD hp)?
Or do I need to hire you as a consultant? :D

I'm cheap :>)

(back on topic)


File Name             = slony_log
 Path                 = /var/log/slony1/slony1.log
 Monitoring mode      = active
 Regex                = if match "FATAL" 1 times within 1 cycle(s) then exec 
'/home/frank/monit/fail_action.sh' timeout 1 cycle(s)

Remote Host Name      = db_server_01
 Monitoring mode      = active
 Depends on Service   = slony_log
 Port                 = if failed db_server_01:5432 [PGSQL via TCP] with 
timeout 5 seconds 1 times within 1 cycle(s) then exec 
'/home/frank/monit/fail_action.sh' timeout 1 cycle(s) else if passed 1 times 
within 1 cycle(s) then alert

One remark about postgresql monitoring :
It is safer to monitor postgresql UNIX socket instead of TCP/IP socket,
because an ethernet card fails (or IP stack full by an attack) more offens than 
a disk failure.
You can monitor also TCP/IP socket but only when several (at least 2) cycles 
are failing.

1. why is it trying to restart the log file ... ?
Because file does not exists and it is an error that should do a start command 
... (if I read well the log)

2. Is there an issue with rights? I am running the script that sets everything 
up and starts monit with root rights. Is that enough?
Be carefull : monit launch script with a basic environnement (be carefull to 
have right environnement variables : use 'env ...'
or source an environnement script in beginning of your script .)
(Obviously script must be chmod +x :>þ )

be care also that postgresql refuse to start as root (I don't know the content 
of your script, but ...)
You may launch as postgresql user ?
i.e : if match "FATAL" 1 times within 1 cycle(s) then exec 'su - postgres -c 
/home/frank/monit/fail_action.sh ' timeout 1 cycle(s)


3. Is there any way to define something like 'if failed exec 'script.sh' then 
unmonitor.
I'm afraid not... but Martin can certainly be more precise.

btw: I just noticed I had 4.8 installed but an upgrade to 4.10 (via 
etch-backports) did not seem to fix the issues.
Would compiling 5.0.3 help?
monit 5.0 and higher have more interesting syntax in config.

Hope it can help.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]