monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: New user with several major monit problems


From: Martin Pala
Subject: Re: New user with several major monit problems
Date: Sat, 10 Sep 2005 17:03:15 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.10) Gecko/20050802 Debian/1.7.10-1

Jonathan Wheeler wrote:
Martin Pala wrote:


Jonathan Wheeler wrote:


Most annoyingly, for my cluster monit -g node1 stop all (as taken
directly from your documentation) kills the *entire* server (see
problem 1)


Yet one thing - the described node shutdown sounds me like some
watchdog driven shutdown - do you use heartbeat's watchdog capability
or some other external check which is able to panic the node?


No I don't, nothing fancy at all yet :)

Any thoughts on how I might troubleshoot this further? Syslog is killed
itself, so I don't have any information in the logs at all. Local
console is also booted out, so even sitting in front of the server
doesn't help.

I think it is either watchdog or some stonith method (power off/cycle the machine). You can try for example 'lsof | grep watchdog' to see whether the watchdog device is opened.

If you can supply your heartbeat, monit and scripts configuration as described Hauk, then it will be much easier to find the problem.

Martin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]