monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: monit deadlocks


From: Martin Pala
Subject: Re: monit deadlocks
Date: Thu, 20 Oct 2005 12:46:06 +0200
User-agent: Mozilla Thunderbird 1.0.7 (Windows/20050923)

Please can you:

1.) run monit in debug mode (using -v option)?

2.) attach the monit log output which preceded the deadlock? (in debug mode there will be lot of informations what the monit is actualy doing)

3.) describe the steps you did before the deadlock occured?

Also the strace could be very useful to find the problem cause.

To your notes:

- mutex is used in monit for synchronization on many places, it is not possible to find the cause without more informations.

- when you run 'monit status' or 'monit summary', you just start standalone monit process which tries to connect to the monit daemon and read the state. When the daemon is not responding, both 'status' and 'summary' should return 'unable to connect to the daemon'. When the first 'status' passed and the following 'summary' failed, it is possible that the deadlock is related to the first 'status'.

Thanks,
Martin

Eli Yukelzon wrote:
Good day.
I am using monit 4.6.
It would be rather hard to provide a large trace for this problem,
because it appears under rare and unpredictable conditions.
To elaborate more about the problem:
The main process is stuck in 'futex' call.
The 'monit status' command works, but it reports incorrect
information, i.e. if some service is down, it will not notice it.
The 'monit summary' command crashes with 'unable to connect to daemon' message.
I've attached the monitrc that I am using.

Any help would be really appriciated.

On 10/19/05, Jan-Henrik Haukeland <address@hidden> wrote:

Which version of monit are you using? Earlier 4.x version (or was it
3.x?) had some problems with thread locking in some special cases. If
you are not using the latest 4.6 release try to upgrade to this
version and see if that solve the problem. If not, could you please
provide us with a longer trace (the stack address is not much help)
and maybe elaborate more around the problem (log output) and provide
us with your monitrc file.

Regards


On 19. okt. 2005, at 20.00, Eli Yukelzon wrote:


Good day.

I've been using monit for administating my server for quite a while,
and i've been very pleased with it's performance.
Lately though I've came across a reoccuring event which will probably
cause me to switch away from using monit...
Monit daemon get's stuck. It enters some deadlock, according to
strace:

# strace -f -F -p 21284
Process 21284 attached - interrupt to quit
futex(0xb7e42800, FUTEX_WAIT, 2, NULL <unfinished ...>
Process 21284 detached
# monit summary
monit: cannot read status from the monit daemon

after killall -9 monit
it restarts from inittab just fine.
the problem is - becuase of this deadlock, it stops monitoring the
services!

Any ideas?






reply via email to

[Prev in Thread] Current Thread [Next in Thread]