monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

problem with cpu user on linux, 4.8.2


From: Aleksander
Subject: problem with cpu user on linux, 4.8.2
Date: Fri, 01 Dec 2006 13:32:00 +0200
User-agent: Thunderbird 1.5.0.7 (X11/20060913)

Hi,

I think I encountered a bug. I upgraded from 4.8.1 to 4.8.2 on November 17th, didn't have any problems till today.

CPU usage alerts are defined like this:

CPU wait limit If greater than 20.0% 3 times within 3 cycle(s) then alert else if passed 1 times within 1 cycle(s) then alert

CPU system limit If greater than 30.0% 3 times within 3 cycle(s) then alert else if passed 1 times within 1 cycle(s) then alert

CPU user limit If greater than 92.0% 3 times within 3 cycle(s) then alert else if passed 1 times within 1 cycle(s) then alert

"alert" sends mails.

monit is running on SLES9 from init on a informix database server. The box has two Intel(R) Xeon(TM) CPU 2.80GHz processors (4 cores). A ML350G3 server.

The monit interval is set to 55 seconds.

Tonight monit complained that cpu user limit had been reached, then once this morning, then at noon and then it started happening every five minutes. The limit passed usually within one or two cycles.

At noon I was closely watching top, the user cpu limit did not even reach 90%, maybe once or twice for a very brief moment during a longer period, but definitely not 3 times with a ~55 second interval. Monit was telling me, that cpu user is at 93%, when it really was and had been for several minutes only ~20%.

I also noticed, when monit told me the limit was passed, it showed that current cpu user limit was 92.0% -- exactly my limit. This just doesn't seem to be real. There's something wrong. I tried reloading monit, didn't help. I have to confess, I didn't try restarting it (issuing a kill and let init restart it), I forgot to try that.

I noticed in the changelog for 4.8.2:

        * Fixed cpu usage statistics on Linux.

I think this caused problems for me. I never had this kind of issues with 4.8.1, so I commented out the fsflags alerts and downgraded to 4.8.1, haven't had any problems for the last half hour.

I don't know why this happened now, after half a month of usage. I haven't (yet) had this problem on an almost identical machine. Another HP ML350.

Any ideas?

Thanks for reading the long story,
        Alex




reply via email to

[Prev in Thread] Current Thread [Next in Thread]