monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Monitoring the wrong thing?


From: Marc Pinnell
Subject: Re: Monitoring the wrong thing?
Date: Fri, 06 Aug 2010 09:32:27 -0700
User-agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.8) Gecko/20100802 Thunderbird/3.1.2

cycles = the defined interval (2 mins in my case) correct?

Marc

On 8/6/10 2:44 AM, Martin Pala wrote:
The configuration is OK, but the limits should be modified. It seems
that your system has CPU usage spikes which last for 2+ cycles and it
trigger the alert. The limit should be set so, that you get alert only
if the state is abnormal/pathological. What is normal/abnormal is
specific for each system - you can watch the load and then set the
limits accordingly ... for example rise the cpu(user) usage to 90% for
10 cycles.

Regards,
Martin


On 08/04/2010 09:30 PM, Marc Pinnell wrote:
Finally got Monit going this am on my webserver (daemon, 2 min
interval). Since then I am getting a couple of warnings an hour about
high CPU loads. I am monitoring the wrong thing (I don't totally
understand the UNIX terminology about loads)? Here is my config:

check system 1027mail
if loadavg (1min)> 4 then alert
if loadavg (5min)> 2 then alert
if memory usage> 75% then alert
if cpu usage (user)> 70% for 2 cycles then alert
if cpu usage (system)> 30% for 2 cycles then alert

and a warning I just received:


Begin forwarded message:

Resource limit matched Service 1027mail

Date: Wed, 04 Aug 2010 15:24:13 -0400
Action: alert
Host: 1027mail
Description: cpu user usage of 80.8% matches resource limit [cpu user
usage>70.0%]


and then two minutes later:

Resource limit succeeded Service 1027mail

Date: Wed, 04 Aug 2010 15:26:19 -0400
Action: alert
Host: 1027mail
Description: '1027mail' cpu user usage check succeeded [current cpu
user usage=0.0%]


This is the way they all go so far. Seems like if it happens to check
at the very moment a web request comes in (which obviously happens on
a regular basis!), it trips the warning.

Suggestions?

Marc


--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general



reply via email to

[Prev in Thread] Current Thread [Next in Thread]