|
From: | Jamie Burchell |
Subject: | RE: loadavg figures - percentages or not? |
Date: | Tue, 9 Apr 2019 15:29:20 +0100 |
Hi
Thanks for your time and help. I’ve made some changes to my configuration (which we are able to provision with Ansible) and am experimenting with the results:
if loadavg (1min) > {{ ansible_processor_count * 2 }} for 4 cycles then alert if loadavg (5min) > {{ ansible_processor_count * 1.5 }} for 4 cycles then alert if loadavg (15min) > {{ ansible_processor_count }} for 4 cycles then alert
Kind regards, From: monit-general [mailto:monit-general-bounces+jamie=address@hidden] On Behalf Of address@hidden
Hi,
the loadavg is not percent - as the manual states, it is absolute value: number of processes in the run queue. The practical limit depends on the number of CPUs and the typical load pattern - a rule of thumb we use is 2 processes per CPU core. If the machine has for example 48 cores, the loadavg of 96 is usually acceptable. There could be also spikes which are common and you may want to suppress false alerts, the example shows setup where high loadavg values for several consecutive cycles are needed before the alert is triggered.
To make the configuration easier, i think we can introduce some kind of "per CPU core" load average test, so the configuration will work the same regardless of CPU cores count, something like:
if loadavg(1m) per core > 1.9 then alert
Best regards, Martin
|
[Prev in Thread] | Current Thread | [Next in Thread] |