monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Memory usage for process


From: Andrea Pagani
Subject: Re: Memory usage for process
Date: Thu, 14 Oct 2004 18:07:32 +0200
User-agent: Mozilla/5.0 (Windows; U; WinNT4.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)

My configure.log reports the following:

...
configure:9519: checking for resource support
configure:9540: result: enabled
...

and doing the configure phase again gave in output 'checking for resource support... enabled'.

Actually, looking at validate.c in check_process(Service_T s), it seems like it never enters check_process_resources(), while it passes through check_process_state(s). The piece of code there is as follow:

 if(Run.doprocess) {
   if(update_process_data(s, ptree, ptreesize, pid)) {
     check_process_state(s);
     for(pr= s->resourcelist; pr; pr= pr->next) {
       check_process_resources(s, pr);
     }
   } else {
     log("'%s' failed to get service data\n", s->name);
   }
 }

As far as I understand, this happens when 's->resourcelist' is NULL, but can you tell me when does this happen?

Best regards.
Andrea

Martin Pala wrote:

I have tried to replicate the problem but it woks on my computer without problems. I tried it on Debian with kernels 2.6.8.1 and 2.4.27, single CPU, 512MB RAM.

In your logs i can see only one strange thing - there is no cpu usage entry for monitored process 'heartbeat'. In my case, monit reports this statistic, for example:

[CEST Oct 13 23:01:46] 'slapd' zombie check passed [status_flag=0000]
[CEST Oct 13 23:01:46] 'slapd' cpu usage check passed [current cpu usage=0.0%] [CEST Oct 13 23:01:46] 'slapd' succeeded connecting to INET[127.0.0.1:389] [CEST Oct 13 23:01:46] 'slapd' succeeded testing protocol [LDAP3] at INET[127.0.0.1:389]


I.e. it may be possible that your monit is compiled without resource monitoring support - check that your confiure phase contains following line near the end of output:

         resource monitoring: enabled

... and that you are not using --without-resource option.

Cheers,
Martin

Andrea Pagani wrote:

I activated monit in debug mode but I couldn't see any error and still I got the same strange values for processes' memory size (system memory usage, which was one of the problem solved with official 4.4 version, seems ok).

I tried to follow even Andreas Rust's advice, installing 4.4-beta5, but I observed the same behaviour.

Best regards.
Andrea

PS : Reported below is a piece of logfile, but even after activating the other services I had no error.

-----------------

Oct 8 18:31:02 ic2sn004 monit: monit: Debug: Adding credentials for user 'admin'. Oct 8 18:31:02 ic2sn004 monit[25083]: Starting monit daemon with http interface at [*:2812] Oct 8 18:31:02 ic2sn004 monit[25083]: Starting monit HTTP server at [*:2812]
Oct  8 18:31:02 ic2sn004 monit[25083]: monit HTTP server started
Oct 8 18:31:02 ic2sn004 monit[25083]: 'system' load average [0.00][0.00][0.00] Oct 8 18:31:02 ic2sn004 monit[25083]: 'system' memory usage 2.6% [158144 kB] Oct 8 18:31:02 ic2sn004 monit[25083]: 'system' cpu usage 0.0%us 0.0%sy 0.0%wa Oct 8 18:31:02 ic2sn004 monit[25083]: 'heartbeat' process is not running
Oct  8 18:31:02 ic2sn004 monit[25083]: 'heartbeat' trying to restart
Oct 8 18:31:02 ic2sn004 monit[25083]: Monitoring disabled -- service heartbeat Oct 8 18:31:02 ic2sn004 monit[25083]: 'heartbeat' start: /etc/init.d/heartbeat Oct 8 18:31:02 ic2sn004 monit[25083]: Monitoring enabled -- service heartbeat Oct 8 18:31:02 ic2sn004 heartbeat[25135]: info: ************************** Oct 8 18:31:02 ic2sn004 heartbeat[25135]: info: Configuration validated. Starting heartbeat 1.3.0 Oct 8 18:31:02 ic2sn004 heartbeat[25136]: info: heartbeat: version 1.3.0 Oct 8 18:31:02 ic2sn004 heartbeat[25136]: info: Heartbeat generation: 22 Oct 8 18:31:02 ic2sn004 heartbeat[25136]: info: UDP Broadcast heartbeat started on port 694 (694) interface eth1 Oct 8 18:31:02 ic2sn004 heartbeat[25136]: info: UDP Broadcast heartbeat started on port 694 (694) interface eth0 Oct 8 18:31:02 ic2sn004 heartbeat[25136]: info: pid 25136 locked in memory. Oct 8 18:31:02 ic2sn004 heartbeat[25136]: info: Local status now set to: 'up' Oct 8 18:31:03 ic2sn004 heartbeat[25142]: info: pid 25142 locked in memory. Oct 8 18:31:03 ic2sn004 heartbeat[25139]: info: pid 25139 locked in memory. Oct 8 18:31:03 ic2sn004 heartbeat[25141]: info: pid 25141 locked in memory. Oct 8 18:31:03 ic2sn004 heartbeat[25140]: info: pid 25140 locked in memory. Oct 8 18:31:03 ic2sn004 heartbeat[25143]: info: pid 25143 locked in memory.
Oct  8 18:31:03 ic2sn004 heartbeat[25136]: info: Link ic2sn004:eth0 up.
Oct  8 18:31:03 ic2sn004 heartbeat[25136]: info: Link ic2sn004:eth1 up.
Oct 8 18:31:12 ic2sn004 monit[25083]: 'system' load average [0.00][0.00][0.00] Oct 8 18:31:12 ic2sn004 monit[25083]: 'system' memory usage 2.6% [158980 kB] Oct 8 18:31:12 ic2sn004 monit[25083]: 'system' cpu usage 0.0%us 0.9%sy 15.9%wa Oct 8 18:31:12 ic2sn004 monit[25083]: 'heartbeat' process is running with pid 25136 Oct 8 18:31:12 ic2sn004 monit[25083]: 'heartbeat' zombie check passed [status_flag=0000] Oct 8 18:31:22 ic2sn004 monit[25083]: 'system' load average [0.00][0.00][0.00] Oct 8 18:31:22 ic2sn004 monit[25083]: 'system' memory usage 2.6% [158980 kB] Oct 8 18:31:22 ic2sn004 monit[25083]: 'system' cpu usage 0.0%us 0.0%sy 1.9%wa Oct 8 18:31:22 ic2sn004 monit[25083]: 'heartbeat' zombie check passed [status_flag=0000]


address@hidden wrote:

Date: Thu, 07 Oct 2004 20:22:22 +0200
From: Martin Pala <address@hidden>
Subject: Re: Memory usage for process
To: This is the general mailing list for monit
    <address@hidden>
Message-ID: <address@hidden>
Content-Type: text/plain; charset=us-ascii; format=flowed

Are there some error messages in monit log? (please run monit in debug mode)

Thanks,
Martin

Andrea Pagani wrote:
Hello,

I'm running monit version 4.4 on a RedHat Linux AS 3.0 (kernel
2.4.21-15.ELsmp) on top of a HP-ProLiant DL380 and I've noticed
that at runtime many of monit services of type process are
showing "Memory" values which have non-sense, like shown below:

Process 'heartbeat'
status                            running
monitoring status                 monitored
pid                               16385
parent pid                        1
uptime 1d 19h 23m childrens 5
memory kilobytes                  -1112
memory kilobytes total            -7612
memory percent                    72401.9%
memory percent total              72401.8%
cpu percent                       0.0%
cpu percent total                 0.0%
data collected                    Wed Oct  6 14:16:36 2004

Process 'postgresql'
status                            running
monitoring status                 monitored
pid                               27386
parent pid                        1
uptime 2h 11m childrens 4
memory kilobytes                  -1460
memory kilobytes total            -5376
memory percent                    72401.9%
memory percent total              72401.9%
cpu percent                       0.0%
cpu percent total                 0.0%
port response time                0.000s to localhost:5000 [DEFAULT]
data collected                    Wed Oct  6 14:16:36 2004

Process 'as'
status                            running
monitoring status                 monitored
pid                               27370
parent pid                        1
uptime 2h 11m childrens 1
memory kilobytes                  -512
memory kilobytes total            372
memory percent                    72401.9%
memory percent total              0.0%
cpu percent                       0.0%
cpu percent total                 0.0%
data collected                    Wed Oct  6 14:16:36 2004

Process 'as_jvm'
status                            running
monitoring status                 monitored
pid                               27392
parent pid                        27370
uptime 1h 44m childrens 0
memory kilobytes                  884
memory kilobytes total            884
memory percent                    0.0%
memory percent total              0.0%
cpu percent                       0.0%
cpu percent total                 0.0%
data collected                    Wed Oct  6 14:16:36 2004

What's the matter with them? Is there something wrong with my
configuration or is it a wrong memory usage computation?

Best regards.
Andrea







--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general




--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general


--
Andrea Pagani

Italtel S.p.A. - Technology & Product Development Unit
BUPR-TPD-ASPD Castelletto C18 - 20019 Settimo Milanese (MI) - ITALY
phone: +39.2.4388.3469, fax: +39.2.4388.8431, e-mail: address@hidden









reply via email to

[Prev in Thread] Current Thread [Next in Thread]