[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: System monitoring
From: |
Pjotr Prins |
Subject: |
Re: System monitoring |
Date: |
Sun, 29 Dec 2019 16:07:41 -0600 |
User-agent: |
NeoMutt/20170113 (1.7.2) |
On Sun, Dec 29, 2019 at 09:05:40PM +0100, Nicolò Balzarotti wrote:
> I think zabbix should work, but I've never used it. On the surface, it
> seems to have a steep learning curve, but this is just my impression.
The problem with these systems is that they target (complex)
deployments that have people watching these systems.
What I need is much simpler - I don't want to watch systems, but I
need a cursory idea of health of say 20-40 machines out there. I also
want something that can notify me if things go really wrong. For
example when backups fail. These are not massive requirements - just
something flexible! I used to have scripts for that that would
mail/text me. But that was all a bit ad hoc and I got tired of
maintaining them and I got tired of repeating notifications ;)
What would be really cool is to be able to use logic programming. It
would allow questions like:
What services showed interruptions in the last month on low RAM
machines that also ran guix < 1.0 and a specific version of nginx.
This would mean storing state of machines in a database that gets
updated by messages. It means a good message broker. It means that
every time you write a monitoring service, you'll have to write a
receiver to turn it into a datastructure something like miniKanren can
solve. Key is to make *creating* such small reporter/receiver tools
really easy.
Visualisations are less important - though I am sure some people enjoy
creating those.
I.e., what I have in mind is a different type of systems monitor: a
minimalistic system that is hackable and can work out of the box for
Guix systems and are really easy to extend.
I think if we can prototype something in the coming months it would
make a great GSoC project to build out functionality.
Pj.