genDevConfig migrating to Shinken/Nagios

genDevConfig is in the process of being converted to output in a Shinken/Nagios configuration format.

Nagios does not have any true discovery frameworks that generate configuration trees. It is more of a piecemeal affaire, where each acquisition check is responsible for knowing what it gets and how. Basic information is passed from Nagios to the check. As you can imagine, for snmp polling, this is a very inefficient method. Calling an interpreter, loading a bundle of modules, executing a tiny script, exit and repeat. Even if they are C or scripted checks. Ouch.

Some innovation has happened in the check_mk plugin which does a good job of centralizing all snmp polling and data normalization before sending the data back as a passive check response. Other means of scaling snmp checks are by using collectd to do the polling and data normalization and just having Nagios check for updated values. Meh.. Fortunately, the collectd guys are supposed to be working at converting this to sending data back as passive check responses.

Nagios, is a real monster to hack. As luck would have it, a pair of Nagios professional admin, book writers and developers took the beast by its horns. They created Shinken, it is a Nagios compatible rewrite in Python. It offers a well thought out design offering scalability, modularity and worlds of flexibility.

Shinken offers distributed poller daemons  that can load modules. These modules make it easy to extend its feature set and scope. The aim is to create a python based SNMP poller module that can execute SNMP checks in a logical and efficient manner using the native python library that supports SNMPbulkwalk and SNMPbulkget. It has its drawbacks, but it does not require installing Net-SNMP and it is actively developped.

The distributed nature of the scheduling and polling in Shinken, the Bulkwalk efficiency and simpler interface offset the slower performance of pySNMP. It is noted that the error handling and corner cases for tables do suffer some inconsistencies, so these need to be accounted for. Performance is estimated around 5000 checks per second using the python module with high cpu usage, while net-snmp goes for over 10000 checks per second with low cpu usage.

So there you have it. There may eventually be an alternative for Cricket users used to the automation offered by genDevConfig and the simplicity of the Cricket config-tree that are looking for data gathering and monitoring in a single core.

More details will be provided when the beta version is released. I expect the configuration output to have a few iterations to find the right mix between what is created by genDevConfig and what is included in the templates. As most of you will note, Cricket’s and Nagios’ configuration inheritance model are not exactly the same.