Monday, May 11, 2009

Icinga musings

So I just heard about the Icinga fork of Nagios. Looking at the motivation, I can't say that I disagree with the fork. Obviously there are a standard set of "things" that most people add to Nagios. Many times those "things" feel like they should be standard and others are just fluff.

Looking over the Icinga project team, we can get a feel (and from the project goals) the things that they want to add on:

  • PNP
  • NagVis
  • Grapher
  • NagTrap
  • more NDO

I can get behind most of those. I think they're all wonderful addons. However, I have some reservations about one of the ones not listed. This is from the Icinga page:

The most significant modification and difference to Nagios is a completely new web interface based on PHP. This allows a wider circle of developers to contribute to the web interface and easier adjustments to be made by users to their individual environments.
I honestly just cannot get behind that in principle. I fully understand the concerns and problems people have. The Nagios interface is "ugly" in a sense but change for change's sake is just silly.

Why PHP? Why not perl (see Groundwork) or ruby or python? It's an arbitrary decision. I like my Nagios installations slim and doing what they do best, monitoring and alerting. You don't even HAVE to have a web interface. I don't want to have to bog down my monitoring server with YAP (yet another package). For all its warts, I like the way the Nagios web interface worked. There's nothing wrong with CGI scripts. They work. The Nagios cgis worked.

Was it a bitch to deal with them? Of course but moving to PHP isn't going to immediately make it better unless there's a framework or an API or standards to work against. I have full faith in the Icinga team to make an outstanding interface but I'm wondering what sort of process is going to be in place to make sure the interface is "stable". One thing that can be argued in favor of the current setup is that it's not at the whims of a constantly changing language like PHP.*

I guess my feeling is that the Icinga folks want to make something MORE of Nagios. Make it more than what it is at the core - network monitoring. There's a valid argument to be made that an "enterprise" monitoring system should have an SNMP trap handler but I personally don't think snmptt is the way to go. If it's that important, it should be something NOT written in a scripting language. If handling traps is of the utmost importance, it should be able to handle whatever volume of traps per second you throw at it. I can't find any performance numbers for snmptt so I can't tell you.

I think the biggest problem I've had with Nagios is that it isn't modular enough. It lacks something we've all come to appreciate these days - the concept of plugins. Admittedly, it's one guy. If he doesn't see a need for it, then we probably won't ever see it. Nagios really needs a standard way for people to plug in to it. Right now we have bolt-on solutions that never REALLY feel integrated. Maybe that's what Icinga wants to do. I can appreciate, however, the lean-ness that Nagios has had for this long. Maybe times have changed an monitoring doesn't just encompass monitoring anymore. I don't know but in my mind, monitoring is still a distinct entity from trending. They go hand in hand but Nagios has never billed itself as an all in one monitoring and trending solution. It monitors, and it alerts. Occasionally it "event handles" but long term storage and analysis of the data is out of scope.

Anyway, much of this has been a ramble based on first blush. I'm sure I'll have more to say. I'll follow the project closely and see what it does. I fully expect a lot of people to switch over just for the "completeness" and "asthetic" factor. Groundwork has clients after all. The demand is there. However, I'm just not sure if I'll make the switch myself.

Maybe the whole thing will prompt Ethan to respond in a positive way and make my wish list come true ;)

- API into the monitoring system
- Native support for RRD storage of perfdata information

Those are my two biggest. I would LOVE to have an API into the live core of the engine to make changes to resources. One thing that I loved about Groundwork (I think it was Groundwork) was that it had a command-line API for adding and removing hosts. I'm really hoping that in the end, we end up with Nagios as a framework with its own basic functionality but that better allows the design of solutions built on top of it. Want to build your own interface? Pull a list of hosts from the API. Pull a list of last known states for each host. Display it.

* By constantly changing, I mean compared to traditional languagues like C. PHP also has (and many developers will admit this) inconsistencies and other "gotchas" left over from years of backwards compatibility.