Lusis: vogeler

Showing posts with label vogeler. Show all posts

Wednesday, October 6, 2010

.plan

TODO

Work through Vogeler issue queue (http://github.com/lusis/vogeler/issues)
@padrinorb test suite enhancement - functional/integration (http://github.com/padrino/padrino-framework)
Integrate libvirt into @fog (http://github.com/geemus/fog)
Finish integrating Riak support into @padrinorb (http://github.com/seancribbs/ripple)
Create a distro that uses LFS for the explicit purpose of hosting dynamic language applications (http://lusislog.blogspot.com/2010/09/distributions-and-dynamic-languages.html)
Do all of the above while maintaining my marriage and raising two kids ;)

Sunday, September 12, 2010

Follow up to #vogeler post

Patrick Debois was kind enough to comment on my previous post and asked some very good questions. I thought they would fit better in a new post instead of a comment box so here it is:

I read your post and I must say I'm puzzled on what you are actually achieving. Is this a CMDB in the traditional way? Or is it an autodiscover type of CMDB, that goes out to the different systems for information? In the project page you mention à la mcollective. Does this mean you are providing GUI for the collected information? Anyway, I'm sure you are working on something great. But for now, the end goal is not so clear to me. Enlighten me!

Good question ;) I think it sits in an odd space at the moment because it tries to be flexible and by design could do all of those things. Mentioning Mcollective may have clouded the issue but is was more of a nod to similar architectural decisions - using a queue server to execute commands on multiple nodes.

My original goal (outside of learning Python) was to address two key things. I mentioned these on the Github FAQ for Vogeler but it doesn't hurt to repost them here for this discussion:

What need is Vogeler trying to fill?

Well, I would consider it a “framework” for establishing a configuration management database. One problem that something like a CMDB can create is that, to meet every individual need, it tends to over complicate. One thing I really wanted to do is avoid forcing you into my model and trying to provide ways for you to customize the application.

I went the other way. Vogeler at the core, provides two things – a place to dump “information” about “things” and a method for getting that information in a scalable manner. By using a document database like CouchDB, you don’t have to worry about managing a schema. I don’t need to know what information is actually valuable to you. You know best what information you want to store. By using a message queue with some reasonable security precautions, you don’t have to deal with another listening daemon. You don’t have to worry about affecting the performance of your system because you’re opening 20 SSH connections to get information or running some statically linked off-the-shelf binary that leaks memory and eventually zombies (Why hello, SCOM agent!).

In the end, you define what information you need, how to get it and how to interpret it. I just provide the framework to enable that.

So to address the question:

If we're being semantic, yes it's probably more of a configuration database than a configuration MANAGEMENT database. Autodiscovery, though not in the traditional sense, is indeed a feature. Install the client, stand up the server side parts and issue a facter command via the runner. You instantly have all the information that facter understands about your systems in CouchDB viewable via Futon. I could probably easily write something that scanned the network and installed the client but I have a general aversion to anything that sweeps networks that way. More than likely, you would install Vogeler when you kicked a new server and managed the "plugins" via puppet.

I hope that makes sense. Vogeler is the framework that allows you to get whatever information about your systems you need, store it, keep that information up to date and interpret it however you want. That's one reason I'm not currently providing a web interface for reporting right now. I just simply don't know what information is valuable to you as an operations team. Tools like puppet, cfengine, chef and the like are great and I have no desire to replace them but you COULD use this to build that replacement. That's also why I use facter as an example plugin with the code. I don't want to rewrite facteri. It just provides a good starting tool for getting some base data from all your systems.

Let's try a use case:

I need to know which systems have X rpm package installed.

You could write an SSH script, hit each box and parse the results or you could have Vogeler tell you. Let's assume that the last run of "package inventory" was a week ago:

vogeler-runner -c rpms -n all

The architecture is already pretty clear. Runner pushes a message on the broadcast queue, all clients see it ('-n all' means all nodes online) and they in turn push the results into another queue. Server pops the messages and dumps them into the CouchDB document for each node. You could then load up Futon or a custom interface you wrote and load the CouchDB design doc that does the map reduce for that information. You have your answer.

Now let's try something of a more complicated example:

I need to know what JMX port all my JBoss instances are listening on in my network.

Well I don't provide a "plugin" for you to get that information, a key for you to store it under in CouchDB or a design doc to parse it by default. But I don't need to. We take the Nagios approach. You define what command returns that information. A shell script, a python script, a ruby script whatever works for you. All you need to tell me is what key you want to store it under and something about the basic structure of the data itself. Maybe your script provides emits JSON. Maybe it emits YAML. Maybe it's a single string. Maybe you run multiple JBoss instances per machine each listening on different JMX ports (as opposed to aliasing IPs and using the standard). I'll take that and create a new key with that data in the Couch document for that system. You can peruse it with a custom web interface or, again, just use Futon.

Does that help?

Notes on #vogeler and #devops

UPDATE: There's some additional information about Vogeler in the followup post to this one:

Background

So I've been tweeting quite a bit about my current project Vogeler. Essentially it's a basic configuration management database built on RabbitMQ and CouchDB. I had to learn Python for work, we may or may not be using those two technologies so Vogeler was born.

There's quite a bit of information on Github about it but essentially the basic goals are these:

Provide a place to store configuration about systems
Provide a way to update that configuration easily and scalably
Provide a way for users to EASILY extend it with the information they need

I'm not doing a default web interface or much else right now. There's three basic components - a server process, a client process and a script runner. The first two don't act as traditional daemons but instead monitor a queue server for messages and act on that.

In the case of the client, it waits for a command alias and acts on that alias. The results are stuck on another queue for the server. The server sits and monitors that queue. When it sees a message, it takes it and inserts it in the database with some formatting based on the message type. That's it. The server doesn't initiate and connections directly to the clients and neither do the clients talk directly to the server. All messages that the clients see are initiated by the runner script only.

That's it in a nutshell.

0.7 release

I just released 0.7 of the library to PyPi (no small feat with a teething two year old and 5 month old) and with it, what I consider the core functionality it needs to be useful for people who really are interested in testing it. Almost everything is configurable now. Server, Client and Runner can specify where each component it needs lives on the network. CouchDB and RabbitMQ are running in different locations from the server process? No problem. Using authentication in CouchDB? You can configure that too. Want to use different RabbitMQ credentials? Got it covered.

Another big milestone was getting it working with Python 2.6. No distro out there that I know of is using 2.7 which is what I was using to develop Vogeler. The reason I chose 2.7 is that was the version we standardized on and since I was learning a new language and 2.7 was a bridge to 3, I chose that one. But when I went to started looking at trying the client on other machines at home, I realized I didn't want to compile and setup the whole virtualenv thing on each of them. So I got it working with 2.6 which is what Ubuntu is using. For CentOS and RedHat testing, I just used ActivePython 2.7 in /opt/.

Milestones

As I said 0.7 was a big milestone release for me because of the above things. Now I've got to do some of the stuff I would have done before if I hadn't been learning a new language:

Unit Tests - These are pretty big for me. Much of my work on Padrino has been as the Test nazi. Your test fails, I'm all up in your grill.
Refactor - Once the unit tests are done, I can safely being to refactor the codebase. I need to move everything out of a single .py with all the classes. This also paves the way for allowing swappable messaging and persistence layers. This is where unit tests shine, IMHO. Additionally, I'll finish up configuration file setup at this point.
Logging and Exception handling - I need to setup real loggers and stop using print messages. This is actually pretty easy. Exception handling may come as a result of the refactor but I consider it a distinct milestone.
Plugin stabilization - I'm still trying to figure out the best way to handle default plugins and what basic document layout I want.

Once those are done, I should be ready for a 1.0 release however before I cut that release, I have one last test.....

The EC2 blowout

This is the part I'm most excited about. When I feel like I'm ready to cut 1.0, I plan on spinning up a few hundred EC2 vogeler-client instances of various flavors (RHEL, CentOS, Debian, Ubuntu, Suse...you name it). I'll also stand up distinct RabbitMQ, CouchDB and vogeler-server instances.

Then I fire off the scripts. Multiple vogeler-runner invocations concurrently from different hosts and distros. I need to work out the final matrix but I'll probably use Hudson to build it.

While you might think that this is purely for load testing, it's not. Load testing is a part of it but another part is seeing how well Vogeler works as a configuration management database - the intended usage. What better way than to build out a large server farm and see where the real gaps are in the default setup? Additionally, this will allow me to really standardize on some things in the default based on the results.

At THAT point, I cut 1.0 and see what happens.

How you can help

What I really need help with now is feedback. I've seen about a 100 or so total downloads on PyPi across releases but no feedback on Github yet. That's probably mostly due to such minimal functionality before now and the initial hurdle. I've tried to keep the Github docs up to date. I think if I convert the github markdown to rst and load it on PyPi, that will help.

I also need advice from real Python developers. I know I'm doing some crazy stupid shit. It's all a part of learning. Know a way to optimize something I'm doing? Please tell me. Is something not working properly? Tell me. I've tried to test in multiple virtualenvs on multiple distros between 2.6 and 2.7 but I just don't know if I've truly isolated each manual test.

Check the wiki on github and try to install it yourself. Please!

I'm really excited about how things are coming along and about the project itself. If you have ANY feedback or comments, whatsoever, please pass it on even if it's negative. Feel free to tell me that it's pointless but at least tell me why you think so. While this started out as a way to learn Python, I really think it could be useful to some people and that's kept me going more than anything despite the limited time I've had to work on it (I can't work on it as part of my professional duties for many reasons). I've been trying to balance my duties as a father of two, husband, Padrino team member along with this and I think my commitment (4AM...seriously?) is showing.

Thanks!