Tuesday, October 12, 2010

Latest Vogeler update - MongoDB, protobufs, Riak and war!

I wanted to take a minute to post an update about Vogeler to those who are following the project. Let's get the easy stuff out of the way - it's not abandoned. Far from it.

There have been several reasons why I haven't made any commits lately, the least of which is both kids have been sick recently and I haven't been able to get a good solid block of time to work on it.

Technical Hurdles

Another reason is that I almost went down a rabbit hole with regards to swappable persistence. In the process of refactoring the persistence backend, I realized it should be fairly easy, using the model I put into place, to go ahead and implement MongoDB and Riak support. I started with MongoDB when I hit a wall. MongoDB does not allow dots in key names. When I ran into that issue, I realized that I made some dangerous assumptions based on the fact that I started with Couchdbkit as the interface to CouchDB:

I was using an ORM when I should have used a lower level driver. You see couchdbkit does some nice stuff like translating native Python datatypes to the appropriate datatypes. If I define a row as having DictProperty(), couchdbkit converts that into the commensurate CouchDB JSON datatypes. If I use ListProperty(), the same thing. This is really evidenced in Futon and makes using Futon as your interface to Vogeler very appealing. However this is VERY couchdb specific.

The pymongo driver, however, didn't like my strategy of dumping execution results that way. You can see the "gist" of what I'm talking about here:


I brought the issue up on the MongoDB mailing list here. I opened an issue for myself to braindump my thoughts. One of my biggest goals (data transparency) was starting to fall apart for me. I decided to shelve MongoDB for a moment and look at Riak. I wanted to make sure that I at least thought about how a generic model would work across multiple document stores. That's when I ran into the biggest cockblock:


I'm almost firmly convinced that protobuf is a piece of trash. Google has some smart people but protobuf is something that quite obviously came out of the mind of someone who was sent off to "solve the RPC problem". There are quite a few issues I have with protobuf:

  • Despite being a "universal" format, it works well in exactly TWO languages - C and Java. Everything else is an afterthought. Don't get me started on Python support. The one guy at Google who supports protobuf on Python can't make it work on anything but Python 2.5 because that's all Google uses. He's unwilling to cut a new PyPi package just to fix all the 2.5 assumptions because he doesn't want to bump the version number. You can't even install it on anything higher than 2.5 without hacking setup.py.
  • You have to precompile your protos before use. I understand what Google is trying to accomplish but seriously? So I have to build the protobuf compiler to compile protos to ship with my code. There's a reason why people like FFI folks.

There are alternatives like Apache Avro that have promise but they also have their own issues. However, Basho has committed to using protobuf which does make sense. Write your own serialization framework or use an existing one? Easy answer when Google wrote one for you.

So I started to noodle out what route I wanted to take when something else came out of left field.

Sgt first Class Lance Vogeler

I have a search setup in Tweetdeck on my Droid for Vogeler. It was nice to stay on top of people mentioning the project. The name for the project came out of me pretty much immersing myself in the latest S.M. Stirling Emberverse books. One of the characters was named Ingolf Vogeler. I really enjoyed the books and liked the name so I picked it. I'm also considering using Ritva for another project.

So one day my phone starts going nuts with Vogeler alerts. I was already getting the occasional history tweet about the real Ingolf Vogeler but it turns out a soldier from Georgia, of all places, was KIA in Afghanistan. He did 8 tours in Afghanistan and 4 tours in Iraq. Politics aside (I'm personally entirely against these campaigns), I didn't want to "pollute" the twitter stream. Regardless of what I think of the current military climate in my country, I have the utmost respect for most of the members in our armed forces.

However what struck me most is that SFC Vogeler left behind a wife. A wife carrying his unborn child. That pretty much did me in. As a father myself, I was pretty torn up thinking about this happening to my wife. Yes, it was a known risk but that doesn't make it any less sad. I decided, in addition to making a donation of my own to his family and holding off on Vogeler till it wasn't alerting on Twitter so much that I would think hard about what my software means.

Steve Jobs asked a guy who emailed him this question "What have you created lately?" Someone else recently said that entrepreneurs are busy creating the next social media app that means fuckall when they could be affecting change with the software they write. That got me thinking. Could I help affect this family somehow with my project that happened to share a name with them? The best I could come up with is this:

If you use Vogeler, are interested in it or just feel like a random act of kindness, please make a small donation to the Vogeler family. My wife and I agreed that should I ever make ANY money off of the project in any identifiable form that I would donate what I could to the family. Vogeler is just a small project. I have no grand aspirations of getting integrated into some mainstream project. I'm just trying to scratch an itch - a niche itch at that. One reason I'm so gung-ho about DevOps is that, as a family man, I don't WANT to be dealing with stupid shit taking time away from my family. I've done it and I'm done with it. If my phone goes off, it's not going to be from some stupid mistake that I made editing a config file or lack of metrics causing an "oh shit we're out of space" moment. I'm past that in my career and I'm past having to work places where that's the norm rather than the exception. My family is first and foremost and anything I can do to keep it that way, I'm going to do it.

So I'm taking the Vim approach to Vogeler. I'm not going to ask anyone to go against his conscience. If you feel like a small donation to this family implies consent to the stupidity of my government then I fully understand. But if you think that open source software and the broader open source community can make a difference in more than just writing software, throw a small donation their way.

I'm going to be starting back up on Vogeler now. I've decided that for now, I'm going to attempt to keep things as generic as possible but continue to code against CouchDB. I'll keep revisiting MongoDB and Riak support but the primary target is CouchDB. If any Basho folks are reading this, if you can remove the hard dep in setup.py on protobuf, that would be awesome. You can't even install from PyPi with it in there anyway. If the MongoDB folks take a gander at this, can you do something about the dots in key names? Thanks!


kristina said...

Encouraging donations to the Vogeler family is a very sweet idea. I hope that lots of your users take you up on that! Regardless of their feelings about the war, I think most people support the troops.

On a more technical note, it is very important that MongoDB not allow dots in key names. MongoDB uses dots to reach into embedded objects, so it would be ambiguous to allow it in keys: if you have {"x.y" : 3, "x" : {"y" : 3}}, which field does x.y refer to? It would completely break backwards compatibility and change the semantics of the query language.

A lot of people have found that a global replace of "." with some other character works fine, you might want to try that.

Lior Gradstein said...

couchquery (which, contrary to its title, doesn't only to queries) is your friend, for low level CouchDB operations