Tuesday, November 24, 2009

Saving the newspaper industry

I debated whether or not to post this but I figured what the hell. If somehow it gets read by my employer and causes issues, I'll cope.

I've recently become employed as a contractor with a large newspaper in the Atlanta area. I was and am really excited by the opportunity. The technologies are cool and it's a market I never thought I'd be in.

This particular paper has been hit pretty hard by the declining revenue in the industry because they are so large. They're currently in the process of moving out of the cornerstone building they've been in downtown since (I think) 1972. It's a gorgeous building and I would have love to been there during the hey day. These days the building is like 30% occupied or something absurd. One of the paths I take to get lunch takes me through the back past all the old printing and sorting equipment. It's very ghostly and I plan on getting some pictures before we move.

Anyway, I was talking to a coworker about a project they're working on. He needed some feedback about solutions from a system engineering side. During the conversation, we got off on a tangent about the company and recent history. It was legitimate discussion because the information helped me understand the purpose of the application and what it needed to do, how it was being accessed and such.

One thing he mentioned to me was that at the last publisher's meeting, they pointed out that digital was only 10% of the revenue and that they would be focusing more on how to increase print revenue.

I was flabbergasted. I mentioned this to my wife to gauge her thoughts and she came to the same conclusion I did.

Why would you scale back from the growing market to try and preserve the dying one?

Here's the deal. I'm pretty typical (I think) of the modern reader (and I LOVE to read).

Sidebar: I remember when this paper was actually TWO papers - one the conservative publication and the other the liberal one. Up until about 9 years ago, I had a subscription to the paper including Sunday. I got great joy out of sitting down and reading the newspaper. I still do.

Anyway, as I said I'd like to think I'm pretty representative of the transition in society. I'm 35 years old. I'm old enough to remember pre-internet and young enough to be hooked into the new stuff. Some of that might have something to do with IT but that's becoming less and less of a gap. Technology is now so pervasive and accessible that what used to be the domain of the geek is now the domain of the mom. Everyone, for the most part, has access to the Internet from any number of devices not the least of which is the cell phone.

So when I hear a statement like I heard above I have to wonder how anyone could make such a statement about trying to "life-support" the print side of things. In addition to the AMAZING post mortem from John Temple, former editor/president/publisher of the Rocky Mountain News. The market for that is slowly dying - literally. I'm not trying to be grim or hyperbolic. According to this, the largest percentage of newspaper subscribers is in the 65+ range. My age range only accounts for 17%.

What I'm trying to say is that once that subscription base starts passing on, things don't look to good. By the time I'm at that range, there simply won't be any subscription base left. From my perspective (and I'm going to be somewhat crass here), why would I buy a newspaper subscription when I can read the same news from 10 different sources on my cell phone in the morning while I'm taking a dump? I wouldn't. These days I find myself not even bothering to pick up a copy of the paper at the doctor's office or mechanic. I'm either tweeting, facebooking, posting on forums or reading RSS feeds of the aforementioned 10 different sources. I'm pretty damn efficient at it too. The generation below me is even BETTER at it. They'll be doing things when I'm 64 (cue music) I won't understand and I'll be bitching about the good old days of the tubes and the propensity of children to walk across my grass.

Newspapers cannot compete on news. At least as far as my "outside" eyes can see. News is a commodity in a sense. Word travels fast these days and it's only going to get faster. From the perspective of a print paper, the news is already old as soon as it happens in this modern age. 500 people have already Twittered, iReported, blogged , Facebooked, Youtubed and done interviews with newspapers overseas before it's been sent to the presses. Using any random bit of aggregation services and software all of that information has been compiled and presented to me in a way that I wish to consume it*. If it hasn't been, I can easily do so with a single Google search when I want to know more. I can easily participate with a hashtag, url shortening service and a cell phone if I want. I become a part of it. I'm a member of the community around that bit of information.

So what can be done? Where should newspapers go? I thought about this a lot on the way home from work tonight. I had a plan all devised in my head. It was all so clear. Too clear in fact. Obviously someone else has to have seen this as clear as I did. Maybe I got some of the ideas from John Temple's speech. I wasn't sure.

When I finally got downstairs in front of my computer, I did a quick search (on Google of course). I pulled up the copy of the Temple speech. I also searched for one phrase - "How to save the newspapers". The fourth result down** was this article on truliablog.com. There were all of my grand ideas laid bare and done so almost a year ago. This almost entirely sums up what I had in mind and plan on repeating here in condensed form.

So what was my idea that obviously isn't just my idea?

Community. The Eastern Standard Tribe kind.

Back "in the day", some of us geeks were identified by the web communities of which we were a part. We're you a Slashdotter or did you read Kuro5hin? Maybe you spent your time digging and burying. If tech wasn't your thing, were you a Deviant? These days it's much of the same but the communities are even more diverse. People like, I think, having a sense of belonging and it's much more if not entirely acceptable to belong to a community where you've probably never met half the people. You're known by an avatar and a nickname. I have people I consider friends who call me by my character name in World of Warcraft and yet we all know each other's real names and personal information just as much as some of my friends here in town. I went to parties years ago with people I only knew from IRC and was called by my nickname the entire night. I addressed others the same way.

Newspapers can establish that kind of community in the modern age. When there were two separate newspapers in town, you might have an idea about a person based on which one they read. These days, I find myself building relationships and communities with people around the world based on the most niche of ideas - a mailing list for an open source project, a coding language or a forum dedicated to an author I might like. If the author is reasonably current, I might even be able to follow them on Twitter or become a fan on Facebook. It's no less of a community than the one I actually live in which has an equally vivid online presence.

So my plan was for the newspapers to be breeding grounds for those communities. Create the tools. Forums. Aggregation. Whatever is appropriate for making it easy for people to build those communities. They will still be providing the same service they were before which, at the core, is delivery of information. Ebay is nice but Craiglist is better when I want to go local. Building local communities online can even help spur the local economy which can always use help regardless of where "local" is.

Take advantage of the personal capital you have built up in your existing personalities. If you have a reporter that people trust, that person should be right in the thick of the community. Clark Howard is someone who has an amazing community of like-minded people from everywhere that he's broadcast but I would wager he's loved nowhere else in the world more than he is right here in Atlanta where we've been listening to him locally for years. Modern news is interactive. It's no longer a matter of writing an editorial and walking away from it. You'll have to be accountable for what you say because if you aren't, you'll lose trust from the community and, as in any relationship, you may never be able to earn that trust back.

Be open. This somewhat ties to the previous statement but I'm thinking more along the lines of "open to ideas". I'm amazed by how much CNN has integrated iReport into the main "product". After the President's speech to Congress months back, CNN had two people on who represented each side of the issue. The amazing part is that those two people were from the community CNN had provided and weren't talking heads. They were real people that best represented the viewpoints of their community at CNN. Mind you I don't think the person representing one of the sides was particularly articulate but he wasn't a pundit or a news guy so I can't be too critical. Make the community part of the product. When someone feels a sense of ownership, it becomes more personal.

Don't be afraid of Google. Don't buy into the paywall tripe being spread by some business men. Don't cut off your nose to spite your face and all that. While your focus may be the traditional concept of local, there are expats all over the world who would belong in that "local" community. Those people might even be your greatest source of revenue. I can see me living in Michigan years down the road and wishing I could get an Frosted Orange. You get the idea. I still need to know about the community and if Google doesn't know about it, I won't know about it.

Commit to the community. Don't be halfway about it. If you're going to do it, do it. Nothing is more frustrating that getting involved with something (an online game or whatever) only to see it die on the vine because of lack of support from the people who put it out there in the first place. While people will gladly contribute more than you can handle, they'll need some nudging to maintain focus. If you have a sub-community built around Foodies in Atlanta, maybe you should have some prominent Foodies actually be involved in some officially recognized capacity. Moderators are a valuable thing. A forum is only as good as its signal to noise ratio. If appointing community managers is out of the question, at least provide the community with the means to self-police. Don't discount Whuffie in self-moderating environments.

Accept organic. While the communities will need some nudging to help get started, realize that they WILL take on a life of their own. They may grow beyond what you might originally envision. They'll merge. They'll split. The Atlanta Baklava community might realize that while anyone who DOESN'T like baklava is obviously deficient in some capacity, maybe they're better suited in merging with the Atlanta Pastry community as a whole.

So where does the news come out of all this? Where's the news in this modern newspaper? The communities are built FROM the news. You'll still have your traditional reporting "model" but the communities will spring up out of that news. I started following a few new people on Twitter after the #atlflood event and every single picture I saw of the flood came from either Flickr, Twitpic or *ding-ding* ajc.com. There were people watching the #atlflood hashtag who had the same experience I did. I would hope that the newpaper was ready to capitalize on that.

I don't claim to have all the answers. I've only been on the job for two weeks now. The thing is that I REALLY like the job. I enjoy the people I'm working with and I want the company to succeed. Maybe that's what's needed. Less old-school baggage and a fresh perspective from the outside. It's also more selfish than that. I would love to be at the ground level of helping to engineer something as amazing as bringing the newspapers into the modern age.

* Think Manfred Macx in Accelerando
** The first result was from Time magazine. The second was from another newspaper. The third was an article on how to save newspaper clippings.

Friday, November 20, 2009

Exchange Web Services via Ruby

I realized as I went to post an update on this that I don't think I ever posted an actual blog entry about it.

While I was on contract at HSWI, the email was hosted on Exchange. The development team was all on Mac workstations but the front-side was all on Windows. Since I was running Linux full-time, Exchange access wasn't a big deal. I ran Mutt and used LDAP/IMAPS/SSMTP. Digging around the Mail.app on the OS X side however, I realized that the client side was using Exchange Web Services for much of the functionality.

Wanting to do some poking around with Ruby and SOAP, I figured it would be a fun exercise to talk to the Exchange server with Ruby. I also had an itch to scratch thinking I could use it as an address book source for Mutt. I got it working in a day or so after dealing with some broken functionality in either the WSDL from the Exchange side or how SOAP4R tried to translate it.

You can find the code here.

Anyway, I'm no longer with HSWI but over at the AJC, we're also using Exchange in a much greater capacity. Anyway, so I whipped out the "old" code and was depressed to find out it didn't work.

Every attempt gave me a 401 error. It made no sense since I could access OWA, ActiveSync with my DROID (awww yeah) and even access the EWS wsdls on the server.

As I started to poke around online, I started to realize that in some of the more complex configurations, the Exchange server is hidden behind an ISA server or something. I don't do the Microsoft world much anymore and I have no real access to the environment.

What I noticed though, was that the internal IP is different than the external one. That gave me an idea to make sure and test the code externally from home while NOT being on the VPN.

It worked!

As I poked around with THAT information, I remembered an interesting thing that happened the one time I logged. on to a Windows machine at the office. If I fired up I.E. and went to the OWA url, I was logged in directly.

What I'm guessing is that the Exchange server has a different set of criteria for authenticating internally than externally. I'm not up to date on current Exchange and AD implementations so I have no idea what the configuration is that's preventing me from using EWS but still allowing me to use OWA from Firefox under Linux and OS X.

If any MS guys out there have any idea what's causing this, I'd be interested to know so I can either work around it or document it appropriately.

I have a *FEELING* that it's somehow related to NTLM but I don't know how to force my EWS call to bypass it just yet.

Tuesday, September 15, 2009

I am seriously having too much fun with conky

Colors and conditionals, oh my!
Notice how the Network and Music sections change. The possibilities are endless. I was having fun with scrolling marquees earlier.

Secure IMAP conky script

So I've been using Conky again. I decided that I wanted to add a mailcheck to the display.

Of course conky doesn't support imaps (yet) and I didn't want to dick around with stunnel so I hacked up this quick ruby script to do it for me. I spent most of the time trying to figure out why it wouldn't display properly in Conky only to FINALLY stumble onto
text_buffer_size 2048

Here's the link:

Saturday, September 12, 2009

New desktop layout

So I was messing with Crunchbang Linux on my Acer Aspire One. I really liked the darker colors and the Conky config. I was also enjoying how simply and quick everything was. What really got me was that I found quite a few new apps that I didn't know existed (for instance Gwibber).

I decided to try and do something similar with a little bit more flare on the desktop side. These screenshots are the result. The first is with terminator up. The sceond is just the desktop.

Thursday, September 10, 2009

The party that cried wolf

Disclaimer: I'm not a Republican. I'm not a Democrat. I'm registered as a Libertarian but the following could easily be applied to any party in power including my own.

There's been quite a bit of fallout with regards to the hoohah over the President's speech to school children this week. Much of it will be overshadowed by the President's speech to Congress last night but I think the school address issue is symptomatic of a bigger issue - the Republican party has become the boy who cried wolf.

From cries of socialism to "death panels", the Republican's have given voice to the conspiracy theorists of the uber-right. The people who see threat at every turn. The kind of people that end up on the "news of the weird" segments. Some of this characterization is unfair. I'm just as distrustful of the government as the next person. I think it's our job as citizens to question the motives and actions of our government. I think it's our right to dissent but these types of statements do no favor to opposition to the current environment.

This is the one that bugs me the most. Ever since Barack Obama became president, there have been cries of the United States becoming a socialist government. First off, there is simply no way short of a rewrite of the Constitution that we could ever become a truly Socialist form of government. Can Congress pass laws that are socialist in nature? Sure but we also have a separation of powers. Any one of those laws could and can be challenged in the court system to be overturned. "Socialism" has become a boogyman word. Just like the boy who cried wolf, eventually the word looses its power and no one listens. I firmly believe that there are members of our Congress and even our own President would love to move control of much of our economy into the hands of the government but that doesn't mean that we're becoming a socialist country or that the Constitution is immediately invalid. What SHOULD be focused on is the overreaching interpretation of the Commerce Clause that Congress likes to abuse and, as stated during the Sotomayor hearings, depends on to accomplish its goals - both sides.

Healthcare debate

Up until the school speech fiasco, this was the biggest "cry wolf" to date. It started before there were any bills even up for vote but most of the kerflufle was related to HR3200. I'm not going to recount everything on FactCheck. The work has already been done. The point to make is that instead of focusing on what the bill actually says, the wolf-criers are extrapolating specific provisions with conditional language into, in some cases, wild flights of fancy and worst case scenarios. See the following examples:

Government deciding treatments
The actual language of the bill simply sets up a private-public advisory committe that makes recommendations on minimum benefits. Somehow this got extrapolated into the comittee actively deciding what treatment each person would get. Nothing in the bill actually gives this panel that power. All they would do would make recommendations as what minimum coverage should be allowed. First off, this is no different than current insurance companies. Secondly, while one could argue that any recommendation that this panel makes would most likely end up as the baseline coverage, that's still not the same thing as the government deciding your treatments in any given situation.

Health care rationing
Nothing in the bill actually says anything about rationing. Again, the wolf-criers are taking things to extremes. They're mixing up what is happening in other countries with language in the bill stating out of pocket expenses will be capped for individuals and families. While it's fair to consider how healthcare is performing in other countries and what changes have been made in those plans, they're not indicitive of anything specific in HR3200 or the concept of government-run healthcare in general.

Those are just a few examples. The current cry-wolf rhetoric on healthcare is a mixture of misinterpretations, worst-case scenarios and comparisons to other countries. By giving voice to these types of statements, the actual issue is being buried under easily debunked conspiracy theories. Instead, the discussion should be on the cost of implementing the bill (both short and long term) and if it's even a valid role of the federal government.

School speech
So we come to the school speech. Anyone with half a brain knew that this would end poorly. Neal Boortz summed it up pretty well but I wanted to mention it. It became the penultimate example of crying wolf. From the minute it was announced, conspiracy theorists were on top of it. There were actually TWO parts to the speech but they ended up getting lumped together into one big "OMFG". The first part was a set of recommended lesson plans sent out by the Education Department to schools in preparation. Most of it was innocuous in nature. There were a few parts that caught my attention but the biggest one was the line (paraphrasing) "Discuss with students why it's important that we listen to elected leaders like the President, congress, mayors.....". Depending on the interpretation of "important that we listen", one could assume that the lesson plans are stating that we should just do what they say. It's the difference between me telling my son "you need to listen to me right now (i.e. do what I say)" vs. "listen closely to what the President is saying (i.e. pay attention because it's important to understand what politicians are scheming). I can agree with the second one but not the first.

Then there's the speech itself. Did anyone HONESTLY think that he would use the speech as an "indoctrination" speech? Seriously? Our President is arrogant but not stupid. Even IF he had originally planned to do that, instead of giving him enough rope to hang himself the wolf-criers gave him plenty of time to rewrite it and make them look stupid. Which he did.

And while we're on the subject of "indoctrination". If you send your children to a public (I know Boortz likes "government") school, you are conceeding that they will be taught by the government. How hard is that to understand? If you don't like it, homeschool or send them to a private school. Yes, it's totally unfair that you still have to pay taxes to support those schools when you don't have children there but that's a whole other issue.

Use of the word radical
While I'm on the issue of words, I'm also pretty fed up with constant usage of the word "radical" to describe the policies of the current administration. It's bass-ackwards.

2. Departing markedly from the usual or customary; extreme: radical opinions on education.
3. Favoring or effecting fundamental or revolutionary changes in current practices, conditions, or institutions: radical political views.
If you look at the policies of this administration in light of the rest of the world, technically our EXISTING system is radical. The administration's policies are more the norm for the rest of the world.

That's really all I have to say on the issue. True political discourse has been given over to the extremes on both sides. It's sad for those of us who want to address the real issues.

Wednesday, September 2, 2009

check_rdp request

So I got a reply to my tweet about a free nagios plugin. I was excited at first until I started delving into the whole thing.

The request came from @cixelsyd:

@lusis random #nagios plugin suggestion: check_rdp verifies port, handshake, auth
Okay, I thought. Sounds interesting. Let's give it a shot.

So I did the first thing which was to see if I could find an existing module for any of the scripting languages I know. It was a long shot and it came up empty.

Not a biggie. Let's see what we can find out about the RDP protocol. Maybe I can knock something together....

About 2 hours later I was done reading various Technet entries. RDP is pretty convoluted and only gotten more so as Microsoft iterates over the various versions.

So I decide to find some code I could attempt to read through. Of course rdesktop was my destination.

After spending the last hour or so navigating the rdesktop source, I'm not quite sure if it's possible to even do a headless RDP client. Mind you my C is very limited.

My first attempt was to simply shortcircuit the rdesktop client after handling the authentication. Each attempt kept leading me to various X-related code. That's a whole other beast that I'm just not remotely competent enough to learn. I compiled it with debugging and used a Vista machine at home as the test server. Each and everytime, it wanted to do some sort of screen rendering.

I'm going to spend some more time on it tonight including reading the source for a few other client implementations. Unfortunately, rdesktop is the only one I know of that supports RDP5. At least one that I have access to the source code for.

My thinking is that, if I can't totally remove the need for client rendering capabilities, that I can somehow fool rdesktop into using a null framebuffer of some kind or faking the capabilities of the client side display. All I need to do is authenticate but I can't really tell if the client window has to be available even if all the credentials are present in the PDU since there's a Basic Settings Exchange before the Security state is even reached.

Another option that I can't test at work but can at home, is to see how the rdp2vnc code works. It might be possible to bring that session up only to tear it down.

The really annoying part is that this would still not be a very efficient plugin unless I proceeded to somehow implement JUST the process up to the Security Exchange.


Friday, August 28, 2009

I has a sad.

Yes, the title is stupid lolcat speak but I think it sums up how I felt this afternoon on the way home.

For those who don't live in Atlanta there's a local radio station that for the past 9 years has been donating airtime and personalities (as well as large amounts of cash from the radio personalities themselves) to a local non-profit. That non-profit is the Aflac Cancer and Blood Disorders Service at Children's Healthcare of Atlanta. CHOA is a worthy non-profit in its own right but the Aflac Cancer Center is a special kind of place. The goal is to get to a 100% cure rate. They're at 80% or so right now.

So every year for two days, 750AM WSB Radio in Atlanta has this Care-a-thon to help raise money. I think last year they raised over 1 million dollars and I haven't heard the final numbers for this year.

Michelle and I never gave to the Care-a-thon until we got pregnant with Gus. At that point, we figured it was "good karma" and considered it jokingly like an MSA should, god forbid, we ever need it. Luckily we haven't but we have taken Gus to the emergency room at one of the CHOA facilities before. These people love children. They know kids. They know how to take care of kids. They are simply wonderful.

Anyway, during this Care-a-thon they often interview the various staff members at the Center. A doctor here. A nurse there. The guy who runs the place. This year they interviewed a woman who's title I can't remember but her job was essentially to partner with the children as they went through chemo and other procedures at the hospital. She helped them understand what each procedure was going to do, what their cancer was...pretty much everything during the time the child was at the Center.

Today, however she read a letter she got from a parent of a child who sadly passed away in July. That's when I realized that not all of her work was the good. I hadn't really thought about the fact that there's this 10% that don't survive. She talked about how she had to tell this girl that the chemo wasn't working and that she was going to die.

I sort of froze for a moment. I couldn't stay that way for too long because I was sitting in traffic on the way home. Then my mind started to wander. This has happened entirely too much for both Michelle and I since Gus was born. I understand it's fairly normal for new parents. Most of these wandering are absurd situations that are 1 in 1 million of things to happen to you kind of things.

I'm going to tell you how my mind wandered because in the sadness of my mind's eye, I came to realize exactly how much I loved my son.

In my mind, I am with my son as he is now - 13 months old. We've gotten new that there's nothing else that can be done for him and he's going to die.

That got me choked up. I mean literally I choked and coughed back the tears.

I have no idea what a child of my son's age would have to go through for cancer treatment but I imagined him as he is now. Playful. Happy. I imagined those last minutes I would have with him.

Would he be playing and get tired and simply lay down? Would he be in pain? What could they do for him?

For me, those last minutes went as I remember the best times with him right now. Sleeping on my shoulder in the glider in his room. I love the those times. I'd like to think he likes them too. It's our thing. Our bonding.

In my mind Gus was sleeping with his head on my shoulder as he does every night, I can hear him with his little snore. I'm rocking in the glider and he's sleeping. Then the snoring just stops. No more breathing and it's over. I keep rocking him and I'm sad but at least we had the time we did together.

Now mind you at this point I'm trying desperately to NOT run off the road from the tears in my eyes. I'm about to vomit from trying to hold this all in. I'm cursing the traffic. All I want to do is get home to my wife and son and hold them both.

When I finally get home, I step around the corner to the kitchen and Gus sees me. He has a huge grin and comes running to me as only a 13 month old who's been walking solidly for all of a month and half can. It's more of a waddle.

He throws his arms up in the air begging me to pick him up. I do.

He lays his head on my shoulder and grips my neck in the way that he does when he wants to love on me.

I burst into tears. Michelle bursts into tears. Gus senses the mood and continues to hug me. He doesn't cry. He occasionally pulls back to look at me and make sure I'm okay. We stay like this - all three of us - for 5 minutes. I tell my story to Michelle a bit later and we have another good cry.

We talked about it tonight after Gus went down. It was cathartic. It felt good to get out all of these unspoken fears about our son. About his future. About what we would do in the gravest of situations. I think it broke through a mental wall that we both had with these "terrors" about something happening to him.

We both love our son so much. In the moment that I was hugging him so tightly, I couldn't feel anymore love than I did for him and my wife right then.

That's love, folks.

Thursday, August 27, 2009

Free nagios plugins! Act now!

Okay that's a silly title but that's what I'm doing.

I need to keep my skills up. I'm taking "commissions" for any Nagios plugins people might want written. I'll try and keep the dependencies to a minimum. Language will be anything from Perl to Ruby or possible a Bash script.

What I need from you:
- What service you want monitored.
- What information from the service you want monitored.
- If it's a commercial application, I'll need a trial version that works on a platform I have access to.

So if you want a plugin for monitoring MyCommericalERP package that only runs on AIX, I'm probably not going to be able to do it. I have an old RS6000 here and that's about it. I would gladly let you give me a Power system though if you wanted ;)

However if you use MyOpensourceERP that runs on Linux then there's a good chance I can come up with something.

Nagios Downtime Scheduler in Ruby

It's been a while since I worked with anything in Ruby. I was getting into the groove when the whole DDS/MO layoffs happened. Since I got right back on my feet, I really didn't have time to dedicate any time to it.

So now 6+ months later, I had an opportunity to get back in the groove. It didn't go as smooth as I would have liked. I had to look up stuff that I knew by heart previously. That's just like any skill though. It atrophies with disuse.

In this case, none of the recuring downtime schedulers for Nagios were really cutting it. One of them had you add the downtime directly to the script. Others didn't work with Nagios 3. After a night of false alarms that escalated to my boss' boss, I decided to fix the damn problem.

So here's the first iteration of my ruby downtime scheduling script for Nagios 3. It's very rough. There's fuckall for input validation, for instance. It doesn't support anything more than daily recurrance. I'm going to be cleaning it up but I'm pretty happy with how it's shaping up. It has a setup mode and you can interactively add downtimes (as long as you don't fuck up the input - heh)


It's not yet approved on exchange.nagios.org. I'm looking for any feedback or feature requests. I want to keep it at its core a downtime scheduler and nothing more but I'd like to revisit my ruby Nagios config parser for the validation.

Like I said, it's been a while since I've done anything in Ruby so I know there are some major 'wtf' moments in there. Be gentle.

Thursday, June 25, 2009

Nagios and Websphere

I saw something interesting pop up on twitter:


that made me think of my websphere monitoring work:


Admittedly, mine isn't actually a Nagios plugin but I could pretty easily convert it to one. The logic is all there. Maybe I'll download a trial version of WAS6 (mine was written again V5) and see if I can make it an actual plugin.

More Adobe AIR stuff

So I recently wiped my Dell XPS M1710 laptop. I had purchased another 2GB of memory (bringing it to 4GB) and a 500GB 2.5" sata drive to replace the shitty 100GB that came with it.

I decided to go 64-bit Jaunty to really take full advantage of the memory. I haven't run a 64-bit desktop Linux since I ran Gentoo. I remember a whole host of problems then with chroot and such but figured things were better now.

Indeed they are except for a few closed source programs. One of those happens to be Adobe AIR.

Adobe provides some nice tips on their website for 64-bit linux installs but most of those were deprecated for Jaunty. TweetDeck would fire up but not let me click anything similar to the problem I had before with gnome-keyring vs. kwallet.

A little research later lead me to this post:


Essentially, getlibs makes all the problems I had before a non-starter. I was able to grab the 32-bit version of gnome-keyring and have it be a part of the package system (so no stray files) and start TweetDeck.

I'm still getting some cocked-up error about libcanberra but that appears to be a known issue:


Friday, June 12, 2009

Updates to mhs

So I did some updates to MHS (the mysql health daemon I wrote for work). I was implementing our scripting framework from bash in perl to provide the same wrapper functionality we have in our shell scripts.

It's not as transparent but it logs script stop and start. This one is pretty useless outside of our shop because it's now implementing our own perl modules:


I also stopped using Proc::Daemon because of some logging issues I was seeing. No matter which order I did Proc::Daemon::Init(), logging would stop. I even changed Log::Log4perl to use init_and_watch but nothing worked. I ended up having to not only change to App::Daemon just to get the tty detach (we start mhs from an init script) but also stop using File::Dispatch::LogRotate because it doesn't honor setting ownership of log files.

As soon as I have time to sanitize it, I'll post a new copy on dev.lusis.org.

Monday, June 8, 2009

Tinis for Tatas

I wanted to remind everyone who might possibly read this and lives in Atlanta that Crepe Revolution is hosting Tinis for Tatas tomorrow night:


This is a pretty big deal as one of our dinner club friends is a survivor as well as one of my guild mates in World of Warcraft. If you're in the Atlanta area, please try and come.

Thursday, June 4, 2009

MyISAM vs. InnoDB via Twitter

So I saw the following tweet come across TweetDeck (using the search functionality as a custom feed no less):

ok #mysql / #database geeks on twitter (twittabase tweeks?) .. which is better: myisam or innodb, and where and why.
from @narendranag

140 characters is nowhere near enough space to answer that question. I'm going to put my thoughts here from the perspective of someone who's had to deal with large databases (500+ GB) in both MySQL and PostgreSQL doing both OLTP and OLAP work. Here's probably the best summation:

- Fast

- Reliable

That's not to say that MyISAM can't be reliable or that InnoDB can't be fast but that's the best way to look at it. But you can't balance which table type you choose based on those two criteria. There are reasons you might not need full ACID compliance that would make you still want to use InnoDB over MyISAM. Hell, MEMORY is faster than MyISAM but you don't want to use it for all your tables. Often times, the right answer is somewhere in the middle.

I tend to default to using InnoDB across the board. It's always defined as my default table type in my.cnf. I do this because often times the developers are coding around the assumption that the database supports things like foreign keys, transactions and the like. Admittedly, this is often hidden behind the ORM they use such as Hibernate but it's still important.

But what about speed? Why is InnoDB "slower" than MyISAM. It's basically because it's doing "more" than MyISAM. It's managing key constraints, logging transactions and all the other things that make it ACID compliant. Much of that "slowness", however, can be mitigated by getting away from the default InnoDB configurations and profiling the system over time to size bufferpools and the like:

  • Don't use a single tablespace file for InnoDB (innodb_file_per_table). The global tablespace will still need to be used but it's much slower than if you were to use a tablespace per table. This also gets you the benefit of being able to recover disk space after dropping an InnoDB table. If InnoDB is using the global tablespace file, the ONLY way to recover that space from dropping an InnoDB table or whole schema of InnoDB tables is to fully backup ALL schemas, remove the ibdata/log files, restart the database and then reload from backup. Not pretty.
  • Use the smallest possible primary key where you can. At one company, we were using GUIDs for the primary key. InnoDB prefaces every index with a subset of the first bytes of the Primary Key for that record. I can't remember the exact number off-hand but it was only a subset. Considering the first X bytes it was using could potentially be the same across multiple records, it took more work than if we had used ints. Additionally, not only were our primary key indexes unneccesarilly large, every subsequent index was as well. This wasn't as much a big deal on columns with smaller datatypes but indexes on columns of datatypes like varchar and lob were pretty ugly
  • Consider using a larger log file size. This has a trade-off in recovery time though. Your call there.
  • Get fancy with disks. If you have multiple volumes of different raid types, you can not only tell InnoDB to put the global tablespace and log files on a different path but you can also "move" the database to a different volume as well. This involves creating the database, shutting mysql down, moving the database directory from the mysql data directory to a new location and then symlinking it. Until MySQL or InnoDB gets me the ability to define where I want a given tablespace, this is the next best thing.

One area where InnoDB is faster than MyISAM natively is in concurrent CRUD operations. That's because InnoDB uses row-level locking. I'm not as clear on the specifics of the locking as I am with say DB2 (DB2 will actually lock the row before and after the one you're modifying) but it's better on multiple concurrent operations than table level locking.

So when would you want to use MyISAM then? One area we really found that using MyISAM made sense was on non-relational tables within a schema that normally had InnoDB tables. In one case, we had a table that had two columns - an id and a blob. That table was MyISAM. Conceivably anywhere you are denormalizing the data quite a bit, it can make sense to use a MyISAM table especially if it's a frequently rebuilt table (like a summary table updated from a cron job). Of course we've also used MEMORY tables for that same purpose. Just be careful how you intermix these in the code (the aforementioned Java transactions for instance).

So here's my recommendation:
OLTP tables - InnoDB with a few caveats
OLAP tables - MyISAM with a few caveats

Wednesday, June 3, 2009

Adobe AIR apps and Linux - Tweetdeck particularly

So I ran into an interesting issue yesterday. I decided to give TweetDeck a shot. I wanted to get hashtagged search results as a feed. TweetDeck can do it. Thwirl cant.

So I make sure I have the latest TweetDeck and fire it up.

Er...what's this black window I see? I can't click on anything. The main canvas is grey and the only thing I get is tooltips on mouseover of the buttons. Nothing works.

So I do some research and find out that this appears to be a known problem. No one has been able to down exactly what's going on. Some people were mentioning that starting kwallet solved the problems while other's said that it was gnome-keyring. My brain started churning.

I started to think back to when I installed AIR on my desktop and when I installed TweetDeck. It was right after I left MediaOcean. I was setting up my desktop at home to be more like the setup I had at work so I could stay in the same mindset while looking for a job. That's when it clicked.

I installed the AIR runtime while I was running KDE. One thing AIR boasts under Linux is integration with either kwallet or gnome-keyring. I wonder if maybe it "locks" that choice in place during install. Well I run through a few quick tests which involve installing and uninstalling AIR , removing some dotfiles and directories where settings are stored. Nothing seems to work.

Here's what finally worked:
  • Drop to a shell.
  • Remove the package that provides kwalletd

jvincent@jvxps:~$ dpkg-query -S /usr/bin/kwalletd
kdebase-runtime-bin-kde4: /usr/bin/kwalletd
jvincent@jvxps:~$ sudo apt-get remove kdebase-runtime-bin-kde4
[sudo] password for jvincent:
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages were automatically installed and are no longer required:
libclucene0ldbl apache2-utils libxine1-x librdf0 kdebase-runtime-data-common php5 libxine1-misc-plugins kdelibs5 libxcb-xv0 kde-icons-oxygen libxine1-bin libexiv2-5 librasqal1 apache2-mpm-prefork libsoprano4 redland-utils
apache2.2-common kdelibs5-data cvs libxcb-shape0 phonon-backend-gstreamer libstreamanalyzer0 libphonon4 libsvnqt5 kdelibs-bin libstreams0 exiv2 php5-mysql libapache2-mod-php5 libxcb-shm0 soprano-daemon kdebase-runtime-data
php5-common libxine1-console libxine1
Use 'apt-get autoremove' to remove them.
The following packages will be REMOVED:
cervisia cvsservice kdebase-runtime kdebase-runtime-bin-kde4 kdesvn kdesvn-kio-plugins khelpcenter4 kompare
0 upgraded, 0 newly installed, 8 to remove and 30 not upgraded.
After this operation, 17.8MB disk space will be freed.
Do you want to continue [Y/n]?

  • One thing to watch carefully is the list of packages that will be removed. I did this on my desktop and it wanted to remove Amarok. I figured I could reinstall Amarok if I needed to after the test.
  • Once that was done, I uninstalled AIR and all the apps I had installed (in this case, TweetDeck and Thwirl).
  • Backup and remove the .appdata directory from your home directory. I'm not sure if this step is absolutely required but I did it anyway.
Make note of any other packages you want to reinstall.
After that, I reinstalled the AIR runtime and TweetDeck. TweetDeck was working!

So what's the dealie yo? Well it appears there are two issues:

  • Which ever keychain you are using when you install AIR (gnome-keyring for gnome and kwallet for kde) becomes the default keyring for the life of the AIR installation. I realized later that when I switched back to Gnome, Thwirl was always asking for my password even when I told it to save it. Now I know why.
  • It appears that Kwallet is the DEFAULT keychain used if you have it installed. That's why I had to fully uninstall KDE just to install AIR. I run under Gnome again. I don't use any KDE apps other than Amarok. Kwallet may not always be running.

I have yet to reinstall Amarok again so I don't know what will happen once kwallet is available but it appears to me that Adobe needs to fix this behaviour. Maybe give the user an option to choose which vault will be used at install time or possibly someone can create an application that can switch the vault from kde to gnome or vice versa.

New Nagios Exchange online

Got a tweet from@nagioscommunity yesterday:
New blog post: Launch of the new Nagios Exchange ! http://bit.ly/1aS44U
Went ahead and moved the stuff that was hosted at monitoringexchange (nee nagiosexchange) over to the new spot.

Tuesday, June 2, 2009

Fun with Foundry - Load balancing MySQL

I've worked with quite a few load balancers over the years from Coyote Point and Cisco to Citrix. One I've never worked with is Foundry.

So I have a task I'm working on - Balance our read-only MySQL slaves behind the FoundryServerIron 4G. Fun stuff and it gets me in front of the units which I've been itching to do.

As with any load balancer, simple balancing is pretty easy. Define multiple real servers that listen on a given port. Define a virtual server that consists of those real servers. Profit.

However this is what I'll call "dumb" load balancing. There's no actual intelligence to it. Most of those checks are simple Layer 4 checks. Is the real server listening on the port I defined? Yep.

Most load balancers provide something a bit higher up the OSI model for testing if a "real" server is alive and able to service connections. The most popular example is an HTTP check. The load balancer requests a page from the web server. This is much more intelligent because we know if our application stack is working all the way. You could define a custom controller to handle a given url and do something as simple as serve up static content with a string that the load balancer matches against or get complex and have it do some database queries to go whole hog.

But these are predefined health checks in the load balancers. What about something like a MySQL slave database? We have a small issue with replication lag. We've coded around most cases where it might be an issue but there are still cases where we need to deal with it transparently. Additionally, we want to enforce some business logic in it. We've gone beyond what most load balancers can do. Short of moving to installing MySQL proxy or some other external solution, we're pretty limited.

So here's what I ended up doing but first a few "facts":

  1. Foundry has support for custom health checks. These are somewhat limited but you can do things like the aforementioned HTTP GET/POST checks or you can do something akin to Expect scripting.
  2. We have two read-only MySQL slaves and one Master server.
  3. The code has two "connection pools" (it's php so not really but I'm using that term anyway). The first is a DSN for writes going to the master as well as immediate reads based on that write. The second is for general SELECT statements and other read operations.
  4. We do have a memcached layer in there but it's irrelevant to this discussion.
  5. One of the slaves goes offline each night for backups.
  6. Sending all traffic to the Master is NOT an option except in EXTREME situations.
  7. We will have replication lag spikes on the order of 20 seconds every so often due to batch operations running against the database (building summary tables and the like). These queries are as optimized as they're going to get but 5.0 statement based replication is the reason things lag.
  8. Upgrading to 5.1 and row-based replication is NOT an option at current.

So there you have the parameters I have to work with. The first step was finding out how flexible the Foundry was at custom health checks. The way Foundry works is you bind ports to real servers. Optionally, a port definition can take the name of a health check as the determining factor. Look at this example:

server real server01
port dns
port dns keepalive
port dns addr_query "server01.mydomain.com"

That sets up a real server providing DNS as a service. DNS is a known service so it has its own set of rules. We tell it that we want to do a lookup on server01.mydomain.com to determine if DNS is working. Here's an example for a web server:

server real searchserver01
port 8080
port 8080 keepalive
port 8080 url "HEAD /solr/solrselect/admin/ping"

We're connecting to a tomcat instance and pulling up a static page inside the container to determine if it's working properly.

Now take a look at this example:

server real dbs02
port 3306
port 3306 healthck dbs02mhs
port 3306 keepalive

This is saying that for port 3306 on the server, I want to use a health check called dbs02mhs. This is a custom health check that I've defined for this purpose. So what's in dbs02mhs?

healthck dbs02mhs tcp
port 10001
port 10001 content-check mhs

We're connecting to port 10001 on IP and performing a content check called mhs. Additionally, we're saying that this is a layer 7 check only. Here's the contents of the mhs content check:

http match-list mhs
up simple 0
down simple 1

Ignore the http part for a minute. It's a bit misleading. We're not actually checking via an http call. What this match list says is if I get a 0 as my response, the server is up. If I get a 1, the server is down. By binding it to health check above and subsequently to the real server, we're saying this:

"Connect to port 10001 on IP If you get a 0 back, everything is good. If you get a 1 back, things are not good. In fact, to determine if port 3306 is available on this IP, I want you to do the check this way"

Interesting, no? Probably not. So what's listening on port 10001 on the database server? MHS.

MHS is a little perl daemon I wrote that encapsulates the more fine-grained logic we need in determining if the slave database should be handling queries. I'm going to post the code for mhs. I warn you now, this is a hack. It needs to be cleaned up and have some code style enforced. I'm actually working on a set of internal perl modules to move much of this stuff out. We already have a bash version of what I'm calling our "scripting framework".

---> MHS

As I said, this was more of a P.o.C. Now that I know it works, I can clean it up. Basically, MHS does three things currently:

  • Check if the database server is alive. We do this by doing a "SELECT 0" from the database.
  • Check if there is a backup in progress. If so, this slave will be lagged and so we don't want to use it for reads.
  • Check if replication is lagging more than 60 seconds. If it is, let's not use it for now.

As I said, this is pretty specific to us. Every MySQL instance we have has a database called sysops. This is a non-replicated local table. There's also a sysops database on a central backend MySQL instance that we use for storing scripting execution details and job logging but on every other database server, there's currently a single table - job_locks. Our shell scripting framework (and soon the perl framework) has a method/function for locking the database so to speak. In this case, our backup job in addition to writing to our master sysops database information about execution time and what not, also writes a lock to the server that is being backed up. The job_locks table currently has one row:

| locked | name |
| 0 | backup |
1 row in set (0.00 sec)

The reason for storing this table directly on the server is that our sysop database instance is not redundant and doesn't need to be. It's for storing one-off databases. If it goes offline, we can't have our checks failing. By putting a table on the actual server being backed up, we can self-contain the check. The daemon runs on the same box as mysql and the tables it checks are on that server.

One thing I'm running into with the Foundry is needing to setup what they call a boolean check. You can use operators such as AND and OR as well as group multiple checks together.

My ruleset needs to basically say this:

If 3306 Layer 4 is true and mhs Layer 7 is true, server is good.
If 3306 Layer 4 is true and mhs Layer 4 is false, server is good.

The reasoning is that we don't want to fail out a slave if the perl daemon crashes. We want to make the default assumption that the slave is good and current unless explicitly told so by the fact that it either isn't listening on port 3306 or the mhs service says it's bad.

I don't THINK I can do that but if there are any Foundry experts who read this post, please let me know.

Thursday, May 14, 2009

Annoyances with OpenNMS

So my company has an existing install of OpenNMS and Cacti. I don't like to barrel through the gate and make changes. I'm trying to work within the system here. So I come from a Nagios world. If I need something monitored, I write a script for it. Nagios handles it. My status can be up, down, warning, critical or unknown.

In Nagios, I do the following to add a new service to a host:
1) Define a check command.
2) Add a service to the host using that check command
3) Profit

About the check command:
Most likely, one is already defined but in the case of fairness I'm going with a brand new check. Let's use the one I'm beating my head against in OpenNMS - MySQL Replication status. So I have a script written. It runs some queries and gives me replication status. Additionally, I'm grabbing replication latency. I use this information to determine a more "fine grained" status. Being up or down is pointless if my slaves are 4 hours behind the master.

So I make my script take a few parameters - A warning level, a critical level and the hostname. So I have the following bits of configuration information:

define command {
command_name check_mysql_replication
command_line $USER99$/check_mysql_replication -h $HOSTADDRESS$ -u $USER1$ -p $USER2$ -w $ARG1$ -c $ARG2$

What the above gives me is the most flexible way to monitor replication on a system. Each hostgroup or host can have different levels for warning or critical. I could even change the USER macros into ARG so that I can pass the credentials. Oh and that script can be written in any language I like.

Then I add the service to my host/hostgroup. Let's assume that all my slaves are in a hostgroup together:

define service {
use production-service
hostgroup_name mysql-slaves
service_description MySQL Replication Status
check_command check_mysql_replication!200!400

There. Now replication is being monitored for that group of hosts. I get performance data. I get alerts based on warning vs. critical. All done.

EVIDENTLY, in OpenNMS I simply can't do this. To be fair, there's a different set of eyes that should be used with OpenNMS. It's a network monitoring system. Network. Network. Everything is centered around that.

So to accomplish the same thing in OpenNMS, I've gotten about this far:

1) Create a script. No hardship. I'm doing it anyway. This script looks a little different though. Let me paste the actual one here:

#. /opt/sysops/lib/init


while [ "$1" != "" ]; do
if [ "$1" = "--hostname" ]; then

if [ "$POLLHOST" = "" ]; then
echo FAIL no host specified
exit 1

QUERYPFX="mysql -u USERNAME --password=PASSWORD -h ${POLLHOST} --xml -Bse"
SLAVESTATUS=`${QUERYPFX} "show slave status;" | grep Slave_IO_Running | awk -F'[<|>]' '{print $3}'`

if [ "$SLAVESTATUS" = "Yes" ]; then
printf "SUCCESS\n"
exit 0
printf "FAIL\n"
printf "Status check returned: ${SLAVESTATUS}\n" 1>&2
exit 1

#. /opt/sysops/lib/uninit

Now I have to edit an XML file (poller-configuration.xml). I had to do this as a package because I didn't want it applied to ALL boxes where MySQL is discovered. Only slaves:

<package name="MySQL-Replication-Slaves">
<filter>IPADDR IPLIKE *.*.*.*</filter>
<include-url xmlns="">file:/opt/opennms/include/mysql-replication-slaves.cfg</include-url>
<rrd step="300">
<rra xmlns="">RRA:AVERAGE:0.5:1:2016</rra>
<rra xmlns="">RRA:AVERAGE:0.5:12:1488</rra>
<rra xmlns="">RRA:AVERAGE:0.5:288:366</rra>
<rra xmlns="">RRA:MAX:0.5:288:366</rra>
<rra xmlns="">RRA:MIN:0.5:288:366</rra>
<service name="MySQL-Replication" interval="300000"
user-defined="false" status="on">
<parameter key="script" value="/scripts/cacti/checkrepl.sh"/>
<parameter key="banner" value="SUCCESS"/>
<parameter key="retry" value="1"/>
<parameter key="timeout" value="3000"/>
<parameter key="rrd-repository" value="/opt/opennms/share/rrd/response"/>
<parameter key="ds-name" value="replication-status"/>
<outage-calendar xmlns="">Nightly backup of atlsvrdbs03</outage-calendar>
<downtime begin="0" end="60000" interval="30000"/>
<downtime begin="60000" end="43200000" interval="60000"/>
<downtime begin="43200000" end="432000000" interval="600000"/>
<downtime begin="432000000" delete="true"/>

And additionally:

<monitor service="MySQL-Replication" class-name="org.opennms.netmgt.poller.monitors.GpMonitor"/>

There. Oh but wait, I also have to add something to a file called capsd-configuration.xml:

<protocol-plugin protocol="MySQL-Replication" class-name="org.opennms.netmgt.capsd.plugins.GpPlugin" scan="on" user-defined="true">
<property key="script" value="/scripts/cacti/checkrepl.sh" />
<property key="banner" value="SUCCESS" />
<property key="timeout" value="3000" />
<property key="retry" value="1" />

I think that's it. Now I have to wait for the scanner to run (or force it) to tie that poll to the servers in the range I defined. One thing you'll note is this GpPlugin that's being used. That's called the General Purpose Poller. It's basically the scripting interface. If you want to poll some arbitrary data that isn't via a predefined plugin or via SNMP, that's the way you have to do it.

The limitation of this is that it handles the poll in binary only. Either it's up or down. This goes back to the origins as a network monitoring system. The port is up or the port is down. The host is up or the host is down.

Over the years it appears that they've added other "plugins" that can handle information differently. These plugins support thresholding for alarms but really only in the area of latency in polling the service. Additionally, they appear to be simple port opens to the remote service - basically check_tcp in the Nagios world. There are some exceptions. I think the DNS plugin actually does a lookup. Some of the L2 related plugins do things like threshold on bandwidth. There's also a disk usage plugin that thresholds on free space. The Radius plugin actually tries to authenticate against the Radius server.

Then there's probably my biggest gripe. These are all written in Java. I'm not a java programmer. I don't want to have to write a god forsaken polling plugin in Java. If I need something faster than a perl/ruby/bash script for my plugin then I'll see about writing it in C but I've yet to come across that case.

So now I'm sitting at point where at least opennms knows when replication isn't running. I can modify my script to check latency and if it's over a certain point throw a FAIL but that's script-wide. I can't set it on a host by host basis. Replication is a bad example but it's not hard to imagine a situation where two servers running the same service would have different thresholds. In the OpenNMS case, I'd have to account for all that logic in my script.

But John OpenNMS supports Nagios plugins, you might be thinking.

No they don't. They support NRPE. This means I have to have all my scripts installed on every single host I monitor AND I have to install the agent on the system itself. Why should I have to do that when, with other systems, I can do all the work from the script on the monitoring server?

Oh but you can use extTable.extEntry in snmpd.conf. Same problem as NRPE without the agent hassle. I still have to copy the scripts to the remote server.

So after spending all morning googling, I might be able to do what I want (i.e. thresholding) if I hook another program into my script that's distributed with OpenNMS - send-event.pl. However the documentation on send-event.pl is SORELY lacking. Maybe I don't speak OpenNMS well enough for my google-foo to work but I can describe the exact same set of results I get every single time I look for ANYTHING related to send-event.pl:

1) Forcing the OpenNMS engine to discover a new node or restart a specific process within opennms using send-event.pl
2) Vague references with no actual code on using send-event.pl and SEC or swatch for log parsing. Of course these are all dated as OpenNMS has actually added a syslog plugin now.

And evidently, there's some XML editing that has to go on before I can actually do anything with send-event.pl. My experimentations this morning went like this:

./send-event.pl uei.opennms.org/internal/capsd/updateService localhost \
--interface -n 107 --service MySQL-Replication --severity 6 --descr "Replication lagging"

Now I know I didn't get it entirely right. Let's forget for a moment the cryptic uei.opennms.org stuff. All I'm trying to do is send an event (send-event.pl sounds like the right tool) to opennms for a given node. I want that to be the equivilent of an SNMP trap. I know the uei line is wrong but I can't find a simple list of the uei strings that are valid to take a stab. There's a shell script called UEIList.sh but it complains about missing dependent shell scripts AND fires up java to do the work. Why is it so hard to have that list in the OpenNMS wiki with some notes about each one?

So after all these machinations, I'm still left with a half-assed implementation of replication monitoring in OpenNMS.

I would love for someone to point me to a rough outline of what I'm trying to accomplish in OpenNMS. I've hit every single link I could find. I've viewed EVERY SINGLE RESULT from google and the mailing lists that had send-event.pl in them. Maybe I'm going down the wrong track. Maybe I'm not looking at it the right way.

I asked my boss yesterday if there was any emotional attachment to OpenNMS and he said there wasn't. Right now they have cacti and opennms running. Double polling on most things just to get graphs into Cacti for trending purposes. At least by switching to Nagios, I can have the perfdata dumped to an RRD and just have cacti around for display purposes. Or I can have cacti continue to do some of the polling and have nagios check the RRD for data to alert on.

I'm giving OpenNMS another day and if I can't make any headway, I'm switching to Nagios.

Tuesday, May 12, 2009

Annoying sign in Roswell

So my work phone doesn't have a camera built in. Couldn't snap a picture but I saw a sign in downtown Roswell today at lunch that really pissed me off.

"Be patriotic. Stimulate the economy"

This was on the front of one of the local shops. My first reaction, which I dutifully followed through with, was to call my wife and bitch and moan.

I dread the day when my spending becomes the measure of my patriotism.

Monday, May 11, 2009

Icinga musings

So I just heard about the Icinga fork of Nagios. Looking at the motivation, I can't say that I disagree with the fork. Obviously there are a standard set of "things" that most people add to Nagios. Many times those "things" feel like they should be standard and others are just fluff.

Looking over the Icinga project team, we can get a feel (and from the project goals) the things that they want to add on:

  • PNP
  • NagVis
  • Grapher
  • NagTrap
  • more NDO

I can get behind most of those. I think they're all wonderful addons. However, I have some reservations about one of the ones not listed. This is from the Icinga page:

The most significant modification and difference to Nagios is a completely new web interface based on PHP. This allows a wider circle of developers to contribute to the web interface and easier adjustments to be made by users to their individual environments.
I honestly just cannot get behind that in principle. I fully understand the concerns and problems people have. The Nagios interface is "ugly" in a sense but change for change's sake is just silly.

Why PHP? Why not perl (see Groundwork) or ruby or python? It's an arbitrary decision. I like my Nagios installations slim and doing what they do best, monitoring and alerting. You don't even HAVE to have a web interface. I don't want to have to bog down my monitoring server with YAP (yet another package). For all its warts, I like the way the Nagios web interface worked. There's nothing wrong with CGI scripts. They work. The Nagios cgis worked.

Was it a bitch to deal with them? Of course but moving to PHP isn't going to immediately make it better unless there's a framework or an API or standards to work against. I have full faith in the Icinga team to make an outstanding interface but I'm wondering what sort of process is going to be in place to make sure the interface is "stable". One thing that can be argued in favor of the current setup is that it's not at the whims of a constantly changing language like PHP.*

I guess my feeling is that the Icinga folks want to make something MORE of Nagios. Make it more than what it is at the core - network monitoring. There's a valid argument to be made that an "enterprise" monitoring system should have an SNMP trap handler but I personally don't think snmptt is the way to go. If it's that important, it should be something NOT written in a scripting language. If handling traps is of the utmost importance, it should be able to handle whatever volume of traps per second you throw at it. I can't find any performance numbers for snmptt so I can't tell you.

I think the biggest problem I've had with Nagios is that it isn't modular enough. It lacks something we've all come to appreciate these days - the concept of plugins. Admittedly, it's one guy. If he doesn't see a need for it, then we probably won't ever see it. Nagios really needs a standard way for people to plug in to it. Right now we have bolt-on solutions that never REALLY feel integrated. Maybe that's what Icinga wants to do. I can appreciate, however, the lean-ness that Nagios has had for this long. Maybe times have changed an monitoring doesn't just encompass monitoring anymore. I don't know but in my mind, monitoring is still a distinct entity from trending. They go hand in hand but Nagios has never billed itself as an all in one monitoring and trending solution. It monitors, and it alerts. Occasionally it "event handles" but long term storage and analysis of the data is out of scope.

Anyway, much of this has been a ramble based on first blush. I'm sure I'll have more to say. I'll follow the project closely and see what it does. I fully expect a lot of people to switch over just for the "completeness" and "asthetic" factor. Groundwork has clients after all. The demand is there. However, I'm just not sure if I'll make the switch myself.

Maybe the whole thing will prompt Ethan to respond in a positive way and make my wish list come true ;)

- API into the monitoring system
- Native support for RRD storage of perfdata information

Those are my two biggest. I would LOVE to have an API into the live core of the engine to make changes to resources. One thing that I loved about Groundwork (I think it was Groundwork) was that it had a command-line API for adding and removing hosts. I'm really hoping that in the end, we end up with Nagios as a framework with its own basic functionality but that better allows the design of solutions built on top of it. Want to build your own interface? Pull a list of hosts from the API. Pull a list of last known states for each host. Display it.

* By constantly changing, I mean compared to traditional languagues like C. PHP also has (and many developers will admit this) inconsistencies and other "gotchas" left over from years of backwards compatibility.

Wednesday, April 22, 2009

An idea for hosting WoW addons

So this curse/wowi/wowmatrix thing has gotten a bit out of hand. It's pitting technical people (addon authors, site administrators) against non-technical users (addon consumers).

The one thing that people don't get is the cost of bandwidth involved. I think I've come up with a way for that might satisfy everyone.

I want to build an addon hosting infrastructure built entirely on Amazon AWS. Here's sort of what I envision:

- EC2 for hosting a web frontend/gateway to the site.
- All the addons are hosted on S3
- People must have an S3 account to access the addons
- Billing is done to the downloader's S3 account
- Alternately, you could use devpay

Then we would see how much people think bandwidth is "free"

I need to flesh it out more and actually "design" it.

Wednesday, April 1, 2009

Tuesday, March 31, 2009

Jonathan Coulton a rip-off artist?


I'm curious about this. Anyone have anymore information?

Listening to the Code Monkey mp3 is pretty incriminating. I fancy myself a google ninja and I could find nothing about this site or the guy mentioned. I'm going to start searching for some of his songs next.

EDIT: So Erine Wade is/was from Moosup, CT according to the link. Coulton hails from Colchester, CT. Moosup is a village in Plainfield, CT.

Here's a mapping courtesy of Google:

35 miles. Don't know CT that well so I'm not sure how often someone would make that drive.

This Ernie Wade guy was 7 years older than Coulton.

Anyway, the whole thing could be a hoax. I c.b.a do to any more research than that and I don't have anymore source material since the original link was a tweet.

EDIT2: Site exceeded bandwidth so no way to verify email link. Agreeing with @danco that it's a hoax. *puts tinfoil hat back on the shelf*

Saturday, March 28, 2009

Tales from the job search

So I've gotten quite a few emails from people who saw my update on LinkedIn about finding a new position. Everyone seems to be surprised that I found something so quick.

Still no luck on my end yet but seeing that you have a new gig, definitely gives me hope that companies are still hiring regardless of what the media is telling us.
That's from a friend of mine that I've known for ten or so years. We worked together at Butler and he saw me in the lobby the day I interviewed at MediaOcean.

The last part is the most telling. Regardless of political affiliation, all we seem to hear from the media and the White House is doom and gloom. You'd have to be an idiot to say that the economy is doing well. Either that or have a severe case of rectal cranial inversion. However, let's be realistic for a moment. Just because the economy is doing poorly doesn't mean that:

1) All sectors of the economy are frozen
2) There are no jobs to be had

I'll try to be as general as possible but understand that I come from the IT side of the house. Companies still need employees. There are plenty of established companies that have survived worse than this. There are plenty of NEW companies springing up because right now is the PERFECT time to start a new company. The cost to market entry is amazingly low.

What companies AREN'T doing is hiring entry level or basic skill set people. They want to get the biggest bang for the buck. Example - There are very few positions for Linux Administrators available but there ARE several positions for Linux Engineers available. The difference is in the details. Companies don't want some guy who's only Linux experience is from a help desk perspective. They want people who've implemented Linux. Engineers.

Enterprise experience is a phrase you see a lot. We're talking multi-server management. Production experience. High availability. Clustering. Those types of concepts. This is NOT the market for 2 years of experience. This is the market for five or ten years of experience. It's great that you can install Ubuntu but can you build out a kickstart infrastructure for thousands of servers? Sure you've set up MySQL but have you set up and managed a 500+GB PostgreSQL datawarehouse?

Additionally, companies want experience outside of the area they're hiring. SAN experience. Networking experience. Load balancer experience. Experience with products running on top of Linux. J2EE application servers. Commercial databases. Message queues. These are the things that set you apart from the next guy.

Community involvement speaks a lot as well. I actually have a section under my resume called Community Contributions. It's nothing but a list of links to things I've done online that relate to the resume. In my case, links to my Nagios plugins. Links to my monitoring notes for DB2 and Websphere. Links to some Linux tips and tricks I've written. Links to scripts. You may be the only person who has ever read this stuff but it gives a prospective employer some insight into your capabilities. Just be prepared to answer questions about it.

One last thing I can suggest is to not burn any bridges. When you leave a company, formally ask peers and non-direct managers if you can use them as a reference. If you leave on good terms with your direct manager, feel free to ask them as well. Many companies have an official policy on references for legal reasons but everyone I've ever worked for has provided a reference off-hours in an unofficial capacity for me.

Anyway, that's all I really wanted to say. The market is rough but companies are hiring. You just need to set yourself apart from the rest of the crowd.

Wednesday, March 25, 2009

Mission Accomplished

I've found gainful employment. Living outside of my means can begin again!

I'll provide details once I start but I found something interesting about myself during the whole process.

I had a few opportunities on my plate. A couple were with different teams/organizations within the same company. While these teams had a "small company" feel, they were still under the umbrella of a large (REALLY large) company.

Right now, the biggest thing for me is "getting my head" back in the game. While I LOVED the time I had at MediaOcean (and would go back in a heartbeat), I got "stale" there. By that I mean that they had some pretty clear delineation between departments. Obviously at some point, you have to have that. One person can't do it all. The problem is that some of my skills got a bit stale. Having not used them in over a year, it made for rough going during the interviewing process. There were some questions that I should have easily answered and yet, because it wasn't current in my mind, I blanked.

What's spanning tree?

I know this. I explained to the guy that while I knew what it was, the best I could provide as an answer was that it was used on switches and vaguely related to redundant interswitch links. I just couldn't recall the exact definition. He gave me "credit" so to speak on the answer since I was in the general area. When he said "loop free", my mind recalled everything it could.

A year ago, I could have told you how to configure Cisco switches for redundant uplinks and which ports you would enable or disable spanning tree on. Today, not so much. I'd have to dig.

I ran across this when recruiters would call me.:

"Are you available for a network engineering position?"
"I am but I wouldn't feel comfortable with it."
"Why not?"
Insert long story about not having current experience, not being able to drop right in and get going and thus not being justified in the salary range I'm currently in.

Same thing happened when I was asked about AIX or HACMP or god forbid HPUX. I haven't used HPUX in over 10 years. Yes, unix is unix but if you dropped me in front of an HPUX box, I'd have to navigate around a bit to remind myself of where things were. I'd have to fire up SAM and renavigate all the menus. Same goes for AIX. When I was working with those technologies, I never had to use the menu-driven admin tools. I could lscfg, lsattr and set_parms with the best of em. I knew exactly what options to pass to the Volume Manager to create new VGs. I could start smitty right at the area I wanted.

So I decided to go with a smaller company. The shop is entirely Linux so no hope of a refresh of my unix skills. However, I'll have involvement and control of the network gear (including the one major load balancer vendor I've not used - F5) so I can bring that skillset current.

After this place, who knows? Small companies have risks. Small companies that are internet companies have more risks. I might be right back in the market in 6 months or a year. I don't forsee that but if I am, I'll have more options to work with.

Thursday, March 19, 2009

Unfathomable Job Descriptions

I saw this one a while ago but it seems to have been resubmitted. I guess they couldn't find anyone the first time around:

An exciting and highly successful company here in town is looking for a Senior Linux Engineer to add to the team. This person needs to be one of the best of the best in the Linux world. You should be able to learn the software and hardware architecture and help with future platform architecture. You must have a track record of building mission critical web applications in a Linux environment, which will include expertise in PHP, XML, etc. You must have strong networking experience including TCP/IP, DNS, etc. as well as experience with Load Balancing and Switch Management. You should understand Web Services and know Object Oriented Programming techniques. This is a perm job, so a good job history is required. For immediate consideration, please e-mail your resume to the information below.
Let's run down the list of "problems" with this job description:

1) It's a "Senior Linux Engineer" position and yet nothing in the job description points to Linux
2) They want the best of the best in the Linux world. Again, nothing in the job description points to Linux

The job description then goes on to point to requiring expertise in PHP and XML among other things. Beyond that it wants experience with Load Balancing and switch management and some general networking concepts. While very important skill sets to have, NONE of these are Linux specific.

In fact the only thing Linux specific it seems that the job has it that the environment runs Linux.

This is a fairly misleading job title. We have four distinct roles here:
1) Linux System Engineer (assumed from the title)
2) Network Engineer (more than just an administrator)
3) Dveloper (specifically it appears PHP and XML. They hint a general OO concepts)
4) Architect ("You should be able to learn the software and hardware architecture and help with future platform architecture")

Now I don't know about anyone else but this is a pretty tall order to fill. At least they didn't ask you to be the DBA as well. I'm a many hats kind of guy but this is insane. Based on my experience I can surmise the following about the position:

1) It's a small shop.
Very small, imho. I've been in small shops where I did both the system and network side of things but they actually had a developer if not multiples. In larger shops, the network and systems teams are distinct.
2) You will never stop working.
They want one person to do EVERYTHING. Let's forget for a moment that, while you might find someone who has two of the items on the list, finding three is a big order and finding all of them is next to impossible. Let me explain why.

Someone who has the experience they want has probably targeted his career to one specific area. While I've done all of these things at various points, I've really nailed down the final path I want to take. My path ahead lies in architecture. I detailed this in another post so I won't rehash here.
There are exceptions but let's address the second implication of that. Someone with that level of experience has been in the industry a while. That means they're probably older. You might have someone who came right out of high school which might put them in the late 20s by this point. If they haven't "settled down" yet, they probably will eventually. Even if that doesn't mean a family, it does mean a more "healthy" life and work balance. Again, there are exceptions.

There's also the question of quality. Can I do all of those things? Yes. Would my performance in one area be lackluster compared to the others? Sure. That would be in PHP. While I've had the positions where I did three of those things, the most PHP I've done in recent years is nowhere near the capacity as the others. As you progress in the "corporate world", you get more focused via corporate structure. Many of the earlier companies had me doing all of those types of things but as I went along, my responsibilities got targeted through shear neccessity. Again, at most companies there's at least a division between developement and systems/network and at the larger companies, development, systems, databases and networking are all distinct departments.

The kicker for the whole thing is the salary range: $95,000.00 - $105,000.00

That's not that bad in Atlanta. I have no information on the location of the company. The problem is that, depending on how much expertise they want in any of those areas, that salary is good for just ONE of those positions.

I'm not going to be all hypothetical about what the situation is at the company but my guess is that they're a small shop and one guy WAS doing it all. Maybe for a lot less than the salary posted. He moved on for whatever reasons and now they're in a lurch. He was working on X application and it had all of these components to it. Maybe he built out an environment and now they don't have anyone to finish it. The other option is that this person is still with the company and is getting burned out quickly and in need of relief.

More thank likely, this just happens to be an over-agressive recruiter. I could be wrong but this sounds like a bunch of requirements and preferences lumped into one position.

Side note: PHP is not a Linux technology. Nor is XML.

Thursday, March 12, 2009

Redoing my office

One thing that suffered while I was gainfully employed was the state of my office at home. My desktop* stayed booted into Vista pretty much all the time since the only thing I used it for was WoW. My laptop** was booted into Ubuntu/Hardy for doing work from home stuff.

Now that I'm looking for jobs, I've had to turn my desktop back into a "workstation" instead of a gaming machine. Everything was fairly ergonomic already since I wanted to be comfortable if I was going to be sitting in front of the machine for a few hours at a time. But it only really did one thing, run WoW.
So what did I do?

- The first thing was to fix GRUB where the Vista bootloader overwrote it. I had been hacking my way into Linux when I needed to get back to it.

One thing I had done not too long ago was upgrade from Hardy to Intrepid. The upgrade went smoothly so the system wasn't broken by any stretch. It just happened to be sitting with the default Gnome desktop and whatever was on my desktop. I didn't really have many third-party applications installed. It was basic but not too functional for long-term use.

- The next thing I did was modify my desktop to be similar to the layout I had on my workstation at MediaOcean. I've come to fall in love with a sidebar. I don't use it on Vista but I actually swapped to KDE from Gnome because the screenlets sidebar was so useless. This was pretty painless with a quick apt-get on all the kubuntu packages.

So now that I had the Kubuntu packages installed, I had to take the time to lay everything out. Remove the Dolphin desktop plasmoids, add the sidebar and put the stuff I needed there. The other thing I did was add the 6 desktops I like to have available. I also had to configure the KDE counterparts to the various apps I used (Akregator instead of Liferea, Kopete instead of Pidgin, Kmail)

The one thing that took the longest was setting up all my RSS feeds again. Any smart company, when they do a layoff or termination, disables access for the now-former employee as soon as possible. This is just smart business. When I got back to my desk, my workstation was already turned off. I didn't bother to turn it back on because I didn't really have anything personal on there other than some images of Gus for my screensaver.

I did, however, have a pretty extensive list of bookmarks in Firefox and a very large list of RSS feeds. While I had *JUST* finished setting up Xmarks with my own WebDAV server, I hadn't recently done an OPML export of my feed list. This meant having to rebuild it from memory. That sucked. I know I've missed some.

- Install some of my "development" tools. Since I'm not a developer in the programming sense, this isn't so much setting up my IDE as it is getting the machine ready for concept work. MY development environment includes, but is not limited to:

So what's left to do?
- Install VMware
- Update EC2 API tools
- Install Firefox extensions (S3fox,ElasticFox)
- Decide between TweetDeck and Thwirl
- Install Ruby toolchain
- Back it ALL up and install 64-bit Intrepid ;)

* When I originally purchased my desktop several years ago, I tried to future proof it as much as possible. I basically have a server sitting next to my desk. It's a dual-proc Opteron box with 5GB of memory and a metric arseload of diskspace. I built it this way so I could have a lab in a desktop.

** Dell XPS M1710

*** I'm currently using VirtualBox because I haven't gotten around to installing VMware Workstation. I prefer VMware Workstation mainly for the teaming feature but VirtualBox is growing on me.

Wednesday, March 11, 2009

How high is too high (level)?

One of the interesting things about being back on the market is dealing with job descriptions.

Many recruiters are not technical (nor should they always be) and you sometimes get the type of offer that was done by a full-text search for key words. The person never bothered to even read your resume before contacting you nor did they bother to see that I have no interest in moving to Malaysia.

Too Technical?

I had an interesting turn of events today that was truly a first for me. I was contacted by a recruiter who thought (as did I based on the req he read me) I would be a good fit for a position. He said they had one other person they submitted but the client wanted to see a few different resumes.

The job was for an architecture position. The client had a government contract for a health care system. I didn't get much more than that and I really didn't have any time to tailor my resume for them to highlight the architecture work I had done. I've got plenty of architecture work over the past 8 years from low level to what I assumed was high level. It's the route I want to take long term so I was very interested to get in front of the client and show them my Visio collection ;)

I got a call back from the recruiter who said the client wasn't really interested because they considered me "too technical".

Come again?

He read some of the keywords that the other guy had on his resume and it was definitely buzzwordy. Phrases like "alignment" and "vertical". Mine had no such words.


I told the recruiter that there were no hard feelings but I was really curious what they meant by "too technical". Mind you I know how to talk to non-technical people. I've done customer presentations for my designs and talked to executives about things. I can get high level and consider myself pretty good at preventing that glazed over look that you see so many times when a non-technical executive is inundated with "geek speak".

The way it was explained to me is that they considered me the type of person who would take what the person they were looking for and implement it. They actually wanted someone who hadn't been hands-on in a while.

I couldn't wrap my head around it. Why would you want someone who WASN'T technical designing something technical. What possible big-picture architecture could someone possibly create without some idea of what's actually POSSIBLE with technology? I've done work for those types of people and it's never ended well. I distinctly remember a person designing something for a client and promising something that simply wasn't technologically possible.

He explained that they didn't want specifics designed. He kept using the phrases "really big pictures" and "really high level". He stressed that the didn't want technology designs to be a part of this. No talk of platforms or specific technology (i.e. AJAX or XML). No user stories or use cases. This really got me thinking. What exactly can you do at a high level that has value?

My attempt at "really high level"

So this has been bugging me all day. Here's what I know. It's a health care system for the government. Let's assume it's the Obama health care database that was part of the "stimulus" package.

What do we know about it? Well not much. The language is pretty vague. Here's what a google search turned up:

Health Care. Health care has been a recession proof industry but the implications making health care available to more people should stimulate new projects, new money available for research and all of the outpatient and auxiliary services, such as testing and analytics labs, that accompany this. The provisions of the Stimulus Package package that call for an extension of COBRA for the unemployed and the development of a national patient database, while not an immediate priority, will require the that technology be developed long before its potential implementation.
So we have a national patient database. From what I understand form other sources, this was to keep track of health care information for preventative purposes. E-health records and other such interesting stuff. Working off of that and trying to be "really high level" and "not technical", this is the best I could come up with:

Forgive me for using Gliffy. I don't have my copy of Visio installed right now since I put Vista64 on my Windows partition. Also forgive the lack of inappropriate UML.

Intemperate side note: Anyone who bought his own copy of Visio Professional with his own money has to have some love for Information Architecture.

So anyway, that's as high level as I can get it. Obviously this would be wrapped up in a document explaining (high level of course) the various aspects of the actors and messages (physical contact vs. an "internet" message) but really there isn't much else to be said. That's assuming of course that this is the type of high level they want. The problem is that I don't see the value in something THIS high-level. Obviously, when designing something like this you have to know what's possible with current technology so you have, even if you're 10 years removed from trends, and idea of how it might be done. I would argue that if you're 10 years removed from technology, your design isn't going to be that good because it's legacy before it even comes to fruition.

While I was rocking Gus back to sleep (who woke up halfway during the last paragraph), I actually did think about it a bit harder. There's a lot that can be said without getting down to "specific technologies" used but I just don't see the value in it. Let me give you an example of what I would consider high level without naming specific technologies:

It's not much different than the first one but I learned Gliffy the second time around so it's got COLOR!

My document would outline generic methods for each message (public web interface, e-health API - without details, XML not even mentioned), the concept of self-service and authorization (patient ultimately controls access to information and level of access to third-parties) and other information.

At this point I've not really gotten TOO technical. I still haven't mentioned any specific technologies. I've not said anything about XML vs. REST vs. SOAP. I've not provided a UI mockup of the web interface (but I have conceded an internet presence). I've not addressed the issue of security (but I would at a high level).

To me, this is high level. The next level down would probably cover technology without being specific. I would probably mention the web frontend in generic language, the concept of web services and a few methods for secure interaction from service providers (VPN, MPLS, HTTPS) and provide different Visio diagrams going into that detail for each one.

It would be a pretty large document. This is the level I enjoy working at. I don't care if the backend is IIS/SQL Server or Tomcat/MySQL/AXIS. I can design the system without needing to know any of that.

Leave it to the people doing the implementation to determine the specific technologies that they know best. I wouldn't presume to dictate that it be done in Java anymore than I would presume to dictate it be done in Rails. I have my opinions on each but a good dotnet programmer can write secure and functional code just as well as an experienced Java programmer can.

Mind you I also enjoy working on the level below that one as well but that's a different scope. If I were tasked with implementing my second level design in a highly available and secure method, I'd have a Visio page with firewalls and load balancers and clustering design but again, I wouldn't dictate the technology. I'm not going to say "This is a Citrix Netscaler. This is a Cisco IDS. This is a Juniper firewall.". Of course I can do that level of design as well. Down to the redundant switching mesh.

The Payoff

Having said all of that and having actually worked through the problem, I have no doubt I could have done the job. Of course the flip side is I'm adamantly opposed to the whole thing. I don't like the idea of the government having access to the information. I don't like the idea of insurance companies having unfettered access to patient information (if it were designed that way). I don't like the idea of a "national health board" that reviews the information periodically.

I COULD design such a system that would satisfy my concerns but I don't know that I would feel comfortable trusting it to the federal government. This is the same government that can't seem to get people of TSA watch lists because they have the same name as a known terrorist.

But this isn't the place for a political rant.

I'm curious if anyone who might actually READ this could give me some insight into what exactly "really high level" is.