Tuesday, June 2, 2009

Fun with Foundry - Load balancing MySQL

I've worked with quite a few load balancers over the years from Coyote Point and Cisco to Citrix. One I've never worked with is Foundry.

So I have a task I'm working on - Balance our read-only MySQL slaves behind the FoundryServerIron 4G. Fun stuff and it gets me in front of the units which I've been itching to do.

As with any load balancer, simple balancing is pretty easy. Define multiple real servers that listen on a given port. Define a virtual server that consists of those real servers. Profit.

However this is what I'll call "dumb" load balancing. There's no actual intelligence to it. Most of those checks are simple Layer 4 checks. Is the real server listening on the port I defined? Yep.

Most load balancers provide something a bit higher up the OSI model for testing if a "real" server is alive and able to service connections. The most popular example is an HTTP check. The load balancer requests a page from the web server. This is much more intelligent because we know if our application stack is working all the way. You could define a custom controller to handle a given url and do something as simple as serve up static content with a string that the load balancer matches against or get complex and have it do some database queries to go whole hog.

But these are predefined health checks in the load balancers. What about something like a MySQL slave database? We have a small issue with replication lag. We've coded around most cases where it might be an issue but there are still cases where we need to deal with it transparently. Additionally, we want to enforce some business logic in it. We've gone beyond what most load balancers can do. Short of moving to installing MySQL proxy or some other external solution, we're pretty limited.

So here's what I ended up doing but first a few "facts":

  1. Foundry has support for custom health checks. These are somewhat limited but you can do things like the aforementioned HTTP GET/POST checks or you can do something akin to Expect scripting.
  2. We have two read-only MySQL slaves and one Master server.
  3. The code has two "connection pools" (it's php so not really but I'm using that term anyway). The first is a DSN for writes going to the master as well as immediate reads based on that write. The second is for general SELECT statements and other read operations.
  4. We do have a memcached layer in there but it's irrelevant to this discussion.
  5. One of the slaves goes offline each night for backups.
  6. Sending all traffic to the Master is NOT an option except in EXTREME situations.
  7. We will have replication lag spikes on the order of 20 seconds every so often due to batch operations running against the database (building summary tables and the like). These queries are as optimized as they're going to get but 5.0 statement based replication is the reason things lag.
  8. Upgrading to 5.1 and row-based replication is NOT an option at current.

So there you have the parameters I have to work with. The first step was finding out how flexible the Foundry was at custom health checks. The way Foundry works is you bind ports to real servers. Optionally, a port definition can take the name of a health check as the determining factor. Look at this example:


server real server01 10.0.101.108
port dns
port dns keepalive
port dns addr_query "server01.mydomain.com"


That sets up a real server providing DNS as a service. DNS is a known service so it has its own set of rules. We tell it that we want to do a lookup on server01.mydomain.com to determine if DNS is working. Here's an example for a web server:


server real searchserver01 10.0.101.102
source-nat
port 8080
port 8080 keepalive
port 8080 url "HEAD /solr/solrselect/admin/ping"


We're connecting to a tomcat instance and pulling up a static page inside the container to determine if it's working properly.

Now take a look at this example:


server real dbs02 10.0.101.113
source-nat
port 3306
port 3306 healthck dbs02mhs
port 3306 keepalive


This is saying that for port 3306 on the server, I want to use a health check called dbs02mhs. This is a custom health check that I've defined for this purpose. So what's in dbs02mhs?


healthck dbs02mhs tcp
dest-ip 10.0.101.113
port 10001
port 10001 content-check mhs
l7-check


We're connecting to port 10001 on IP 10.0.101.113 and performing a content check called mhs. Additionally, we're saying that this is a layer 7 check only. Here's the contents of the mhs content check:


http match-list mhs
up simple 0
down simple 1


Ignore the http part for a minute. It's a bit misleading. We're not actually checking via an http call. What this match list says is if I get a 0 as my response, the server is up. If I get a 1, the server is down. By binding it to health check above and subsequently to the real server, we're saying this:

"Connect to port 10001 on IP 10.0.103.113. If you get a 0 back, everything is good. If you get a 1 back, things are not good. In fact, to determine if port 3306 is available on this IP, I want you to do the check this way"

Interesting, no? Probably not. So what's listening on port 10001 on the database server? MHS.

MHS is a little perl daemon I wrote that encapsulates the more fine-grained logic we need in determining if the slave database should be handling queries. I'm going to post the code for mhs. I warn you now, this is a hack. It needs to be cleaned up and have some code style enforced. I'm actually working on a set of internal perl modules to move much of this stuff out. We already have a bash version of what I'm calling our "scripting framework".

---> MHS

As I said, this was more of a P.o.C. Now that I know it works, I can clean it up. Basically, MHS does three things currently:

  • Check if the database server is alive. We do this by doing a "SELECT 0" from the database.
  • Check if there is a backup in progress. If so, this slave will be lagged and so we don't want to use it for reads.
  • Check if replication is lagging more than 60 seconds. If it is, let's not use it for now.

As I said, this is pretty specific to us. Every MySQL instance we have has a database called sysops. This is a non-replicated local table. There's also a sysops database on a central backend MySQL instance that we use for storing scripting execution details and job logging but on every other database server, there's currently a single table - job_locks. Our shell scripting framework (and soon the perl framework) has a method/function for locking the database so to speak. In this case, our backup job in addition to writing to our master sysops database information about execution time and what not, also writes a lock to the server that is being backed up. The job_locks table currently has one row:


+--------+--------+
| locked | name |
+--------+--------+
| 0 | backup |
+--------+--------+
1 row in set (0.00 sec)


The reason for storing this table directly on the server is that our sysop database instance is not redundant and doesn't need to be. It's for storing one-off databases. If it goes offline, we can't have our checks failing. By putting a table on the actual server being backed up, we can self-contain the check. The daemon runs on the same box as mysql and the tables it checks are on that server.

One thing I'm running into with the Foundry is needing to setup what they call a boolean check. You can use operators such as AND and OR as well as group multiple checks together.

My ruleset needs to basically say this:

If 3306 Layer 4 is true and mhs Layer 7 is true, server is good.
If 3306 Layer 4 is true and mhs Layer 4 is false, server is good.

The reasoning is that we don't want to fail out a slave if the perl daemon crashes. We want to make the default assumption that the slave is good and current unless explicitly told so by the fact that it either isn't listening on port 3306 or the mhs service says it's bad.

I don't THINK I can do that but if there are any Foundry experts who read this post, please let me know.

1 comment:

  1. http://community.brocade.com/home/docs/DOC-1577

    This has helped me in this situation.

    ReplyDelete