Wednesday, January 5, 2011

Chef and Encrypted Data Bags - Revisted

In my previous post here I described the logic behind wanting to store data in an encrypted form in our Chef data bags. I also described some general encryption techniques and gotchas for making that happen.

I've since done quite a bit of work in that regard and implemented this at our company. I wanted to go over a bit of detail about how to use my solution. Fair warning, this is a long post. Lot's of scrolling.

A little recap

As I mentioned in my previous post, the only reliable way to do the encryption of data bag items in an automated fashion is to handle key management yourself outside of Chef. I mentioned two techniques:

  • storing the decryption key on the server in a flat file
  • calling a remote resource to grab the key

Essentially the biggest problem of this issue is key management and, in an optimal world, how to automate it reliably. For this demonstration, I've gone with storing a flat text file on the server. As I also said in my previous post, this assumes you tightly control access to that server. We're going with the original assumption that if a malicious person gets on your box, you're screwed no matter what.

Creating the key file

I used the knife command to handle my key creation for now:

knife ssh '*:*' interactive
echo "somedecryptionstringblahblahblah" > /tmp/.chef_decrypt.key
chmod 0640 /tmp/.chef_decrypt.key

Setting up the databags and the rake tasks

One of the previous things I mentioned is knowing when and what to encrypt. Be sensible and keep it simple. We don't want to throw out the baby with the bath water. The Chef platform has lots of neat search capabilities that we'd like to keep. In this vein, I've created a fairly opinionated method for storing the encrypted data bag items.

We're going to want to create a new databag called "passwords". The format of the data bag is VERY simple:

We have an "id" that we want to use and the plaintext value that we want to encrypt.

Rake tasks

In my local chef-repo, I've created a 'tasks' folder. In that folder, I've added the following file:

As you can see, this requires a rubygem called encrypted_strings. I've done a cursory glance over the code and I can't see anything immediately unsafe about it. It only provides an abstraction to the native OpenSSL support in Ruby with an additional String helper. However I'm not a cryptographer by any stretch so you should do your own due diligence.

At the end of your existing Rakefile, add the following:

load File.join(TOPDIR, 'tasks','encrypt_databag_item.rake')

If you now run rake -T you should see the new task listed:

rake encrypt_databag[databag_item]  # Encrypt a databag item in the passwords databag

If you didn't already create a sample data bag and item, do so now:

mkdir data_bags/passwords/
echo '{"id":"supersecretpassword","data":"mysupersecretpassword"}' > data_bags/passwords/supersecretpassword.json

Now we run the rake task:

rake encrypt_databag[supersecretpassword]

Found item: supersecretpassword. Encrypting
Encrypted data is <some ugly string>
Uploading to Chef server
INFO: Updated data_bag_item[supersecretpassword_crypted.json]

You can test that the data was uploaded successfully:

knife data bag show passwords supersecretpassword

"data": "<some really ugly string>",
"id": "supersecretpassword"

Additionally, you should have in your 'data_bags/passwords' directory a new file called 'supersecretpassword_crypted.json'. The reason for keeping both files around is for key management. Should you need to change your passphrase/key, you'll need the original file around to reencrypt with the new key. You can decided to remove the unencrypted file if you want as long as you have a way of recreating it.

Using the encrypted data

So now that we have a data bag item uploaded that we need to use, how do we get it on the client?
That will require two cookbooks:

The general idea is that, in any cookbook you need decrypted data, you essentially do three things:
  • include the decryption recipe
  • include_recipe "databag_decrypt::default"
  • assign the crypted data to a value via databag search
    password = search(:passwords, "id:supersecretpassword").first
  • assign the decrypted data to a value for use in the rest of the recipe
    decrypted_password = item_decrypt(password[:data])

From there, it's no different that any other recipe. Here's an example of how I use it to securely store Amazon S3 credentials as databag items:

include_recipe "databag_decrypt::default"
s3_access_key = item_decrypt(search(:passwords, "id:s3_access_key").first[:data])
s3_secret_key = item_decrypt(search(:passwords, "id:s3_secret_key").first[:data])
s3_file erlang_tar_gz do
  bucket "our-packages"
  object_name erlang_file_name
  aws_access_key_id s3_access_key
  aws_secret_access_key s3_secret_key
  checksum erl_checksum

Changing the key

Should you need to change the key, you'll need to jump through a few hoops:

  • Update the passphrase on each client. Ease depends on your method of key distribution
  • Update the passphrase in the rake task
  • Reencypt all your data bag items.
The last one can be a pain in the ass. Since Chef currently doesn't support multiple items in a data bag json file, I created a small helper script in my chef-repo called 'split-em.rb'.
I store all of my data bag items in large json files and use split-em.rb to break them into individual files. Those file I upload with knife:

bin/split-em.rb -f data_bags/passwords/passwords.json -d passwords -o

Parsing data for svnpass into file data_bags/passwords/svnpass.json
Parsing data for s3_access_key into file data_bags/passwords/s3_access_key.json
Parsing data for s3_secret_key into file data_bags/passwords/s3_secret_key.json
#Run the following command to load the split bags into the passwords in chef
for i in svnpass s3_access_key s3_secret_key; do knife data bag from file passwords $i.json; done

You could then run that through the rake task to reupload the encrypted data:

for i in svnpass s3_access_key s3_secret_key; do rake encrypt_databag[$i]; done

Limitations/Gotchas/Additional Tips

Take note of the following, please.

Key management

The current method of key management is somewhat cumbersome. Ideally, the passphrase should be moved outside of the rake task. Additionally, the rekey process should be made a distinct rake task. I imagine a workflow similar to this:

  • rake accepts a path to the encryption key
  • additional rake task to change the encryption key in the form of oldpassfile/newpassfile.
  • Existing data is decrypted using oldpassfile, reencrypted using new passfile and sent back to the chef server.

Optimally, the rake task would understand the same attributes that the decryption cookbook does so it can handle key managment on the client for you. I'd also like to make the cipher selection configurable as well an integrate it into the above steps.

Duplicate work

Seth Falcon at Opscode is already in the process of adding official support for encrypted data bags to Chef. His method involves converting the entire databag sans "id" to YAML and encrypting it. I wholeheartedly support that effort but that would obviously require a universal upgrade to Chef as well. The purpose of my cookbook and tasks is to work with the existing version.


If you're an Amazon EC2 user, you should start using IAM NOW. Stop putting your master credentials in to recipes and limit your risk. I've created a 'chef' user who I give limited access to certain AWS operations. You can see the policy file here. It gives the chef user read-only access to 'my_bucket' and 'my_other_bucket'.
If you wanted to get REALLY sneaky, you could use fake two-factor authentication to store your key in S3:

  • Encrypt data bag items with "crediential B" password except for one item "s3_credentials"
  • s3_credentials (crendential A) is encrypted with a passphrase and managed similar to this article
  • Use transient credentials to access S3 and grab a passphrase file (credential B)
  • Decrypt data with secondary credentials
You would have to heavily modify the cookbook to do this. I think the current implementation is fine.

File-based passphrases

I'm not a big fan of the file-based passphrase method. While we agreed that you should consider yourself screwed if someone gets on the box, that still leaves poorly coded applications running as an attack vector. Imagine you have an application that must run as root. Now it can read the passphrase. Should that application become remotely exploitable, the passphrase file is vulnerable. I'm leaning to the method of a private server that allows RESTful access to grab the key. I've already added support in the cookbook for a passphrase type of 'url'.


I think that covers anything. I'd love some feedback on what people think. We've already implemented this in a limited scope for using IAM credentials in our cookbooks. I can easily revoke those should they get compromised without having to generate all new master keys.

1 comment:

Tom Halligan said...

Thanks for this, we're looking into Chef and need to be sure we can keep our sensitive config information safe!