Tarsnap critical security bug

Tarsnap versions 1.0.22 through 1.0.27 have a critical security bug. It may be possible for me, Amazon, or US government agencies with access to Amazon's datacenters to decrypt data stored with those versions of Tarsnap. This is an absolutely unacceptable compromise of Tarsnap's security principles, and I sincerely apologize to everyone affected.

There's a lot to say about this, and it's entirely possible that I'll miss covering some important points in this post; if I've missed something, please email me or post a comment below and I'll do my best to add the necessary information.

The bug

Tarsnap archives data by first converting it into a series of "chunks" of average size 64 kB; next compressing and encrypting each chunk; and finally uploading those chunks. The encryption is performed using a per-session AES-256 key in CTR mode.

In versions 1.0.22 through 1.0.27 of Tarsnap, the CTR nonce value is not incremented after each chunk is encrypted. (The CTR counter is correctly incremented after each 16 bytes of data was processed, but this counter is reset to zero for each new chunk.)

How the bug happened

Up to version 1.0.21 of Tarsnap, AES-CTR was used in two places: First, to encrypt each chunk of data; and second, in the Tarsnap client-server protocol. In version 1.0.22 of Tarsnap, I introduced passphrase-protected key files, which used AES-CTR encryption (with a key computed using scrypt).

In order to simplify the Tarsnap code — and in the hopes of reducing the potential for bugs — I took this opportunity to "refactor" the AES-CTR code into a new file (lib/crypto/crypto_aesctr.c in the Tarsnap source code) and modified the existing places where AES-CTR was used to take advantage of these routines.

It is at this point where the bug slipped into the chunk-encryption code (crypto_file_enc in lib/crypto/crypto_file.c):

        /* Encrypt the data. */
-       aes_ctr(&encr_aes->key, encr_aes->nonce++, buf, len,
-           filebuf + CRYPTO_FILE_HLEN);
+       if ((stream =
+           crypto_aesctr_init(&encr_aes->key, encr_aes->nonce)) == NULL)
+               goto err0;
+       crypto_aesctr_stream(stream, buf, filebuf + CRYPTO_FILE_HLEN, len);
+       crypto_aesctr_free(stream);

The encr_aes->nonce++ turned into encr_aes->nonce, and as a result the same nonce value was used repeatedly. (The other places where Tarsnap uses AES-CTR — in the client-server protocol and in the handling of passphrase-protected key files — are not affected by this bug.)

Impact of the bug

As stated above: It may be possible for me, Amazon, or US government agencies with access to Amazon's datacenters to decrypt data stored with affected versions of Tarsnap. Other individuals/agencies are unlikely to be able to decrypt data for the simple reason of being unable to access the encrypted data: Amazon Web Services is considered to be sufficiently secure to handle medical records and credit cards, and while I often remind people that regulatory compliance is not at all the same thing as security, in this case I think they align fairly accurately. (Note that since the Tarsnap client-server protocol is encrypted, being able to intercept Tarsnap client-server traffic does not provide an attacker with access to the data.)

There are two ways of decrypting AES-CTR data when the nonce is reused: By comparing two ciphertexts, or by using a known plaintext. In the first case, the ciphertexts A xor C and B xor C are compared to yield the exclusive OR of the two plaintexts, A xor B. If the plaintexts are English text or otherwise have a small amount entropy, this usually enough to allow both plaintexts to be extracted — in fact, this is one of the methods which was used by British codebreakers in the second world war. However, the blocks which Tarsnap encrypts do not have low entropy: Tarsnap compresses its chunks of data before encrypting them. While the compression is not perfect (there are, for instance, some predictable header bits), I do not believe that enough information is leaked to make such a ciphertext-only attack feasible.

Given a known plaintext, however — that is, if the attacker knows any block of data which was encrypted — then the attack is trivial: They need only compare the plaintext against corresponding ciphertext block to recover the AES-CTR keystream, which can then be used to decrypt other blocks of data. If Tarsnap is used to perform complete system backups, there will be many such plaintexts — files belonging to the operating system and the Tarsnap binary itself are obvious examples — but if Tarsnap is used selectively then it is possible that the attacker will have no such plaintext at his disposal.

Because Tarsnap uses per-session AES keys for encrypting blocks of data, this bug affects only data uploaded using affected versions of Tarsnap, and the known-plaintext attack will only endanger data uploaded during the same archive when the known plaintext is uploaded; so it is possible that an attacker would be able to decrypt some data but not all.

What Tarsnap users should do

Tarsnap users should immediately upgrade to version 1.0.28.

Tarsnap users who wish to re-encrypt their stored data should register a new machine using tarsnap-keygen, upload their data using the newly generated keys, and then delete the old data by running tarsnap --nuke with the old keys. (Note that creating a new archive with the same set of keys will not cause data to be re-encrypted and uploaded, since Tarsnap's de-duplication will recognize the duplicated data.) Anyone wishing to do this should contact me via email so that I can provide a Tarsnap account credit to cover the bandwidth fees which would otherwise be charged. (Of course, if the US government wants your data, re-encrypting it and deleting the old version from Tarsnap won't force them to delete any copies they have made — but it might help you if the US government doesn't realize that it wants your data yet.)

Tarsnap users who wish to stop using Tarsnap should delete their stored data by running tarsnap --nuke and contact me via email for a refund.

Tarsnap users with any other questions or concerns should contact me via email, twitter, IRC, or any other convenient form of communication.

What I'm doing about this

After being contacted on Friday afternoon and confirming the bug, I immediately re-checked all of the Tarsnap crytographic code; I found no other bugs. Of course, this can't guarantee that there are no subtle issues lurking; but at least it makes very unlikely the possibility that other similarly obvious problems exist.

I've also added "double-check all changes to critical security code, even if they are 'cosmetic' or 'refactoring' changes" to my pre-release checklist. When I wrote the original chunk-encryption code, I reviewed my work very carefully to make sure that I got it right — and it was right for two years, until I accidentally introduced this bug while making what I thought was an insignificant change. This is an important lesson to learn: Mistakes can happen any time a piece of code is modified.

Finally, I am instituting a Tarsnap bug bounty (complete details to follow in a later blog post). This bug was found and reported to me by someone who was reading the Tarsnap source code purely out of curiosity — I'm a great fan of curiosity, but I've also learned that money can help to encourage curiosity. While I hope that I this is the last time I have to pay out a bounty for a security bug, if there are other bugs I hope this bounty will result in them being found sooner rather than later.

Final remarks

I will not attempt to decrypt and read your data. Amazon claims that it does not inspect Amazon Web Services users' data. And the US government is theoretically bound by a constitution which prohibits unreasonable searches. This is all, however, entirely irrelevant: The entire point of Tarsnap's security is to remove the need for such guarantees. You shouldn't need to trust me; you shouldn't need to trust Amazon; and you most certainly shouldn't need to trust the US government.

This was a very easy mistake to make. Anyone could have made it. It was also a very easy mistake to find. I should have found it, 19 months ago, before releasing version 1.0.22 of Tarsnap. I didn't, and I'm sorry.

I'd like to thank Taylor R Campbell for bringing this bug to my attention.

Q&A

Some questions I've been hearing, aggregated here so that people can stop asking them:

Is the updated Tarsnap in the FreeBSD ports tree? Yes. It wasn't when this announcement first went up, but I committed the update at 21:23 UTC.
Is there any way to download all the data for a machine, re-encrypt, and re-upload? This is theoretically possible, but needs some new code to be written, and I didn't want to delay announcing this bug for the time it would take to write that code. If you don't want to take the 'upload a new archive and nuke the old ones' approach (e.g., if you have important history to keep), you'll have to wait a few days at least.
UPDATE: This can be done using the new tarsnap-recrypt utility in version 1.0.29 of the Tarsnap client code.

How do I generate new keys?

tarsnap-keygen --keyfile /root/new-tarsnap.key --user me@example.com --machine mybox

So are my keys compromised now? This bug affected data stored on Tarsnap, not the keys used to encrypt it. If you delete all your data and then re-upload, it will be encrypted securely -- the only reason to need new keys is if you have data already stored and need to make sure that Tarsnap's deduplication doesn't prevent the data from being re-uploaded.
One caveat to this: If your tarsnap keys were in an archive you stored, they might be compromised that way.
I'm not worried about you, Amazon, or the US government reading my data; all I'm concerned about is keeping it safe from script kiddies. Do I need to worry about this? Script kiddies aren't going to be able to access Tarsnap's backing storage on S3, so this issue shouldn't affect you. (Whether your lack of worry about me, Amazon, and the US government is justified is another matter, but that's for you to judge, not me.)
I don't want to create new keys; can I keep my existing key file and nuke first then upload? Yes. The purpose of creating a new key is to ensure that new data isn't deduplicated against old (insecurely encrypted) data, so if you delete all the old data first you'll be fine. Unless, of course, your computer dies between deleting the old archive and uploading the new ones...

Posted at 2011-01-18 21:05 | Permanent link | Comments

Daemonic Dispatches