make ec2amiAs my regular readers will be aware, I have been working on bringing FreeBSD to the Amazon EC2 platform for many years. While I have been providing pre-built FreeBSD/EC2 AMIs for a while, and last year I wrote about my process for building images, I have been told repeatedly that FreeBSD users would like to have a more streamlined process for building images. As of a few minutes ago, this is now available in the FreeBSD src tree: make ec2ami
This code makes use of the FreeBSD release-building framework — in particular the work done by Glen Barber to create virtual-machine images — and the bsdec2-image-upload tool I wrote a few months ago. Once you have installed bsdec2-image-upload (from pkg or the ports tree) and set some AWS parameters (AWS key file, region, and S3 bucket to be used for staging) in /etc/make.conf, building an AMI takes just two commands:
# cd /usr/src && make buildworld buildkernel # cd /usr/src/release && make ec2ami
This will build an AMI in the EC2 region you specified; if you want to copy the AMI out to all EC2 regions and make it public, you can do this by adding an extra option:
# cd /usr/src && make buildworld buildkernel # cd /usr/src/release && make ec2ami EC2PUBLIC=YES
I did this a few hours ago, and after about an hour of bits being copied around, I had this on my console:
Created AMI in us-east-1 region: ami-aa0532c2 Created AMI in us-west-1 region: ami-bb2cccff Created AMI in us-west-2 region: ami-534b6263 Created AMI in sa-east-1 region: ami-1315ae0e Created AMI in eu-west-1 region: ami-55b12b22 Created AMI in eu-central-1 region: ami-4a97aa57 Created AMI in ap-northeast-1 region: ami-1de21f1d Created AMI in ap-southeast-1 region: ami-a66f5cf4 Created AMI in ap-southeast-2 region: ami-2df98a17
Building an AMI in one command: Could it really get any easier?
FreeBSD 10 iwn problemsApologies to my regular readers: This post will probably not interest you; rather than an item of general interest, I'm writing here for the benefit of anyone who is running into a very specific bug I encountered on FreeBSD. Hint for Googlebot: If someone is looking for FreeBSD 10 iwn dies or iwn stops working on FreeBSD 10, this is the right place.
A few weeks ago I updated an old laptop from FreeBSD 9.3 to FreeBSD 10.1, and I soon found that the Intel Centrino Advanced-N 6205 wifi (using the iwn0 driver) would stop working after a few hours. Or to be more precise: After being connected to my wifi network for a while, it would drop the connection and attempt to reconnect; but wpa_supplicant would not be able to rejoin the network. Restarting the device via ifconfig iwn0 down; ifconfig iwn0 up would make it work again — until the next time it died, that is.
It turns out that there are some bugs in 802.11n support in the FreeBSD 10.1 iwn0 driver and/or 802.11 infrastructure. These are fixed in HEAD, but for 10.1 they can be worked around via disabling 802.11n by adding -ht to the ifconfig line for the interface. In my /etc/rc.conf I now have:
and everything is working perfectly.wlans_iwn0="wlan0" ifconfig_wlan0="-ht WPA"
Thanks to Adrian Chadd for helping me track down the problem and suggesting the workaround.
When security goes rightI've written a lot over the years about ways that companies have gotten security wrong; as a pedagogical technique, I find that it is very effective, since people tend to remember those stories better. Today, I'd like to tell a different story: A story about how a problem was fixed.
A few weeks ago, I heard that my slow and unreliable ISP was raising their prices, and I decided to look around for a better ISP. I found one in the form of Novus, which has fibre runs to towers around the Vancouver area. I signed up; they came to the building and placed the necessary patch cable in the basement telecommunications room; and I had 50/10 Mbps internet access costing less than my previous 25/2.5 Mbps.
Then I went onto Novus' account management website and found my bandwidth utilization graph. "Cool", I thought, "they're using RRDtool. I wonder if they're generating the standard day/week/month/year graphs, or just the monthly graph." Right click, open in new tab, look at URL. "Huh, that's interesting, it's just a path to a .gif file, not a CGI invocation. Well, maybe they're rewriting usage_graphs to the directory with the graphs for whichever user is logged in. What do I get if I issue a request for the directory itself?"
Oops. Rather than a list of traffic graphs for my account, I received an Apache directory listing with thousands upon thousands of traffic graphs — all the traffic graphs ever created for their customers, in fact — and a few random clicks confirmed that I could indeed download those graphs. Being familiar with RRDtool, I knew exactly how this had happened: One of the common configurations generates traffic graphs on-demand but writes them into a directory to be fetched separately. This can save CPU time compared to always generating graphs, since RRDtool can point at an existing graph rather than regenerating it; but as in this case it isn't always a safe configuration to use.
Time to do something about this; but how? I've learned from experience that front-line technical support should only be contacted as a last resort; alas, the Shibboleet backdoor is not widely implemented. Not only do front-line technical support personnel tend to waste time because they don't understand the issues involved, they often mangle problem reports while forwarding them — more than once I've cornered an engineer and said "so, you never fixed this issue I reported a few months ago..." only to get an amazed "so that's what that ticket was about" reply.
Since I didn't want to phone the technical support line, I looked for another contact. No "security" contact on their website, unfortunately... I could have contacted them on Twitter, but I didn't really want to explain the issue in public (and wouldn't necessarily have reached anyone any more clueful)... Aha! They have valid whois data for their domain. Even better, they have separate "Administrative" and "Technical" contacts — and when companies do this, it's a good indication that the "Technical" contact will go to someone who is actually technical. (For ISPs, whois data for IP address blocks is also often useful; that would have taken me to the same person.)
He wasn't answering his phone when I first called, but ten minutes later I got through:
"Hi, I got your phone number from Novus' whois data. The RRDtool graphs for all your customers are visible. If I go to $URL I can see a list of the graphs and download any of them."After I identified myself (and explained that I was a new customer) he promised to get the problem fixed right away, and within ten minutes Apache was sending 403 (Forbidden) responses for the directory; this solved most of the problem, but the file names were predictably based on the customer account number, so knowing someone's account number could still allow an attacker to fetch their traffic graphs. A short time later they started checking Referer headers; but as I pointed out to them, sending a fake Referer header is easy, so that only helps against the most careless attackers. I suggested several options for solving this, and they chose the simplest: Use a cron job to delete all the generated graphs every minute. While this does leave a very slight exposure, it's narrow enough to be inconsequential: In order to illicitly fetch a customer traffic graph, an attacker would need to be a logged-in Novus customer (that filtering was in place from the start), know the account number of their victim, and issue the request within 60 seconds of the victim viewing their traffic graph. (A completely robust approach would be to have code which checks the graph file name against the account number of the logged-in user, but writing new code always risks introducing new bugs, so I really can't fault them for taking the 99.99% solution.)
"That's not good... uhh, who are you?"
After fixing the issue, they sent me an email thanking me, and — much to my surprise — telling me that as a sign of their appreciation they were crediting a month of free internet access to my account. While I award bounties to people who report bugs (security or otherwise) in my company's product, I don't expect other companies to do likewise... especially if they don't publish any such bounties.
Obviously, Novus made a mistake in their original configuration, but in the spirit of Christmas, this post isn't about mistakes: It's about what went right. So what went right here?
First, they were using widely used open source code; if I hadn't been familiar with it, I wouldn't have noticed the problem. Bad guys will always dig deeper than unpaid good guys; so if you're going to benefit from having many eyeballs looking at what you're doing, it's much better if your bugs are shallow.
Second, I saw something and I said something. This is just good citizenship; I could have saved myself half an hour by ignoring the issue, but having seen it I felt that I had a moral duty to report it.
Third, I was able to find a way to report the problem to someone who would understand it and be able to fix it. If you're running a company and you're serious about security, put a page on your website with a GPG key and a security contact email address — this will guarantee that people who find issues will know where to send them, as well as announcing that you will take the issues seriously. If you don't want to go that far, at a minimum you should make sure that you have a valid technical whois contact for your domain: People who think to look up domain whois records are the people you want to hear from.
Fourth, I didn't ask for a "consulting fee" or any other remuneration for disclosing the vulnerability. This is something people new to the field often do, and it's a bad idea. Seriously kids, don't do this. You're not likely to get any money out of it, and you are likely to cross the line into extortion and end up in jail.
Fifth, while I had to "exploit" the vulnerability — loading other customer's traffic graphs — in order to confirm that it existed, I did so only to the minimum extent necessary. Unlike Andrew "weev" Auernheimer, upon finding that information had been accidentally made publicly available I did not proceed to download all of it; I knew that I was not intended to have that information, so I had no moral right to access it regardless of the questionable legality of doing so.
And finally, Novus immediately took the issue seriously and remained in contact with me to ensure that the issue was resolved. Many companies would file this into a bug tracker and have a sysadmin look at it months later; Novus gave it the attention it deserved, and when I wrote back to point out that their Referer-checking fix was inadequate they went back to apply a further fix.
Understanding the technical aspects of security is great, but soft skills matter too. Goodwill to all isn't just a Christmas message; it's also good security.
The missing ImportVolume documentationAs a general rule, the documentation provided by Amazon Web Services is very good; in many ways, they set the standard for what documentation for public APIs should look like. Occasionally, however, important details are inexplicably absent from the documentation, and — I suspect in part due to Amazon's well known culture of secrecy — it tends to be very difficult to get those details. One such case is the EC2 ImportVolume API call.
As the maintainer of the FreeBSD/EC2 platform, I wanted to use this API call to simplify the process of building EC2 images for FreeBSD. For several years my build process has involved launching an EC2 instance, building a FreeBSD disk image directly onto an EC2 volume, and then converting that volume into an EC2 machine image ("AMI"); but in order to integrate better with the FreeBSD release process, it was essential to be able to build the disk image offline and then upload it — ideally without ever launching an EC2 instance. The ImportVolume API call is intended for exactly this purpose; but one of the mandatory parameters to this call — Image.ImportManifestUrl — came without any documentation of what data the "disk image manifest" file should contain. Without that crucial documentation, it's impossible to create a manifest file; without creating a manifest file, it's impossible to make use of the ImportVolume API; and without the ImportVolume API I could not streamline the FreeBSD/EC2 build process.
Fortunately developers inside Amazon have access to better documentation. While the ImportVolume API is not implemented in the AWS CLI, it is implemented in the — much older, and now rarely used — EC2 API Tools package. Despite the EC2 API Tools not being useful for the FreeBSD release process — being written in Java, they are far too cumbersome — they did allow me to figure out the structure of the requisite manifest file; and so I am able to present the missing ImportVolume documentation:
One interesting thing about the metadata file is that it does not contain any information to allow the disk image parts stored on S3 to be validated; nor, for that matter, is there any mechanism in the EC2 ImportVolume API call to allow the manifest file to be validated. One assumes that this is due to a presumption that data stored in S3 is safe from any tampering; if I were designing this API, I would add a <sha256> tag into each <part> and add an Image.ImportManifestSHA256 parameter to the ImportVolume API call. In fact, these could easily be added as optional parameters, in order to provide security without compromising backwards compatibility.
EC2 ImportVolume manifest file formatThe EC2 ImportVolume manifest file is a standalone XML file containing a top-level <manifest> tag. This tag contains:
- A <version> tag with the value "2010-11-15".
- A <file-format> tag containing the image file format, as in the Image.Format parameter to ImportVolume: "RAW", "VHD", or "VMDK".
- An <importer> tag providing information about the software used to generate the manifest file; it contains <name>, <version> and <release> tags which seem to be purely informational.
- A <self-destruct-url> tag containing an S3 URL which is pre-signed for issuing a DELETE request on the manifest file object.
- An <import> tag containing a <size> tag (with the size in bytes of the disk image, as in the Image.Bytes parameter to ImportVolume), a <volume-size> tag (with the size in GB of the volume to be created, as in the Volume.Size parameter to ImportVolume), and a <parts> tag with a count attribute set to the number of <part> tags which it contains.
Each <part> tag corresponds to a portion of the disk image, has an index attribute identifying the position of this part in sequence (numbered starting at 0), and contains:
- A <byte-range> tag with start and end attributes specifying the position of the first and last bytes of this part in the disk image.
- A <key> tag containing the S3 object name for this part. [I'm not sure what purpose this serves, and it's possible that these could just be any unique names for the parts.]
- <head-url>, <get-url>, and <delete-url> tags containing S3 URLs which are pre-signed for issuing HEAD, GET, and DELETE requests respectively on the S3 object containing this part.
ExampleThe following is an example of a manifest file (with a repetitive section elided):<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <manifest> <version>2010-11-15</version> <file-format>RAW</file-format> <importer> <name>ec2-upload-disk-image</name> <version>1.0.0</version> <release>2010-11-15</release> </importer> <self-destruct-url>https://import-volume-example.s3.amazonaws.com/c05aebc0-98d3-41df-bd64-53eacf4de842/disk.imgmanifest.xml?AWSAccessKeyId=07G3159HQ3Z614FJ8GR2&Expires=1417078408&Signature=LZi1Hzkq%2FfC%2BUJrxj7m1DfozTUI%3D</self-destruct-url> <import> <size>1073741824</size> <volume-size>1</volume-size> <parts count="103"> <part index="0"> <byte-range end="10485759" start="0"/> <key>c05aebc0-98d3-41df-bd64-53eacf4de842/disk.img.part0</key> <head-url>https://import-volume-example.s3.amazonaws.com/c05aebc0-98d3-41df-bd64-53eacf4de842/disk.img.part0?AWSAccessKeyId=07G3159HQ3Z614FJ8GR2&Expires=1417078408&Signature=%2FFHSYRWxr5F7h1yoUuT1uV4lXD0%3D</head-url> <get-url>https://import-volume-example.s3.amazonaws.com/c05aebc0-98d3-41df-bd64-53eacf4de842/disk.img.part0?AWSAccessKeyId=07G3159HQ3Z614FJ8GR2&Expires=1417078408&Signature=KulBDTq%2BoHeWJzXZ2iOPSjk%2FN%2BQ%3D</get-url> <delete-url>https://import-volume-example.s3.amazonaws.com/c05aebc0-98d3-41df-bd64-53eacf4de842/disk.img.part0?AWSAccessKeyId=07G3159HQ3Z614FJ8GR2&Expires=1417078408&Signature=NgupOVUophQxUzgkBl9t%2BngofuM%3D</delete-url> </part> . . . <part index="102"> <byte-range end="1073741823" start="1069547520"/> <key>c05aebc0-98d3-41df-bd64-53eacf4de842/disk.img.part102</key> <head-url>https://import-volume-example.s3.amazonaws.com/c05aebc0-98d3-41df-bd64-53eacf4de842/disk.img.part102?AWSAccessKeyId=07G3159HQ3Z614FJ8GR2&Expires=1417078408&Signature=L7XD9KPWd1O%2B3Yt%2BLBJUzRXA4HA%3D</head-url> <get-url>https://import-volume-example.s3.amazonaws.com/c05aebc0-98d3-41df-bd64-53eacf4de842/disk.img.part102?AWSAccessKeyId=07G3159HQ3Z614FJ8GR2&Expires=1417078408&Signature=0Nl%2BRFguwfnRuYDS22F0noJ8BwE%3D</get-url> <delete-url>https://import-volume-example.s3.amazonaws.com/c05aebc0-98d3-41df-bd64-53eacf4de842/disk.img.part102?AWSAccessKeyId=07G3159HQ3Z614FJ8GR2&Expires=1417078408&Signature=S%2BuNzWlxA9%2F8P7549s4O2NbwVss%3D</delete-url> </part> </parts> </import> </manifest>
Having determined the format of this file, I was then able to return to my original task — streamlining the FreeBSD EC2 image build process. To that end, I wrote a BSD EC2 image upload tool; and yesterday I finished preparing patches to integrate EC2 builds into the FreeBSD release process. While I still have to negotiate with the FreeBSD release engineering team about these — they are, naturally, far more familiar with the release process than I am, and there may be some adjustments needed for my work to fit into their process — I am confident that when the FreeBSD 10.2 release cycle starts there will be EC2 images built by the release engineering team rather than by myself.
And, of course, in the spirit of open source: My code and the formerly missing ImportVolume documentation is now available for anyone who might find either of them useful.
Thoughts on Startup SchoolLast weekend, I attended Y Combinator's Startup School. When the event was announced, I was distinctly ambivalent about attending — in fact I had decided against attending many previous such events due to the cost (in both time and money) of travelling down to the San Francisco bay area — but everybody I asked told me that it was well worth attending (even for someone from outside the valley), so I took their advice and signed up.
Perhaps my expectations were misaligned, but I was astonished at how geek-unfriendly the event was. At conferences like BSDCan, one of the major considerations is the availability of power sockets; at Startup School, not only were the power sockets very limited — we were in a theatre with perhaps a dozen outlets around the perimeter — but when I plugged in my laptop I was firmly told by one of the ushers (yes, ushers) that I wasn't allowed to do that. For that matter, the ushers were also telling people that they weren't allowed to stand at the side or back of the room, and anyone who left was told that they wouldn't be allowed back in until the current item on the program finished. An attitude reasonable for a symphony concert, to be sure; but entirely misplaced in this context.
The talks themselves, while occasionally entertaining, were somewhat lacking in useful information. When they weren't being outright contradictory — Trust in your vision! But know when to pivot! — there was little to be learned which couldn't already be gleaned from Paul Graham's Essays: Good investors are important far beyond the money they bring; having good co-founders (or, at a minimum, not having bad co-founders) is critical; and the best way to grow a startup is to create an excellent product. There were only two moments which stuck out in my mind: First, Ron Conway's comment that while older venture capitalists are better at giving advice and performing due diligence, it is the youngest people who are best at picking promising startups; and second, the founder of Groupon announcing, to my great astonishment, that he doesn't like people who build unsustainable businesses.
More than anything else, I would categorize the talks as "inspirational" — within very deliberate scare quotes — since while I did not find the talks to be particularly inspiring, the amount of applause directed at any mention of millions or billions of dollars suggested that many attendees were impressed by the speakers' financial successes. I came away wondering if this was in fact the real purpose of Startup School: Rather than being a "one day conference where you'll hear stories and practical advice", as the website billed, perhaps it — taking place just a few days before the deadline for Y Combinator applications — was really just a recruiting program for YC.
On the other hand, maybe the talks aren't the point of attending Startup School: After all, you can watch them online. Startup School brought together about 1700 people who are interested in startup companies; there must be interesting people to talk to in such a group, yes? Indeed there were; the challenge was to find them amidst the crowd. While I did have some very good conversations (and met several Tarsnap users, including a few who, much to my amusement, did not know that I was the developer behind it), there were far more people who I would charitably describe as "business-oriented". While this is a problem I've seen in Vancouver — lots of people who want to run startups, but far fewer people who want to build products — I had always thought that the concentration of technical talent in the bay area produced a better ratio. At the end of the day, I was left longing for the post-dot-com-bubble days, when the business people had fled and only the geeks were left behind.
I don't plan on attending next year's Startup School, and I would hesitate to recommend it to any other startup founder. If you're considering launching a startup and you need some "inspiration" to push you into going ahead, then by all means attend. For that matter, if you're looking for an audience to practice your "elevator pitch" on, you could certainly do worse. But if you're already working on a startup? Your time is probably better spent staying home and writing code.