FreeBSD/EC2 AMI Systems Manager Public Parameters

In June, I posted a EC2 Wishlist with three entries: "AWS Systems Manager Public Parameters", "BootMode=polyglot", and "Attaching multiple IAM Roles to an EC2 instance". I am happy to say that my first wish has been granted!

The necessary flags were recently set within AWS, and a few days ago I added code to FreeBSD's build system to register 14.0-CURRENT AMI Ids as Public Parameters. (I'll be merging this code to 13-STABLE and 12-STABLE in the coming weeks.) I've also "backfilled" the parameters for releases from 12.0 onwards.

This means that you can now

$ aws --region us-east-1 ssm get-parameter
    --name /aws/service/freebsd/arm64/base/ufs/13.0/RELEASE |
    jq -r '.Parameter.Value'
ami-050cc11ac34def94b
(using the jq tool to extract the Value field from the JSON blog returned by the AWS CLI) to look up the arm64 AMI for 13.0-RELEASE, and also
$ aws ec2 run-instances
    --image-id resolve:ssm:/aws/service/freebsd/arm64/base/ufs/13.0/RELEASE
    ... more command line options here ...
to look up the AMI and launch an instance — no more grepping the release announcement emails to find the right AMI Id for your region! Assuming everything works as expected, this will also be very useful for anyone who wants to run the latest STABLE or CURRENT images, since every time a new weekly snapshot is published the Public Parameter will be updated.

Many thanks to David and Arthur at AWS for their assistance in liaising with the Systems Manager team — I wouldn't have been able to do this without them!

This work was supported by my FreeBSD/EC2 Patreon; if you find it useful, please consider contributing so that I have more "funded hours" to spend on FreeBSD/EC2 work.

Posted at 2021-08-31 04:30 | Permanent link | Comments

EC2 boot time benchmarking

Last week I quietly released ec2-boot-bench, a tool for benchmarking EC2 instance boot times. This tool is BSD licensed, and should compile and run on any POSIX system with OpenSSL or LibreSSL installed. Usage is simple — give it AWS keys and tell it what to benchmark:
usage: ec2-boot-bench --keys <keyfile> --region <name> --ami <AMI Id>
    --itype <instance type> [--subnet <subnet Id>] [--user-data <file>]
and it outputs four values — how long the RunInstances API call took, how long it took EC2 to get the instance from "pending" state to "running" state, how long it took once the instance was "running" before port TCP/22 was "closed" (aka. sending a SYN packet got a RST back), and how long it took from when TCP/22 was "closed" to when it was "open" (aka. sending a SYN got a SYN/ACK back):
RunInstances API call took: 1.543152 s
Moving from pending to running took: 4.904754 s
Moving from running to port closed took: 17.175601 s
Moving from port closed to port open took: 5.643463 s

Once I finished writing ec2-boot-bench, the natural next step was to run some tests — in particular, to see how FreeBSD compared to other operating systems used in EC2. I used the c5.xlarge instance type and tested FreeBSD releases since 11.1-RELEASE (the first FreeBSD release which can run on the c5.xlarge instance type) along with a range of Linux AMIs mostly taken from the "quick launch" menu in the AWS console. In order to perform an apples-to-apples comparison, I passed a user-data file to the FreeBSD instances which turned off some "firstboot" behaviour — by default, FreeBSD release AMIs will update themselves and reboot to ensure they have all necessary security fixes before they are used, while Linuxes just leave security updates for users to install later:

>>/etc/rc.conf
firstboot_freebsd_update_enable="NO"
firstboot_pkgs_enable="NO"

For each of the AMIs I tested, I ran ec2-boot-bench 10 times, discarded the first result, and took the median values from the remaining 9 runs. The first two values — the time taken for a RunInstances API call to successfully return, and the time taken after RunInstances returns before a DescribeInstances call says that the instance is "running" — are consistent across all the AMIs I tested, at roughly 1.5 and 6.9 seconds respectively; so the numbers we need to look at for comparing AMIs are just the last two values reported by ec2-boot-bench, namely the time before the TCP/IP stack is running and has an IP address, and the time between that point and when sshd is running.

The results of my testing are as follows:
AMI Id (us-east-1) AMI Name running to port closed closed to open total
ami-0f9ebbb6ab174bc24Clear Linux 346401.230.001.23
ami-07d02ee1eeb0c996cDebian 106.264.0910.35
ami-0c2b8ca1dad447f8aAmazon Linux 29.551.5411.09
ami-09e67e426f25ce0d7Ubuntu Server 20.04 LTS7.394.6512.04
ami-0747bdcabd34c712aUbuntu Server 18.04 LTS10.644.3014.94
ami-03a454637e4aa453dRed Hat Enterprise Linux 8 (20210825)13.162.1115.27
ami-0ee02acd56a52998eUbuntu Server 16.04 LTS12.765.4218.18
ami-0a16c2295ef80ff63SUSE Linux Enterprise Server 12 SP516.326.9623.28
ami-00be86d9bba30a7b3FreeBSD 12.2-RELEASE17.096.2223.31
ami-00e91cb82b335d15fFreeBSD 13.0-RELEASE19.005.1324.13
ami-0fde50fcbcd46f2f7SUSE Linux Enterprise Server 15 SP218.136.7624.89
ami-03b0f822e17669866FreeBSD 12.0-RELEASE19.825.8325.65
ami-0de268ac2498ba33dFreeBSD 12.1-RELEASE19.936.0926.02
ami-0b96e8856151afb3aFreeBSD 11.3-RELEASE22.615.0527.66
ami-70504266FreeBSD 11.1-RELEASE25.724.3930.11
ami-e83e6c97FreeBSD 11.2-RELEASE25.455.3630.81
ami-01599ad2c214322aeFreeBSD 11.4-RELEASE55.194.0259.21
ami-0b0af3577fe5e3532Red Hat Enterprise Linux 813.4352.3165.74

In the race to accept incoming SSH connections, the clear winner — no pun intended — is Intel's Clear Linux, which boots to a running sshd in a blistering 1.23 seconds after the instance enters the "running" state. After Clear Linux is a roughly three way tie between Amazon Linux, Debian, and Ubuntu — and it's good to see that Ubuntu's boot performance has improved over the years, dropping from 18 seconds in 16.04 LTS to 15 seconds in 18.04 LTS and then to 12 seconds with 20.04 LTS. After the Amazon Linux / Debian / Ubuntu cluster comes SUSE Linux and FreeBSD; here, interestingly, SUSE 12 is faster than SUSE 15, while FreeBSD 12.2 and 13.0 (the most recent two releases) are noticeably faster than older FreeBSD.

Finally in dead last place comes Red Hat — which brings up its network stack quickly but takes a very long time before it is running sshd. It's possible that Red Hat is doing something similar to the behaviour I disabled in FreeBSD, in downloading and installing security updates before exposing sshd to the network — I don't know enough to comment here. (If someone reading this can confirm that possibility and has a way to disable that behaviour via user-data, I'll be happy to re-run the test and revise this post.)

UPDATE: Turns out that Red Hat's terrible performance was due to a bug which was fixed in the 2021-08-25 update. I tested the new version and it now lands in the middle of the pack of Linuxes rather than lagging far behind.

Needless to say, FreeBSD has some work to do to catch up here; but measurement is the first step, and indeed I already have work in progress to further profile and improve FreeBSD's boot performance, which I'll write about in a future post.

If you find this useful, please consider supporting my work either via my FreeBSD/EC2 Patreon or by sending me contributions directly. While my work on the FreeBSD/EC2 platform originated from the needs of my Tarsnap online backup service, it has become a much larger project over the years and I would be far more comfortable spending time on this if it weren't taking away so directly from my "paid work".

Posted at 2021-08-12 04:15 | Permanent link | Comments

My EC2 wishlist

I've been using Amazon EC2 since 2006, and I've been maintaining the FreeBSD/EC2 platform for over a decade. Over those years I've asked Amazon for many features; some of them, like HVM support (EC2 originally only support Xen/PV) and bidirectional serial console support (EC2 originally had an "output-only" serial console) eventually arrived, but I'm still waiting for others — some of which should be very easy for AWS to provide and would yield very large benefits.

While I've made engineers inside Amazon aware of all of these at various times, I think it's time to post my wishlist here — both so that a wider audience inside Amazon can hear more about these, and so that the FreeBSD community (especially the people who are financially supporting my work) can see what I'm aiming towards.


AWS Systems Manager Public Parameters

FreeBSD release announcements currently include a long list of AMI IDs — two for each EC2 region — and I would publish more AMIs if it weren't for the impracticality of putting all the AMI IDs into the announcements. One might say "there's got to be a better solution" — and indeed there is: AWS Systems Manager Public Parameters. Amazon publishes AMI IDs for Amazon Linux and Windows via the AWS Systems Manager Parameter Store, and Ubuntu AMI IDs are also published via the same mechanism (I assume by Canonical). I wrote code over a year ago to allow FreeBSD to publish AMI IDs the same way, but we can't use it until Amazon authorizes the FreeBSD release engineering account to publish these parameters — and we're still waiting.

In addition to allowing us to publish multiple AMIs (e.g. ZFS and cloud-init), if we had this then we could publish updated AMIs after every security update — using the Parameter Store to allow users to look up the latest updated version — which would dramatically speed up the process of launching new FreeBSD/EC2 instances.

Wishlist item #1: Please give the FreeBSD release engineering account access to store AWS Systems Manager Public Parameters.


BootMode=polyglot

A few months ago, Amazon started supporting UEFI booting on newer x86 instances. (ARM instances already used UEFI.) This is great news for FreeBSD, since we can boot much faster on UEFI than via the "legacy" BIOS boot mode — I/O is much faster since UEFI doesn't need to bounce disk reads through a small buffer in the bottom 1 MB of address space, and console output is much faster since we can use the UEFI console rather than a shockingly slow emulated VGA text mode. In fact, the total loader + kernel time (starting when the boot loader starts running, and stopping when the init process is spawned) drops from 10.9 seconds down to 3.9 seconds!

There's just one problem with this: AMIs are marked as either "legacy-bios" or "uefi", and while legacy-bios AMIs can boot on all of the x86 instance types, the UEFI-flagged AMIs can only boot on the instance types which support UEFI. FreeBSD's AMIs are built from disk images which support both boot methods — but when we make the EC2 RegisterImage API call, we have to specify one or the other. While we would love to make FreeBSD AMIs boot faster, we don't want to drop support for customers who are using older instance types.

Wishlist item #2: Please add a new "BootMode=polyglot" option, which marks AMIs as supporting both legacy-bios and uefi boot modes, with UEFI being used on instances where it is available and legacy-bios being used otherwise.


Attaching multiple IAM Roles to an EC2 instance

IAM Roles for EC2 are a very powerful — but very dangerous — feature, making credentials available to any process on the instance which can open a TCP connection to 169.254.169.254:80. Last year, I released imds-filterd, which allows access to the EC2 Instance Metadata Service (and thereby IAM Roles) to be locked down; as a result, you can now attach an IAM Role to an EC2 instance without the risk that a user-nobody privilege escalation allows an attacker to access the credentials.

There's only one problem: You can only attach a single IAM Role. This means that — even with imds-filterd restricting what each process can access in the metadata service — there's no way to give different credentials to different processes. This becomes a problem if you want to use the AWS Systems Manager Agent, since it requires credentials exposed as an IAM Role; there's no way to use the SSM Agent and another process which also uses IAM Role credentials without them both having access to each other's privileges. This even became a problem for Amazon a few years ago when they wanted to provide "extra" credentials to EC2 instances which could be used to manage SSH host keys: Because these credentials couldn't be attached as an IAM Role, they were exposed via the Instance Metadata Service as meta-data/identity-credentials/ec2/security-credentials/ec2-instance which Amazon's documentation helpfully marks as "[Internal use only]".

As it turns out, the EC2 API already supports attaching an array of IAM Roles to an instance, and the Instance Metadata Service already supports publishing credentials with different names — but the EC2 API throws an error if the array of IAM Roles has more than one name listed in it. Get rid of that restriction, and it will become much easier to properly effect privilege separation... and also easier for Amazon to provide credentials to code it has running on customer instances.

Wishlist item #3: Allow multiple IAM Roles to be attached to a single EC2 instance.


If you work at Amazon and can make one or all of these wishes come true, please get in touch (cperciva@FreeBSD.org). I really don't think any of these should be very difficult to provide on Amazon's side, and they would provide a huge benefit to FreeBSD. Alternatively, if you work at Amazon and you're screaming at your laptop "it's not that simple Colin!", please get in touch anyway (yes, I've signed the necessary NDAs).

And if you don't work at Amazon but you work at a large AWS customer: Please draw this list to the attention of your Amazonian contacts. Eventually we'll find someone who can make these happen!

Posted at 2021-06-16 01:45 | Permanent link | Comments

107 Lightbulbs

I bought a house last summer, and after moving in on October 1st, one of my first priorities was to replace all of the old (mostly incandescent) light bulbs with efficient LED bulbs. This turned into a five month saga.

Growing up, I thought that light bulbs were generally interchangeable. OK, there were fluorescent tube bulbs; but aside from that, while bulbs came in 40W, 60W, and 100W varieties, they were all the same size and screwed into the same sockets. Well, maybe that was the norm in the late 70s, but the house I bought was built in the late 90s and things definitely changed; there turned out to be no less than 14 different types of bulbs needing replacement:

If you're wondering what a house worth of assorted light bulbs looks like, here you go:

In total, replacing these bulbs cost $856.80 — but the power consumption (if I hypothetically had all the lights turned on at once) dropped from 5.8 kW down to 800 W. Based on an average 3 hours/day of usage (the Internet gives me estimates ranging from 1.5 hours/day up to 5 hours/day), the bulbs will pay for themselves in just one year — quite apart from the fact that the LED bulbs should last much longer than the incandescent bulbs they replaced.

Posted at 2021-03-04 03:20 | Permanent link | Comments

On the use of a life

In a recent discussion on Hacker News, a commenter posted the following question:
Okay, so, what do we think about TarSnap? Dude was obviously a genius, and spent his time on backups instead of solving millennium problems. I say that with the greatest respect. Is this entrepreneurship thing a trap?
I considered replying in the thread, but I think it deserves an in-depth answer — and one which will be seen by more people than would notice a reply in the middle of a 100+ comment thread.

First, to dispense with the philosophical argument: Yes, this is my life, and yes, I'm free to use — or waste — it however I please; but I don't think there's anything wrong with asking if this is how my time could be best spent. That applies doubly if the question is not merely about the choices I made but is rather a broader question: Is our society structured in a way which encourages people to make less than the greatest contribution they could?

That said, I do object somewhat to the premise of the question — specifically the statement that I "spent [my] time on backups".

On one hand, it is true: Tarsnap has been my full time job since 2006. I do occasional consulting — more in the early years than I have recently — but financially speaking that's a rounding error; it's Tarsnap which pays the bills (including allowing me to buy the house which I'll be moving into next week). On the other hand: My work on Tarsnap has extended considerably into adjacent fields.

In 2009, having had many users ask for passphrase-protected Tarsnap key files, and having determined that the current state of the art of password based key derivation was sorely lacking, I invented scrypt — and in the process, opened up a whole new field of cryptography. Sure, I was doing this because it was something I could do to make Tarsnap more secure; but it would be a stretch to place this under the umbrella of "spending my time working on backups".

In 2011, wanting to connect daemons on disparate hosts together securely, and not being happy with the existing TLS-based options, I wrote spiped. While I think it remains largely underappreciated by the world at large, I nonetheless consider it a significant contribution to computer security — and, like scrypt, while I created it to serve Tarsnap's needs, it would be a stretch to place such a general-purpose open-source tool under the narrow umbrella of "working on backups".

Around the same time, I started working on kivaloo, my high performance key-value data store. This may be the least used of all of the software I've written — I'm not aware of anyone else using it at present (although being open source software that doesn't necessarily preclude the possibility) — but I consider it to be some of my best code and I think it may become used in more niches than merely Tarsnap in the future.

Starting in 2006, and accelerating significantly after Amazon launched the "M3" family of HVM-enabled EC2 instances in 2012, I've been creating and maintaining the FreeBSD/EC2 platform. While I don't have any precise statistics on its usage, a survey last year found that 44% of people running FreeBSD in the cloud use Amazon EC2; so — despite the fact that only 22 people currently provide sponsorship for my efforts — it's clear that my work here has been productive. Again, while I was working on this because I wanted to run FreeBSD in EC2 for Tarsnap, I don't think it can be placed entirely into the category of "working on backups".

Of course, the question at hand isn't whether I've done anything useful, but rather whether this was the most useful way I could have spent these years. Judging by the reference to the Millennium Problems, I imagine that the specific alternative they had in mind was a research career; indeed, between my Undergraduate studies in number theory under the late Peter Borwein and my Doctoral studies in Oxford I might have considered seriously working on the Birch and Swinnerton-Dyer conjecture had my life taken a different path. (A very different BSD from the one with which I am currently involved!)

So why am I not an academic? There are many factors, and starting Tarsnap is certainly one; but most of them can be summarized as "academia is a lousy place to do novel research". In 2005, I made the first publication of the use of shared caches in multi-threaded CPUs as a cryptographic side channel, and in 2006 I hoped to continue that work. Having recently received my doctorate from Oxford University and returned home to Canada, I was eligible for a post-doctoral fellowship from Canada's National Sciences and Engineering Research Council, so I applied, and... I didn't get it. My supervisor cautioned me of the risks of doing work which was overly novel as a young academic: Committees don't know what to make of you, and they don't have any reputational prior to fall back upon. Indeed, I ran into this issue with my side channel attack: Reviewers at the Journal of Cryptology didn't understand why they were being asked to read a paper about CPU design, while reviewers at a computer hardware journal didn't understand why they were being asked to read about cryptography. It became clear, both from my own experiences and from advice I received, that if I wanted to succeed in academia I would need to churn out incremental research papers every year — at very least until I had tenure.

In many ways, starting my own company has given me the sort of freedom which academics aspire to. Sure, I have customers to assist, servers to manage (not that they need much management), and business accounting to do; but professors equally have classes to teach, students to supervise, and committees to attend. When it comes to research, I can follow my interests without regard to the whims of granting agencies and tenure and promotion committees: I can do work like scrypt, which is now widely known but languished in obscurity for several years after I published it; and equally I can do work like kivaloo, which has been essentially ignored for close to a decade, with no sign of that ever changing.

Is there a hypothetical world where I would be an academic working on the Birch and Swinnerton-Dyer conjecture right now? Sure. It's probably a world where high-flying students are given, upon graduation, some sort of "mini-Genius Grant". If I had been awarded a five-year $62,500/year grant with the sole condition of "do research", I would almost certainly have persevered in academia and — despite working on the more interesting but longer-term questions — have had enough publications after those five years to obtain a continuing academic position. But that's not how granting agencies work; they give out one or two year awards, with the understanding that those who are successful will apply for more funding later.

In short, academic institutions systemically promote exactly the sort of short-term optimization of which, ironically, the private sector is often accused. Is entrepreneurship a trap? No; right now, it's one of the only ways to avoid being trapped.

Posted at 2020-09-20 22:10 | Permanent link | Comments

Recent posts

Monthly Archives

Yearly Archives


RSS