FreeBSD on the Graviton 3Amazon announced the Graviton 3 processor and C7g instance family in November 2021, but it took six months before they were ready for general availability; in the mean time, however, as the maintainer of the FreeBSD/EC2 platform I was able to get early access to these instances.
As far as FreeBSD is concerned, Graviton 3 is mostly just a faster version of the Graviton 2: Most things "just work", and the things which don't work on Graviton 2 — hotplug devices and cleanly shutting down an instance via the EC2 API — also don't work on Graviton 3. (Want to help get these fixed? Sponsor my work on FreeBSD/EC2 so I have a few more paid hours to work on this.) The one notable architectural difference with the Graviton 3 is the addition of Pointer Authentication, which makes use of "unused" bits in pointers to guard against some vulnerabilities (e.g. buffer overflows which overwrite pointers). Andrew Turner recently added support for arm64 pointer authentication to FreeBSD.
But since Graviton 3 is largely a "faster Graviton 2", the obvious question is "how much faster" — so I launched a couple instances (c6g.8xlarge and c7g.8xlarge) with 500 GB root disks and started comparing.
The first performance test I always run on FreeBSD is a quick microbenchmark
of hashing performance: The md5 command (also known as
sha1, sha256, sha512, and many other things) has
a "time trial" mode which hashes 100000 blocks of 10000 bytes each. I ran
a few of these hashes:
|Graviton 2||Graviton 3||speedup|
|md5||2.26 s||2.16 s||1.05x|
|sha1||2.67 s||1.94 s||1.38x|
|sha256||0.82 s||0.81 s||1.01x|
|sha512||2.87 s||1.03 s||2.79x|
The first two of these hashes (md5 and sha1) are implemented in FreeBSD as pure C code; here we see Graviton 3 pulling slightly ahead. The sha256 and sha512 hashes make use of the arm64 cryptographic extensions (which have special instructions for those operations) so it's no surprise that sha256 has identical performance on both CPUs; for sha512 however it seems that Graviton 3 has far more optimized implementations of the arm64 extensions, since it beats Graviton 2 by almost a factor of 3.
Moving on, the next thing I did was to get a copy of the FreeBSD src and
ports trees. Three commands: First, install git using the
pkg utility; second, git clone the FreeBSD src tree;
and third, use portsnap to get the latest ports tree (this last
one is largely a benchmark of fork(2) performance since
portsnap is a shell script):
|Graviton 2||Graviton 3||speedup|
|pkg install git||19.13 s||4.76 s||18.14 s||3.40 s||1.05x||1.40x|
|git clone src||137.76 s||315.79 s||120.99 s||240.09 s||1.14x||1.32x|
|portsnap fetch extract||159.56 s||175.22 s||124.41 s||133.02 s||1.28x||1.32x|
These commands are all fetching data from FreeBSD mirrors and extracting files to disk, so we should expect that changing the CPU alone would yield limited improvements; and indeed that's exactly what we see in the "real" (wall-clock) time. The pkg command only drops from 19.13 to 18.14 seconds — only a 1.05x speedup — because most of the time pkg is running the CPU is idling anyway. The speedup in CPU time, in contrast, is a factor of 1.40x. Similarly, the git clone and portsnap commands spend some of their time waiting for network or disk; but their CPU time usage drops by a factor of 1.32x.
Well, now that I had a FreeBSD source tree cloned, I had to run the
most classic FreeBSD benchmark: Rebuilding the FreeBSD base system (world
and kernel). I checked out the 13.1-RELEASE source tree (normally I would
test building HEAD, but for benchmarking purposes I wanted to make sure
that other people would be able to run exactly the same compile later) and
timed make buildworld buildkernel -j32 (the -j32 tells
the FreeBSD build to make use of all 32 cores on these systems):
|Graviton 2||Graviton 3||speedup|
|FreeBSD world+kernel||849.09 s||21892.79 s||597.14 s||14112.62 s||1.42x||1.45x|
Here we see the Graviton 3 really starting to shine: While there's some disk I/O to slow things down, the entire compile fits into the disk cache (the src tree is under 1 GB and the obj tree is around 5 GB, while the instances I was testing on have 64 GB of RAM), so almost all of the time spent is on actual compiling. (Or waiting for compiles to finish! While we run with -j32, the FreeBSD build is not perfectly parallelized, and on average only 26 cores are being used at once.) The FreeBSD base system build completes on the Graviton 3 in 9 minutes and 57 seconds, compared to 14 minutes and 9 seconds on the Graviton 2 — a 1.42x speedup (and 1.45x reduction in CPU time).
Ok, that's the base FreeBSD system; what about third-party packages? I
built the apache24, Xorg, and libreoffice
packages (including all of their dependencies, starting from a clean
system each time). In the interest of benchmarking the package builds
rather than the mirrors holding source code, I ran a make fetch-recursive
for each of the packages (downloading source code for the package and
all of its dependencies) first and only timed the builds.
|Graviton 2||Graviton 3||speedup|
|apache24||502.77 s||1103.50 s||369.34 s||778.52 s||1.36x||1.42x|
|Xorg||3270.62 s||41005.32 s||2492.98 s||28649.06 s||1.31x||1.43x|
|libreoffice||10084.95 s||106502.80 s||7306.28 s||74385.83 s||1.38x||1.43x|
Here again we see a large reduction in CPU time — by a factor of 1.42 or 1.43 — from the Graviton 3, although as usual the "real" time shows somewhat less improvement; even with the source code already downloaded, a nontrivial amount of time is spent extracting tarballs.
All told, the Graviton 3 is a very nice improvement over the Graviton 2: With the exception of sha256 — which, at 1.2 GB/s, is likely more than fast enough already — we consistently see a CPU speedup of between 30% and 45%. I look forward to moving some of my workloads across to Graviton 3 based instances!
FreeBSD/EC2: What I've been up toI realized recently that there's very little awareness of the work which goes into keeping FreeBSD working on Amazon EC2 — for that matter, I often have trouble remembering what I've been fixing. As an experiment I'm going to start trying to record my work, both for public consumption and to help myself; I might end up posting monthly, but to start with I'm going to report on what I've been done in January through March of 2022.
- I committed code I started working on 4.5 years earlier which speeds up the x86 boot process (including EC2) by roughly 2 seconds.
- Working with a few other FreeBSD developers, I helped to fix qemu breakage which was preventing EC2 arm64 images from building.
- I reported benchmarking results to Amazon which helped them fix a performance issue in their EFI boot code.
- I kicked the Lightsail team about updating their FreeBSD images.
- I continued kicking the Lightsail team about updating their FreeBSD images.
- I handed out AWS credit codes (from my "AWS Hero" quota) to FreeBSD developers.
- I liaised with an Amazon developer working on fixing hotplug in arm64. (Work not ready for commit yet.)
- I committed a patch (not done by me, although I helped to review it) for obtaining entropy from EFI in the boot loader and passing it to the kernel; this ensures that arm64 EC2 instances have enough entropy for key generation when they first boot.
- I helped to debug more breakage affecting the release engineering AMI builds.
- I updated my EC2 boot scripts to fix the formatting of the SSH host keys, which had been broken by changes in the logger utility.
- Lightsail finally updated to FreeBSD 12.3. I encouraged them to add a FreeBSD 13 offering as well.
- I investigated a bug report concerning encrypted EBS volumes; it seems to be as AWS bug and I convinced Amazonians to investigate.
- I closed a bug report concerning clock stability on T3 family instances; it resulted from an AWS bug which I have been told has now been fixed.
- I fixed a glitch in the release engineering build process which was resulting in 13.1 BETA AMIs not being registered in the Systems Manager Parameter Store.
- I wrote a patch to fix the console on EC2 arm64 instances; currently pending review.
This work is supported by my FreeBSD/EC2 Patreon.
FreeBSD/EC2 AMI Systems Manager Public ParametersIn June, I posted a EC2 Wishlist with three entries: "AWS Systems Manager Public Parameters", "BootMode=polyglot", and "Attaching multiple IAM Roles to an EC2 instance". I am happy to say that my first wish has been granted!
The necessary flags were recently set within AWS, and a few days ago I added code to FreeBSD's build system to register 14.0-CURRENT AMI Ids as Public Parameters. (I'll be merging this code to 13-STABLE and 12-STABLE in the coming weeks.) I've also "backfilled" the parameters for releases from 12.0 onwards.
This means that you can now
$ aws --region us-east-1 ssm get-parameter --name /aws/service/freebsd/arm64/base/ufs/13.0/RELEASE | jq -r '.Parameter.Value' ami-050cc11ac34def94b(using the jq tool to extract the Value field from the JSON blog returned by the AWS CLI) to look up the arm64 AMI for 13.0-RELEASE, and also
$ aws ec2 run-instances --image-id resolve:ssm:/aws/service/freebsd/arm64/base/ufs/13.0/RELEASE ... more command line options here ...to look up the AMI and launch an instance — no more grepping the release announcement emails to find the right AMI Id for your region! Assuming everything works as expected, this will also be very useful for anyone who wants to run the latest STABLE or CURRENT images, since every time a new weekly snapshot is published the Public Parameter will be updated.
Many thanks to David and Arthur at AWS for their assistance in liaising with the Systems Manager team — I wouldn't have been able to do this without them!
This work was supported by my FreeBSD/EC2 Patreon; if you find it useful, please consider contributing so that I have more "funded hours" to spend on FreeBSD/EC2 work.
EC2 boot time benchmarkingLast week I quietly released ec2-boot-bench, a tool for benchmarking EC2 instance boot times. This tool is BSD licensed, and should compile and run on any POSIX system with OpenSSL or LibreSSL installed. Usage is simple — give it AWS keys and tell it what to benchmark:
usage: ec2-boot-bench --keys <keyfile> --region <name> --ami <AMI Id> --itype <instance type> [--subnet <subnet Id>] [--user-data <file>]and it outputs four values — how long the RunInstances API call took, how long it took EC2 to get the instance from "pending" state to "running" state, how long it took once the instance was "running" before port TCP/22 was "closed" (aka. sending a SYN packet got a RST back), and how long it took from when TCP/22 was "closed" to when it was "open" (aka. sending a SYN got a SYN/ACK back):
RunInstances API call took: 1.543152 s Moving from pending to running took: 4.904754 s Moving from running to port closed took: 17.175601 s Moving from port closed to port open took: 5.643463 s
Once I finished writing ec2-boot-bench, the natural next step was to run some tests — in particular, to see how FreeBSD compared to other operating systems used in EC2. I used the c5.xlarge instance type and tested FreeBSD releases since 11.1-RELEASE (the first FreeBSD release which can run on the c5.xlarge instance type) along with a range of Linux AMIs mostly taken from the "quick launch" menu in the AWS console. In order to perform an apples-to-apples comparison, I passed a user-data file to the FreeBSD instances which turned off some "firstboot" behaviour — by default, FreeBSD release AMIs will update themselves and reboot to ensure they have all necessary security fixes before they are used, while Linuxes just leave security updates for users to install later:
>>/etc/rc.conf firstboot_freebsd_update_enable="NO" firstboot_pkgs_enable="NO"
For each of the AMIs I tested, I ran ec2-boot-bench 10 times, discarded the first result, and took the median values from the remaining 9 runs. The first two values — the time taken for a RunInstances API call to successfully return, and the time taken after RunInstances returns before a DescribeInstances call says that the instance is "running" — are consistent across all the AMIs I tested, at roughly 1.5 and 6.9 seconds respectively; so the numbers we need to look at for comparing AMIs are just the last two values reported by ec2-boot-bench, namely the time before the TCP/IP stack is running and has an IP address, and the time between that point and when sshd is running.
The results of my testing are as follows:
|AMI Id (us-east-1)||AMI Name||running to port closed||closed to open||total|
|ami-0f9ebbb6ab174bc24||Clear Linux 34640||1.23||0.00||1.23|
|ami-0c2b8ca1dad447f8a||Amazon Linux 2||9.55||1.54||11.09|
|ami-09e67e426f25ce0d7||Ubuntu Server 20.04 LTS||7.39||4.65||12.04|
|ami-0747bdcabd34c712a||Ubuntu Server 18.04 LTS||10.64||4.30||14.94|
|ami-03a454637e4aa453d||Red Hat Enterprise Linux 8 (20210825)||13.16||2.11||15.27|
|ami-0ee02acd56a52998e||Ubuntu Server 16.04 LTS||12.76||5.42||18.18|
|ami-0a16c2295ef80ff63||SUSE Linux Enterprise Server 12 SP5||16.32||6.96||23.28|
|ami-0fde50fcbcd46f2f7||SUSE Linux Enterprise Server 15 SP2||18.13||6.76||24.89|
|ami-0b0af3577fe5e3532||Red Hat Enterprise Linux 8||13.43||52.31||65.74|
In the race to accept incoming SSH connections, the clear winner — no pun intended — is Intel's Clear Linux, which boots to a running sshd in a blistering 1.23 seconds after the instance enters the "running" state. After Clear Linux is a roughly three way tie between Amazon Linux, Debian, and Ubuntu — and it's good to see that Ubuntu's boot performance has improved over the years, dropping from 18 seconds in 16.04 LTS to 15 seconds in 18.04 LTS and then to 12 seconds with 20.04 LTS. After the Amazon Linux / Debian / Ubuntu cluster comes SUSE Linux and FreeBSD; here, interestingly, SUSE 12 is faster than SUSE 15, while FreeBSD 12.2 and 13.0 (the most recent two releases) are noticeably faster than older FreeBSD.
Finally in dead last place comes Red Hat — which brings up its network stack quickly but takes a very long time before it is running sshd. It's possible that Red Hat is doing something similar to the behaviour I disabled in FreeBSD, in downloading and installing security updates before exposing sshd to the network — I don't know enough to comment here. (If someone reading this can confirm that possibility and has a way to disable that behaviour via user-data, I'll be happy to re-run the test and revise this post.)
UPDATE: Turns out that Red Hat's terrible performance was due to a bug which was fixed in the 2021-08-25 update. I tested the new version and it now lands in the middle of the pack of Linuxes rather than lagging far behind.
Needless to say, FreeBSD has some work to do to catch up here; but measurement is the first step, and indeed I already have work in progress to further profile and improve FreeBSD's boot performance, which I'll write about in a future post.
If you find this useful, please consider supporting my work either via my FreeBSD/EC2 Patreon or by sending me contributions directly. While my work on the FreeBSD/EC2 platform originated from the needs of my Tarsnap online backup service, it has become a much larger project over the years and I would be far more comfortable spending time on this if it weren't taking away so directly from my "paid work".
My EC2 wishlistI've been using Amazon EC2 since 2006, and I've been maintaining the FreeBSD/EC2 platform for over a decade. Over those years I've asked Amazon for many features; some of them, like HVM support (EC2 originally only support Xen/PV) and bidirectional serial console support (EC2 originally had an "output-only" serial console) eventually arrived, but I'm still waiting for others — some of which should be very easy for AWS to provide and would yield very large benefits.
While I've made engineers inside Amazon aware of all of these at various times, I think it's time to post my wishlist here — both so that a wider audience inside Amazon can hear more about these, and so that the FreeBSD community (especially the people who are financially supporting my work) can see what I'm aiming towards.
AWS Systems Manager Public ParametersFreeBSD release announcements currently include a long list of AMI IDs — two for each EC2 region — and I would publish more AMIs if it weren't for the impracticality of putting all the AMI IDs into the announcements. One might say "there's got to be a better solution" — and indeed there is: AWS Systems Manager Public Parameters. Amazon publishes AMI IDs for Amazon Linux and Windows via the AWS Systems Manager Parameter Store, and Ubuntu AMI IDs are also published via the same mechanism (I assume by Canonical). I wrote code over a year ago to allow FreeBSD to publish AMI IDs the same way, but we can't use it until Amazon authorizes the FreeBSD release engineering account to publish these parameters — and we're still waiting.
In addition to allowing us to publish multiple AMIs (e.g. ZFS and cloud-init), if we had this then we could publish updated AMIs after every security update — using the Parameter Store to allow users to look up the latest updated version — which would dramatically speed up the process of launching new FreeBSD/EC2 instances.
Wishlist item #1: Please give the FreeBSD release engineering account access to store AWS Systems Manager Public Parameters.
BootMode=polyglotA few months ago, Amazon started supporting UEFI booting on newer x86 instances. (ARM instances already used UEFI.) This is great news for FreeBSD, since we can boot much faster on UEFI than via the "legacy" BIOS boot mode — I/O is much faster since UEFI doesn't need to bounce disk reads through a small buffer in the bottom 1 MB of address space, and console output is much faster since we can use the UEFI console rather than a shockingly slow emulated VGA text mode. In fact, the total loader + kernel time (starting when the boot loader starts running, and stopping when the init process is spawned) drops from 10.9 seconds down to 3.9 seconds!
There's just one problem with this: AMIs are marked as either "legacy-bios" or "uefi", and while legacy-bios AMIs can boot on all of the x86 instance types, the UEFI-flagged AMIs can only boot on the instance types which support UEFI. FreeBSD's AMIs are built from disk images which support both boot methods — but when we make the EC2 RegisterImage API call, we have to specify one or the other. While we would love to make FreeBSD AMIs boot faster, we don't want to drop support for customers who are using older instance types.
Wishlist item #2: Please add a new "BootMode=polyglot" option, which marks AMIs as supporting both legacy-bios and uefi boot modes, with UEFI being used on instances where it is available and legacy-bios being used otherwise.
Attaching multiple IAM Roles to an EC2 instanceIAM Roles for EC2 are a very powerful — but very dangerous — feature, making credentials available to any process on the instance which can open a TCP connection to 169.254.169.254:80. Last year, I released imds-filterd, which allows access to the EC2 Instance Metadata Service (and thereby IAM Roles) to be locked down; as a result, you can now attach an IAM Role to an EC2 instance without the risk that a user-nobody privilege escalation allows an attacker to access the credentials.
There's only one problem: You can only attach a single IAM Role. This means that — even with imds-filterd restricting what each process can access in the metadata service — there's no way to give different credentials to different processes. This becomes a problem if you want to use the AWS Systems Manager Agent, since it requires credentials exposed as an IAM Role; there's no way to use the SSM Agent and another process which also uses IAM Role credentials without them both having access to each other's privileges. This even became a problem for Amazon a few years ago when they wanted to provide "extra" credentials to EC2 instances which could be used to manage SSH host keys: Because these credentials couldn't be attached as an IAM Role, they were exposed via the Instance Metadata Service as meta-data/identity-credentials/ec2/security-credentials/ec2-instance which Amazon's documentation helpfully marks as "[Internal use only]".
As it turns out, the EC2 API already supports attaching an array of IAM Roles to an instance, and the Instance Metadata Service already supports publishing credentials with different names — but the EC2 API throws an error if the array of IAM Roles has more than one name listed in it. Get rid of that restriction, and it will become much easier to properly effect privilege separation... and also easier for Amazon to provide credentials to code it has running on customer instances.
Wishlist item #3: Allow multiple IAM Roles to be attached to a single EC2 instance.
If you work at Amazon and can make one or all of these wishes come true, please get in touch (cperciva@FreeBSD.org). I really don't think any of these should be very difficult to provide on Amazon's side, and they would provide a huge benefit to FreeBSD. Alternatively, if you work at Amazon and you're screaming at your laptop "it's not that simple Colin!", please get in touch anyway (yes, I've signed the necessary NDAs).
And if you don't work at Amazon but you work at a large AWS customer: Please draw this list to the attention of your Amazonian contacts. Eventually we'll find someone who can make these happen!