Apache memory usage bogosityFor several months I've been seeing some very strange behaviour from Apache: As each day progressed, the child processes would use more and more memory, resetting to their initial size each morning when Apache restarted (due to daily HTTP log rotation), resulting in a very clear sawtooth pattern in my MRTG graphs of CPU and memory usage. This wasn't an unmanageable problem -- by setting a MaxRequestsPerChild limit in httpd.conf, I instructed Apache to "recycle" its children, which eventually recovered the lost memory -- but it was irritating enough that I wanted to find a better solution.
I found that solution this morning. Working via the procfs pseudo-filesystem (also known as "rootmefs" due to its long history of security flaws), I compared the memory maps of a 2MB Apache process and a 9MB Apache process and discovered that, as expected, all of the increased memory usage was within the region used by malloc(3) calls. Inspecting the contents of this memory, I was surprised to find that it was almost all used by a long list of files names -- specifically, the names of the 18,000+ files in the /f/ directory on the portsnap mirror.
Since the growth in memory usage took the form of individual processes suddenly growing much larger, rather than all the processes slowly growing, it was clear that I was looking for a single request which caused Apache to generate a long list of file names... at which point the answer is immediately obvious. Looking at the access logs, I confirmed my guess: Someone was issuing a request which mapped to the directory, and Apache was helpfully responding with a directory listing -- after first allocating 7MB of memory to store the list of files in memory so that they could be sorted. Apache was presumably "freeing" this memory afterwards, but since it was not advising the kernel that the memory was unneeded, the additional megabytes continued to be marked as "in use" (and would ultimately be paged to disk when memory pressure made it necessary).
The simplest fixes are the best fixes, and this one is certainly simple: In the .htaccess file at the root of portsnap2.freebsd.org, I added the line "options -Indexes". No more large indices being generated; no more large memory allocations; and in the past 14 hours the memory usage on the server has remained nearly exactly constant.
BSDCan'06For the past four days I've been at BSDCan'06, talking to other FreeBSD developers and users, meeting the rest of the FreeBSD Security Team, attending talks (mostly from other FreeBSD developers), and presenting my paper.
To quickly summarize the paper: Difficult questions often arise when handling security problems, and they usually result from poor specifications. We're never going to convince everybody to write formal specifications for every interface they provide; but as far as security is concerned, what we really need is clear and precise specifications not for what a program should do, but instead for what it is guaranteed to do. In short, we need to add "fine print" into our API contracts, and define a security flaw to be any bug which violates the guarantees provided in the fine print.
Now that the conference is over, I'm going to spend the day touring Ottawa before my evening flight back to Vancouver; on Monday I'll start work on rewriting FreeBSD Update.
Starting workI've started my summer of FreeBSD work with three improvements to the mirror-selection code in the portsnap client: It now understands the output of the host(1) from BIND 8, will switch to a different mirror if the snapshot tag download fails, and if it is using an HTTP proxy, portsnap will always pick mirrors in the same order -- this ensures that a collection of machines behind a proxy will only result in the data being fetched and cached by the proxy once instead of once for each mirror in use. These three changes will be MFCed after 6.1-RELEASE and 5.5-RELEASE, and at that time I'll also update the version in the ports tree. Nothing very dramatic here, but I said that I would keep this page updated with my progress, and that means including the routine as well as the dramatic.
Next item to get finished: my BSDCan paper!