Write opinionated workarounds

A few years ago, I decided that I should aim for my code to be as portable as possible. This generally meant targeting POSIX; in some cases I required slightly more, e.g., "POSIX with OpenSSL installed and cryptographic entropy available from /dev/urandom". This dedication made me rather unusual among software developers; grepping the source code for the software I have installed on my laptop, I cannot find any other examples of code with strictly POSIX compliant Makefiles, for example. (I did find one other Makefile which claimed to be POSIX-compatible; but in actual fact it used a GNU extension.) As far as I was concerned, strict POSIX compliance meant never having to say you're sorry for portability problems; if someone ran into problems with my standard-compliant code, well, they could fix their broken operating system.

And some people did. Unfortunately, despite the promise of open source, many users were unable to make such fixes themselves, and for a rather large number of operating systems the principle of standards compliance seems to be more aspirational than actual. Given the limits which would otherwise be imposed on the user base of my software, I eventually decided that it was necessary to add workarounds for some of the more common bugs. That said, I decided upon two policies:

Workarounds should be disabled by default, and only enabled upon detecting an afflicted system.
Users should be warned that a workaround is being applied.

The first policy is essential for preventing a scenario often found in older software: A workaround is added for one system, but then that workaround introduces a problem on a second system and so a workaround is added for the workaround, and then a problem is found with that second workaround... and ten years later there's a stack of workarounds to workarounds which nobody dares to remove, even though the original problem which was being worked around has long since been corrected. If a workaround is disabled by default, it's less likely to provoke such a stack of workarounds — and it's going to be much easier to remove them once they're no longer needed.

The second policy is important as a matter of education: Users deserve to know that they're running a broken operating system. And running broken operating systems they are doing. Here are some of the warnings people will see, along with explanations (more for the benefit of people who arrive here via google than for my regular readership):

WARNING: POSIX violation: make's CC doesn't understand -lxnet
WARNING: POSIX violation: make's CC doesn't understand -lrt
The POSIX C compiler is required to accept the options -lxnet and -lrt even if those libraries do not exist. On many systems the functionality implied by those options is included in libc and is thus always available, but those options are not properly ignored.
WARNING: POSIX violation: <time.h> not defining CLOCK_REALTIME
Up to POSIX POSIX.1-2004, CLOCK_REALTIME was part of the optional "Timers" component; but it is now a mandatory part of the standard, although the (arguably far more useful) CLOCK_MONOTONIC clock remains optional.
WARNING: POSIX violation: <sys/socket.h> not defining MSG_NOSIGNAL
Another historical portability problem, MSG_NOSIGNAL became mandatory starting in POSIX.1-2008.
#warning Working around bug in LLVM optimizer
#warning For more details see https://llvm.org/bugs/show_bug.cgi?id=27190
LLVM is known to miscompile code paths containing longjmp or siglongjmp calls. I'm actually rather shocked that this wasn't noticed and fixed a long time ago; longjmp doesn't get used very often, but the places where it does get used tend to be places where having miscompiled code is even scarier than normal.
WARNING: Applying workaround for Docker signal-handling bug
Unlike the others, this warning appears at run-time; it refers to a problem where SIGTERM and SIGINT are disabled for a process running as init in a Docker container.

But as passionate as I am about user education, there's a far more important reason for that second policy: Getting things fixed. All of these are problems we could have worked around silently; indeed, with the exception of the LLVM bug (which I don't think anyone else has noticed) all of them have been worked around silently. But while silent workarounds solve the immediate problem for one piece of software, they do nothing to help the next developer who trips over those bugs. Warnings, on the other hand, can help to get bugs fixed: Indeed, a few months ago I fixed a bug in FreeBSD for the sole reason that I was getting annoyed by one of my own warning messages! Even if the vast majority of people who see those warnings disregard them, any chance that the right developer will get the message and fix a bug is better than none.

My regular readers will know that I care deeply about producing correct code, offering bounties for issues as trivial as misplaced punctuation in comments. But it isn't just my own code I care about; I'm affected by bugs in all of the code I run, and even by bugs in code I don't run if I rely on someone else who does. So please, if you find a bug, don't just work around it; shout it from the rooftops in the hope that the right people will hear.

Because if we all stop accepting broken code, we might eventually end up with less broken code.

Posted at 2016-04-11 10:00 | Permanent link | Comments

Daemonic Dispatches

Write opinionated workarounds

Recent posts

Monthly Archives

Yearly Archives