A packet trace of portsnap.

Out of curiosity and to make sure that everything was functioning as I expected, I used tcpdump to capture a trace of network activity while running portsnap. Portsnap updated its compressed snapshot from 02:55:24 UTC to 14:04:14 UTC; during this period, 15 ports were modified and 5 new ports were added to the tree.

The first packet was sent at 15:49:56.569740 and the last packet was received at 15:50:03.316998, slightly less than seven seconds later. In total, 98 packets were sent and received: 5 outbound DNS requests (a total of 408 bytes including IP headers), 5 inbound DNS responses (1253 bytes including IP headers), 45 outbound TCP packets forming 5 HTTP connections and 32 HTTP requests (9152 bytes including IP headers), and 43 inbound TCP packets, from those same HTTP connections, containing 32 HTTP responses (36182 bytes including IP headers). The TCP:UDP ratios are 8.8:1 (packets) and 27.3:1 (bytes); the inbound:outbound ratios are 0.96:1 (packets) and 3.9:1 (bytes) -- slightly more outbound traffic than ideal for a client application (broadband internet access often has a 10:1 bandwidth ratio), but not too bad.

If each of the necessary files had been fetched separately instead of using pipelined HTTP, the same updating would have taken over twice as many packets (a minimum of 224 packets for 32 HTTP connections), at least 50% more time (and far longer than that if this system didn't happen to have an unusually short round-trip time of 30ms to the server), and roughly 8000 more bytes in TCP/IP overhead. For updates over a longer period of time or from a greater distance, a lack of pipelined HTTP could result in a factor of ten slowdown.

The approach of "fetch lots of independent small pieces over HTTP and knit them together" is very useful and increasingly popular; it's worth noting that Google Maps does exactly this with its tiled map, and in so doing manages to use far less bandwidth and be far more responsive than systems which fetch an entire new map from a server every time the user scrolls around. Whenever this approach is used, however, it is essential to carefully consider the round-trip time associated with each HTTP connection, and to make sure that pipelined HTTP is used where necessary. Unfortunately, over six years after RFC 2616 was published, client-side support for pipelined HTTP is still quite rare.

It has been said that if you build a better mousetrap, the world will beat a path to your door; what people fail to mention is that whether your mousetrap is called "pipelined HTTP", "FreeBSD", "portsnap", or "the subset-sum self-initializing quadratic sieve", the path-beating is likely to take years if not decades before it is complete.

Posted at 2005-12-28 17:45 | Permanent link | Comments

Liberal election goofs.

What are they drinking at the Canadian Liberal party headquarters?

First Scott Reid, the party leader's director of communications, suggested that parents would spend a no-strings-attached child-care grant promised by the opposition Conservatives on " beer and popcorn"; then party officials distributed material accusing Stephen Harper (the Conservative leader) and Gilles Duceppe (the leader of the separatist Bloc Quebecois) of making deals to break up the country -- along with a photograph of the two talking at a Holocaust memorial; and now Mike Klander, the executive vice-president of the Liberal party's Ontario wing has written in his blog that he thinks " Jack Layton is an asshole" [Layton is the leader of the third-place New Democratic Party] and suggested that Layton's wife, Olivia Chow, was "separated at birth" from a chow chow dog.

I remember pundits suggesting that the 2004 Federal election would have been won by Stephen Harper and the Conservative party if they could simply have kept their mouths closed; I wonder if we're going to look back at the 2006 election and say the same thing about Paul Martin and the Liberal party.

Posted at 2005-12-27 06:00 | Permanent link | Comments

What is the population of Quebec?

For the past week I have been rather puzzled by Canadian Federal Election Opinion Polls, and in particular by the difference between the numbers reported by SES and The Strategic Counsel for the Bloc Quebecois. While Strategic Counsel has reported daily numbers showing Bloc support at 13-14% nationally, SES reported a drop down to 11% -- a difference which would likely have a profound affect upon the outcome of the election. (Note to non-Canadian readers: The Bloc Quebecois is a separatist party which only runs candidates within the province of Quebec; as a result of the first-past-the-post system used in Canada, they received 12.4% of the popular vote but 17.5% of the seats in the 2004 federal election.)

Now, Strategic Counsel reports their 95% confidence interval as being ±2.5%, while SES reports theirs as ±3.2%, so this difference isn't entirely outside of the possibility of random sampling error; but statistically speaking the confidence interval on the BQ vote should be roughly half the size of the confidence interval given for the entire poll -- as a result of the BQ vote share being smaller -- so these numbers are suspicious enough to deserve investigation.

Looking at the detailed numbers from the two companies provides some insight. For the days December 1st through December 4th, Strategic Counsel reports national support for the Bloc of 14% and support within Quebec of 54% (presumably these are rounded to the nearest integer percent). On December 8th, SES reports national support of 11% but provincial support of 50% for the Bloc. Ignore the difference in the days when the polls were performed, and just look at those numbers for a moment: How can a 4% difference in support within a province which is slightly less than a quarter the size of Canada translate into a 3% difference in the national result?

If you believe the numbers from Strategic Counsel, and assume that the numbers published were rounded to the nearest integer percent, then Quebec constitutes between 24.7% and 27.2% of Canada. If you believe the numbers from SES, then Quebec constitutes between 20.8% and 23.3% of Canada. In contrast, if you believe Wikipedia, the correct number is 23.6%, while Statistics Canada numbers indicate a value of 23.5%.

Obviously, without examining the detailed methodology behind the reported numbers, I can't do anything beyond pointing out that something very odd is going on with these numbers. Nevertheless, I hope this will serve as a reminder both to those who publish and to those who read such polls: Pollsters are only human, and even if you trust them not to present you with lies or damn lies, they are quite capable of making mistakes.

Posted at 2005-12-13 07:00 | Permanent link | Comments

Recent posts

Monthly Archives

Yearly Archives