Deep packet inspection rears it ugly head

Last Friday I started noticing error messages in my production environment. URLs were being mangled, two consecutive characters being replaced by 0x80 and 0x01 or 0x80 and 0x04, causing UTF-8 decode exceptions to be logged, as well as failures for the cryptographic hash function we use to secure our URLs. As a general principle, I take any such unexpected exceptions very seriously and started investigating them, one concern being that some of our custom C extensions to nginx could be responsible for data corruption under heavy load.

I ran snoop (a Solaris utility similar to tcpdump) on one of our production servers, and after combing through 180MB of packet traces with Wireshark, it turned out the data was being corrupted before even hitting our web servers. While it was a relief to find out our own infrastructure was not to blame, I still had to identify the culprit, e.g. whether our hosting provider’s switches, firewalls or load-balancers were to blame.

TCP has built-in checksums, so a malfunctioning switch working at layers 1–3 would not cause this problem, a corrupted packet would be dropped and resent, with a slight hit on performance but no errors. Thus the problem would need to be at a L4 or higher device such as a load balancer.

I added some extra logging and let it run over the weekend. After analyzing the data, it turns out the problem is very circumscribed (76 requests out of hundreds of millions), and all the affected IP addresses come from the same ISP, Singapore Telecom Magix (AS9506). The only plausible explanation is that SingTel is running some sort of deep packet inspection gear, and some of the DPI gateways have corrupt memory or software bugs, that are causing the data flowing through them to get corrupted,

Deep Packet Inspection is a scourge the general public is insufficiently aware of. At a high level, DPI gateways watch over your shoulder as you use the Internet. They decode the data packets passing through them, reconstruct unencrypted HTTP requests (in other words, spy on your browsing history). In their transparent proxy incarnation, they can rewrite the requests or responses. Verizon Wireless uses the technology to resize and recompress images or videos requested by smartphones. Back when I used to work for France Telecom (circa 1996-1999), vendors would regularly approach us to peddle their wares and how they would allow us to price-gouge our customers more effectively. Hardware has progressed dramatically since and a single Xeon processor is capable of inspecting at least 10 Gbps of data.

The whole premise of DPI and other snooping devices is profoundly repugnant to me as a former network engineer, on both moral and technical grounds. Any additional “bump in the wire” slows things down and is yet another potential point of failure, as shown by this incident, but the potential for abuse is the real concern. Not to mince words, the legitimate purposes for the technology, such as fighting cybercrime, are just rationalizations, it was really developed for purposes most people would consider abusive.

When I joined FT, I had to go to a Paris courthouse and swear a solemn oath to defend the privacy of our customers’ communications, and report any infringement of the same. DPI technology originates in spy agencies, and is much beloved of authoritarian governments. China uses the technology, combined with voice recognition, to drop calls at the merest mention of the word “protest”. The Ben Ali regime in Tunisia used it to snoop Facebook users’ authentication cookies. Singapore’s government has a well-demonstrated intolerance of criticism, and who knows what SingTel is doing with their defective gear? Western companies like Cisco were disgracefully eager to sell censorware to dictatorships, but those governments now have homegrown capabilities from the likes of Huawei.

For telco oligopolies, the endgame is to practice perfect price discrimination, e.g. charge you more for packets that carry a voice over IP call or a Netflix video on demand session that compete with the carriers’ own services. Telcos and cablecos cannot be permitted to use their stranglehold over public networks for what is essentially racketeering. Strowger invented the automatic telephone switch because the operator at his manual exchange would divert his calls to one of his competitors, her husband. Telcos, in their monopolistic arrogance, feel a sense of entitlement to all the value the network creates, even when they are not responsible, and want to reverse this. Letting them get away with it, as is consistently the case in the US, is a recipe for long-term economic stagnation.

What can we as the general public do to fight back? The telcos are one of the largest lobbies in Washington, and wireless spectrum auction fees are one of the crutches propping up Western budgets, so help is unlikely to come from the venal legislatures. The most practical option is to start using SSL and DNSSEC for everything. Google now offers an encrypted search option and Facebook has an option to use SSL for the entire session, not just for login.

Posted in Network, Soapbox | Leave a comment

Doing my bit for the Internet

My first IPv6 connectivity, courtesy of Hurricane Electric’s Tunnel Broker. It only took me three years…

Posted in IT, Mac, Network | Leave a comment

Hey Apple…

Some improvements you should consider:

  • Sync iPods, iPhones and iPads over WiFi. Cables are so twentieth century. Palm had bluetooth sync working ten years ago, and 802.11n has the same real-world speed as USB. You could then simply extend this to sync the device to the cloud instead of a specific computer.
  • Ditching DVDs to offer an OS reinstall USB flash drive on the new MacBook Airs and Pros is a good idea, but the stick is easy to misplace. How about soldering a read-only USB drive directly onto the motherboard so it can never be lost?
  • When someone enters an address in a Calendar entry on iOS, make it clickable and linked to the Maps app, the way addresses in Contacts are. Copying and pasting them manually is a drag.
  • Stop adding useless frills like “stationery” to Mail.app, and make the default chronological sort order switchable to “most recent on top”.
  • Add HDMI CEC support to the AppleTV. It would be nice to have a HDTV automatically switch over to the AppleTV’s HDMI input when you try to access it. Speaking of which, it would be nice to have an option to disable the audio out on HDMI, e.g. if you have a decent surround sound system connected to it over Toslink and don’t want the TV’s tinny speakers to kick in.
Posted in Mac, Soapbox | Leave a comment

Is this a Google Street View car?

Update (2011-05-12): the answer is no, it’s a Navteq 3D mapping car with a LIDAR array. Thanks to Darrell Kresge for the clarification.

As I was walking to lunch today, I caught sight of this weird contraption, and had just enough presence of mind to grab a few snaps of it.

One strange feature is a spinning white cylinder inside the arm canted at a 45 degree angle. It doesn’t look like any of the Google Street View vehicles captured before, nor does it have the Google markings. The Michigan license plate is a bit odd as well. A prototype, perhaps? Or is some other company is getting into this racket, perhaps Microsoft?

Posted in Photo, San Francisco | 3 Comments

Changing the WordPress table prefix

This may be of use to people experiencing the dreaded “You do not have sufficient permissions to access this page.” message when trying to reach WordPress’ admin page, even when logging in as a proper administrator. WordPress embeds the table prefix in 4 different locations:

  • The wp-config.php file
  • The name of the tables
  • The name of user metadata keys
  • The name of blog options

Thus if you want to change the prefix, you have to:

  1. Edit wp-config.php to change the prefix
  2. Rename your tables
  3. Rename your user metadata
  4. Rename your site options

Missing steps 1 or 2 will cause WordPress to not find the tables, and it will go through the initial install process again.

Missing step 3 will cause the account to lose its roles, and thus not be authorized to do much besides read public posts.

Missing step 4 is more insidious, as it destroys the option wp_user_roles, the link between roles and capabilities, and thus even if your account is an administrator, it is no longer authorized for anything.

It feels quite clunky to embed the database prefix in column values, not just tables, just like WordPress’ insistence on converting relative links to absolute links. The former makes moving tables around (e.g. when consolidating multiple blogs on a single MySQL database) harder than necessary. The latter makes moving a blog around in a site’s URL hierarchy break internal links. I suppose there are security reasons underlying Automattic’s design choice, but security by obscurity of the WordPress table prefix is hardly a foolproof measure.

If you are renaming the tables, say, from the default prefix wp_ to foo_, the MySQL commands necessary for steps 2–4 would be the following:

UPDATE wp_usermeta SET meta_key=REPLACE(meta_key, 'wp_', 'foo_')
WHERE meta_key LIKE 'wp_%';
UPDATE wp_options SET option_name=REPLACE(option_name, 'wp_', 'foo_')
WHERE option_name LIKE 'wp_%';
ALTER TABLE wp_commentmeta RENAME TO foo_commentmeta;
ALTER TABLE wp_comments RENAME TO foo_comments;
ALTER TABLE wp_links RENAME TO foo_links;
ALTER TABLE wp_options RENAME TO foo_options;
ALTER TABLE wp_postmeta RENAME TO foo_postmeta;
ALTER TABLE wp_posts RENAME TO foo_posts;
ALTER TABLE wp_redirection_groups RENAME TO foo_redirection_groups;
ALTER TABLE wp_redirection_items RENAME TO foo_redirection_items;
ALTER TABLE wp_redirection_logs RENAME TO foo_redirection_logs;
ALTER TABLE wp_redirection_modules RENAME TO foo_redirection_modules;
ALTER TABLE wp_term_relationships RENAME TO foo_term_relationships;
ALTER TABLE wp_term_taxonomy RENAME TO foo_term_taxonomy;
ALTER TABLE wp_terms RENAME TO foo_terms;
ALTER TABLE wp_usermeta RENAME TO foo_usermeta;
ALTER TABLE wp_users RENAME TO foo_users;
Posted in Web | Leave a comment

I love my ISP

Not only do Webpass give me fast 45Mbps symmetrical access for $45/month, with no capricious restrictions or anticompetitive shenanigans, but they are also real mensches.

Posted in San Francisco, Soapbox | 2 Comments

Putting customers first

When you visit the Dell website, the first thing they force you to decide is whether you are a Home, Small Business or “Enterprise” business customer. At one point, the thin and light laptops were only available in the Enterprise section—perhaps plebs and small businesses are judged unworthy of appreciating the finer things in life, unlike the kleptocrats who run large corporations. We hoi polloi presumably should be content with our fate and make do with last year’s (decade’s?) technology.

When you search for products on Amazon, you have to select a “Department” to enable sorting by price. What do I care whether a microfiber cloth was filed under “Automotive” or “Electronics”? Taxonomies are inherently subjective, a fact librarians know well, but is surprisingly poorly understood outside the field.

Both cases illustrate what happens when a self-centered organization puts its internal structure and implementation details ahead of its customers.

Posted in Soapbox | 1 Comment

Dear Parallels

Since you keep hitting me with these spammy popups no matter how many times I click on “Do not show again”, you leave me no choice but to switch to VirtualBox (much better software in any case, and less Windows integration means less chances a virus breaking out of the virtualized Windows ghetto.

Oh, and installing MacFuse without asking permission (unlike VMware Fusion): not cool.

Don’t let the door hit you on the way out.

Posted in Mac, Soapbox | Leave a comment

Will Adobe ever learn?

In a triumph of hope over experience, I recently “upgraded” from Adobe CS3 Design Standard to CS5 Design Standard. I hardly ever use Photoshop any more since I started using Aperture and Lightroom (originally a Macromedia product, no matter what the lame “Adobe Photoshop Lightroom” face-saving branding may try to claim), the main driver for the purchase was actually InDesign CS5 and its ePub functions.

Of course, this is Adobe. Previous versions gratuitously included crud like a full Opera install (an older version, insecure, naturally) just to display a splash screen, or a full MySQL install to power Acrobat search. I never install Acrobat, of course, since that bloated and bug-ridden piece of garbage managed to steal the crown for most insecure software from Internet Explorer, no small feat.

Adobe does not want to confuse users with streamlined and efficient software, so they decided to include the mostly useless Growl on-screen notification program to nag you into registering. Increasing bloat and attack surface for malware is not a good idea, nor is interrupting creative people’s flow with interruptions. Of course, helping clients Get Things Done is a low priority at Adobe, as evidenced by their product choices.

You have to pity the Growl developers, whose reputation will suffer from guilt by association. I dislike interruptions and do not find it useful, but many people do and rave about it. They installed it by choice, not as a sneaky drive-by install for slimy marketing purposes.

Some more annoyances in CS5:

  • The pricing for the suite is more than the sum of its parts: $200 each for Photoshop, InDesign or Illustrator, $700 for Design Standard. I suppose they must think Acrobat and their online tie-ins have a value of $100 (hint: they forgot the negative sign).
  • Of course, they won’t let you upgrade individual component applications.
  • On the plus side, they now have the decency to include Acrobat on a separate CD, so you can discard it immediately and not risk installing it as a side-effect of installing the apps that are actually useful.
  • The icons were designed by the world’s laziest and most creatively bankrupt designer, just as with CS3 and CS4
  • Performance on a high-end 8-core or 12-core Mac is actually slower than on a lower-end configuration, thanks to legacy cruft and incompetence.
  • It is slower to load on my wife’s MacBook Pro. Each successive version of OS X is faster on the same hardware, Microsoft and Adobe deliver software that gets progressively slower.

In other words, unlike Lightroom, CS5 is designed to be endured, not to delight.

Posted in IT, Mac, Soapbox | 2 Comments

Organizing with Delicious Library

Delicious Library is one of the slickest apps on the Mac, and won countless design accolades. Essentially, it is a database for your books, CDs and DVDs (version 2 added gadgets), and it looks glorious on a large monitor like mine. It seems like a novelty for collector-fondlers, and I myself unfairly dismissed it as a toy in 2008, but behind its playful user interface lies a remarkably powerful organizational tool, and the new 2.5 version has made major improvements in stability and performance after 2 years of relative neglect.

Screenshot of Delicious Library

My wife and I are both avid readers—one of our common dreams is to someday have a home with a dedicated room for a library. We are squarely in the demographic for the Bookshelf porn website. Here is a montage of mine alone, not including the books I reluctantly had to consign to storage, or those in my parents’ basement back in France:

My bookshelves

With well over 900 books, I needed a system to manage. At some point I discovered Delicious Library has a writeable location field in its database for every item, and you can create virtual shelves to organize your books. I literally have one DL shelf for each shelf in my bookshelves, one for each box in storage, and one for all the books I keep at work. This way, I can browse shelf by shelf or box by box, or conversely look up the location of a book I need.

Location data for a book

You may think recording the location of each book would be a mind-numbing task, but Delicious Library makes it effortless. It can read ISBN bar codes using the iSight camera included in most Macs, but a better option is to use the Microvision RoV Bluetooth barcode scanner they sell and support. It is quite inexpensive at well over $200, specially compared to the inexpensive Symbol CS1504, but the convenience makes it well worth the price if you have a serious library to wrangle.

If you select a shelf on the right sidebar, then scan a barcode, the book is automatically added to the selected shelf. I filed a feature request with them in 2008 for precisely this, and I do not know when they added it, but it makes a world of difference. The scanner has a memory, so you can zip into an adjoining room, scan all the barcodes on a shelf, zip back and let the scanner pour the shelf’s inventory into DL. It even reads out the titles as they are added. Repeat the process and quite quickly you can compile an inventory of your entire library.

The search function in DL is somewhat primitive, but Smart bookshelves (similar to iTunes’ Smart playlists) help. I have Smart Bookshelves for:

  • Books I have already read
  • Books I have yet to read
  • Books that now have significant resale values on Amazon, to identify candidates for decluttering (you would be surprised to see the markups some art or technical books can fetch once they go out of print, even temporarily, such as my old copy of De Marco’s Peopleware).
  • Books signed by the author
  • Science-Fiction and Fantasy books, using Amazon’s (probably Bookscan’s) categorization
  • Computer books, similarly

Delicious Library tries hard to use cutting-edge functionality in OS X, which is why version 2 only supported Snow Leopard and later. It has a top-notch AppleScript implementation. I am no AppleScript guru, but I relatively easily wrote my own scripts to:

  • Copy the bookshelf name into all the books it contains. This essentially eliminates the data entry, but you have to be careful to make sure a book is not misfiled into two shelves at the same time. The sample script Highlight shelves containing selected media is helpful in this respect.
  • Find the BookMooch entry for books I want to give away.
  • Find the book in the San Francisco Public Library, in case they do not have it and might be interested in a donation.

Thanks to the user interface, browsing books on the Mac is almost as fun as doing so on the bookshelves, and infinitely faster. Of course, this is true of most book catalog programs, including many fine free options available for Windows and Mac, but most others do it with less aplomb.

Posted in Mac, Stuff | 1 Comment