2013-10-20

Federal IT project comparisons

Stewart Baker at the esteemed Volokh Conspiracy argues that not all big Federal IT projects are disasters:

... it isn't impossible, even with stiff political opposition, to manage big public-facing federal IT projects successfully. I can think of three fairly complex IT projects that my old department delivered despite substantial public/Congressional opposition in the second half of George W. Bush's administration. They weren't quite as hard as the healthcare problem, but they were pretty hard and the time pressure was often just as great.
He quotes three examples:
  1. ESTA: international visa waiver, serving 20M foreign customers per year and serving results to US border ports;
  2. E-verify: US employers checking entitlement to work, about 0.5M transactions per year
  3. US-VISIT: electronic fingerprint checks at US borders, about 45M queries per year

ESTA is a pretty good comparison to the health exchange: the user creates something like an account, uploads their identity information for offline consideration and conducts a financial transaction (paying for the visa). 20 million visitors per year sounds a lot, but it's spread fairly evenly across the day, week and year as the traffic source is world-wide. You're actually looking at an average of well under 1 user per second, and there are only a couple of pages on the site so average queries per second is in single figures. You could serve this with about 6 reasonably-specced PCs in three physically separate locations so that you always have at least two locations active and at least one PC in each location active even allowing for planned and unplanned outages. This is a couple of orders of magnitude less than the health exchange traffic - it's not a bad system to evaluate in preparing for implementation of the health exchange, but you can't expect to just translate across the systems and code. The unofficial rule of thumb is that if you design a system for traffic level X, it should (if well designed) scale fine to 10X traffic, but by the time you approach 100X you need a completely different system. The serving to border checks is a similar scale - most visitors with an ESTA visit the US about once per year, so you expect about 20M border checks per year and so around 1 query per second.

E-verify can be dismissed immediately as not comparable: it's an extremely lightweight check and has very low traffic levels.

US-VISIT is more interesting: although it's only a couple of queries per second, fingerprint matching is well known to be computationally intensive. Fortunately it's very easy to scale. You "shard" the fingerprint database by easily identified characteristics, breaking it into (possibly overlapping) subgroups; say, everyone with a clockwise whorl on their right thumb and anticlockwise spiral on their left index finger goes into subgroup 1. That your frontend receiving a fingerprint set can identify an appropriate subgroup and query one of a pool of machines which has all fingerprint sets matching that characteristic. You have a few machines in each pool in three separate sites, as above.

These are interesting applications, and I agree that they are reasonable examples of federal IT projects that work. But they are relatively simple to design and build, and they did not have the huge publicity and politically imposed deadlines that the health exchanges have. If any lesson comes from these projects, it's that well defined scopes, low traffic levels and relaxed performance requirements seem to be key to keep federal IT projects under control.

No comments:

Post a Comment

All comments are subject to retrospective moderation. I will only reject spam, gratuitous abuse, and wilful stupidity.