2013-04-09

Linus Torvalds to probe accounting irregularities at the Bank of England

Ok, I lied. The truth, however, is no stranger than fiction. The Financial Conduct Authority, replacement for the spectacularly ineffective Financial Services Authority and hence desperate to prove itself, has decided to investigate the recent (short-lived) IT meltdowns at RBS, NatWest and the Royal Ulster Bank:

The Financial Conduct Authority (FCA), which took over regulation of financial services companies at the start of April, said it had begun an enforcement investigation into the breakdown, which affected accounts at RBS, NatWest and Ulster bank; the latter two are also part of the RBS group.
The authority said it would "reach its conclusions in due course and will decide whether or not enforcement action should follow that investigation". If it does find that there were systemic failures behind the technology problems, the bank could face a fine, or individuals could be censured and banned.
They've got no idea what they're doing, have they?

There are almost always systemic failures behind any publically visible technology problem. Such problems are not random - to be so visible, they have to be the result of significant infrastructure failures. Examples might include:

  • locating all the serving infrastructure within one physical location, which loses power due to an external failure (power company error / overly curious cat in the transformers / large bird in the overhead lines);
  • multiple locations for the serving infrastructure but a failure in the software which routes queries to the best location;
  • rolling out a software change to all locations at once, or at least without a suitable gap between locations to detect problems;
  • employing inept monkeys to write business-critical software;
  • anything involving words from the list "Capita", "Fujitsu", "Accenture".
What intrigues me is how the FCA thinks it's going to improve matters by "investigating" the problem. Believe me, RBS / NW / RUB have already gone over the causes of these incidents with a fine toothcomb - even if the IT management didn't care about the systemic problems which led to these outages, the odds of the FCA employing actual experts in the area are painfully low. If you're a very talented software engineer, or (better) a software engineering project manager with decades of experience and knowledge of fsck-ups, why the heck would you work for a government salary for the FCA? If they brought in known experts on risk and critical systems to lead the technical aspects of the enquiry, such as John Rusby or Anthony Hall, I would listen to what they had to say. Chance of this happening? Zero.

I do like - in principle - the idea of "banning" individuals from future involvement in UK bank IT, but fear that the chance of actually identifying the truly dangerous individuals - every large company has several at least - is negligible. Every experienced software engineer has joked about revoking someone's licence to code/commit code to source control/write documentation, but that kind of constructive dismissal would never fly with management. I don't see the FCA changing this. If anything, they'll be directed at programmers whom the current management wants to get rid of anyway. It's a great idea to let them fire the programmers with cause ("the FCA blamed you for the outage!") and hence avoid vesting the programmers' contractual restricted share units.

I would bet £100 (my limit on sure bets) that the final FCA report will a) contain a number of platitudes about general software engineering practice and risk management and b) repeat near-verbatim the contents of already-existing RBS, NW, RUB reports into the incidents. Total amount of new information: epsilon (as near to zero as makes no difference).

I'm hoping against hope, however, that the FCA has the testicular fortitude to publish at least substantial subsections of the internal bank reports into the failures, despite no doubt inevitable cries of "security risk" from the executives involved. It would make no difference to the public, but it would give the critical systems community a fascinating set of data points on current practice in retail banking IR systems and the failure modes thereof.

No comments:

Post a Comment

All comments are subject to retrospective moderation. I will only reject spam, gratuitous abuse, and wilful stupidity.