Don't worry about opaque algorithms; you already don't know what anything is doing, or why

2017-05-24

Machine learning algorithms are opaque, difficult to audit, unconstrained by ethics , and there's always the possibility they'll do the unthinkable when facing the unexpected. But that's true of most our society's code base, and, in a way, they are the most secure part of it, because we haven't talked ourselves yet into a false sense of security about them.

There's a technical side to this argument: contemporary software is so complex, and the pressures under which it's developed so strong, that it's materially impossible to make sure it'll always behave the way you want it to. Your phone isn't supposed to freeze while you're making a call, and your webcam shouldn't send real-time surveillance to some guy in Ukraine, and yet here we are.

But that's not the biggest problem. Yes, some Toyota vehicles decided on their own to accelerate at inconvenient times because their software systems were mindbogglingly and unnecessarily complex, but nobody outside the company knew they were because it was so legally difficult to have access to the code that even after the crashed they had to be inspected by an outside expert under conditions usually reserved to high-level intelligence briefings.

And there was the hidden code in VW engines designed to fool emissions tests, and the programs Uber uses to track you even while they say they aren't, or even Facebook's convenient tools to help advertisers target the emotionally vulnerable.

The point is, the main problem right now isn't what a self-driving car _might_ do when it has to make a complex ethical choice guided by ultimately unknowable algorithms, but what the car is doing on every other moment, reflecting ethical choices guided by corporate executives that might be unknowable in a philosophical, existential sense, but are worryingly familiar in an empirical one. You don't know most of what your phone is doing at any given time, not to mention other devices, it can be illegal to try to figure it out, and it can also be illegal if not impossible to change it even if you did.

And a phone a thing you hold in your hand and can, at least in theory, put in a drawer somewhere if you want to have a discrete chat with a Russian diplomat. Even more serious are all the hidden bits of software running in the background, like the ones that can automatically flag you as a national security risk, or are constantly weighting whether you should be allowed to turn on your tractor. Even if the organization that developed or runs the software did its job uncommonly well and knows what it's doing down to the last bit, you don't and most likely never will.

This situation, perhaps first and certainly most forcefully argued against by Richard Stallman, is endemic to our society, and absolutely independent of the otherwise world-changing Open Source movement. Very little of the code in our lives is running in something resembling a personal computer, after all, and even when it does, it mostly works by connecting to remote infrastructures whose key algorithms are jealously guarded business secrets. Emphasis on secret, with a hidden subtext of specially from users.

So let's not get too focused on the fact that we don't really understand how a given neural network works. It might suddenly decide to accelerate your car, but "old fashioned" code could, and as a matter of fact did, and in any case there's very little practical difference between not knowing what something is doing because it's a cognitively opaque piece of code, and not knowing what something is doing because the company controlling the thing you bought doesn't want you to know and has the law on its side if it wants to send you to jail if you try to.

Going forward, our approach to software as users, and, increasingly, as citizens, cannot but be empirical paranoia. Just assume everything around you is potentially doing everything it's physically capable of (noting that being remotely connected to huge amounts of computational power makes even simple hardware quite more powerful than you'd think), and if any of that is something you don't find acceptable, take external steps to prevent it, above and beyond toggling a dubiously effective setting somewhere. Recent experience shows that FOIA requests, legal suits, and the occasional whistleblower might be more important for adding transparency to our technological infrastructure than your choice of operating system or clicking a "do not track" checkbox.