A throwaway note on haunted algorithms

An algorithm is like a map: it means as much through what it leaves out as through what it includes. A subway map shows the topology of the network but not distances – it leaves physical distance out as a way of claiming, in and through silence, that it’s fast enough to make physical distance irrelevant. A physical map showing no political boundaries and a political map showing no ecosystem distributions are making political and ecological statements, respectively.

When we use gender in a prediction model, but not access to healthcare, the model is using both. A content recommendation algorithm that uses popularity with demographically similar users and doesn’t let you enter specific information about what you find annoying or disturbing is the implementation of a deliberate choice to annoy or disturb you whenever convenient.

I say deliberate because all ignorance is deliberate; this applies with more universality the more resources a person or organization has. Anything a large company or state doesn’t compute with, it’s not using because it wants not to.

And it’s therefore haunted by it. This is a pragmatic statement: any relevant feature you leave out of an algorithm will, once the algorithm is deployed, impact its behavior and consequences in ways that will be unexplainable if you only look at the algorithm itself, at its data and its assumptions.

“Where is the bug?” may ask somebody.

It’s not a bug, not in that sense. The algorithm is simply haunted in the same way landscapes, people, and societies are: influenced, and sometimes ruined, by that which was deliberately forgotten.

The good news, then, is that improving an algorithm may be easier than expected: there’s important information to be added, and information that was already potentially available – or otherwise it wouldn’t have had to be forgotten.

The bad news is that the barrier, not being technical, is psychological. We don’t choose to ignore what we’re comfortable with. And dealing with their (our) psychological discomfort is not something the IT industry has usually excelled at.