When good AI ethics is just good AI engineering

Most issues in AI ethics are side effects of subpar engineering. This doesn’t mean they aren’t ethical issues: if your bridge falls and kills people because you tried to save money using bad materials, that is an ethical problem. But it’s one that can be prevented by using the right materials — by getting the engineering right — without having to invent an “ethical bridge” that doesn’t fall or setting up a Bridge Ethics Committee inside your building company. The bridge didn’t have an ethics problem, the people managing the project did.

There isn’t a complete taxonomy of ethical issues in AI, and as more of our world gets built with, run by, or just called AI, it’ll be harder to build one. The following are just a couple of examples of issues in AI ethics that are rather the ethically questionable side effects of improper AI engineering.

Biased data collection

This is the best-known one: if your data set is a biased reflection of the world, then any model trained on that data without taking this into account will replicate this bias. The tools to deal with this are also well known, e.g. through data augmentation, or even the less glamorous but solid technique of not trying to build and then oversell something you don’t have proper data for, but they all begin by understanding and modeling the data collection process as part of the general modeling workflow. It’s not just a matter of data provenance, but almost of data philology: to understand how the data was collected in order to be able to figure out what problems might have been introduced by the process in the data, and then deal with them.

Biased data generation

While most analysts are familiar with biased data sets, biased data generation is a subtler problem. There’s the implicit expectation in AI and data science that if we could just “Have All The Data” then we could build our model, no problem. The best way to understand why this isn’t always sufficient is to consider the past as a series of indifferently well designed experiments, not just an impartial collection of data. Even if you have full access to all the data “collected” by history (from deep data sets to somebody’s game logs), this data was the result of conditions influenced by human decisions that weren’t always, or rather rarely were, diverse enough to count as a good experimental setup.

Consider the “question” of the interaction between gender and mathematical abilities. While it’s true that female mathematicians — even very recent ones — are less represented in the historical records than their achievements merit, an hypothetical omniscient historical log is likely to show more male professional mathematicians (or the equivalent in their societies) than female ones, simply because females were systematically at best discouraged and at worst prevented from studying mathematics. We don’t just have limitations in the data we have, but this data was generated in a way that systematically filtered out the information relevant to the question (and, in fact, when you look at the data for which we have balanced enough conditions, this supposed gap in ability goes away).

Another way to put this, if a drier technical description will be more palatable, is that the noise in most data sets is much less uncorrelated with other variables than our tools are comfortable with: if you try to model, say, chances of promotion as ability plus effort plus noise, in racist environments that “noise” might vary significantly based on race. Even with a large and complete data set, it’s impossible to build an unbiased model without explicitly modeling any bias inside the process that generated that data.

Anybody familiar with causal analysis will recognize the problem presented in this way, and there are literature and tools dealing with whether and how it’s possible to figure out something with this data, including how to use our knowledge of the data generation process to improve our model. These tools aren’t used as often as they should, partly because of the unconscious mythology of “if we had all the data…” and partly because the domain knowledge involved — history, anthropology, sociology — often lies outside the areas of training or interest of the institutions and people working on these models. Yet it’s a technical problem, not including in the model a key process influencing its performance, with ethically negative implications, but not a failure in ethics know-how.

The “true” ethical trade-off

As the examples above show, there are many issues in AI that are not a result of technical difficulties in making an “ethical AI” but rather of ethical difficulties while we are making the AI. In these issues there’s no trade off between ethics and engineering (or, in the way it’s usually phrased, between fairness and accuracy): a more ethical engineering process would create a model with better performance.

There’s a set of problems when this isn’t the case, at least superficially. I dislike Trolley-like problems very very much, but consider a simplified Titanic with a single lifeboat with space for one person, a passenger list of one healthy adult and one unhealthy kid, and lack of knowledge about when rescue will arrive. An AI algorithm maximizing the expected number of survivors will put the adult in the boat — and, in some cultures and time periods, that would be the ethical thing to do. I would put the kid on it – and let’s assume for the sake of the discussion that the reader would too.

If you want to be less stark, consider the choice between two almost equally skilled candidates for a job, with the slightly more capable one requiring the spending of some money making the office wheelchair-accessible.

Here we truly don’t have a technical problem: the model’s predictions of survival probabilities or net profit from the employee would be accurate. It would be, at least for some people, an ethical problem, in the sense of a negative outcome. Here the trade-off between fairness and accuracy is real, but I would suggest that this is due to a mis-specified utility function. If you consider ethically unacceptable to put the adult on the boat instead of the kid, simply add “with kids going first” to the solution constraints for the AI.

In some senses this is a facile tautology, but the way most AI ethics problems are framed tend to obscure this possibility by separating a narrowly defined utility metric from binding ethical constraints. It’s a version of the rather extreme concept of corporate officers only having a fiduciary duty to their company — the idea that, say, if you own a clinic and slashing its epidemic preparation resources would be more profitable than keeping them, you are ethically obligated to do so, and any potential deaths caused by it, if not a crime ex ante, cannot entre your consideration.

With this sort of mindset, either explicit or implicit (the “Why was it our problem to prevent genocide? We sell online ads” doctrine), it’s to be expected that AIs can be perceived to have an ethics-utility trade-off even discounting the cases where it’s just bad engineering. Here the solution isn’t technical but rather social and political: AIs can adapt to ethical constraints as well as to any other properly defined characteristic of the potential solution space, the thing to do is to have the engineers (or rather the business manager ask them to) build those into the AIs from the ground up, and not wait until they are running to start discussing philosophical questions about the limits of AI.

The good news

This is all good news. There are many areas of life where, sometimes, there are trade-offs between doing the ethically correct thing and doing what’s more convenient for us. How we navigate those is a very large and often complicated part of who we are, as individuals and as groups. But, despite the increasing importance of AI ethics as a field and as a practice, in many, I think most cases, there is no tension. The right thing and the profitable thing are the same. You may have to spend a bit more time and technical resources on building the models, but that’s more than compensated for by the better models you get in return. If you don’t do it, your competitors will.

And, to quote Kirk, if change is inevitable, predictable, beneficial, doesn’t logic demand that you be a part of it?