How to reverse-engineer your organization for fun and profit

2024-09-08

A bit of self-understanding is worth a terabyte of data. This doesn't just sound good as a quote in a slide: it's also a practical way to deploy algorithmic analysis that's not (yet) widely used, so it's worth walking through a stylized example of how that looks.

To make the story clearer — and to avoid any NDA issues — I will cheat a bit:

If this were a commercial demo it'd be a rigged one [although not worse than a lot of what even the big companies show...]. But the outputs of every computation I show will be real: the example I will work on is staged, but the methods work.

The bad news: The country is rife with ghosts. The good news: The magic amulet industry is booming, and at the center of it is the oldest and biggest conglomerate in the business, Amulets, Conjurations, and Magic Enterprises (known for short as ACME).

Thanks to centuries of experience they have accumulated a vast database of amulet sales to help them predict how much an amulet will sell based on variables like price and the local density of ghosts. Yet their profit predictions are not as good as they could be, which is a problem because it's a very high-risk business where it's almost as easy to lose money as it is to make it:

On average, in fact, their predictions miss the mark by 1465 gold coins.

Wisely, they decide to spend some on getting extra help getting it right.

The first question as usual is "do you have data?" They do! They are a large company with lots of accumulated experience, so they were able to give us data for about a million different amulet projects. This is the data the sales department uses to predict demand, engineering to model the expected cost of producing the amulets, and then finance, integrating both predictions, uses to estimate the revenue from the sales of the amulet:

We have a prediction problem and we have relevant data for it. Surely we can do better than a centuries-old company? Filled with confidence, we use the data to fit a solid, standard prediction model and get...

That doesn't look better. And it's not: while the average prediction error of ACME's own methods is 1465 gold coins, ours is 1483.

A common reaction in this situation would be to attempt to use more sophisticated algorithms ("surely with AI...") but that rarely works. What's going on in this case —I know for sure because I coded both the "real world" and the organization— is that ACME's predictions use variables that are not in the data set.

That's not a data trick to rig the demo. Unless you are replacing a well-known software model with another, no algorithm is ever using all the data an organization has. It can certainly use data the organization hasn't used before (and spending money to acquire the right variable can be more useful than spending it on more complex models), but employees everywhere, just like ACME's, have information, perceptions, and experience that's just not reflected in the databases, and can't be substituted by just more of the same sort of data.

Do we give up? The data we have doesn't allow us to improve on what the company's already doing. What else can we do?

But we do have some data we haven't used so far - not about the amulet market but about ACME itself. More specifically, we know a bit about how ACME thinks.

Recall what we said above about ACME's internal process. Graphically, it went like this:

We won't attempt to build a model to replace any of those elements: we already know they are using information and knowledge we don't have access to. But we can ask what of the information we have they are actually using.

And because it should never be assumed that a person or organization is fully aware of how they think, we will use the data to reverse-engineer at least an aspect of what's going on inside those teams.

For that we will... build models, yes. The difference is that we don't really expect the model to be very good at prediction. What we want to know, and if you do your model-building with care you can figure this out, is what information is flowing where in practice, regardless of what calendar invitations and reports say.

So, if you build an internal model of ACME's demand prediction using only the variables we have and their historical prediction records, we can see that they are only influenced by price and ghost density - statistically speaking we don't see evidence of an impact from the amulet's material or size on their predictions:

I'm skipping over the mathematics involved not because they aren't fun but because they aren't the bottleneck: once you know that you can do this, the how isn't just pressing a button or calling an API, but it's not basic research either. Tools to do things like Bayesian network structure learning or causal effect estimation have improved enormously during the last few years; paired with a bit of domain knowledge and a lot of systematic skepticism, we can learn a lot about how variables inform or impact on each other.

As I said, we won't attempt to make a better demand prediction model than ACME's. But we do have the realized demand numbers, so we can see which ones of the handful of variables we have influence them:

Repeating it because it's the key part: we didn't really use our data to build a model to predict amulet demand. We used it to see which ones of our variables impacted on amulet demand - we don't even really need to know how (although not having a clue about it should be a red flag in your process).

And here's how we earn our consulting coins. We don't build a model, we tell ACME to start using material information to predict demand, and if they think they are already doing it we help them figure out why they aren't. It's not at all what's expected from a data-and-algorithms analysis, and therefore we might need some finesse explaining it.

But if we manage to do it, and then ACME uses material information not just to predict costs but to predict demand (which I can simulate by simply going into the organization code I wrote and modifying it to use this variable to predict demand):

That is very much better! From a mean predictive error of 1465 gold coins they have improved to just 1196.

We didn't, couldn't help them do this by building models to replace parts of their organization. Rather, we built models to understand empirically some aspects of how the organization thinks —the ones we have data for, not everything they think with— find a misalignment between how information flows in the company and real-world relationships, and then suggest a conceptual fix. Saying "use the material to help predict demand" is far from a cloud-sized profit prediction AI API, yet with the data we have, those seven words of advice are more effective than the large-scale model.

Rather than looking deeper into the example, let's stop and zoom out.

We – the IT industry for sure but even society in a general sense – don't accumulate huge data sets, train neural networks, or build sophisticated quantitative models for the sake of it. We do it because how well we think is one of the strongest constraints on how effectively we act, and those are tools that help us think better.

They aren't the only ones. In many senses, they aren't the critical ones. Most decisions in business and government, and even in our personal lives, are made through organizational structures. Even when you are directly interfacing with an AI, how it was built, what it can do, what errors it can make, those are all influenced by how the organization that built it thinks, what it can do, what errors it can make, what optimizes and what it believes it optimizes, and so on.

At the competitive frontier, or at the last extremes of corporate survival, it's an error to consider, say, meeting agendas, AI development, and HR staffing as different concerns. They are just the short- and long-term processes the organization uses to think with. The root of the usefulness of "AI people" (or whatever label comes next) comes from the ability to help organizations understand themselves in this light, to reverse-engineer their own minds and then improve them.

Sometimes it will be a sophisticated AI infrastructure. Sometimes it'll be seven words in an email. Over the long term it always ends up involving both and everything between them; it's the disciplined focus on the organization's thinking as a whole, rather than the technical specs of the software, what makes transformative advances possible.