Why does AI monitoring of job performance generally backfires? Blame how much current AI has improved.
The conventional wisdom that data is the most important asset to build an AI is not entirely true. The key input is having a way to check how well a candidate program works. Data (sometimes, if you are careful, to a partial degree) can, if properly used, help you do that, but it's not trivial, and it's not necessary; witness the success of self-play architectures like AlphaZero.
In other words: if you can evaluate it with an AI, you should have an AI do it.
Clearly Xsolla doesn't know how to automate away their developers. That's alright -- nobody does. But pretending that you can will drive you to measure and evaluate things that neither explain nor are always correlated with performance, like counting keypresses and interactions with project management software. We know what happens then — we have known since we began building organizations complex enough to depend on reporting. People optimize whatever they are evaluated on, so if you focus on something that's not correlated with performance, you end up with better KPIs and worse organizations.
AI doesn't change that logic. Kafka at computer speed just makes the collapse faster and harder to fight, because AI gives arbitrary rules a false sheen of objectivity.
But AI also amplifies institutional culture in the other direction: hire good people and give them AIs that work for and with them, and they'll beat your competitors - and your expectations of what they can do.