Measuring Down
The saying “be careful what you wish for” is a reminder to be thoughtful about how we measure our work.
The saying “be careful what you wish for” is a reminder to be thoughtful about how we measure our work.
After our startup was acquired, my team had to make adjustments. Our focus had been on gaining customers and revenue. But their company did not buy ours in order to sell more of our subscriptions. It was meant to be packaged into their larger offering.
We adopted a focus on 30-day Monthly Active Use (MAU) as an indicator of our growth. We counted MAUs as unique subscribers who logged into site OR who who had content delivered to them in the last 30 days.
A sale shows that you have gained a customer. But is the customer gaining value? Are we present in their lives? Are they building a habit that makes them more likely to keep their subscription?
MAU isn’t perfect but it can be a proxy for many of these outcomes. Gradually our counts of users and dollars became less prominent, and our MAU was growing fast.
It seemed we were too successful. As our usage eclipsed other, there was skepticism that we could actually have as many active users as we did.
I got an email informing me that our users would only be counted as “active” if they actually logged into the website. I was devastated: We’d lose about half of our MAUs overnight.
There was no reason for users to log in! Content was being delivered without any action on their part. That was kind of the whole point! Do you want to log into the Spotify.com website every time you listen to music?
I felt sad because it reflected a misunderstanding about how cloud services like ours created value for users. And it created incentives for our team directly opposed to the user’s goals.
There’s an idea in Artificial Intelligence called “Specification Gaming” that says that unpredictable behavior can arise when a system is given a goal that does not correspond with the intended outcome.
For example, a robotic claw is trained to move a box on a table, but learns that it can achieve its goal simply by banging into the table. I’m linking an article and video that give more examples, many of them quite amusing.
Less amusing is finding that your measurement and your outcomes are not aligned. It’s crushing for a team to feel that their work is not being fairly judged.
I find it helpful to ask myself “Could I imagine an unhappy customer incrementing this metric?” or Could we still fail if this number is going straight up?”
If so, be careful. Metrics can easily become incentives, which are susceptible to gaming and unintended consequences. Our customers are looking at outcomes before metrics — let’s make sure we are too.
Specification gaming: the flip side of AI ingenuity
Specification gaming is a behaviour that satisfies the literal specification of an objective without achieving the…deepmindsafetyresearch.medium.com