Code coverage is a perverse incentive

While code coverage can be a useful measurement for teams to improve their own results, the moment it’s tracked by people external to the team, particularly management, it becomes a perverse incentive.

“A perverse incentive is an incentive that has an unintended and undesirable result that is contrary to the intentions of its designers.” - Wikipedia

An example of a perverse incentive that seems to turn up in the news again every couple of years is that of volunteer fire fighters, who are only paid when they called out for a fire. The incentive in this case is monetary and the only way they can get paid is if there’s a fire somewhere in their coverage area. We’ve now encouraged these people to actually light fires in their area that they can then get paid to put out. Obviously the majority of fire fighters don’t do this and yet we see enough cases in the news where this exact thing happened.

This is a case where even though there are penalties for lighting the fire in the first place, should they get caught, people do it anyway.

What about the situation where there are no penalties for doing the wrong thing? This is the exact case for code coverage.

Code coverage is a percentage of code that gets executed during a test run. It doesn’t tell us anything about the quality of the tests or the code under test. It merely tells us how much of that code got executed during a test run.

When management is looking at this number, we are all encouraged to make the number higher. Some people will make it higher by writing better, more comprehensive, tests but that’s not what’s being measured. Most people will do what it takes to improve the thing being measured which means executing more code during the test run, whether or not it makes the code quality higher.

We frequently see tests that have no assertions at all. This leads to great code coverage numbers but doesn’t actually do anything to improve the quality of the product. I recall one client where there were hundreds of tests that didn’t validate anything useful and when I suggested that they delete these tests, their objection was that it would lower their code coverage numbers. Not that it would make the code worse in any way, but that it would make code coverage look worse and that’s what they cared about satisfying.

The moment that the team thinks anyone, particularly management, is looking at their code coverage numbers, they’ll start to game them and this becomes a perverse incentive. Code coverage can be a useful metric for the team to help improve themselves and if we allow it to become a perverse incentive then we lose all that value.

Keep this one to yourself. Code coverage shouldn’t be shared outside the team.

Tags