If you tell enough stories, perhaps the moral will show up.

2009-06-04

In Favour of Delinquency

Anti Virus software doesn't work if it's not installed, running, and updating signatures. What with one thing and another, it's hard to keep AV installed and running on every machine, and so we need a metric to manage by.

It's conventional to measure coverage: "90% of our machines have updated their signature file within the last week". The number and the age are arbitrary -- it could be 80% or 99% or whatever within a day or a month. (But it certainly seems hard to stay above 90% with McAfee....)

But I think coverage is an inadequate target, especially for servers. You have to watch it, certainly, but it's not enough. The problem is that a coverage report says nothing about how long machines are out of compliance -- you risk being satisfied that some machines never, ever, have current AV scanners. Imagine a network with a thousand machines -- if everything is up to date except for two file servers and and the DCs, then your coverage is over 99%, but your overall situation is not at all pretty.

Worse, coverage isn't a good guide to the best next action. Are you going to fix the agent on that critical server with its rare maintenance window? or patch up a couple of workstations? If you just want to get the coverage up you're going to choose the workstations, and you'll be wrong to do so.

Delinquency is a different metric. It measures the proportion going unfixed. It's the percentage of the non-compliant machines in the latest snapshot that were also unfixed at an earlier one, and haven't been fixed in between. The lower the delinquency the better -- a high delinquency means that AV installs are breaking and not getting fixed, a low one means that you are keeping up with the workload.

The levels I like are these:

  • For servers, I think the delinquency should be zero, but the lookback period should allow for the time taken to get a maintenance slot on a server. For us, that's seven weeks. It's simply a claim that everything should be fixed in one maintenance cycle, so you can't leave those DCs without current AV.
  • For workstations, some delinquents are acceptable. So we say 10%, with a lookback of one week.
It's not ideal. It's harder to compute as you need historical data. But it does tell you what to do first.

And coverage? Well, if you're fixing the breaks, it hardly matters. Like all metrics, delinquency can be gamed if it's your only target, so the best plan is to set something easy like 90% and leave it at that.

No comments: