Friday, August 10, 2007

the first rule of anti-virus fight club is don't let that kurt guy find out about it

awww, randy beat me to it, darn my self-imposed no-blog-posting-at-work rule... more than a few potential posts have wound up indefinitely stalled because of that rule but oh well... i'll post the thoughts i took down earlier anyways since there's a thing or two i can say that someone who's actually in the industry probably can't (or shouldn't)...

so i stumbled across this itblogwatch article about a group called untangle untangling av's mysteries and i was stunned... testing with 35 viruses? half of which submitted by the audience so we don't even know if they're really viruses? and clamav won? there are so many things wrong with this picture it's not even funny...

first, 35 viruses - 17 controlled (for lack of a better word), 17 submitted by the audience, and 1 test file - isn't enough to do anything except mislead people... this is the kind of test pc magazines used to do that were so laughably bad people were told to ignore them...

next, using samples submitted by the audience - can you see the audience sitting quietly for hours while the integrity of the samples they submitted was verified? no? me neither... ergo their integrity probably wasn't verified and so the results of that part of the test are next to useless...

further, who are the people submitting these samples? where do they come from and what company(ies) or organization(s) are they associated with? do they have ties of any kind with any of the products being tested? there's a potential for some significant sample selection bias here (magnified considerably by the ridiculously small sample size)...

and clamav? i generally don't bash actual products, but clamav? while some people may argue that anti-virus products are a commodity now, that doesn't mean that absolutely all of them are created equal - even if they are a commodity there are still going to be products on the fringes for which such general statements don't apply and unfortunately clamav is just such a product... the reason for this is that while av products may be a commodity, the expertise involved in creating virus scanning engines is not... it's highly specialized and few people have those skills - those that do are employed commercially and bound under various employment contracts in such a way as to not be able to contribute to the clamav open source project... and that's just the engine development, there's also considerable expertise involved in processing malware samples to add to a product's database and the people with those skills are also generally employed employed commercially... why do you think the number of entries in clamav's database has only ever been a fraction (generally about 1/3) of what commercial products have... there are all sorts projects for which open source works, but in order for it to work the knowledge required has to be fairly common and widely available in the community - something which just isn't true for av scanning engine/signature development...

at first blush it seemed to me that this av fight-club is a publicity stunt more than a reliable comparative review... then i dug a little deeper and found a few things:
  1. the organization (untangle) conducting the fight-club uses clamav in their own product and so cannot be trusted not to cook the results in that engine's favour in some way or another (ex. sample selection, test design, etc.)...
  2. the organization that conducted the test is distributing live viruses to the public, which is totally unacceptable - the center for disease control doesn't hand out free samples and neither should people dealing with computer viruses...


the untangle documentation on their av fight club spins an interesting yarn about their dealings with other testing bodies and how they don't want to test clamav... though i can't speak for the more well regarded testing organizations, if you want to know why they don't bother testing clamav i wouldn't bother looking any further than the paltry 144,368 entries in their database (at the time of writing)... many products have passed 300,000 and some are as high as 500,000 (all figures include non-viral malware, obviously)... i'm sure they probably recheck clamav every once in a while to see if it's significantly improved or not (ie. can it reliably detect polymorphic viruses yet?) but so long as it's only detecting a fraction of what the other products detect it's not worth the time required to test it regularly...

and before anyone starts jumping up and down about how clam is only missing the 'old' malware that shouldn't be a problem anymore, let me first reiterate that old viruses never die (people are still looking for stoned.empire.monkey removal help 15 years after it was released), and let me also point out that considering the growth rate of malware, the 'old' stuff should be a minority fraction not a majority...

lastly, and this is not something i thought to mention originally but randy made a good point, including the eicar test file in this sort of test shows just how serious (or rather not serious) you should take these test results... it's sort of like giving the competitors marks for getting their names right... not only is detection of the test file almost guaranteed for all products, it's detection has no bearing on a product's ability to detect viruses because the test file isn't a virus or even virus-like in any way...

Update: it seems the composition of the test set may be different than i thought... mcafee are reporting that there are actually 6 eicar samples... they may be right, i don't know, none of the documentation says exactly how many of their samples came from eicar and since eicar really only has the one test file i assumed it only contributed that much to the test... the mcafee folks (in an effort to replicate the results) downloaded and examined the actual test set, something i refused to do (i wasn't going to denounce the distribution of viruses in one breath and then greedily gobble them up in the next, it's just not me), so they're probably right about how many of the samples are the eicar file... i can't imagine how they got 6 eicar samples, though, since eicar only provides the one test file in 3 different forms (4 if you think changing the file extension qualifies)... at any rate, sextuple counting the eicar test file is just weird...

4 comments:

Anonymous said...

Yes, it's hard to knock open source at the best of times without looking like the big bully, since it is free after all!, and these sneeky vendors know it. Well done for your article.

kurt wismer said...

well, i certainly don't want to knock open source, especially not open source in general as i use quite a bit of open source tools...

but the more i think about it, the more i'm convinced that in the open source world, quality requires that the community have talent to spare (ie. more talent than the commercial market can easily absorb)... there are so many commercial entities in the scanner market and so many more looking to break into it, i can't see where the spare talent for an open source project would come from unless perhaps there's a really talented iconoclast out there somewhere...

thankfully that hasn't been a problem for browsers, operating systems, media players, office suites, etc... any market where there's relatively few players (strangely, just about anything microsoft has been in for an extended period of time) seems to have plenty of talent to spare...

Anonymous said...

I also use open source and would not knock 99% of open source products. I think you have misunderstood my post.
You ARE knocking an open source product.Don't be ashamed of it!Your reaction explains my point beautifully! It is politically incorrect to criticise, or be seen to criticise, ANY open source, and the vendor knows this, hence they get away with a highly dubious "test".

kurt wismer said...

it doesn't seem to me like they're getting away with anything... i'm not the only one pointing out the problems with the test... i'm just the only one so far who can touch this part of it without it sounding like sour grapes/vendor rivalry...