Anti-Virus Community

Malware authors and people in the anti-virus community have one thing in common: there isn't a lot written about either. The anti-virus community is shaped by a number of external forces, including external perceptions of them as well as customer demands and legal minefields.

The most common perception about the anti-virus community is a conspiracy theory. Anti-virus companies have the most to gain by a steady stream of malware, so the argument goes, and anti-virus companies conveniently know how to defend against any new threats.

There is no evidence whatsoever that supports this theory. The evidence that does exist also doesn't support the conspiracy theory. If it were true, then anti-virus companies would want to boost revenue with the least amount of effort on their part - a rational plan.

Any malware that wasn't noted by current or potential customers would therefore be wasted effort, and anti-virus researchers would work no more than was necessary. There is lots and lots of malware that doesn't attract attention, though; not just variants but entire families of malware can go unnoticed by most anti-virus customers.

Monitoring anti-virus updates and comparing that information to malware-related media stories is a good demonstration of this fact. The sheer volume of malware is inconsistent with the conspiracy theory, too, because far more effort is being expended by anti-virus researchers than would be necessary to sustain the industry.

Anti-virus researchers do benefit from staying ahead of malware writers, even if they don't produce the malware themselves. Researchers may monitor web sites frequented by malware writers for up-and-coming threats, especially socalled "virus exchange" or "vX" sites.

Malware writers have also been known to send their latest creations directly to anti-virus companies, which tends to support the motivation of malware writing as an intellectual thrill rather than a destructive pursuit.

A workday for an anti-virus researcher is long, to begin with. An 80-hour work week is not uncommon for researchers, which can obviously exact a personal toll. Samples of potential malware candidates can be captured by anti-virus companies' own defensive systems, like firewalls and honeypots.

Malware samples may also be submitted by customers. Conceptually, there are two databases kept: one with known malware, the other with known malware-free, or "clean" files. Any submission is first checked against both these databases, in order to avoid re-analyzing a submission and to respond to customers as quickly as possible.

If the submission is absent from both databases, then it must be analyzed. There is still no guarantee that the submission is malicious, so this is the next thing to determine; if the answer is negative, then the clean file database can be updated with the result.

Otherwise, for replicating malware, a large number of samples are produced to ensure that all manifestations of the malware variant are able to be detected. (Virus writers can try to derail this process by having their viruses mutate slowly.) Adding detection to the anti-virus software comes next.

The result is verified against both databases, because detection of the new malware shouldn't interfere with existing detection, nor should it cause false positives. Testing will also try to catch problems on any of the platforms that the anti-virus software runs on.

For this reason alone, anti-virus software is more challenging than malware writing, because malware doesn't have a customer base that complains if something goes wrong. Finally, the malware database gets updated and the customer is notified.

Most anti-virus companies have online "malware encyclopedias" which provide details about malware to the public, and these would also be updated at this time. While a workday for an anti-virus researcher may be long, the workday for an anti-virus company may be endless.

Anti-virus companies may maintain offices worldwide, strategically located in different time zones, so that aroundthe- clock security coverage can be given to their customers.

Anti-virus customers have certain expectations of their anti-virus software, which can be simply stated: 100% perfect detection of known and unknown threats, with no false positives. This is an impossible task, of course.

Complicating matters is that different customers may want different "threats" to be detected. Techniques used by anti-virus software may be applied more generally to locate many types of programs - this is called gray area detection. Anti-virus software may be employed to look for:

  • Jokes and games. "Joke" executables and games may be completely harmless, yet having them may violate corporate IT policies.
  • Cracking tools. The legitimacy of programs like password crackers and port scanners may depend on context. System administrators can use these programs to check for vulnerabilities and weak passwords in their own systems, but other users possessing these may be cause for alarm.
  • Adware. Spyware is now largely recognized as a threat, but adware may also pose a risk of leaking information to another party. Some people see adware as performing a useful function, and it's not always obvious what programs have been installed quietly, and what programs have been deliberately installed by a user.
  • Remote administration tools. Again, RATs may provide a useful service, but their presence may also constitute a security breach or a policy violation.

Gray area detection is a delicate matter, because vendors of legitimate software may object to having their product negatively classified by anti-virus software, and there may be legal ramifications for doing so.

Some anti-virus vendors attempt to forestall legal action, especially for spy ware, through an appeals process which software producers can follow if they feel that their software has been misclassified. More generally, the threat of legal action is possible for any false positive.

Engineering

Malware is often categorized based on where it's located. Malware is said to be in the wild if it's actively spreading or otherwise functioning on anyone's computer.

Malware not in the wild, which only exists in malware collections and anti-virus research labs, is in the zoo. Accurately determining whether malware is actually in the wild requires omniscience in the general case, so an approximation is used.

An organization called the WildList Organization has a worldwide membership of anti-virus experts who verify malware occurrences and report their data, which is combined to form the WildList, a (presumably close) approximation of the malware in the wild at any given time.

Malware on the WildList is confusingly referred to as being In the Wild (ItW). This means that malware can be in the wild but not In the Wild, but something In the Wild must be in the wild. Hopefully that clarifies things.

An argument can be made, from an engineering point of view, that the only threats that need to be detected are those that are in the wild, since anything in the zoo cannot pose a direct threat.

Anti-virus software could potentially be made smaller and faster by only detecting malware in the wild, whose numbers can be several orders of magnitude lower than the total number of threats.

From a marketing point of view, however, this would be a bad idea. If company A advertises that they protect against 100,000 threats, and company B's product only guards against 500 threats - even if they're really the only ones that are in the wild - then company 5 is at a competitive disadvantage.

Marketing is somewhat of a sore spot in the anti-virus community in any case. Product claims of detecting 100% of known and unknown threats are obviously silly, and misrepresentation is one possible legal concern.

Open Questions

There are a number of interesting questions which (at least at this time) have no obvious answer.

  • Anti-virus products are installed in computer systems in an ideal place to perform any number of tasks, like gray area detection. Should anti-virus software...
  • ... supply a firewall? This is clearly in the realm of computer security, yet integrating firewall and anti-virus software may make both defenses vulnerable to attack by reducing the amount of software diversity.
  • ... provide content filtering? More gray area detection, content filtering would block objectionable content - or any content that might violate IT policy - from being received. Filtering might also watch outgoing content too, since sending offensive material (either intentionally, or through zombies) could damage a company's reputation.
  • ... perform spam detection? Anti-spam is a growing concern for antivirus companies, although spam detection techniques have comparatively little overlap with malware detection techniques.
  • ... apply software patches? Where technical weaknesses are exploited by worms, for example, anti-virus disinfection may only be temporary if the vulnerability used as an infection vector is still present. The safest approach is probably not to apply relevant software patches, since doing so may accidentally break software, introducing more customer support and liability issues.
  • Anti-virus researchers perform reverse engineering and decompilation legitimately as part of their jobs, and also routinely decipher "protection measures." It's unlikely that any malware author will take them to task for this, but researchers may also trace into legitimate code or need to understand undocumented software mechanisms. At what point does this run afoul of copyright laws?
  • Users of anti-virus software may occasionally be presented with quarantined files to assess. Are there situations in which looking at these files, or the data within them, violates privacy laws? This may be even riskier in the case of a false positive.
  • For computer owners, use of anti-virus software is a widespread practice. Does this mean that computer owners are liable for negligence if they don't use anti-virus software? Do anti-virus companies have a captive market?