Anti-Virus Techniques - Anti-Stealth

One assumption made up to this point is that anti-virus software sees an accurate picture of the data being checked for viruses. But what if a virus is using stealth to hide?

Anti-stealth techniques are countermeasures used against stealth viruses. There are two options:

  1. Detect and disable the stealth mechanism. For example, calls to the operating system can be examined to make sure they're going to the "right" place.
  1. Bypass the usual mechanisms to call the operating system in favor of unsubvertible ones. For Unix, this would mean that anti-virus software only uses direct system calls (assuming, of course, that the operating system kernel is secure); for MS-DOS systems, this could mean making direct BIOS calls to get disk data.

Macro Virus Detection

Macro viruses present some interesting problems for anti-virus software. Macros are in source form, and are easy to change and allow a lot of freedom with formatting. Macro language interpreters can be extremely robust in terms of buUishly continuing execution in the face of errors; a missing or damaged macro won't necessarily keep a macro virus from operating. Some specific problems with macro viruses:

  • Accidental or deliberate changes to a macro virus, even to its formatting, may create a new macro virus. This may even happen automatically: Microsoft Word converts documents from one version of Word to another, and this conversion has created new macro viruses in the process.
  • Bugs in macro virus propagation, or incomplete disinfection of a macro virus, can create new macro virus variants. Anti-virus software can accidentally create viruses if it's not careful!
  • A macro virus can accidentally "snatch" macros from an environment it infects, becoming a new virus. In one case, a Word macro virus even swiped two macros from Microsoft's software that protects against macro viruses.

Macro viruses, despite these problems, have one redeeming feature. Macros operate in a restricted domain, so anti-virus detection can determine what constitutes "normal" behavior with a very high degree of confidence. This limits the number of false positives that might otherwise be incurred by detection.

All of the same ideas have been trotted out for macro viruses as have been used for other types of virus, including signature scanning, static heuristics, behavior blocking, and emulation.

Due to variability in formatting, methods looking for static signatures are facilitated by removing whitespace and comments, or translating it into some equivalent canonical form first. A similar need for canonicalization arises from macro languages which aren't case sensitive, where f 00, FOO, and Foe would all refer to the same variable.

More systemic approaches to macro virus detection periodically examine documents on a system, and build a database of the documents and their properties. In particular, macros in documents can be tracked; the sudden appearance of macros in a document, a change to known macros in a document, or a number of documents with the same changes to their macros are all signals that a macro virus may be active.

Macro viruses have not been parasitic, meaning they have not inserted viral code into legitimate code, but have acted more like companion viruses. (Nothing prevents macro viruses from being parasitic; it's just slightly more effort to implement.) Disinfection strategies for macro viruses have consequently tended towards deletion-based approaches:

  • Delete all macros in the infected document, including any unfortunate, legitimate user macros.
  • Delete macros known to be associated with the virus found. This requires a known-macro-virus database.
  • For macro viruses detected using heuristics, remove the macros found to contain the offending behavior.
  • Emulator-based detection can track the macros seen to be used by the macro virus and delete them.

Applications supporting macros treat macros in a much more guarded fashion than they once did, and macro viruses are a much less prominent threat than they have been as a result.

Compiler Optimization

Compiler techniques have natural overlaps with anti-virus detection. For example, some scanning algorithms are applied to match patterns in trees, for code generation; scanning and parsing are needed for macro virus detection; work on efficient interpretation is applicable to emulation, and interpreting special-purpose code in the anti-virus engine.

One suggestion which rears its head occasionally is the possibility of using compiler optimizations for detection of viruses. Given that a number of compiler optimization techniques perform some sophisticated analyses, it isn't surprising to consider applying them to the problem of virus detection:

  • Constant propagation replaces variables which are defined as constants with the constants themselves. This increases the information available about code being analyzed, and facilitates other optimizations.
  • Dead code is code which is executed, but the results are never used. Polymorphic viruses tend to exhibit a lot of dead code - more than 25% - especially when compared to non-viral code, so dead code analysis can make a useful heuristic to help with polymorphic virus detection.

However, some problems loom. Compiler optimization algorithms are not known for efficiency, with the exception of algorithms designed specifically for use in dynamic, or just-in-time, compilers. Such algorithms tend to trade speed increases for decreases in accuracy, though.

It is often possible to concoct programs which exercise the worst case performance of optimization algorithms, or programs which make the task of precise analysis undecidable. Virus writers will undoubtedly take advantage of this if anti-virus' use of compiler optimization becomes widespread.