Malware Defenition

It would be nice to present a clever taxonomy of malicious software, one that clearly shows how each type of malware relates to every other type. However, a taxonomy would give the quaint and totally incorrect impression that there is a scientific basis for the classification of malware.

In fact, there is no universally-accepted definition of terms like "virus" and "worm," much less an agreed-upon taxonomy, even though there have been occasional attempts to impose mathematical formalisms onto malware. Instead of trying to pin down these terms precisely, the common characteristics each type of malware typically has are listed.

Malware Types

Malware can be roughly broken down into types according to the malware's method of operation. Anti-"virus" software, despite its name, is able to detect all of these types of malware. There are three characteristics associated with these malware types.

  1. Self-replicating malware actively attempts to propagate by creating new copies, or instances, of itself. Malware may also be propagated passively, by a user copying it accidentally, for example, but this isn't self-replication.
  1. The population growth of malware describes the overall change in the number of malware instances due to self-replication. Malware that doesn't selfreplicate will always have a zero population growth, but malware with a zero population growth may self-replicate.
  1. Parasitic malware requires some other executable code in order to exist. "Executable" in this context should be taken very broadly to include anything that can be executed, such as boot block code on a disk, binary code in applications, and interpreted code. It also includes source code, like application scripting languages, and code that may require compilation before being executed.

Logic Bomb

  • Self-replicating: no.
  • Population growth: zero.
  • Parasitic: possibly.

A logic bomb is code which consists of two parts:

  1. A pay load, which is an action to perform. The payload can be anything, but has the connotation of having a malicious effect.
  1. A trigger, a boolean condition that is evaluated and controls when the payload is executed. The exact trigger condition is limited only by the imagination, and could be based on local conditions like the date, the user logged in, or the operating system version.

Triggers could also be designed to be set off remotely, or - like the "dead man's switch" on a train - be set off by the absence of an event.

Logic bombs can be inserted into existing code, or could be standalone. A simple parasitic example is shown below, with a payload that crashes the computer using a particular date as a trigger.

legitimate code if date is Friday the 13th: crashcomputer() legitimate code

Logic bombs can be concise and unobtrusive, especially in millions of lines of source code, and the mere threat of a logic bomb could easily be used to extort money from a company.

In one case, a disgruntled employee rigged a logic bomb on his employer's file server to trigger on a date after he was fired from his job, causing files to be deleted with no possibility of recovery. He was later sentenced to 41 months in prison.

Another case alleges that an employee installed a logic bomb on 1000 company computers, date-triggered to remove all the files on those machines; the person allegedly tried to profit from the downturn in the company's stock prices that occurred as a result of the damage.

Trojan Horse

  • Self-replicating: no.
  • Population growth: zero.
  • Parasitic: yes.

There was no love lost between the Greeks and the Trojans. The Greeks had besieged the Trojans, holed up in the city of Troy, for ten years. They finally took the city by using a clever ploy: the Greeks built an enormous wooden horse, concealing soldiers inside, and tricked the Trojans into bringing the horse into Troy. When night fell, the soldiers exited the horse and much unpleasantness ensued.

In computing, a Trojan horse is a program which purports to do some benign task, but secretly performs some additional malicious task. A classic example is a password-grabbing login program which prints authentic-looking "username" and "password" prompts, and waits for a user to type in the information.

When this happens, the password grabber stashes the information away for its creator, then prints out an "invalid password" message before running the real login program. The unsuspecting user thinks they made a typing mistake and reenters the information, none the wiser.

Trojan horses have been known about since at least 1972, when they were mentioned in a well-known report by Anderson, who credited the idea to D. J. Edwards.

Back Door

  • Self-replicating: no.
  • Population growth: zero.
  • Parasitic: possibly.

A back door is any mechanism which bypasses a normal security check. Programmers sometimes create back doors for legitimate reasons, such as skipping a time-consuming authentication process when debugging a network server. As with logic bombs, back doors can be placed into legitimate code or be standalone programs.

One special kind of back door is a RAT, which stands for Remote Administration Tool or Remote Access Trojan, depending on who's asked. These programs allow a computer to be monitored and controlled remotely.

Users may deliberately install these to access a work computer from home, or to allow help desk staff to diagnose and fix a computer problem from afar. However, if malware surreptitiously installs a RAT on a computer, then it opens up a back door into that machine.

Virus

  • Self-replicating: yes.
  • Population growth: positive.
  • Parasitic: yes.

A virus is malware that, when executed, tries to replicate itself into other executable code; when it succeeds, the code is said to be infected. The infected code, when run, can infect new code in turn. This self-replication into existing executable code is the key defining characteristic of a virus.

When faced with more than one virus to describe, a rather silly problem arises. There's no agreement on the plural form of "virus." The two leading contenders are "viruses" and "virii;" the latter form is often used by virus writers themselves, but it's rare to see this used in the security community, who prefer "viruses.

If viruses sound like something straight out of science fiction, there's a reason for that. They are. The early history of viruses is admittedly fairly murky, but the first mention of a computer virus is in science fiction in the early 1970s, with Gregory Benford's The Scarred Man in 1970, and David Gerrold's When Harlie Was One in 1972.

Both stories also mention a program which acts to counter the virus, so this is the first mention of anti-virus software as well. The earliest real academic research on viruses was done by Fred Cohen in 1983, with the "virus" name coined by Len Adleman.

Cohen is sometimes called the "father of computer viruses," but it turns out that there were viruses written prior to his work. Rich Skrenta's Elk Cloner was circulating in 1982, and Joe Dellinger's viruses were developed between 1981-1983; all of these were for the Apple II platform.

Some sources mention a 1980 glitch in Arpanet as the first virus, but this was just a case of legitimate code acting badly; the only thing being propagated was data in network packets.

Gregory Benford's viruses were not limited to his science fiction stories; he wrote and released nonmalicious viruses in 1969 at what is now the Lawrence Livermore National Laboratory, as well as in the early Arpanet. Some computer games have featured self-replicating programs attacking one another in a controlled environment.

Core War appeared in 1984, where programs written in a simple assembly language called Redcode fought one another; a combatant was assumed to be destroyed if its program counter pointed to an invalid Redcode instruction. Programs in Core War existed only in a virtual machine, but this was not the case for an earlier game, Darwin.

Darwin was played in 1961, where a program could hunt and destroy another combat ant in a non-virtual environment using a well-defined interface. In terms of strategy, successful combatants in these games were hard-to-find, innovative, and adaptive, qualities that can be used by computer viruses too.

Traditionally, viruses can propagate within a single computer, or may travel from one computer to another using human-transported media, like a floppy disk, CD-ROM, DVD-ROM, or USB flash drive. In other words, viruses don't propagate via computer networks; networks are the domain of worms instead.

However, the label "virus" has been applied to malware that would traditionally be considered a worm, and the term has been diluted in common usage to refer to any sort of self-replicating malware. Viruses can be caught in various stages of self-replication. A germ is the original form of a virus, prior to any replication.

A virus which fails to replicate is called an intended. This may occur as a result of bugs in the virus, or encountering an unexpected version of an operating system. A virus can be dormant, where it is present but not yet infecting anything - for example, a Windows virus can reside on a Unix-based file server and have no effect there, but can be exported to Windows machines."

Worm

  • Self-replicating: yes.
  • Population growth: positive.
  • Parasitic: no.

A worm shares several characteristics with a virus. The most important characteristic is that worms are self-replicating too, but self-replication of a worm is distinct in two ways. First, worms are standalone, and do not rely on other executable code. Second, worms spread from machine to machine across networks.

Like viruses, the first worms were fictional. The term "worm" was first used in 1975 by John Brunner in his science fiction novel The Shockwave Rider, (Interestingly, he used the term "vims" in the book too.)

Experiments with worms performing (non-malicious) distributed computations were done at Xerox PARC around 1980, but there were earlier examples. A worm called Creeper crawled around the Arpanet in the 1970s, pursued by another called Reaper which hunted and killed off Creepers.

A watershed event for the Internet happened on November 2, 1988, when a worm incapacitated the fledgling Internet. This worm is now called the Internet worm, or the Morris worm after its creator, Robert Morris, Jr. At the time, Morris had just started a Ph.D. at Cornell University.

He had been intending for his worm to propagate slowly and unobtrusively, but what happened was just the opposite. Morris was later convicted for his worm's unauthorized computer access and the costs incurred to clean up from it. He was fined, and sentenced to probation and community service.

Rabbit

  • Self-replicating: yes.
  • Population growth: zero.
  • Parasitic: no.

Rabbit is the term used to describe malware that multiplies rapidly. Rabbits may also be called bacteria, for largely the same reason. There are actually two kinds of rabbit.

The first is a program which tries to consume all of some system resource, like disk space. A "fork bomb," a program which creates new processes in an infinite loop, is a classic example of this kind of rabbit. These tend to leave painfully obvious trails pointing to the perpetrator, and are not of particular interest.

The second kind of rabbit, which the characteristics above describe, is a special case of a worm. This kind of rabbit is a standalone program which replicates itself across a network from machine to machine, but deletes the original copy of itself after replication.

In other words, there is only one copy of a given rabbit on a network; it just hops from one computer to another. Rabbits are rarely seen in practice.

Spyware

  • Self-replicating: no.
  • Population growth: zero.
  • Parasitic: no.

Spyware is software which collects information from a computer and transmits it to someone else. Prior to its emergence in recent years as a threat, the term "spyware" was used in 1995 as part of a joke, and in a 1994 Usenet posting looking for "spy-ware" information.

The exact information spyware gathers may vary, but can include anything which potentially has value:

  1. Usernames and passwords. These might be harvested from files on the machine, or by recording what the user types using a key logger. A keylogger differs from a Trojan horse in that a keylogger passively captures keystrokes only; no active deception is involved.
  2. Email addresses, which would have value to a spammer.
  3. Bank account and credit card numbers.
  4. Software license keys, to facilitate software pirating.

Viruses and worms may collect similar information, but are not considered spy ware, because spy ware doesn't self-replicate. Spy ware may arrive on a machine in a variety of ways, such as bundled with other software that the user installs, or exploiting technical flaws in web browsers.

The latter method causes the spyware to be installed simply by visiting a web page, and is sometimes called a drive-by download.

Adware

  • Self-replicating: no.
  • Population growth: zero.
  • Parasitic: no.

Adware has similarities to spyware in that both are gathering information about the user and their habits. Adware is more marketing-focused, and may pop up advertisements or redirect a user's web browser to certain web sites in the hopes of making a sale.

Some adware will attempt to target the advertisement to fit the context of what the user is doing. For example, a search for "Calgary" may result in an unsolicited pop-up advertisement for "books about Calgary."

Adware may also gather and transmit information about users which can be used for marketing purposes. As with spyware, adware does not self-replicate.

Hybrids, Droppers, and Blended Threats

The exact type of malware encountered in practice is not necessarily easy to determine, even given these loose definitions of malware types. The nature of software makes it easy to create hybrid malware which has characteristics belonging to several different types.

A classic hybrid example was presented by Ken Thompson in his ACM Turing award lecture. He prepared a special C compiler executable which, besides compiling C code, had two additional features:

  1. When compiling the login source code, his compiler would insert a back door to bypass password authentication.
  2. When compiling the compiler's source code, it would produce a special compiler executable with these same two features.

His special compiler was thus a Trojan horse, which replicated like a virus, and created back doors. This also demonstrated the vulnerability of the compiler tool chain: since the original source code for the compiler and login programs wasn't changed, none of this nefarious activity was apparent.

Another hybrid example was a game called Animal, which played twenty questions with a user. John Walker modified it in 1975, so that it would copy the most up-to-date version of itself into all user-accessible directories whenever it was run. Eventually, Animals could be found roaming in every directory in the system.

The copying behavior was unknown to the game's user, so it would be considered a Trojan horse. The copying could also be seen as self-replication, and although it didn't infect other code, it didn't use a network either - not really a worm, not really a virus, but certainly exhibiting viral behavior.

There are other combinations of malware too. For example, a dropper is malware which leaves behind, or drops, other malware. A worm can propagate itself, depositing a Trojan horse on all computers it compromises; a virus can leave a back door in its wake.

A blended threat is a virus that exploits a technical vulnerability to propagate itself, in addition to exhibiting "traditional" characteristics. This has considerable overlap with the definition of a worm, especially since many worms exploit technical vulnerabilities.

These technical vulnerabilities have historically required precautions and defenses distinct from those that anti-virus vendors provided, and this rift may account for the duplication in terms. The Internet worm was a blended threat, according to this definition.

Zombies Computers that have been compromised can be used by an attacker for a variety of tasks, unbeknownst to the legitimate owner; computers used in this way are called zombies.

The most common tasks for zombies are sending spam and participating in coordinated, large-scale denial-of-service attacks. Sending spam violates the acceptable use policy of many Internet service providers, not to mention violating laws in some jurisdictions.

Sites known to send spam are also blacklisted, marking sites that engage in spam-related activity so that incoming email from them can be summarily rejected. It is therefore ill-advised for spammers to send spam directly, in such a way that it can be traced back to them and their machines.

Zombies provide a windfall for spammers, because they are a free, throwaway resource: spam can be relayed through zombies, which obscures the spammer's trail, and a blacklisted zombie machine presents no hardship to the spammer.

As for denials of service, one type of denial-of-service attack involves either flooding a victim's network with traffic, or overwhelming a legitimate service on the victim's network with requests.

Launching this kind of attack from a single machine would be pointless, since one machine's onslaught is unlikely to generate enough traffic to take out a large target site, and traffic from one machine can be easily blocked by the intended victim.

On the other hand, a large number of zombies all targeting a site at the same time can cause grief. A coordinated, network-based denial-of-service attack that is mounted from a large number of machines is called a distributed denial-of-service attack, or DDoS attack.

Networks of zombies need not be amassed by the person that uses them; the use of zombie networks can be bought for a price. Another issue is how to control zombie networks.

One method involves zombies listening for commands on Internet Relay Chat (IRC) channels, which provides a relatively anonymous, scalable means of control. When this is used, the zombie networks are referred to as botnets, named after automated IRC client programs called bots.