Eighth-Generation Processors

As of 2001, it had been about 15 years since PCs had begun to support 32-bit processors (all processors from the 80386 up through the Intel Pentium 4 and AMD Athlon XP). However, in 2001, Intel introduced the first 64-bit processor for servers—the Itanium—followed in 2002 by the improved Itanium 2.

In 2003, AMD introduced the first 64-bit processor for desktop computers—the Athlon 64—followed by its first 64-bit server processor, the Opteron. The following sections discuss the major features of these processors and the different approaches taken by Intel and AMD to bring 64-bit computing to the PC server and desktop.

Intel Itanium and Itanium 2

Introduced on May 29, 2001, the Itanium was the first processor in Intel's IA-64 (Intel Architecture 64-bit) product family, and it incorporated innovative performance-enhancing architecture techniques, such as prediction and speculation. It and its newer sibling, the Itanium 2 (introduced in June 2002), are the highest-end processors from Intel and are designed mainly for the server market.

If Intel was still using numbers to designate its processors, the Itanium family might be called the 886 because the Itanium and Itanium 2 are the eighth-generation processors in the Intel family, and they represent the most significant processor architecture advancement since the 386.

Itanium and Itanium 2 Specifications

Intel's IA-64 product family is designed to expand the capabilities of the Intel architecture to address the high-performance server and workstation market segments.

As with previous new processor introductions, the Itanium and Itanium 2 are not designed to replace the Pentium 4 or III, at least not at first. They feature an all-new design that is initially expensive and is found only in the highest-end systems such as file servers or workstations.

The Itanium and Itanium 2 are the first Intel processors with three levels of integrated cache. Even though a few previous system designs featured L3 cache, the L3 cache was located on the motherboard and was therefore much slower. By building L3 cache in to the cartridge (Itanium) or on the processor die (Itanium 2), all three cache levels run at the full processor speed.

The following features apply to both Itanium and Itanium 2 processors:

  • 16TB (terabytes) physical memory addressing (44-bit address bus).

  • Full 32-bit instruction compatibility in hardware.

  • EPIC (explicitly parallel instruction computing) technology, which enables up to 20 operations per cycle.

  • Two integer and two memory units that can execute four instructions per clock.

  • Two FMAC (floating-point multiply accumulate) units with 82-bit operands.

  • Each FMAC unit is capable of executing two floating-point operations per clock.

  • Two additional MMX units are capable of executing two single-precision FP operations each.

  • A total of eight single-precision FP operations can be executed every cycle.

  • 128 integer registers, 128 floating-point registers, 8 branch registers, 64 predicate registers.

Intel and Hewlett-Packard began jointly working on the Itanium processor in 1994. In October 1997, more than three years after they first disclosed their plan to work together on a new microprocessor architecture, Intel and HP officially announced some of the new processor's technical details.

Itanium is the first microprocessor based on the Intel architecture-64 (IA-64) specification, which is also supported by Itanium 2. IA-64 is a completely different processor design that uses Very Long Instruction Words (VLIW), instruction prediction, branch elimination, speculative loading, and other advanced processes for enhancing parallelism from program code. The Itanium series features elements of both CISC and RISC design.

The Itanium series incorporates a new architecture Intel calls explicitly parallel instruction computing (EPIC), which enables the processor to execute parallel instructions—several instructions at the same time. In the Itanium and Itanium 2, three instructions can be encoded in one 128-bit word, so that each instruction has a few more bits than today's 32-bit instructions.

The extra bits let the chip address more registers and tell the processor which instructions to execute in parallel. This approach simplifies the design of processors with many parallel-execution units and should let them run at higher clock rates.

In other words, besides being capable of executing several instructions in parallel within the chip, the Itanium can be linked to other Itanium chips in a parallel processing environment. The Itanium 2 also supports parallel processing. Besides having new features and running a completely new 64-bit instruction set, Itanium and Itanium 2 feature full backward compatibility with the current 32-bit Intel x86 software.

In this way, they support 64-bit instructions while retaining full compatibility with today's 32-bit applications. Full backward-compatibility means it will run all existing applications as well as any new 64-bit applications. Unfortunately, because this is not the native mode for the processor, performance is not as good when executing 32-bit instructions as it is in the Pentium 4 and earlier chips.

To use the IA-64 instructions, programs must be recompiled for the new instruction set. This is similar to what happened in 1985, when Intel introduced the 386, the first 32-bit PC processor. The 386 gave us a platform for an advanced 32-bit operating system that tapped this new power.

To ensure immediate acceptance, the 386 and future 32-bit processors still ran 16-bit code. To take advantage of the 32-bit capability first found in the 386, new software would have to be written. Unfortunately, software evolves much more slowly than hardware. It took Microsoft a full 10 years after the 386 debuted to release Windows 95, the first mainstream 32-bit operating system for Intel processors.

That won't happen with the Itanium and Itanium 2, which already have support from four operating systems, including Microsoft Windows (XP 64-bit Edition and 64-bit Windows Advanced Server Limited Edition 2002), Linux (from four distributor companies: Red Hat, SuSE, Caldera, and Turbo Linux), and two Unix versions (Hewlett-Packard's HP-UX and IBM's AIX).

Despite the immediate OS support, it will likely take several years before the mainstream software market shifts to 64-bit operating systems and software. The installed base of 32-bit processors is simply too great.

The backward-compatible 32-bit mode of the Itanium family enables them to run 32-bit software because 32-bit instructions are handled directly in the hardware rather than through software emulation. Still, this does not perform as well as a native 32-bit chip.

Itanium and Itanium 2 initially have been based on 0.18-micron technology; however, later versions will move to 0.13-micron, allowing for higher speeds and larger caches. Itaniums come in a new package called the pin array cartridge (PAC). This cartridge includes L3 cache and plugs into a PAC418 (418-pin) socket on the motherboard and not a slot.

The package is about the size of a standard index card, weighs about 6oz. (170g), and has an alloy metal on its base to dissipate the heat. Itanium has clips on its sides, enabling four of them to be hung from a motherboard, both below and above.

The Itanium's pin array cartridge

Itanium 2 was codenamed McKinley and officially introduced in June 2002. Because Itanium 2 has a significantly higher CPU bus bandwidth (6.4GBps), a higher clock speed, and an integrated on-die L3 cache that is twice as wide (128 bits) as the original Itanium, the Itanium 2 is about twice as fast in overall processing.

The Itanium 2 integrates all three levels of cache inside the processor die, so a cartridge is unnecessary. The Itanium and Itanium 2 are not interchangeable and are supported by different Intel chipsets. Following McKinley will be Madison, a version of the Itanium 2 based on 0.13-micron technology.

The Itanium 2 is a more compact design than the original Itanium. Photograph used by permission of Intel Corporation.

AMD Athlon 64

The AMD Athlon 64, introduced in the second half of 2003, is the first 64-bit processor for desktop computers. Originally code named ClawHammer, the Athlon 64 is the desktop element of AMD's two-part 64-bit processor family, which also includes the Opteron (code named SledgeHammer) server processor.

The major features of the Athlon 64 design include

  • Single-channel 72-bit (64-bit plus ECC support) PC2700 (DDR 333) memory interface integrated into the processor (instead of the North Bridge or MCP, as in other recent chipsets)

  • 128KB L1 cache

  • 256KB or 1MB L2 cache

  • Actual clock speeds of 1.6GHz–2GHz initially

  • New x86-64 architecture (extends 32-bit x86 architecture)

  • 6.4GBps Hypertransport link to chipset

  • Addressable memory size that exceeds the 4GB limit imposed by 32-bit processors

Although AMD has been criticized by many, including me, for its confusing performance-rating processor names in the Athlon XP series, AMD also uses this naming scheme with the Athlon 64. As I suggest with the Athlon XP, you should look at the actual performance of the processor with the applications you use most to determine whether the Athlon 64 is right for you and which model is best suited to your needs.

The integrated memory bus in the Athlon 64 means that the Athlon 64 connects to memory more directly than any 32-bit chip and makes North Bridge design simpler. AMD offers its own chipsets for the Athlon 64, but most Athlon 64 motherboards and systems use third-party chipsets from the same vendors that now produce Athlon XP chipsets.

The Athlon 64 introduces a new processor socket to PCs: Socket 754. Socket 754, which at 754 pins is the largest socket ever developed for an x86 processor, looks similar to the newer Pentium 4 Socket 478 and Xeon Socket 603 and uses the same type of mPGA connectors to improve performance and reliability.

To improve cooling, often a problem with the faster Athlon XP processors, the Athlon 64 has a 20% larger die surface. As with the Pentium 4 processor, motherboards for the Athlon 64 also require the ATX12V connector to provide adequate 12V power.

The initial version of the Athlon 64 is built on a .13-micron process. The second version—code named San Diego (desktop version) and Odessa (portable version)—is due in late 2003 and will be AMD's first processor to use the .09-micron process.

AMD Opteron

The AMD Opteron is the workstation and server counterpart to the AMD Athlon 64, supporting the same x86-64 architecture as the Athlon 64. The Opteron was introduced in the spring of 2003.

The following are the major features of the Opteron:

  • 128KB L1 cache

  • 1MB L2 cache

  • Initial clock speeds of 1.8GHz–2GHz

  • Three 6.4MBps Hypertransport links to chipset

  • 940-pin socket

  • Integrated memory controller

  • 128-bit plus ECC dual-channel memory bus

  • Maximum addressable memory of 1 terabyte (40-bit physical) and 256 terabytes (48-bit virtual)

  • x86-64 architecture

Unlike the Itanium series, which has been supported primarily by Intel chipsets, the Opteron has broad third-party chipset support from companies such as VIA, SiS, ALi, NVIDIA, and ATI (just like the Athlon 64 does).

AMD's x86-64 Architecture

Although the Athlon 64, similar to the Intel Itanium and Itanium 2, is a 64-bit processor, it takes a completely different approach from what Intel does to 64-bit operations. As you saw in the previous section, the Itanium/Itanium 2 processors support a new architecture called IA-64, which requires the recompilation of software for full performance.

Instead of requiring that software be recompiled for best performance, the Athlon 64 supports AMD's new x86-64 architecture. Because x86-64 is based on the current x86 architecture but is optimized for better access to large amounts of RAM, processors that use x86-64 architecture, such as the Athlon 64 and Opteron, run existing 32-bit code faster than current 32-bit processors and support new and extended instructions.

The advantages of the x86-64 architecture over the Itanium series' IA-64 architecture include

  • Immediate speed increases with current software.

  • The command set is an extension of the x86 architecture for easier recompilation to support x86-64.

  • Gradual movement to 64-bit processing.

It will be interesting to see whether x86-64 or IA-64 becomes the most popular method for moving the PC world into 64-bit computing. Given the history of gradual changes with backward compatibility in mind, x86-64 might be the winner, in spite of the theoretical advantages of IA-64.