HTML Versus XHTML

As you’ve seen, HTML is a markup language designed to combine text, multimedia, and hyperlinks to create a Web page. HTML is also a moving target, though, because several different versions have been introduced since it first appeared in the early 1990s.

Although each version has built upon the last, and most Web browsers are designed to be “backward-compatible” with previous versions, it’s important to know something about current and future versions of HTML. The World Wide Web Consortium is responsible for creating the specifications that other companies adhere to (for the most part) when creating such things as Web browser applications and devices for viewing Web pages.

The W3C is an industry group, founded by Tim Berners-Lee, that includes most of the major players in the corporate world of Web development (such as Microsoft, Netscape, AOL, and AT&T). One of the tasks the W3C undertakes is maintaining the HTML specification. Because technology is always changing, the W3C constantly works on new versions of the HTML standard.

Every so often, it publishes working drafts that attempt to codify the advances in technology and capabilities of HTML and the Web, while keeping in mind the needs of the majority of Web browsers and users. (For instance, the W3C might reject or alter an element that one of the browser companies invents because it only works in visual Web browsers, leaving out users of text-based browsers or browsers for the visually impaired.)

After a working draft has been published and is bandied about by peers and the public for a while, it becomes final and is published as the official recommendation. Then, Web browsers and authoring tools implement the parts of the recommendation that they haven’t already (by the time the specification is official, most companies have rolled in the majority of the new elements discussed at the recommendation stage) and then release new versions of their products.

While browser companies aren’t forced to follow the specification set by the W3C, failing to do so means the pages created by Web authors may be incompatible in different browser versions. So, most of the browser companies attempt to keep up with the standards. The HTML specification has gone through this updating process many times, through various versions, from an HTML 1.0 standard to the most recent HTML 4.01 standard (finished in 1999).

Since then, HTML development has been focused on making HTML’s core elements compatible with XML (eXtensible Markup Language), a newer standard that is designed to be a foundation for many other markup languages. XML can be used to create and define markup languages that are specific to certain applications, industries, and so on.

Because of the power of XML, one of the W3C’s recent goals has been to recast, or rewrite, HTML in XML so that the standards are compatible. At the same time, it’s done everything it could to keep the new HTML as similar to the old HTML as possible, so as not to introduce too many compatibility problems.

The result of this recasting of HTML is called XHTML. While it may seem that changing the name to XHTML would mean it’s a really big deal, the truth is the current version, XHTML 1.0, is only slightly different than its predecessor, HTML 4.01. XHTML does have a few differences, but mostly it’s just a bit more strict than HTML has been in the past, requiring that authors be more diligent in the way they implement their Web pages.

Overall, though, it’s easy enough to grasp. Why the new standard? Essentially, as more Web browsers support XML, XHTML will become only one module of many different XML-based markup languages that can be understood and displayed by browsers and other applications. That makes it possible, for example, to create a math-specific markup language to display complex mathematical formulae in pages that are rendered by XML-compliant applications.

Strict adherence to the XHTML standard will also make the future a bit easier to cope with. Already many different types of devices and applications are used to access the Web, from phones and handheld computers to devices used by the physically challenged. XHTML is designed to take all those browsers into account. The better your code conforms to the standard, the better it will render in a variety of circumstances.

It may seem obvious that you should use the latest standard, XHTML 1.0, but it actually isn’t quite that simple. The problem is, even within the XHTML 1.0 standard, there are two basic approaches: a strict approach and a transitional approach. While using strict XHTML would seem ideal, doing so can have an unintended drawback— it can fail to work in older Web browser applications.

Although the vast majority of computer users upgrade their Web browsers fairly regularly, there are still quite a few older computers out there, with older browsers that may not recognize all the changes XHTML requires. So, you have to decide if you’ll work with strict XHTML or transitional XHTML. In fact, you have to declare one or the other within your Web document.