The Web Design Group

Structure of an XHTML 1.0 Document

Elements and Tags

Elements are the structures that describe parts of an XHTML document. For example, the p element represents a paragraph while the em element gives emphasized content.

An element has three parts: a start tag, content, and an end tag. A tag is special text--"markup"--that is delimited by "<" and ">". An end tag includes a "/" after the "<". For example, the em element has a start tag, <em>, and an end tag, </em>. The start and end tags surround the content of the em element:

<em>This is emphasized text</em>

Element names must always be lower-case in XHTML, so <em> is allowed, but <eM> and <EM> are not.

Elements cannot overlap each other. If the start tag for an em element appears within a p, the em's end tag must also appear within the same p element.

Some elements have no content, so you don't have to type the full end tag. These elements, such as the br element for line breaks, are represented only by a start tag with a short-form closing tag as in <br />. They are said to be empty.

Attributes

An element's attributes define various properties for the element. For example, the img element takes a src attribute to provide the location of the image and an alt attribute to give alternate text for those not loading images:

<img src="wdglogo.gif" alt="Web Design Group">

An attribute is included in the start tag only--never the end tag--and takes the form attribute-name="Attribute-value". The attribute value is delimited by single or double quotes.

Attribute names must be lower-case. Attribute values may be case-sensitive.

Special Characters

Certain characters in XHTML are reserved for use as markup and must be escaped to appear literally. The "<" character may be represented with an entity, &lt;. Similarly, ">" is escaped as &gt;, and "&" is escaped as &amp;. If an attribute value contains a double quotation mark and is delimited by double quotation marks, then the quote should be escaped as &quot;.

Other entities exist for special characters that cannot easily be entered with some keyboards. For example, the copyright symbol ("©") may be represented with the entity &copy;. See the Entities section for a complete list of XHTML 1.0 entities.

As an alternative to entities, authors may also use numeric character references. Any character may be represented by a numeric character reference based on its "code position" in Unicode. For example, one could use &#169; for the copyright symbol or &#1575; for the Arabic letter ALEF.

Comments

Comments in XHTML have a complicated syntax that can be simplified by following this rule: Begin a comment with "<!--", end it with "-->", and do not use "--" within the comment.

<!-- An example comment -->

A Complete XHTML 1.0 Document

An XHTML 1.0 document begins with a DOCTYPE declaration that declares the version of XHTML to which the document conforms. The html element follows and contains the head and body. The head contains information about the document, such as its title and keywords, while the body contains the actual content of the document, made up of block-level elements and inline elements. A basic XHTML 1.0 document takes on the following form:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
  <head>
    <title>The document title</title>
  </head>
  <body>
    <h1>Main heading</h1>
    <p>A paragraph.</p>
    <p>Another paragraph.</p>
    <ul>
      <li>A list item.</li>
      <li>Another list item.</li>
    </ul>
  </body>
</html>

In a Frameset document, the frameset element replaces the body element.

Validating your XHTML

Each XHTML document should be validated to check for errors such as missing quotation marks (<a href="oops.html>Oops</a>), misspelled element or attribute names, and invalid structures. Such errors are not always apparent when viewing a document in a browser since browsers are designed to recover from an author's errors. However, different browsers recover in different ways, sometimes resulting in invisible text on one browser but not on others.

The WDG HTML Validator checks the validity of XHTML 1.0 documents.

Note that some programs claim to be validators but really are not. A validator checks a document against a formal document type definition (DTD) while other programs such as lints warn about valid but unsafe XHTML. Both kinds of programs are useful, but validation should never be forgotten.