When you think ASP, think...
Recent Articles
All Articles
ASP.NET Articles [1.x] [2.0]
ASPFAQs.com
Message Board
Related Web Technologies
User Tips!
Coding Tips
Search

Sections:
Book Reviews
Sample Chapters
Commonly Asked Message Board Questions
Headlines from ASPWire.com
JavaScript Tutorials
MSDN Communities Hub
Official Docs
Security
Stump the SQL Guru!
Web Hosts
XML Info
Information:
Advertise
Feedback
Author an Article
Technology Jobs

















internet.com
IT
Developer
Internet News
Small Business
Personal Technology
International

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers
ASP ASP.NET ASP FAQs Message Board Feedback ASP Jobs
Print this page.

Windows Systems Administrator
Jupitermedia
US-CT-Darien

Justtechjobs.com Post A Job | Post A Resume

Published: Wednesday, December 03, 2003

The XHTML Way
By Vlad Alexander


Introduction
If you're starting a new Web project or enhancing an existing Web site, you face the same dilemma: Which version of HTML should I use? Many developers shy away from the latest Web standards because they think it's an either/or choice and aren't ready to commit 100% to the new standards. This article will give you the background you need to make an informed decision and show you how you can gradually transition to the latest Web standards by successfully combining HTML 4 with the latest XHTML.

- continued -

A Brief History
I don't like to look back at old technology, but putting XHTML into its proper context requires a quick trip down memory lane. The timeline below shows key milestones in the evolution of HTML. With each milestone, there have been significant changes to the standard. In year 2000, a change to the standard caused a change in the name (XHTML) and version number (1.0). Some saw this as the death of HTML and the birth of a new markup syntax. Others (including myself) see this as just another milestone in the evolution of HTML.

This chart shows the significant milestones in the history of XHTML.

HTML 4
HTML 4 is the markup syntax we are all familiar with. Along with new features like scripting, HTML 4 introduced Cascading Style Sheets (CSS) and made it easier to write more accessible code for users with disabilities. Since the language was very easy to write, a wave of WYSIWYG editors sprung up permitting non-technical users to author rich content for the first time. However, because the language was so easy to write, it also encouraged mistakes and - in their rush to imitate word processors - WYSIWYG editors generated markup that was considered "dirty."

The problem stems from the fact that HTML, itself, does not impose any formatting or structuring guidelines. Add this with the fact that browsers will gleefully render sloppy and malformatted HTML, and you have yourself a recipe for disaster. Instead of tackling the problem at its source and making sure that the markup these editors generated was clean, tools like HTML Tidy were used to clean up dirty markup after the fact.

XHTML 1.0
What was missing in HTML 4 was a sense of professionalism - a mechanism that would enforce the rules of the language and prevent WYSIWYG authoring tools from generating bad code in the first place. XML, a sister standard to HTML, provided this mechanism. If the syntax of an XML document is incorrect, if tags are improperly nested or if closing tags are missing, the structure of the XML document is considered not valid. When the rigorous standards of XML were applied to HTML it reformulated HTML and what emerged was XHTML 1.0. (For more information on XML and its formatting rules, be sure to read the FAQ What is XML?)

Confusingly, XHTML 1.0 came in three flavors and you could specify which flavor of the language you were using by inserting a line in the beginning of the document.

  • "XHTML 1.0 Strict" declared elements like <font> and <basefont> to be outdated ("deprecated") and allowed formatting only through Cascading Style Sheets – either external, embedded or inline CSS.
  • "XHTML 1.0 Transitional" was less strict and retained most of the formatting model of HTML, including the use of the <font> element.
  • "XHTML 1.0 Frameset" was similar to XHTML 1.0 Transitional but also permitted the use of frames. (Frames are used to partition a browser's window into sections with each section displaying content from a different Web page)

One advantage of XHTML 1.0 was that it displayed pages in Web browsers much faster than HTML 4 pages, the difference being most apparent in very long documents. This was due to the fact that XHTML 1.0 followed the rules of XML, so parsing Web pages became much easier and required less CPU resources. Also, browsers did not need to clean up the structure of code before displaying the Web page, because Web pages written in XHTML 1.0 were well formed. Some WYSIWYG editors that natively generated HTML 4 were able to convert their code to XHTML 1.0 using clean-up tools like HTML Tidy.

XHTML 1.1
Apart from loading Web pages into browsers faster, most developers saw few other benefits to adopting XHTML 1.0. However, XHTML 1.1 offers developers one very significant benefit – it cleanly separates data from formatting. It does this by deprecating the style attribute and thus eliminating inline formatting. Instead, formatting is permitted only using CSS, which are referenced exclusively through the class attribute.

For developers of medium to large Web sites, the benefits of separating data from formatting are huge. First, in its "raw" state data becomes immediately more available to a wide range of devices and applications. Second, separating data from formatting has significant advantages for Web design. For instance, if you have ever maintained a Web site with many contributing authors, you know that some can't tell the different between Arial and Times Roman. Some like 11 point font while others prefer putting everything in 14 point. And if you give a non-technical user a color-picker, you can be sure that no color on the palette will go unused. Since XHTML 1.1 does not permit random inline formatting of this type, but regulates presentation through external or embedded CSS, it is much easier to maintain the common look and feel of Web sites. Modifying the look and feel of entire Web pages or web sites is also much simpler. Both can be achieved by making a few simple changes to one or more CSS files.

True, XHTML 1.1 requires a change in the way that Web pages are served, but the change is slight. It involves the "media type" information that is normally returned to the browser by the Web server when a page is requested. For HTML Web pages, the media type is text/html. For XHTML 1.1 Web pages, the media type should be application/xhtml+xml. For the many browsers that don't yet recognize this new media type, a W3C Note allows the continued used of the old text/html media type. However, rather than serving up XHTML 1.1 with the old media type, it's better practice to keep the old media type and serve up XHTML 1.1 content as XHTML 1.0 Strict. Do this by changing the doctype.

Content-managed Web Sites
Before deciding which version of HTML is right for you, let's look at how content-managed Web sites are built. Typically, they are built using a set of layout templates (ASP, PHP, ColdFusion, etc). These templates provide the general look and feel and navigation for the site, with placeholders for content (script that fetches data). When a site visitor requests a page, the layout template is combined with the data to produce the HTML Web page (see the diagram below).

This diagram shows content combined with a template to produce a Web page.

This is a solid and time-tested approach and virtually all content-managed sites are built in this way. Some store content in the database, others store it in XML documents on the file system or in plain text files, but the approach is essentially the same. However, over time, content usually needs to be re-purposed, syndicated and inserted into different page layouts. So while the way in which data is presented will change over time, content itself needs to remain highly available to any layout that needs it. The diagram below demonstrates this point.

This diagram shows the same content inserted into two different page layout templates.

Only content that is free from formatting can be easily re-used in this way. In theory, HTML 4 right through to XHTML 1.1 supports the separation of data from formatting, but only XHTML 1.1 actually enforces it. The reality is therefore that in the real world of content authoring, most WYSIWYG editors still generate code that fuses data and formatting together. This makes data more difficult to parse and reuse. Take for example this simple illustration. Let's say that an author decides to present people's names, within a news article, in the color green. This will generate the following code:

<font color="green">John Smith</font>

or

<span style="color: green">John Smith</span>

Problem: what if another Web site's policy is to display people's names in blue? On the surface, the solution seems easy – a simple "search and replace" on the word "green" within a color or a style attribute. But what if green is also being used to colorize something else? How confident would you be that your search and replace has not mistakenly replaced something it was not supposed to?

A far better approach is to author content in such a way that the data is not compromised by inline formatting - by using an external or embedded CSS. For example:

<span class="person">John Smith</span>

Each Web site that uses the data "John Smith" is now free to define the CSS rule that formats the person class in a way that meets its own common look and feel policy. For example:

span.person {color: blue}

Taking this one step further, what if a Web site for some reason wants to revert to using the <font> tag, instead of using CSS? Even this is quite easy to do by using an XSLT rule that transforms <span class="person"> to a <font> tag:

<xsl:template match="span[@class = 'person']">
   <font color="green"><xsl:value-of select="."/></font>
</xsl:template>

This example reveals one self-evident truth: it is possible to convert semantically rich markup to semantically barren markup, but not vice versa.

Fortunately, there are XHTML 1.1-compliant WYSIWYG editors, ones that enforce the separation of content and style. In Part 2 we'll look at one XHTML WYSIWYG editor in particular, and look at some general rules you can apply to your HTML markup today to help prepare it for a future of XHTML.

  • Read Part 2!


    Windows Internet Technology | ASP.NET [1.x] [2.0] | ASPMessageboard.com | ASPFAQs.com | Advertise | Feedback | Author an Article



  • JupiterOnlineMedia

    internet.comearthweb.comDevx.commediabistro.comGraphics.com

    Search:

    Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

    Jupitermedia Corporate Info


    Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

    Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

    Solutions
    Whitepapers and eBooks
    Microsoft Article: Will Hyper-V Make VMware This Decade's Netscape?
    Microsoft Article: 7.0, Microsoft's Lucky Version?
    Microsoft Article: Hyper-V--The Killer Feature in Windows Server 2008
    Avaya Article: How to Feed Data into the Avaya Event Processor
    Microsoft Article: Install What You Need with Windows Server 2008
    HP eBook: Putting the Green into IT
    Whitepaper: HP Integrated Citrix XenServer for HP ProLiant Servers
    Intel Go Parallel Portal: Interview with C++ Guru Herb Sutter, Part 1
    Intel Go Parallel Portal: Interview with C++ Guru Herb Sutter, Part 2--The Future of Concurrency
    Avaya Article: Setting Up a SIP A/S Development Environment
    IBM Article: How Cool Is Your Data Center?
    Microsoft Article: Managing Virtual Machines with Microsoft System Center
    HP eBook: Storage Networking , Part 1
    Microsoft Article: Solving Data Center Complexity with Microsoft System Center Configuration Manager 2007
    MORE WHITEPAPERS, EBOOKS, AND ARTICLES
    Webcasts
    Intel Video: Are Multi-core Processors Here to Stay?
    On-Demand Webcast: Five Virtualization Trends to Watch
    HP Video: Page Cost Calculator
    Intel Video: APIs for Parallel Programming
    HP Webcast: Storage Is Changing Fast - Be Ready or Be Left Behind
    Microsoft Silverlight Video: Creating Fading Controls with Expression Design and Expression Blend 2
    MORE WEBCASTS, PODCASTS, AND VIDEOS
    Downloads and eKits
    Sun Download: Solaris 8 Migration Assistant
    Sybase Download: SQL Anywhere Developer Edition
    Red Gate Download: SQL Backup Pro and free DBA Best Practices eBook
    Red Gate Download: SQL Compare Pro 6
    Iron Speed Designer Application Generator
    MORE DOWNLOADS, EKITS, AND FREE TRIALS
    Tutorials and Demos
    How-to-Article: Preparing for Hyper-Threading Technology and Dual Core Technology
    eTouch PDF: Conquering the Tyranny of E-Mail and Word Processors
    IBM Article: Collaborating in the High-Performance Workplace
    HP Demo: StorageWorks EVA4400
    Intel Featured Algorhythm: Intel Threading Building Blocks--The Pipeline Class
    Microsoft How-to Article: Get Going with Silverlight and Windows Live
    MORE TUTORIALS, DEMOS AND STEP-BY-STEP GUIDES