XML

An XSLT Primer

Seeing as how XSL-FO is extremely limited in current major web browsers, the practical usage of XSL with respect to the Web must focus on XSLT for the time being. This isn't entirely a bad thing when you consider the learning curve for XSL in general. It may be that by staggering the adoption of the two technologies, the W3C may be inadvertently giving developers time to get up to speed with XSLT before tackling XSL-FO. The remainder of this tutorial focuses on XSLT and how you can use it to transform XML documents.

As you now know, the purpose of an XSLT style sheet is to process the nodes of an XML document and apply a pattern-matching mechanism to determine which nodes are to be transformed. Both the pattern-matching mechanism and the details of each transformation are spelled out in an XSLT style sheet. More specifically, an XSLT style sheet consists of one or more templates that describe patterns and expressions, which are used to match XML content for transformation purposes. The three fundamental constructs in an XSL style sheet are as follows:

  • Templates

  • Patterns

  • Expressions

Before getting into these constructs, however, you need to learn about the xsl:stylesheet element and learn how the XSLT namespace is used in XSLT style sheets. The stylesheet element is the document (root) element for XSL style sheets and is part of the XSLT namespace. You are required to declare the XSLT namespace in order to use XSLT elements and attributes. Following is an example of declaring the XSLT namespace inside of the stylesheet element:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

This example namespace declaration sets the prefix for the XSLT namespace to xsl, which is the standard prefix used in XSL style sheets. You must precede all XSLT elements and attributes with this prefix. Notice in the code that the XSLT namespace is http://www.w3.org/1999/XSL/Transform. Another important aspect of this code is the version attribute, which sets the version of XSL used in the style sheet. Currently the only version of XSL is 1.0, so you should set the version attribute to 1.0 in your style sheets.

By the Way

The XSLT namespace is specific to XSLT and does not apply to all of XSL. If you plan on developing style sheets that use XSL-FO, you'll also need to declare the XSL-FO namespace, which is http://www.w3.org/1999/XSL/Format and typically has the prefix fo. Furthermore, if you plan on using XSLT to transform web pages, it's a good idea to declare the XHTML namespace: http://www.w3.org/1999/xhtml.


Templates

A template is an XSL construct that describes output to be generated based upon certain pattern-matching criteria. The idea behind a template is to define a transformation mechanism that applies to a certain portion of an XML document, which is a node or group of nodes. Although it is possible to create style sheets consisting of a single template, you will more than likely create multiple templates to transform different portions of the XML document tree.

Templates are defined in XSL style sheets using the xsl:template element, which is primarily a container element for patterns, expressions, and transformation logic. The xsl:template element uses an optional attribute named match to match patterns and expressions in an XSLT style sheet. You can think of the match attribute as specifying a portion of the XML tree for a document. The widest possible match for a document is to set the match attribute to /, which indicates that the root of the tree is to be matched. This results in the entire tree being selected for transformation by the template, as the following code demonstrates:

<xsl:template match="/">
...
</xsl:template>

If you have any experience with databases, you might recognize the match attribute as being somewhat similar to a query in a database language. To understand what I mean by this, consider the following example, which uses the match attribute to match only elements named state:

<xsl:template match="state">
...
</xsl:template>

This template would come in useful for XML documents that have elements named state. For example, the template would match the state element in the following XML code:

<contact>
  <name>Frank Rizzo</name>
  <address>1212 W 304th Street</address>
  <city>New York</city>
  <state>New York</state>
  <zip>10011</zip>
</contact>

Matching a portion of an XML document wouldn't mean much if the template didn't carry out any kind of transformation. Transformation logic is created using several template constructs that are used to control the application of templates in XSL style sheets. These template constructs are actually elements defined in the XSLT namespace. Following are some of the more commonly used XSLT elements:

  • xsl:value-of Inserts the value of an element or attribute

  • xsl:if Performs a conditional selection (this or that)

  • xsl:for-each Loops through the elements in a document

  • xsl:apply-templates Applies a template in a style sheet

A crucial part of XSLT document transformation is the insertion of document content into the result tree, which is carried out with the xsl:value-of element. The xsl:value-of element provides the mechanism for transforming XML documents because it allows you to output XML data in virtually any context, such as within HTML markup. The xsl:value-of element requires an attribute named select that identifies the specific content to be inserted. Following is an example of a simple template that uses the xsl:value-of element and the select attribute to output the value of an element named title:

<xsl:template match="title">
  <xsl:value-of select="."/>
</xsl:template>

In this example, the select attribute is set to ., which indicates that the current node is to be inserted into the result tree. The value of the select attribute works very much like the path of a file on a hard drive. For example, a file on a hard drive might be specified as \docs\letters\lovenote.txt. This path indicates the folder hierarchy of the file lovenote.txt. In a similar way, the select attribute specifies the location of the node to be inserted in the result tree. A dot (.) indicates a node in the current context, as determined by the match attribute. An element or attribute name indicates a node beneath the current node, whereas two dots (..) indicate the parent of the current node. This approach to specifying node paths using a special expression language is covered in much greater detail in Tutorial 22.

To get an idea as to how the previous example template (matching title elements) can be used to transform XML code, take a look at the following code excerpt:

<book>
  <title>All The King's Men</title>
  <author>Robert Penn Warren</author>
</book>
<book>
  <title>Atlas Shrugged</title>
  <author>Ayn Rand</author>
</book>
<book>
  <title>Ain't Nobody's Business If You Do</title>
  <author>Peter McWilliams</author>
</book>

Applying the previous template to this code would result in the following results:

All The King's Men
Atlas Shrugged
Ain't Nobody's Business If You Do

As you can see, the titles of the books are plucked out of the code because the template matched title elements and then inserted their contents into the resulting document.

In addition to inserting XML content using the xsl:value-of element in a style sheet, it is also possible to conditionally carry out portions of the logic in a style sheet. More specifically, the xsl:if element is used to perform conditional matches in templates. This element uses the same match attribute as the xsl:template element to establish conditional branching in templates. Following is an example of how the xsl:if element is used to test if the name of a state attribute is equal to TN:

<xsl:if match="@state=TN">
  <xsl:apply-templates select="location"/>
</xsl:if>

This code might be used as part of an online mapping application. Notice in the code that the state attribute is preceded by an "at" symbol (@); this symbol is used in XPath to identify an attribute, as opposed to an element. Another important aspect of this code is the manner in which the location template is applied only if the state attribute is equal to TN. The end result is that only the location elements whose state attribute is set to TN are processed for transformation.

If you have any programming experience, you are no doubt familiar with loops, which allow you to repeatedly perform an operation on a number of items. If you don't have programming experience, understand that a loop is a way of performing an action over and over. In the case of XSLT, loops are created with the xsl:for-each element, which is used to loop through elements in a document. The xsl:for-each element requires a select attribute that determines which elements are selected as part of the loop's iteration. Following is an example of using the xsl:for-each element to iterate through a list of locations:

<xsl:for-each select="locations/location">
  <h1><xsl:value-of select="@city"/>, <xsl:value-of select="@state"/></h1>
  <h2><xsl:value-of select="description"/></h2>
</xsl:for-each>

In this example, the xsl:for-each element is used to loop through location elements that are stored within the parent locations element. Within the loop, the city and state attributes are inserted into the result tree, along with the description element. This template is interesting in that it uses carefully placed HTML elements to transform the XML code into HTML code that can be viewed in a web browser. Following is some example code to which you might apply this template:

</locations>
  <location city="Washington" state="DC">
    <description>The United States Capital</description>
  </location>
  <location city="Nashville" state="TN">
    <description>Music City USA</description>
  </location>
  <location city="Chicago" state="IL">
    <description>The Windy City</description>
  </location>
</locations>

Applying the previous template to this code yields the following results:

<h1>Washington, DC</h1>
<h2>The United States Capital</h2>
<h1>Nashville, TN</h1>
<h2>Music City USA</h2>
<h1>Chicago, IL</h1>
<h2>The Windy City</h2>

As you can see, the template successfully transforms the XML code into XHTML code that is capable of being viewed in a web browser. Notice that the cities and states are combined within large heading elements (h1), followed by the descriptions, which are coded in smaller heading elements (h2).

In order for a template to be applied to XML content, you must explicitly apply the template with the xsl:apply-templates element. The xsl:apply-templates element supports the familiar select attribute, which performs a similar role to the one it does in the xsl:for-each element. When the XSL processor encounters an xsl:apply-templates element in a style sheet, the template corresponding to the pattern or expression in the select attribute is applied, which means that relevant document data is fed into the template and transformed. Following is an example of applying a template using the xsl:apply-templates element:

<xsl:apply-templates select="location"/>

By the Way

The exception to the rule of having to use the xsl:apply-templates element to apply templates in an XSLT style sheet is the root element, whose template is automatically applied if one exists.


This code results in the template for the location element being invoked in the current context.

Patterns and Expressions

Patterns and expressions are used in XSLT templates to perform matches and are ultimately responsible for determining what portions of an XML document are passed through a particular template for transformation. A pattern describes a branch of an XML tree, which in turn consists of a set of hierarchical nodes. Patterns are used throughout XSL to describe portions of a document tree for applying templates. Patterns can be constructed to perform relatively complex pattern-matching tasks. When you think of patterns in this light, they form somewhat of a mini-query language that can be used to provide exacting controls over the portions of an XML document that are selected for transformation in templates.

As you learned earlier, the syntax used by XSL patterns is somewhat similar to that used when specifying paths to files on a disk drive. For example, the contacts/contact/phone pattern selects phone elements that are children of a contact element, which itself is a child of a contacts element. It is possible, and often useful, to select the entire document tree in a pattern, which is carried out with a single forward slash (/). This pattern is also known as the root pattern and is assumed in other patterns if you leave it off. For example, the contacts/contact/phone pattern is assumed to begin at the root of the document, which means that contacts is the root element for the document.

Expressions are similar to patterns in that they also impact which nodes are selected for transformation. However, expressions are capable of carrying out processing of their own, such as mathematical calculations, text processing, and conditional tests. XSL includes numerous built-in functions that are used to construct expressions within style sheets. Following is a simple example of an expression:

<xsl:value-of select="sum(@price)"/>

This code demonstrates how to use the standard sum() function to calculate the sum of the price attributes within a particular set of elements. This could be useful in a shopping cart application that needs to calculate a subtotal of the items located in the cart.

Admittedly, this discussion isn't the last word on XSL patterns and expressions. Fortunately, you learn a great deal more about patterns and expressions in Tutorial 22. In the meantime, this introduction will get you started creating XSL style sheets.