XML

XPath

Identifying sections of XML documents is an essential part of using Extensible Stylesheet Language (XSL) and XPath. XSL is based on the idea of identifying sections of an XML document and transforming them according to a set of rules. XPath provides a means for identifying sections of an XML document.

XPath is based on the idea of repeating patterns. XML documents develop distinctive patterns in the way their elements are presented and ordered. For example, in Chapter 5 we put an a element within the body element of an XML document to create an anchor to the top of the document. This established a pattern in which the a element was always included within a p element and the p element was included within the body element. We also had a pattern in which the a element was included in a p element within a td element. In this instance, the a element was used as a hyperlink. Thus, the body, p, a pattern represents an anchor and the td, p, a pattern represents a hyperlink.

If you can identify these two patterns in the document, you can use XSL to transform the two a elements in different ways-for example, the hyperlinks can be underlined and displayed in a specified color, and the anchor can be made invisible in the document. Pattern identification enables XSL to find specific elements and transform them in a specified manner. We'll have a detailed discussion of XSL in Chapter 12.

You can also use patterns to select and link to specific sections of a document. For example, you could create a link that finds all of the item_name elements in a purchase order document and returns a reference to these elements. In both XLink and XSL, XPath is used to identify portions of an XML document. Let's begin with location paths.

Location Paths

The XPath specification is designed to address different parts of the XML document through the use of location paths. The location path provides instructions for navigating to any location in an XML document. You can use XPointer to specify an absolute location or a relative location. An absolute location points to a specific place in the document structure. A relative location points to a place that is dependent upon a starting location. If you were giving directions, an absolute location would be 12 Main Street, whereas a relative location would be drive 1 mile up Main Street from the intersection of Oak Street and Main Street. In the case of an XML document, an absolute location would be the root or the second customer element. A relative path would be the fourth child node of the root.

The entire XML document is called the document element. The document is represented as a treelike structure where location paths return sets of nodes on node axes. Movement will occur up and down these node axes.