XML

Node Axes

Each element in the XML document can be considered a point in the tree structure. These element points can be seen as having a set of axes, each containing nodes extending from the element point. For example, you could have the following XML fragment:

  <message>
     <customer customerID = "c1" customerName = "John Smith"/>
     <customer customerID = "c2" customerName = "William Jones"/>
     <order orderID = "o100" customerID = "1"/>
  </message>

As shown in Figure 6-1, this message element has a child axis that consists of three nodes: two customer element nodes and one order element node.

Figure 6-1. Representation of a child axis consisting of three nodes.

Thus, each axis moves through the element tree, selecting a set of nodes based on its axis type (child in this example), and places these elements on the axis. Using node axes, you can select a set of elements within the document. The syntax for an XPath node axis is shown here:

  context axis::name

The name can be the name of an element, attribute, or other node. The context is the starting point of the path, which is usually the root of the XML document. The axis is the type of the axis that you want to select. As you will see, this format is extremely flexible and will allow you to select any pattern of elements within your XML documents.

The root element can be represented by a slash (/). When the root element is used as the context, it is equivalent to the document element.

XPath defines the following types of the node axis:

  • child The child axis selects all children of the context element in document order.
  • descendant The descendant axis contains all the descendants of the context node in document order. A descendant can be a child, a child of a child, and so on. The descendant axis does not contain attribute or namespace nodes.
  • parent The parent axis contains the parent of the context node.
  • following-sibling The following-sibling axis contains the following siblings of the context node in document order. A sibling is an element on the same level of the tree. If the context node is either an attribute or a namespace node, the following-sibling axis is empty.
  • preceding-sibling The preceding-sibling axis contains the preceding siblings. If the context node is either an attribute or a name-space node, the preceding-sibling axis is empty.
  • following The following axis contains the nodes in the same document as the context node that are immediately after the context node. Attribute, namespace, and descendant nodes are not included on the following axis.
  • preceding The preceding axis contains all the nodes in the same document as the context node that are immediately before the context node. Attribute, namespace, and ancestor nodes are not included on the preceding axis.
  • ancestor The ancestor axis contains all the context node ancestors, including the context node's parent, the parent's parent, and so on. The ancestor axis will store the nodes in reverse document order.
  • attribute The attribute axis contains the attributes of the context node. There are three possible attribute axes. If you use the following syntax, the attribute axis will contain the value of the attribute with attributeName:

  •   attribute::attributeName
    

    In the following syntax, the attribute axis contains all the elementName elements that have an attribute with the value of attributeName:


      elementName[attribute::attributeName]
    

    Finally, in the following syntax, the axis contains all the elementName elements that have an attribute with the attribute-Name equal to the attributeValue:


      elementName[attribute::attributeName=attributeValue]
    

    All of these axes will be empty unless the context node is an element node.

  • namespace The namespace axis contains the namespace nodes of the context nodes; the order will be defined by the implementation. This axis will be empty unless the context node is an element node.
  • self The self axis contains only the context node.
  • ancestor-or-self The ancestor-or-self axis contains the context node and all the context node's ancestors, in reverse document order.
  • descendant-or-self The descendant-or-self axis contains the context node and all the context node's descendants.

When an axis contains more than one element, you can select an element by using [position()=positionNumber]. The first element is assigned a positionNumber value of 1.

The following XML document fragment will be used to demonstrate how these axes can be used:

  <message>
     <date>01-01-2001</date>
     <customer customerID = "c1" customerName = "John Smith"/>
        <order orderID = "o100"/>
     <customer customerID = "c2" customerName = "William Jones" >
        <order orderID = "o101"/>
     </customer>
  </message>

As you can see, this document has a message root element, one date child element, and two customer child elements; each customer element has one order child element. Let's take a look at how an axis selects a set of nodes when it navigates through the element tree according to the instruction provided by the location path. The following table lists the example location paths and the element nodes selected based on these location paths.

Example Location Paths

Location Path Description
/child::customer

/child selects all the children of the root (the date and two customer elements). /child::customer selects all the customer elements that are children of the root-in this case, there are two customer elements.

/descendant::order

/descendant selects all the descendants of the root (the date element, the two customer elements,
and the order elements). /descendant::order selects the two order elements.

/descendant-or-self::message /descendant-or-self selects all the descendants of the root (the date element, the two customer elements, the two order elements, and the message root element). /descendant-or-self::message selects the message element.
/child::customer [attribute::customerID= c1]

Selects the customer element that has an attribute with a value of c1.

/child::customer [attribute::customerID= c1] [position() = 1]

Selects the first customer element having an attribute with a value of c1 (the first customer element-which is actually the only customer element with an attribute value equal to c1).

/child::customer [attribute::customerID= c1] [position() = 1]/following-sibling::customer

Selects all the customer elements that are following siblings to the customer element having an attribute with a value of c1-in this case, the second customer element (customerID = c2).

/child::customer [attribute::customerID= c2] [position() = 2]/preceding-sibling::customer

Selects all of the customer elements that are preceding siblings to the customer element having an attribute with a value of c2-in this case, the first customer element (customerID = c1).

/following::customer

Selects the two customerelements.

/child::customer [attribute::customerID= c1] [position() = 1]/ customer preceding::date

Selects the date elements preceding the first element that has an attribute with a value
of c1. Our example document has only one.

/self

Selects the message element.