An XML application can certainly determine if a document is well formed without any other information, but it requires a schema in order to assess document validity. This schema typically comes in the form of a DTD (Document Type Definition) or XSD (XML Schema Definition), which you learned about in Tutorial 3, "Defining Data with DTD Schemas," and Tutorial 7, "Using XML Schema." To recap, schemas allow you to establish the following ground rules that XML documents must adhere to in order to be considered valid:
-
Establish the elements that can appear in an XML document, along with the attributes that can be used with each
-
Determine whether an element is empty or contains content (text and/or child elements)
-
Determine the number and sequence of child elements within an element
-
Set the default value for attributes
It's probably safe to say that you have a good grasp on the usefulness of schemas, but you might be wondering about the details of how an XML document is actually validated with a schema. This task begins with the XML processor, which is typically a part of an XML application. The job of an XML processor is to process XML documents and somehow make the results available for further processing or display within an application. A modern web browser, such as Internet Explorer, Firefox, Safari, or Opera, includes an XML processor that is capable of processing an XML document and displaying it using a style sheet. The XML processor knows nothing about the style sheetit just hands over the processed XML content for the browser to render.
The actual processing of an XML document is carried out by a special piece of software known as an XML parser. An XML parser is responsible for the nitty-gritty details of reading the characters in an XML document and resolving them into meaningful tags and relevant data. There are two types of parsers capable of being used during the processing of an XML document:
-
Standard (non-validating) parser
-
Validating parser
A standard XML parser, or non-valid parser, reads and analyzes a document to ensure that it is well formed. A standard parser checks to make sure that you've followed the basic language and syntax rules of XML. Standard XML parsers do not check to see if a document is validthat's the job of a validating parser. A validating parser picks up where a standard parser leaves off by comparing a document with its schema and making sure it adheres to the rules laid out in the schema. Because a document must be well-formed as part of being valid, a standard parser is still used when a document is being validated. In other words, a standard parser first checks to see if a document is well-formed, and then a validating parser checks to see if it is valid.
In actuality, a validating parser includes a standard parser so that there is technically only one parser that can operate in two different modes.
When you begin looking for a means to validate your documents, make sure you find an XML application that includes a validating parser. Without a validating parser, there is no way to validate your documents. You can still see if they are well formed by using a standard parser only, which is certainly important, but it's generally a good idea to go ahead and carry out a full validation.