XML

Writing Programs That Use SAX Parsers

Unless you really develop an interest in XML parsing, chances are you won't be writing a SAX parser. Rather, you'll be writing a program that interacts with a SAX parser. Writing a program that works with a SAX parser is in some ways similar to writing a program with a graphical user interface (GUI), such as a traditional application for Windows or Macintosh. When you write a GUI program, the GUI library turns actions that the user takes into events that are returned to you by the library. Your job as a programmer is then to write event handlers that respond to incoming events. For example, with JavaScript, certain elements on a web page can generate events that can be handled by JavaScript. Links generate onClick and onMouseOver events. There are also documentwide events in JavaScript, such as onLoad.

In regard to event handling, SAX works the same way conceptually as JavaScript. When a SAX parser parses the data in an XML document, it fires events based on the data that it is currently parsing. All of the methods listed previously that are associated with SAX are called by the parser when the associated event occurs. It's up to the application programmer to decide what action to take when those events are caught.

For example, you might want to print out just the contents of all of the title elements in a document, or you might want to construct a complex data structure based on all of the information you find in the document. The SAX parser doesn't care; it just provides you with all of the data in the document in a linear manner so that you can do whatever you like with it.

You might be asking yourself at this point why you would ever care to parse an XML document at such a low level. In other words, why would you ever want to print out just the contents of the title elements in a document? The main answer to this question has to do with data maintenance and integrity. As you continue to build and maintain larger and larger XML documents, you may find that you need to extract and study portions of the documents to find editorial errors or any other inconsistencies that are difficult to find when viewing raw XML code. A custom application built around a SAX parser can be used to drill down into an XML document and spit out any subset of the data that you want.