XML

Obtaining a SAX Parser

If you want to write an application that uses SAX, the first thing you have to do is obtain a SAX parser. There are several SAX parsers available, and it's ultimately up to your own specific development needs as to which parser you should use. Furthermore, you'll need to look at the documentation for the parser that you choose in order to figure out how to integrate the parser with your applications. Following are several of the more popular SAX parsers you might want to consider using:

  • Xerces

  • libxml

  • Python SAX API

The next few sections provide more information about these SAX parsers, along with how to download and install them.

Xerces

Xerces is the XML parser from the Apache Software Foundation. It's used as part of several other Apache XML and Java-related projects and can be used by itself as well. In addition to supporting SAX, it also supports DOM Level 2, which you learned about in the previous tutorial, as well as XML Schema validation.

You can obtain Xerces, along with lots of other open source XML-related software, at http://xml.apache.org/. Xerces is completely free as it is open source software released under the Apache Software License.

The Xerces library is available in both .tar.gz and .zip formatsdownload the one that's appropriate for your platform. Included in the package are xercesImpl.jar and xml-apis.jar, which contain the compiled class files for the Xerces library itself, and xercesSamples.jar, compiled versions of the sample programs that come with Xerces. The package also includes documentation, source code for the sample programs, and some sample data files.

A .JAR file is a lot like a .ZIP file except that it is typically used to package compressed Java programs for distribution; JAR stands for Java ARchive.

In order to use the Xerces library, you just need to include the two aforementioned .JAR files (xercesImpl.jar and xml-apis.jar) in your class path when compiling and running programs that use it.

libxml

libxml is a package of Perl modules that contains a number of XML processing libraries. One of these is XML::Parser::PerlSAX. The easiest way to install it is to download it from CPAN (http://www.cpan.org/) and follow the instructions to install it on your local system. The methods provided by the PerlSAX module are basically identical to those in the Java version of SAXthey both implement the same interface in ways appropriate to Perl and Java, respectively.

Python

If you're a Python programmer, things are particularly easy for you. Recent versions of Python (from 2.0 on) provide built-in support for SAX without any additional software. To use the SAX library in your programs, you just need to include the line

from xml.sax import saxutils