DTD, W3C Schema, or BizTalk Schema?
So far, we have looked at three different ways of defining XML documents. We now need to ask ourselves which is the best one for our current systems. The W3C schema is not currently being implemented in many applications because it is not a finalized standard. At the time of this printing, the W3C standard should be in candidate status, however, and it is likely that applications will begin to use it. The W3C standard is the future, and once applications such as the Microsoft, SUN, and IBM XML parsers begin using it, others will follow.
Until the W3C schema standard is implemented, the BizTalk schema can be used. When the W3C schema standard is complete, none of the XML documents based on a BizTalk schema will need to be changed. The only thing that will have to change is the schema located in the repository. It is likely that an application will be available that will automatically convert a BizTalk schema to the W3C schema, or you could write one yourself. While we are waiting for the final W3C standard to be completed, the BizTalk schema is the ideal interim candidate. The BizTalk schema has many advantages, including ease of use, simplicity, and a set of data types.
There is little question that movement of data across boundaries is one of the greatest challenges facing large and small corporations today. As we have mentioned, XML is the best available solution for moving data across boundaries, and we can use schemas to validate XML documents. In fact, you can spend the next year writing applications that do not use schemas, or that don't even work with XML, and then upgrade all of your applications when the W3C schema specification is released. However, by rewriting your applications, you will likely incur a large expense and waste a great deal of development time over the next one or two years. So a better choice is to write your applications using BizTalk schemas now and then upgrade only your schema when the new specification is complete. Using the BizTalk schema now will result in applications that integrate with the next generation of applications and require little expense to upgrade.
Even though BizTalk does offer many advantages over DTDs, DTDs do still have a place in the current development of XML. On the minus side, however, DTDs are much more complicated than schemas, especially when entities are used. They are also not written in well-formed, valid XML. On the plus side, DTDs have been in use for some time and are widely implemented across virtually every platform.
The large software corporations are currently divided on how to implement XML. As you know, Microsoft, working with other organizations such as SAP and CommerceOne, has developed BizTalk so that corporations can begin to implement XML solutions using schemas now. In response to BizTalk, IBM, SUN, and other organizations have formed a group called OASIS and created their own schema repository, which can be found at http://www.xml.org
. They are developing ebXML, which is similar to BizTalk but works only with the XML 1.0 standard-that is, only with DTDs.
As we have mentioned, the entire purpose of schemas and DTDs is to validate documents. The easiest means of accomplishing this is to have parsers, such as the Microsoft XML parser, that can validate XML documents using schemas. If a cross-platform parser for BizTalk schemas were available, the schemas could be used to validate any XML document on any platform. Unfortunately, only the Microsoft XML parser supports BizTalk schemas. If the information is passing across a Microsoft system boundary into another Microsoft system, support is not an issue. When information is moving over a boundary and across platforms, however, you need to find a way to validate XML documents. You can use the following three options.
The first option is to extend the Java or C++ parsers provided by SUN and IBM so that they use schemas instead of DTDs to validate documents. This means that all the non-Microsoft systems must have this custom-built parser installed on their servers. This may be an acceptable solution when you are working with a corporate partner or building internal solutions.
The second option is to pass the information across a boundary to a Microsoft BizTalk server, which will then pass the information over another system boundary to the non-Microsoft systems. The BizTalk server will perform the validations and then send the appropriate information to the correct server and method. When Microsoft releases its new BizTalk server, this solution will probably be the best. This too is an acceptable solution when you are working with a corporate partner or building internal solutions.
The third option is to use DTDs. Ideally, you want to be able to build a solution that is platform independent, meaning that you don't need to know whether the person on the receiving end has a BizTalk server. In many circumstances, you will be dealing with multiple organizations and will have no idea what platform they are using. For example, if you are publishing real estate information to all the real estate brokers in a certain area, you will have no way of knowing what system each broker is using. In this case, you need a platform-independent way of moving information to the recipients. As mentioned, BizTalk won't currently work in this situation. Until schemas become a widely accepted standard across all platforms, DTDs can still be used to validate XML documents. When you are working with UNIX and mainframes, DTDs will probably be the best solution until the W3C standard is released and used by all organizations.
When all platforms have applications that work with schemas, schemas will be the better choice. This shift will probably take place within the next year or two. When possible, you should try to use BizTalk schemas when building new applications. When cross-platform issues prevent the usage of schemas, DTDs will have to be used for now.