XML Processing Description Language (XPDL)

Simon St. Laurent

For the full presentation, click here.

ABSTRACT

XML provides a core set of tools for identifying document types, and the MIME infrastructure provides additional support for naming and describing document types. Although the tools provided by these core standards are very useful, they also have significant flaws. Documents can override declarations that are critical to the processing of a set of documents, providing too much flexibility at times. Similarly, non-validating processors aren't required to read external declarations, leading to missing default attributes which can generate additional and surprising problems. MIME type identifiers may not provide enough information to both generic XML processors and vocabulary-specific applications. Although the advent of schemas may help clean up some of these problems, a solution that describes document processing rules directly may provide more information to applications that need tighter descriptions of classes of documents for reliable and automated processing.

XML Processing Description Language (XPDL) is one attempt at solving these problems by describing characteristics for sets of documents. XML Processing Descriptions (XPDs) provide a machine-readable and extensible description of document types, allowing DTD and schema developers to create complete descriptions of their expectations for document processing. In some ways, XPDs are RFCs written for XML processors rather than for humans, providing guidelines for processors more complete and more centralized than the current document-centered tools allow. XPDs provide descriptions of document types on several levels, from the identification of schemas for the type and rules for their processing to default style sheets that applications can use to render the document in the absence of a document-specific style sheet declaration. This presentation will describe the (ongoing) development of XPDL and a Java Bean that supports it, as well as discuss what needs to be done to integrate XPDL with current parsing and application environments. The new options made available by SAX2 (and DOM level 2, to some extent) make possible closer coordination between applications and parsers, and easier creation of control objects that manage the parsing process more closely.

(Current information on XPDL, not yet updated for the Bean, is available at http://purl.oclc.org/NET/xpdl.)