Data Descriptors by Example

Len Berman, IBM

For the full presentation, click here.

ABSTRACT

Data Descriptors by Example (DDbE) is a package of components and a simple command line application which facilitates the automatic generation of data descriptors, (DTDs, Schemas, etc.), from a set of well-formed XML example documents. The example documents are valid under the resulting descriptor. A pre-release version can be downloaded from http://www.alphaworks.ibm.com/tech/DDbE. (Version 1.0 which includes documentation will be available before the meeting.)

The function supplied by the DDbE core:

Extensions to the DDbE core are concentrated in two areas:

  1. Expanded type inferencing to utilize the data typing capabilities of XML Schemas. Initially extensions will focus on the primitive and built-in data types specified in the Schema Data Types draft. We expect an initial implementation capable of inferring basic types and facets to be in place at the time of the meeting. For more complex facets, such as inference of lexical representation, we expect to be in the planning stage.

  2. Utilization of repositories of pre-existing data descriptor fragments. Problems include searching repositories for relevant documents and adapting external declarations for local use. This phase of the work is in the planning stage.

In our presentation we discuss DDbE with an emphasis on the set of interfaces which model the processing required to incrementally construct and manipulate data descriptors. The examination of the interfaces will follow their evolution from DTDs to Schemas as well as the extensions necessary to integrate repositories.