Efficient Encoding of XML Updates

F. Curbera, D. Epstein, T. Poon, IBM

For the full presentation, click here.

ABSTRACT

We have designed XUL as a general-purpose XML update language. The purpose of XUL is to provide an standardized representation of the difference between documents. In XUL each operation is embedded in a fragment of contextual information extracted from the original document, enough to completely specify how the operation is to be performed. All the relevant fragments are then linked together, maintaining the structural relationship that they have in the original documents. As a result, an XUL document provides an abbreviated snapshot of the original, showing only those sections that require modification, but providing all the significant structural relationship among them.

This encoding provides several key advantages. First, it is easy to visualize the changes made to the document because the encoding reproduces the modified sections of the source, and the new information is shown in the correct context. The ability to encode complex operations in concise high level commands also contributes to enhance readability and conciseness. Second, the structure of an XUL document expresses the interdependencies between the operations in the update, allowing a consumer application to infer a correct order of execution, and to exploit potential parallelism. Finally, since only the initial and final states of the document are referenced by XUL, the encoding is effectively independent of the actual implementation of the update mechanism.

Having implemented programs to both efficiently generate the XUL representation of the difference of two documents, and to apply an XUL update to a source XML document, our future direction of work turns now toward relating this structural description of the differences to the underlying semantics of the document. The goal is now to obtain an encoding of the difference between two documents that is meaningful in the semantic framework where the documents originate. Consider, for instance, the update an XML customer data base. We would like to be able to express not only that a new element was added (which XUL expresses efficiently,) but the fact that this modification actually implies the opening of a new customer account.