LT PyXML: A Fast Validating XML Parser Embedded in Python
Henry Thompson, University of Edinburgh
For the full
presentation, click here.
ABSTRACT
By the time DevCon happens, the HCRC Language Technology Group will
have put out a major set of new releases of our free (for
non-commercial use) XML API (LT XML), XML editor (XED) and for the
first time, the bridge between the two (LT PyXML). This presentation
will describe the form the embedding of our LT XML C API into Python
takes, and illustrate its use with at least the following three
applications:
- 1) XED
- An XML-smart text editor, which maintains well-formedness at all
times, supports fast keyboard-only XML document authoring, and with
the forthcoming release, makes DTD-compliant authoring fast and easy;
- 2) XML Schema workbench
- A simple Python tool using LT PyXML to graph the archetype lattice
implicit in any XML Schema (WD of 6 May) schema, and output a
normalised XML DTD as close as possible in coverage as the schema;
- 3) XML DTD normaliser
- An even simpler Python tool using LT PyXML to normalise XML DTDs for
comparison with the output of (2).
I'll finish with a comparison between the LT XML API and the DOM,
particularly with reference to access to the DTD and to streaming
(i.e. not whole tree) access to large documents.