Contents
Editorial Information
Editorial
If you need help
Announcements
Physics Computing
Desktop Computing
Internet Services and Network
Scientific Applications and Software Engineering
Desktop Publishing
The Learning Zone
User Documentation
Just For Fun ...
|
XML Applications at CERN
Michel Goossens
, IT/API
Abstract
Several XML applications (commercial and public domain) are already
installed at CERN. They are used in various areas of electronic data
handling, such as document production, database queries, electronic
data exchange, etc. This article provides an update about the
availability of some XML-related free tools.
More and more XML-related tools are becoming available regularly. I
have evaluated a few of them and installed a couple of Java-based
interesting offerings on the central Unix systems. Please send your
suggestions for adding other packages to the author.
Below you find a list of packages that have been updated
recently. For each of them there is a brief description. More
information is available in the documentation. Please note that the
version numbers of the packages can vary if a more recent gets
installed. Hence you may have to look into the directory of the
package name, e.g., into /usr/local/doc/JAVA/xerces for
Xerces, etc. .
-
xerces Free source validating XML parser distributed by the
Apache Project. The Xerces Java Parser 1.1.3 supports XML 1.0
recommendation and contains advanced parser functionality, such as XML
Schema, DOM Level 2 version 1.0, and SAX Version 2, in addition to
supporting the industry-standard DOM Level 1 and SAX version 1 APIs.
Documentation for the current version (presently 1.1.3), is at
http://xml.apache.org/xerces-j/index.html. At CERN we have
only installed this Java version, but work is ongoing on a C++ and
Perl version.
At CERN possible command parameters are displayed by typing the
xerces command. Examples of using xerces are
xerces -saxcount -v invitationfr.xml
invitationfr.xml: 2003 ms
(14 elems, 0 attrs, 18 spaces, 384 chars)
xerces -saxcount invitationfr.xml
invitationfr.xml: 1693 ms
(14 elems, 0 attrs, 18 spaces, 384 chars)
The first version with the -v switch verifies the file and
counts the elements, while the version without that switch merely
checks the file for well-formedness.
-
xalan Free source XSL parser written in Java distributed by
the Apache Project. Documentation for the current version is in
http://xml.apache.org/xalan/overview.html, the API is defined
in
http://xml.apache.org/xalan/apidocs/index.html.
The presently installed version is 1.1.
A transformation of an XML file by an XSL stylesheet is obtained as
follows:
xalan invitationfr.xml invlat1fr.xsl a.tex
========= Parsing file:invlat1fr.xsl ==========
Parse of file:invlat1fr.xsl took 3851 milliseconds
========= Parsing invitationfr.xml ==========
Parse of invitationfr.xml took 1132 milliseconds
=============================
Transforming...
transform took 345 milliseconds
XSLProcessor: done
-
saxonxsl Michael Kay's XSL parser. Documentation for the
latest version (presently 5.4) is in
http://users.iclway.co.uk/mhkay/saxon/.
A
transformation of an XML file by an XSL stylesheet is obtained as
follows:
saxonxsl -t invitationfr.xml invlat1fr.xsl > a.tex
SAXON from Michael Kay of ICL
Version 5.4
Elapsed time: 2235 milliseconds
-
oraxsl Oracle's XSL parser (oraxml is the
corresponding XML parser). Information about the current version
(presently 2.0.2.9, check for the latest version if needed) is at http://technet.oracle.com/tech/xml/parser_java2/index.htm.
From that page on can find further documentation, in particular
the API definitions for the Java classes.
At CERN sample Java code is available in the directory
/usr/local/doc/JAVA/oraxmlxsl/2.0.2.9/sample/.
A transformation of an XML file by an XSL stylesheet is obtained as
follows:
oraxsl -v invitationfr.xml invlat1fr.xsl a.tex
1 XML document will be transformed using XSLT stylesheet
specified in invlat1fr.xsl with 1 thread
Parsing file invlat1fr.xsl
Parsing file invitationfr.xml
Transforming XML document specified in invitationfr.xml
-
fop Open source XSL formatting object to PDF convertor
developed by the Apache Project. Documentation for the current
version (presently 0.13.0) is available at
http://xml.apache.org/fop/index.html. However, most of the
time you will just type a command like the following to obtain a PDF
file from and XML file using an XSL stylesheet that transform your XML
element into XSL FO objects.
fop invitation.xml invfo1.xsl a.pdf
FOP 0.12.2 [dev]
using SAX parser org.apache.xerces.parsers.SAXParser
using renderer org.apache.fop.render.pdf.PDFRenderer
using element mapping org.apache.fop.fo.StandardElementMapping
using element mapping org.apache.fop.svg.SVGElementMapping
building formatting object tree
setting up fonts
formatting FOs into areas
[1]
rendering areas to PDF
writing out PDF
The PDF file a.pdf can be viewed with acroread or
gf and printed if needed.
-
svgtools A few tools for supporting the Scalable Vector
Graphics XML language. The command svgview-csiro will
open an SVG editor (you need java 1.2.2 or higher to run this
program). Samples of SVG sources are (at CERN) in the directory
/usr/local/doc/JAVA/svgtoolkit/20000712/samples and its
subdirectories. Another utility svg2jpg-csiro will translate
an SVG file into a JPEG image.
-
DocBook The latest DocBook SGML DTD and associated DSSSL
stylesheets, and the corresponding XML DTD version with its associated
XSL stylesheets are made available below the directories
/usr/local/share/docbooksgml,
/usr/local/share/docbookdsssl,
/usr/local/share/docbookxml, and
/usr/local/share/docbookxsl, respectively.
-
TEI The Text Encoding Initiative (TEI) XML DTD version with
its associated XSL stylesheets are available below the directories
/usr/local/share/tei.
A series of four lectures on XML that I gave in the framework of
the Academic Training Program in May 2000 is at
http://home.cern.ch/goossens/xml2000.html.
General XML-related information is available at http://web.cern.ch/xml/
About the author(s):
Michel Goossens is a CERN authority on LaTeX, XML and Electronic
Document Publishing techniques in general. He has written several
books on the subject.
|