XStandoff Toolkit

For constructing XStandoff instances a set of XSLT 2.0 stylesheets, the XStandoff Toolkit, was created. Note, that these stylesheets do not support every single feature of XStandoff but a subset of the most used features, i.e., in 99% of all times they will be sufficient for your tasks (e.g. there is no support for discontinuous segments).
All XSLT stylesheets are available under the GNU Lesser General Public License (LGPL v3) and are available at the Download section.

The XStandoff Toolkit is a moving target. If there are any problems feel free to contact us (maikATxstandoffDOTnet).

Using the XStandoff Toolkit

Warning: The included XSLT stylesheets rely on extensions provided by the Saxon XSLT processor. Either use the latest free version of Saxon-B that supports these extensions (version, download for Java or .NET) or obtain a license for Saxon-PE 9.2 or Saxon-EE 9.2 (or above) from Saxonica (highly recommended). Saxon-HE 9.2 (and above) will not support these extensions!
If you decide to use the schema-aware version you have to use Saxon-EE 9.2 (or above) from Saxonica.

Since 9/29/2010, you have two options for merging XStandoff instances:

  1. either use the non-schema-aware mergeXSF.xsl or
  2. use the schema-aware mergeXSF-sa.xsl.

Follow these steps to create XStandoff instances:

  1. [Optional] If you want to check that your inline annotation files is valid according to the provided primary data file use pd-check.xsl:
    saxon -s:input-1.xml -xsl:pd-check.xsl primary-data=input.txt
    Note: The same check will be executed each time you convert an inline annotation to the corresponding XStandoff instance (see below).
  2. Use inline2XSF.xsl for converting an XML instance containing a single annotation tree into an XStandoff instance containing the standoff representation of this tree:
    saxon -o:output-1.xml -s:input-1.xml -xsl:inline2XSF.xsl primary-data=input.txt
  3. Repeat the step above to create a second XStandoff instance spanning an annotation over the same primary data:
    saxon -o:output-2.xml -s:input-2.xml -xsl:inline2XSF.xsl primary-data=input.txt
    1. Combine the two created XStandoff instances into a single file using the mergeXSF.xsl stylesheet:
      saxon -o:combined-output.xml -s:output-1.xml -xsl:mergeXSF.xsl merge-with=output-2.xml
    2. Alternatively, us the schema-aware mergeXSF-sa.xsl stylesheet.
      saxon -sa -val -o:combined-output.xml -s:output-1.xml -xsl:mergeXSF-sa.xsl merge-with=output-2.xml
  4. [Optional] If you want to remove or extract an annotation level/layer use extractXSFcontent.xsl:
    1. either use the remove-ID paramater to remove the respective level/layer:
      saxon -o:rest-output.xml -s:combined-output.xml -xsl:extractXSFcontent.xsl remove-ID=<ID>
    2. or use the extract-ID parameter to extract the respective level/layer and export it into a new file:
      saxon -o:extracted-output.xml -s:combined-output.xml -xsl:extractXSFcontent.xsl extract-ID=<ID>
  5. [Optional] Convert the XStandoff instance into an inline representation using XSF2inline.xsl
    saxon -o:combined-output-inline.xml -s:combined-output.xml -xsl:XSF2inline.xsl

Keep in mind that all of your inline annotation instances have to be bound to an XML namespace and must be valid according to an XML schema. In addition, providing the primary data (stored in a txt file) may help in case of errors (the primary data file is mandatory for the execution of the inline2xsf.xsl stylesheet).

Visualizing XStandoff instances (2D)

The XSF2SVG.xsl XSLT 2.0 stylesheet uses an XStandoff instance as input file and generates an SVG instance. This is based on work demonstrated by Wendell Piez during Digital Humanities 2010 and Balisage 2010's Beer & Demo Jam (winning the 1st prize). Start the conversion with the following command:

saxon -o:combined-output.svg -s:combined-output.xml -xsl:XSF2SVG.xsl

An example visualization of the "Drive my car" XStandoff instance can be seen below:

SVG visualisation of the "Drive my car" example

The complete representation (including JavaScript mouseover functionality) can be found here (provided that your browser supports SVG images).
The XSF2SVG.xsl stylesheet neither supports discontinuous nor empty elements and can be considered for demonstration purposes. However, it is quite useful to visualize smaller XStandoff instance, especially for demonstrating overlapping structures.

For a detailed documentation use the online documentation of the XStandoff Toolkit.

Visualizing XStandoff instances (3D prototype)

In 2011's Balisage paper entitled "Visualization of concurrent markup. From trees to graphs, from 2D to 3D" we've demonstrated a 3D prototype for visualizing concurrent hierarchies in a 3D space using the z-axis to separate tree structures spanning over the very same primary data.

We use the experimental open source (MIT/GPL dual license) X3DOM framework and the X3D royalty-free open standards serialization format in combination with HTML5 (note that the resulting instance is neither valid XHTML nor HTML5 at the moment).
The transformation is accomplished by the following command:

saxon -o:drive_my_car.x3d.html -s:drive_my_car.xsf.xml -xsl:XSF2X3D.xsl

A more detailed description of the prototype can be found in the paper mentioned above.
Below is a Screenshot of the prototype rendering the "Drive my car" XStandoff instance:

X3D visualisation of the "Drive my car" example

The complete representation (including JavaScript mouseover functionality and draggable tree structures) can be found here. A recent browser version of Google Chrome (9+), Apple Safari (5.1+, OS X version only) or Firefox (4+) and JavaScript is required. There is a Flash fallback for Internet Explorer 9 (see the Browser support page at http://www.x3dom.org for further details.

Warning: The X3DOM JS Runtime file x3dom.js and the X3DOM CSS file x3dom.css have to be present in the output directory. You may obtain both files at the Download page at X3DOM.
The stylesheet has been tested with the production version 1.2 of the X3DOM JS Runtime.

This stylesheet is not documented in the online documentation of the XStandoff Toolkit yet.

Analyzing XStandoff instances

There is a small XQuery script analyzeXSF.xq available that may serve as a starting point for analyzing an XStandoff instance. It uses a target element (default target elements are elements on the first layer of the first level) and compares it with elements derived from other annotation layers in regard to the position in the character stream of the primary data, outputting elements with identical text span, identical start or end point, or overlaps.

The script selects the first leaf element (that is, elements without any children) of the first layer as first target element and compares its character positions with the positions of elements on the same layer and other layers, resulting in a list of elements with identical start and end positions, identical start position, identical end position, inclusion or classic overlap.

To use the script either modify the $doc variable or provide a data directory (which is the default value containing the file(s) to process on the same directory level of the one in which the XQuery script is contained.
The target element (that is the first element to be processed) can be modified as well by adopting the $target variable. Afterwards run the script by the following command:

xquery -o:queryout.xml -s:combined-output.xml -q:analyzeXSF.xq

The output file queryout.xml contains the necessary information.

The XQuery script is available under the GNU Lesser General Public License (LGPL v3) at the Download section.