A gist for converting OHCO2 text archives
05 Jun 2015The abstract OHCO2 model of citable text can be equivalently represented in many formats, but in practical terms, local XML files are a convenient representation for human editors.
This gist from Neel Smith gives you a single command you can use to convert a set of local XML files to equivalent representations as tabular structures in delimited text files, and to an RDF graph expressed in TTL notation. cts.groovy
is a groovy script: you’ll need groovy to run it. It resolves and retrieves library dependencies dynamically, so you’ll also need to be connected to the internet. Finally, since it creates new output files, you’ll need to run it in a directory where you have write permission.
The script takes three pieces of information as input: a TextInventory file documenting your texts and their citation schemes; the root directory of the files where your XML editions are stored; and a Relax NG schema for validating the TextInventory. As the prefatory comment in the script shows, you can run it with
groovy cts.groovy INVENTORYFILE ARCHIVEDIRECTORY SCHEMAFILE
and you’ll get as output a new directory cts-tabulated
containing one delimited text file for each text, and a new file cts.ttl
expressing your entire archive in a single RDF graph.