CiteEXchange

Parse data in the delimited-text CEX format.

Cite EXchange format (CEX) is a plain-text format for serializing citable scholarly resources. CEX organizes data in one or more blocks defined by a CEX header line. Using the CiteEXchange package, you can work with data from CEX sources as labelled Blocks with associated lines of metadata and data, can extract data contents by CEX block type, and can filter contents by URN.

Quick introduction

You can use the blocks function to read source data into a Vector of Block objects. This example using the FileReader type from CitableBase to indiate we should from a local file. The file in this example has two blocks, one labelled ctscatalog and one labelled ctsdata.

using CiteEXchange
using CitableBase
blocklist = blocks(f, FileReader)
2-element Vector{Block}:
 Block("ctscatalog", SubString{String}["urn|citationScheme|groupName|workTitle|versionLabel|exemplarLabel|online|language", "urn:cts:greekLit:tlg5026.burney86.hmt:|book, scholion|Scholia to the Iliad|Main scholia to the Iliad of British Library, Burney 86|British Library, Burney 86||true|grc", "urn:cts:greekLit:tlg5026.burney86int.hmt:|book,scholion,section|Scholia to the Iliad|Interior scholia of British Library, Burney 86|British Library, Burney 86||true|grc"])
 Block("ctsdata", SubString{String}["urn:cts:greekLit:tlg5026.burney86.normed:8.73r_1.ref|urn:cts:greekLit:tlg0012.tlg001.burney86:8.title", "urn:cts:greekLit:tlg5026.burney86.normed:8.73r_1.comment|Τὴν ῥαψῳδίαν κῶλον μάχην καλοῦσι: συντέμνει γὰρ τὴν διήγησιν συναχθόμενος τοῖς Ἀχαιοῖς ⁑", "urn:cts:greekLit:tlg5026.burney86.normed:8.73r_2.lemma|κροκόπεπλος"])
Note

The file f in the example below is test/assets/burneyex.cex in this github repository.

Each Block has a label and an array of data lines. You can work directly with the array of blocks:

blocklist[1].label
blocklist[1].lines

In more detail

The CiteEXchange package provides two main functions for working with CEX data:

  • the blocks function parses and filters CEX sources into lists of Blocks
  • the data function parses and filters CEX sources, and extracts only the data lines from the resulting Blocks

They are documented on the following pages.