CiteEXchange
Parse data in the delimited-text CEX format.
Cite EXchange format (CEX) is a plain-text format for serializing citable scholarly resources. CEX organizes data in one or more blocks defined by a CEX header line. Using the CiteEXchange
package, you can work with data from CEX sources as labelled Block
s with associated lines of metadata and data, can extract data contents by CEX block type, and can filter contents by URN.
Quick introduction
You can use the blocks
function to read source data into a Vector of Block
objects. This example using the FileReader
type from CitableBase
to indiate we should from a local file. The file in this example has two blocks, one labelled ctscatalog
and one labelled ctsdata
.
using CiteEXchange
using CitableBase
blocklist = blocks(f, FileReader)
2-element Vector{Block}:
Block("ctscatalog", SubString{String}["urn|citationScheme|groupName|workTitle|versionLabel|exemplarLabel|online|language", "urn:cts:greekLit:tlg5026.burney86.hmt:|book, scholion|Scholia to the Iliad|Main scholia to the Iliad of British Library, Burney 86|British Library, Burney 86||true|grc", "urn:cts:greekLit:tlg5026.burney86int.hmt:|book,scholion,section|Scholia to the Iliad|Interior scholia of British Library, Burney 86|British Library, Burney 86||true|grc"])
Block("ctsdata", SubString{String}["urn:cts:greekLit:tlg5026.burney86.normed:8.73r_1.ref|urn:cts:greekLit:tlg0012.tlg001.burney86:8.title", "urn:cts:greekLit:tlg5026.burney86.normed:8.73r_1.comment|Τὴν ῥαψῳδίαν κῶλον μάχην καλοῦσι: συντέμνει γὰρ τὴν διήγησιν συναχθόμενος τοῖς Ἀχαιοῖς ⁑", "urn:cts:greekLit:tlg5026.burney86.normed:8.73r_2.lemma|κροκόπεπλος"])
The file f
in the example below is test/assets/burneyex.cex
in this github repository.
Each Block
has a label and an array of data lines. You can work directly with the array of blocks:
blocklist[1].label
blocklist[1].lines
In more detail
The CiteEXchange
package provides two main functions for working with CEX data:
- the
blocks
function parses and filters CEX sources into lists ofBlock
s - the
data
function parses and filters CEX sources, and extracts only the data lines from the resultingBlock
s
They are documented on the following pages.