CiteEXchange
Parse data in the delimited-text CEX format.
Cite EXchange format (CEX) is a plain-text format for serializing citable scholarly resources. CEX organizes data in one or more blocks defined by a CEX header line. Using the CiteEXchange package, you can work with data from CEX sources as labelled Blocks with associated lines of metadata and data, can extract data contents by CEX block type, and can filter contents by URN.
Quick introduction
You can use the blocks function to read source data into a Vector of Block objects. This example using the FileReader type from CitableBase to indiate we should from a local file. The file in this example has two blocks, one labelled ctscatalog and one labelled ctsdata.
using CiteEXchange
using CitableBase
blocklist = blocks(f, FileReader)2-element Vector{Block}:
Block("ctscatalog", SubString{String}["urn|citationScheme|groupName|workTitle|versionLabel|exemplarLabel|online|language", "urn:cts:greekLit:tlg5026.burney86.hmt:|book, scholion|Scholia to the Iliad|Main scholia to the Iliad of British Library, Burney 86|British Library, Burney 86||true|grc", "urn:cts:greekLit:tlg5026.burney86int.hmt:|book,scholion,section|Scholia to the Iliad|Interior scholia of British Library, Burney 86|British Library, Burney 86||true|grc"])
Block("ctsdata", SubString{String}["urn:cts:greekLit:tlg5026.burney86.normed:8.73r_1.ref|urn:cts:greekLit:tlg0012.tlg001.burney86:8.title", "urn:cts:greekLit:tlg5026.burney86.normed:8.73r_1.comment|Τὴν ῥαψῳδίαν κῶλον μάχην καλοῦσι: συντέμνει γὰρ τὴν διήγησιν συναχθόμενος τοῖς Ἀχαιοῖς ⁑", "urn:cts:greekLit:tlg5026.burney86.normed:8.73r_2.lemma|κροκόπεπλος"])The file f in the example below is test/assets/burneyex.cex in this github repository.
Each Block has a label and an array of data lines. You can work directly with the array of blocks:
blocklist[1].labelblocklist[1].linesIn more detail
The CiteEXchange package provides two main functions for working with CEX data:
- the
blocksfunction parses and filters CEX sources into lists ofBlocks - the
datafunction parses and filters CEX sources, and extracts only the data lines from the resultingBlocks
They are documented on the following pages.