The data function

The data function can select data lines for a specified block type from a CEX source or from a list of Blocks

It always returns a (possibly empty) Vector of string values representing CEX data lines.

Select data lines from CEX sources

In this example, we work with a CEX source that has several different kinds of CEX blocks, and two ctsdata blocks with passages from two different texts. We can collect all of the text datalines using the same syntax as for the blocks function.

using CiteEXchange
f = joinpath(root, "test", "assets", "burneyex.cex")
str = read(f, String)
simplelines = data(str, "ctsdata")
3-element Vector{SubString{String}}:
 "urn:cts:greekLit:tlg5026.burney" ⋯ 39 bytes ⋯ "tlg0012.tlg001.burney86:8.title"
 "urn:cts:greekLit:tlg5026.burney" ⋯ 144 bytes ⋯ "ιν συναχθόμενος τοῖς Ἀχαιοῖς ⁑"
 "urn:cts:greekLit:tlg5026.burney86.normed:8.73r_2.lemma|κροκόπεπλος"
using CitableBase: StringReader
stringlines = data(str, StringReader, "ctsdata")
3-element Vector{SubString{String}}:
 "urn:cts:greekLit:tlg5026.burney" ⋯ 39 bytes ⋯ "tlg0012.tlg001.burney86:8.title"
 "urn:cts:greekLit:tlg5026.burney" ⋯ 144 bytes ⋯ "ιν συναχθόμενος τοῖς Ἀχαιοῖς ⁑"
 "urn:cts:greekLit:tlg5026.burney86.normed:8.73r_2.lemma|κροκόπεπλος"
using CitableBase: FileReader
filelines = data(f, FileReader, "ctsdata")
3-element Vector{SubString{String}}:
 "urn:cts:greekLit:tlg5026.burney" ⋯ 39 bytes ⋯ "tlg0012.tlg001.burney86:8.title"
 "urn:cts:greekLit:tlg5026.burney" ⋯ 144 bytes ⋯ "ιν συναχθόμενος τοῖς Ἀχαιοῖς ⁑"
 "urn:cts:greekLit:tlg5026.burney86.normed:8.73r_2.lemma|κροκόπεπλος"
using CitableBase: UrlReader
url = "https://raw.githubusercontent.com/cite-architecture/CiteEXchange.jl/dev/test/assets/laxlibrary1.cex"
urllines = data(url, UrlReader, "ctsdata")
7-element Vector{SubString{String}}:
 "urn:cts:citedemo:gburg.bancroft" ⋯ 153 bytes ⋯ "hat all men are created equal."
 "urn:cts:citedemo:gburg.bancroft" ⋯ 357 bytes ⋯ "proper that we should do this."
 "urn:cts:citedemo:gburg.bancroft" ⋯ 434 bytes ⋯ "ve thus far so nobly advanced."
 "urn:cts:citedemo:gburg.bancroft" ⋯ 420 bytes ⋯ "all not perish from the earth."
 "urn:cts:greekLit:tlg5026.burney" ⋯ 39 bytes ⋯ "tlg0012.tlg001.burney86:8.title"
 "urn:cts:greekLit:tlg5026.burney" ⋯ 144 bytes ⋯ "ιν συναχθόμενος τοῖς Ἀχαιοῖς ⁑"
 "urn:cts:greekLit:tlg5026.burney86.normed:8.73r_2.lemma|κροκόπεπλος"
simplelines == stringlines == filelines == urllines
false

Note in particular that citerelationset blocks have three lines of metadata before the relations data. These three lines appear in the lines field of a block, but are not included in the output of data.

relationsurl = "https://raw.githubusercontent.com/cite-architecture/CiteEXchange.jl/dev/test/assets/laxlibrary1.cex"
relblocks = blocks(relationsurl, UrlReader, "citerelationset")
relblocks[1].lines
5-element Vector{SubString{String}}:
 "urn|urn:cite2:hmt:dse.v1:msBil8"
 "label|Collection of DSE records for Iliad 8 in the Venetus B"
 "passage|imageroi|surface"
 "urn:cts:greekLit:tlg0012.tlg001" ⋯ 72 bytes ⋯ "05667|urn:cite2:hmt:msB.v1:103r"
 "urn:cts:greekLit:tlg0012.tlg001" ⋯ 72 bytes ⋯ "03061|urn:cite2:hmt:msB.v1:103r"
data(relationsurl, UrlReader, "citerelationset")
2-element Vector{SubString{String}}:
 "urn:cts:greekLit:tlg0012.tlg001" ⋯ 72 bytes ⋯ "05667|urn:cite2:hmt:msB.v1:103r"
 "urn:cts:greekLit:tlg0012.tlg001" ⋯ 72 bytes ⋯ "03061|urn:cite2:hmt:msB.v1:103r"

Select data lines from a list of Blocks

Instead of a CEX source, you can also directly supply a list of blocks (without a "reader" type).

blockgroup = blocks(relationsurl, UrlReader)
blocklines = data(blockgroup, "ctsdata")
7-element Vector{SubString{String}}:
 "urn:cts:citedemo:gburg.bancroft" ⋯ 153 bytes ⋯ "hat all men are created equal."
 "urn:cts:citedemo:gburg.bancroft" ⋯ 357 bytes ⋯ "proper that we should do this."
 "urn:cts:citedemo:gburg.bancroft" ⋯ 434 bytes ⋯ "ve thus far so nobly advanced."
 "urn:cts:citedemo:gburg.bancroft" ⋯ 420 bytes ⋯ "all not perish from the earth."
 "urn:cts:greekLit:tlg5026.burney" ⋯ 39 bytes ⋯ "tlg0012.tlg001.burney86:8.title"
 "urn:cts:greekLit:tlg5026.burney" ⋯ 144 bytes ⋯ "ιν συναχθόμενος τοῖς Ἀχαιοῖς ⁑"
 "urn:cts:greekLit:tlg5026.burney86.normed:8.73r_2.lemma|κροκόπεπλος"