Packages

case class Corpus(nodes: Vector[CitableNode]) extends LogSupport with Product with Serializable

A corpus of citable texts.

nodes

Contents of the citable corpus

Annotations
@JSExportAll()
Linear Supertypes
Product, Equals, LogSupport, LazyLogger, LoggingMethods, Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Corpus
  2. Product
  3. Equals
  4. LogSupport
  5. LazyLogger
  6. LoggingMethods
  7. Serializable
  8. Serializable
  9. AnyRef
  10. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new Corpus(nodes: Vector[CitableNode])

    Create a new corpus with a vector of CitableNode objects.

    Create a new corpus with a vector of CitableNode objects.

    nodes

    Contents of the citable corpus

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. def ++(corpus2: Corpus): Corpus

    Create a new corpus by adding a second corpus to this one.

    Create a new corpus by adding a second corpus to this one.

    corpus2

    second corpus with contents to be added.

  4. def --(corpus2: Corpus): Corpus

    Create a new corpus by subtracting a second corpus from this one.

    Create a new corpus by subtracting a second corpus from this one.

    @ corpus2 second corpus with contents to be removed from this one.

  5. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  6. def >=(urn: CtsUrn): Corpus

    Create a new corpus of nodes that are contained by a given URN.

    Create a new corpus of nodes that are contained by a given URN.

    urn

    CtsUrn to use in filtering the corpus.

  7. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  8. def cex(delimiter: String = "#"): String

    Two-column serialization of this Corpus as formated for CEX serialization.

    Two-column serialization of this Corpus as formated for CEX serialization.

    delimiter

    String value to separate two columns.

  9. def chunkByCitation(drop: Int = 1): Vector[Corpus]

    Split a Corpus in to a Vector[Corpus] by citation (Will first chunk by Text).

    Split a Corpus in to a Vector[Corpus] by citation (Will first chunk by Text).

    drop

    How many levels of the passage-hierarchy, from the right, to drop when grouping

  10. def chunkByText: Vector[Corpus]

    Split a Corpus in to a Vector[Corpus] by distinct text (versions & exemplars)

  11. def citedWorks: Vector[CtsUrn]

    List all versions or exemplars cited in a corpus.

  12. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native() @HotSpotIntrinsicCandidate()
  13. def compressReff(urns: Vector[CtsUrn]): Vector[CtsUrn]

    Given a Vector[CtsUrn] compress it so that any sequences of URNs that can be expressed as ranges are expressed as ranges.

    Given a Vector[CtsUrn] compress it so that any sequences of URNs that can be expressed as ranges are expressed as ranges.

    urns

    Vector[CtsUrn]

  14. def concrete(urn: CtsUrn): Set[CtsUrn]

    Find list of all concrete texts for a given URN.

    Find list of all concrete texts for a given URN.

    urn

    URN to find concrete texts for.

  15. def concreteMap: Map[CtsUrn, Corpus]

    Map each concrete text's URN to a Vector of [CitableNode]s.

  16. def containedNodes(u: CtsUrn): Corpus

    Create a new corpus comprising nodes contained by a given URN.

    Create a new corpus comprising nodes contained by a given URN.

    u

    A CtsUrn at either version or exemplar level.

  17. def contents: Vector[String]

    Project text contents of the corpus to a vector of Strings.

  18. macro def debug(message: Any, cause: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingMethods
  19. macro def debug(message: Any): Unit
    Attributes
    protected
    Definition Classes
    LoggingMethods
  20. val dupes: Iterable[CtsUrn]

    Erroneously duplicated URN values.

  21. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  22. macro def error(message: Any, cause: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingMethods
  23. macro def error(message: Any): Unit
    Attributes
    protected
    Definition Classes
    LoggingMethods
  24. def exemplarToVersion(newVersionId: String): Corpus

    Creates a new corpus by reducing exemplar-level URNs to version-level URNs.

    Creates a new corpus by reducing exemplar-level URNs to version-level URNs. Order of exemplar-level nodes is maintained in the flattened, version-level corpus.

    newVersionId

    Value for version identifier of newly generated version.

  25. def exemplars(urn: CtsUrn): Set[CtsUrn]

    Find the set of exemplars in the present corpus matching a given URN.

    Find the set of exemplars in the present corpus matching a given URN.

    urn

    URN to find exemplars for.

  26. def find(v: Vector[String]): Corpus

    Create a new corpus containing citable nodes with content matching all of a list of strings.

    Create a new corpus containing citable nodes with content matching all of a list of strings. This is equivalent to successively filtering from a given corpus for nodes matching each string. E.g., corpus.find (Vector[s1,s2]) is equivalent to corpus.find(s1).find(s2).

    v

    Strings to search for.

  27. def find(v: Vector[String], currentCorpus: Corpus): Corpus

    Create a new corpus containing citable nodes with content matching all strings in a given list by recursively finding matches for the first string in the list.

    Create a new corpus containing citable nodes with content matching all strings in a given list by recursively finding matches for the first string in the list.

    v

    Strings to search for.

    currentCorpus

    Corpus to search in.

  28. def find(str: String): Corpus

    Create a new corpus containing citable nodes with content matching a given string.

    Create a new corpus containing citable nodes with content matching a given string.

    str

    String to search for.

    returns

    A Corpus object.

  29. def findToken(t: String, omitPunctuation: Boolean = true): Corpus

    Create a new corpus containing citable nodes with content matching a white-space delimited token.

    Create a new corpus containing citable nodes with content matching a white-space delimited token. Optionally, ignore punctuation characters.

    omitPunctuation

    True if punctuation should be ignored.

  30. def findTokens(v: Vector[String], currentCorpus: Corpus, omitPunctuation: Boolean = true): Corpus

    Create a new corpus with nodes containing all tokens in a given list by recursively finding matches for the first token in the list.

    Create a new corpus with nodes containing all tokens in a given list by recursively finding matches for the first token in the list. Optionally omit or include punctuation in token definition.

    v

    Tokens to search for.

    currentCorpus

    Corpus to search in.

    omitPunctuation

    True if punctuation should be omitted from tokens.

  31. def findTokensWithin(v: Vector[String], distance: Int, omitPunctuation: Boolean = true): Corpus

    Create a new corpus containing citable nodes with content matching all of a list of whitespace-delimited tokens with a given number of words of each other.

    Create a new corpus containing citable nodes with content matching all of a list of whitespace-delimited tokens with a given number of words of each other.

    v

    Vector of tokens.

    distance

    Maximum size of consecutive tokens all tokens in v must fall within.

  32. def findWhiteSpaceTokens(v: Vector[String]): Corpus

    Create a new corpus containing citable nodes with content matching all of a list of whitespace-delimited tokens.

    Create a new corpus containing citable nodes with content matching all of a list of whitespace-delimited tokens. This is equivalent to successively filtering from a given corpus for nodes matching each token. E.g., corpus.findTokens (Vector[s1,s2]) is equivalent to corpus.findTokens(s1).findTokens(s2).

    v

    Strings to search for.

  33. def findWordTokens(v: Vector[String]): Corpus

    Create a new corpus containing citable nodes with content matching all of a list of whitespace-delimited tokens, ignoring punctuation ("word" tokens).

    Create a new corpus containing citable nodes with content matching all of a list of whitespace-delimited tokens, ignoring punctuation ("word" tokens). This is equivalent to successively filtering from a given corpus for nodes matching each token. E.g., corpus.findTokens (Vector[s1,s2]) is equivalent to corpus.findTokens(s1).findTokens(s2).

    v

    Strings to search for.

  34. def first: CitableNode

    Find first citable node in the corpus.

    Find first citable node in the corpus. It is an exception if the passage does not include at least one citable node.

  35. def firstNode(filterUrn: CtsUrn): CitableNode

    Find first citable node in a passage.

    Find first citable node in a passage. It is an exception if the passage does not include at least one citable node.

    filterUrn

    URN identifying the passage.

  36. def firstNodeIndex(urn: CtsUrn): Option[Int]

    Find index in this corpus of a URN's first node.

    Find index in this corpus of a URN's first node. If urn is a leaf node, it's simply the index of the node, but for a containing node, it's the first contained leaf node.

    urn

    First node of a range.

  37. def firstNodeOption(filterUrn: CtsUrn): Option[CitableNode]

    Find first citable node in a passage.

    Find first citable node in a passage. Option is None if no citable nodes are found for the requested passage.

    filterUrn

    URN identifying the passage.

  38. def flattenTriple(v: Vector[(String, CitableNode, Int)], newVersion: String): (Int, CitableNode)

    Pairs a CitableNode with a sequential index number for that node.

    Pairs a CitableNode with a sequential index number for that node.

    v

    Vector of triples, comprised of passage identifier (a String value), a citable node, and a sequence number within the passage node.

    newVersion

    Version identifier for the new node.

  39. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  40. macro def info(message: Any, cause: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingMethods
  41. macro def info(message: Any): Unit
    Attributes
    protected
    Definition Classes
    LoggingMethods
  42. def isEmpty: Boolean

    True if citable nodes vector is empty.

  43. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  44. def last: CitableNode

    Find the last citable node in the corpus.

    Find the last citable node in the corpus. It is an exception if the passage does not include at least one citable node.

  45. def lastNode(filterUrn: CtsUrn): CitableNode

    Find the last citable node in a passage.

    Find the last citable node in a passage. It is an exception if the passage does not include at least one citable node.

    filterUrn

    URN identifying the passage.

  46. def lastNodeIndex(urn: CtsUrn): Option[Int]

    Find index in this corpus of a URN's last node.

    Find index in this corpus of a URN's last node. If urn is a leaf node, it's simply the index of the node, but for a containing node, it's the last contained leaf node.

    urn

    Last node of a range.

  47. def lastNodeOption(filterUrn: CtsUrn): Option[CitableNode]

    Find the last citable node in a passage.

    Find the last citable node in a passage. Option is None if no citable nodes are found for the requested passage.

    filterUrn

    URN identifying the passage.

  48. macro def logAt(logLevel: LogLevel, message: Any): Unit
    Attributes
    protected
    Definition Classes
    LoggingMethods
  49. lazy val logger: Logger
    Attributes
    protected[this]
    Definition Classes
    LazyLogger
  50. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  51. def next(filterUrn: CtsUrn): Vector[CitableNode]

    Find nodes following a passage.

    Find nodes following a passage. The number of nodes will equal the number of nodes in the passage unless fewer than that number of nodes follow the passage. In that case, all following nodes will be returned. If no nodes follow the passage, an empty vector is returned.

    filterUrn

    passage to find nodes before

  52. def nextUrn(filterUrn: CtsUrn): Option[CtsUrn]

    Find URN for nodes following a passage.

    Find URN for nodes following a passage.

    filterUrn

    Passage to find nodes after.

  53. def ngramHisto(str: String, n: Int, threshhold: Int, dropPunctuation: Boolean): StringHistogram

    Create a histogram of ngrams of size n, occurring more than threshold times, and including a specified string.

    Create a histogram of ngrams of size n, occurring more than threshold times, and including a specified string.

    str

    String that must be part of indexed ngram.

    n

    size of ngram desired

    threshhold

    only include ngrams that occur more than threshhold times. (Default value of 0 therefore collects all ngrams of the given sie.)

    dropPunctuation

    true if punctuation should be omitted from ngrams

    returns

    a vector of word+count pairs sorted from high to low

  54. def ngramHisto(n: Int, threshhold: Int = 0, dropPunctuation: Boolean = true): StringHistogram

    Create a histogram of ngrams of size n, occurring more than threshold times.

    Create a histogram of ngrams of size n, occurring more than threshold times.

    n

    size of ngram desired

    threshhold

    only include ngrams that occur more than threshhold times. (Default value of 0 therefore collects all ngrams of the given sie.)

    dropPunctuation

    true if punctuation should be omitted from ngrams

    returns

    a vector of word+count pairs sorted from high to low

  55. val nodes: Vector[CitableNode]
  56. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  57. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  58. def passageVersions(urn: CtsUrn): Vector[CtsUrn]

    Find all versions of a given CtsUrn in this corpus.

    Find all versions of a given CtsUrn in this corpus.

    urn

    URN to find versions for

  59. def passagesToWords(skipPunct: Boolean = true): Vector[Vector[String]]

    Convert strings to vectors of words, tokenizing on whitespace.

    Convert strings to vectors of words, tokenizing on whitespace. Optionally, omit puncutation characters from result.

    skipPunct

    true if punctuation should be omitted.

  60. def pointIndex(urn: CtsUrn): Int

    Find index in nodes of a given CtsUrn.

  61. def prev(filterUrn: CtsUrn): Vector[CitableNode]

    Find nodes preceding a passage.

    Find nodes preceding a passage. The number of nodes will equal the number of nodes in the passage unless fewer than that number of nodes preceding the passage. In that case, all preceding nodes will be returned. If no nodes precede the passage, an empty vector is returned.

    filterUrn

    passage to find nodes before

  62. def prevUrn(filterUrn: CtsUrn): Option[CtsUrn]

    Find URN for nodes preceding a passage.

    Find URN for nodes preceding a passage.

    filterUrn

    Passage to find nodes before.

  63. def rangeExtract(urn: CtsUrn): Corpus

    Create a new corpus from a single URN idetnifying a range.

    Create a new corpus from a single URN idetnifying a range. The given URN must refer to a concrete text.

    urn

    Range URN identifying corpus to extract.

  64. def rangeIndex(urn: CtsUrn): RangeIndex

    Find beginning and end index in this corpus of a given range URN.

    Find beginning and end index in this corpus of a given range URN. Beginning and end references of ranges may either be node references or containing references.

  65. def relation(u1: CtsUrn, u2: CtsUrn): TextPassageTopology.Value

    Computes topological relation of passage components of two CtsUrns.

    Computes topological relation of passage components of two CtsUrns.

    u1

    First CtsUrn to compare.

    u2

    Second CtsUrn to compare.

  66. def size: Int

    Number of citable nodes in the corpus.

  67. def sortPassages(passages: Iterable[CtsUrn]): Vector[CtsUrn]

    Given an Iterable[CtsUrn] return a Vector[CtsUrn] sorted by document order according to the order in the Corpus.

    Given an Iterable[CtsUrn] return a Vector[CtsUrn] sorted by document order according to the order in the Corpus. If any URNs in the parameter Iterable are range-URNs, this expands them to leaf-nodes before sorting.

  68. def sumCorpora(corpora: Vector[Corpus], sumCorpus: Corpus): Corpus

    Create a single Corpus by summing up the contents of a vector of corpora.

    Create a single Corpus by summing up the contents of a vector of corpora.

    corpora

    Corpus instances to concatenate.

  69. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  70. def textContents(filter: String, connector: String = "\n"): String

    Format text contents of passages matching a given string as a single string.

    Format text contents of passages matching a given string as a single string.

    connector

    String value separating citable nodes in the resulting string.

  71. def to2colString(delimiter: String): String

    Represent the Corpus in two-column delimited-text format.

    Represent the Corpus in two-column delimited-text format.

    delimiter

    String value to use as to separate URN strings from text contents.

  72. def to82xfString(delimiter: String): String

    Represent the Corpus in 82XF format.

    Represent the Corpus in 82XF format.

    delimiter

    String value to use as a column separator.

  73. def to82xfVector: Vector[XfRow]

    Create a vector of edu.holycross.shot.ohco2.XfRow instances equivalent to the present corpus.

  74. macro def trace(message: Any, cause: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingMethods
  75. macro def trace(message: Any): Unit
    Attributes
    protected
    Definition Classes
    LoggingMethods
  76. def urns: Vector[CtsUrn]

    Project all URNs in the corpus to a vector.

  77. def urnsForNGram(gram: String, threshhold: Int = 2, dropPunctuation: Boolean = true): Vector[CtsUrn]

    Find passages, identified by URN, where a given ngram occurs.

    Find passages, identified by URN, where a given ngram occurs. The value of n is derived from the number of whitespace-delimited tokens in gram.

    gram

    The desired ngram, with white space separating tokens.

    dropPunctuation

    True if punctuation should be omitted.

  78. def validReff(urn: CtsUrn): Vector[CtsUrn]

    Extract all URNs for all citable nodes identified by a given URN.

    Extract all URNs for all citable nodes identified by a given URN. Note that it is not an error if the resulting Vector is empty.

    urn

    URN identifying passage for which to find node URNs.

  79. def validReff2(filterUrn: CtsUrn): Vector[CtsUrn]
  80. def versions(urn: CtsUrn): Set[CtsUrn]

    Find the set of versions in the present corpus matching a given URN.

    Find the set of versions in the present corpus matching a given URN.

    urn

    URN to find versions for.

  81. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  82. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  83. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  84. macro def warn(message: Any, cause: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingMethods
  85. macro def warn(message: Any): Unit
    Attributes
    protected
    Definition Classes
    LoggingMethods
  86. def ~=(filterUrn: CtsUrn): Corpus

    Create a new corpus of nodes that are URN-similar to a given CtsUrn, limited to a given Version or Exemplar.

    Create a new corpus of nodes that are URN-similar to a given CtsUrn, limited to a given Version or Exemplar. Collect all texts where this URN is cited, then collect citable nodes for the cited version. Note that chaining these filters therefore successively filters the corpus and can be thought of as filtering by logically ANDing the URNs.

    filterUrn

    URN identifying a set of nodes to select from this corpus.

  87. def ~~(urnV: Vector[CtsUrn], resultCorpus: Corpus): Corpus

    Recursively add to a given corpus all nodes in the present corpus that are URN-similar to the first URN in a given vector of URNs.

    Recursively add to a given corpus all nodes in the present corpus that are URN-similar to the first URN in a given vector of URNs. When all nodes in the vector have been applied, the result is the final accumulation of all added nodes.

    urnV

    vector of URNs to use in filtering the corpus.

  88. def ~~(urnV: Vector[CtsUrn]): Corpus

    Create a new corpus of nodes that are URN-similar to any CtsUrn in a given vector of CtsUrns.

    Create a new corpus of nodes that are URN-similar to any CtsUrn in a given vector of CtsUrns. Note that this can be thought of as filtering by logically ORing the CtsUrns in the Vector.

    urnV

    vector of URNs to use in filtering the corpus.

  89. def ~~(filterUrn: CtsUrn): Corpus

    Create a new corpus of nodes that are URN-similar to a given CtsUrn.

    Create a new corpus of nodes that are URN-similar to a given CtsUrn. Collect all texts where this URN is cited, then collect citable nodes for the cited version. Note that chaining these filters therefore successively filters the corpus and can be thought of as filtering by logically ANDing the URNs.

    filterUrn

    URN identifying a set of nodes to select from this corpus.

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] ) @Deprecated @deprecated
    Deprecated

    (Since version ) see corresponding Javadoc for more information.

Inherited from Product

Inherited from Equals

Inherited from LogSupport

Inherited from LazyLogger

Inherited from LoggingMethods

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped