A CITE library for the JVM, version 0.96.0 > CTS URNs >

Constructing CTS URN objects

The technology-independent representation of a CTS URN as a string of characters is defined by the CTS URN specification. You can construct an object representation of a CTS URN from a string conforming to the CTS URN specification. It is an Exception if you try to construct a CTS URN from a String that does not conform to the specification.

Examples

Less obvious examples

Invalid index on subreference: substrings are indexed with positive integers. The string urn:cts:greekLit:tlg0012.tlg001.msA:1.1@Μῆνιν[0] is therefore invalid. Passing it to a constructor generates an Exception.

Wrong number of components: the colon-delimited syntax of a CTS URN always identifies four components after the required string cts:, even if the final passage component is empty. The string urn:cts:greekLit:tlg0012.tlg001.msA only includes three CTS components, so trying to make a CTS URN from it generates an Exception. The string urn:cts:greekLit:tlg0012.tlg001.msA: has four components, even though the final one is empty; it is possible to construct a CTS URN object from it.

Unicode normalization

CTS URN objects can be constructed from strings in any Unicode form, but all output representations of the CTS URN or parts of the CTS URN as strings are normalized to Unicode form NFC.

Examples

Compare the two CTS URNs in this table: they look identical when printed, but one is in pre-composed form (Unicode form NFC), the other in decomposed form (Unicode form NFD).

Input string Unicode form of input Length of input in bytes Output string identical to input string
urn:cts:greekLit:tlg0012.tlg001:1.1@μῆνιν NFC (composed) 47 Yes
urn:cts:greekLit:tlg0012.tlg001:1.1@μῆνιν NFD (decomposed) 48 No