The technology-independent representation of a CTS URN as a string of characters is defined by the CTS URN specification. You can construct an object representation of a CTS URN from a string conforming to the CTS URN specification. It is an Exception if you try to construct a CTS URN from a String that does not conform to the specification.
Invalid index on subreference: substrings are indexed with positive integers. The string urn:cts:greekLit:tlg0012.tlg001.msA:1.1@Μῆνιν[0] is therefore invalid. Passing it to a constructor generates an Exception.
Wrong number of components: the colon-delimited syntax of a CTS URN always identifies four components after the required string cts:
, even if the final passage component is empty. The string urn:cts:greekLit:tlg0012.tlg001.msA only includes three CTS components, so trying to make a CTS URN from it generates an Exception. The string
urn:cts:greekLit:tlg0012.tlg001.msA: has four components, even though the final one is empty; it is possible to construct a CTS URN object from it.
CTS URN objects can be constructed from strings in any Unicode form, but all output representations of the CTS URN or parts of the CTS URN as strings are normalized to Unicode form NFC.
Compare the two CTS URNs in this table: they look identical when printed, but one is in pre-composed form (Unicode form NFC), the other in decomposed form (Unicode form NFD).
Input string | Unicode form of input | Length of input in bytes | Output string identical to input string |
---|---|---|---|
urn:cts:greekLit:tlg0012.tlg001:1.1@μῆνιν | NFC (composed) | 47 | Yes |
urn:cts:greekLit:tlg0012.tlg001:1.1@μῆνιν | NFD (decomposed) | 48 | No |