Identification with URNs

Summary

The task: ISBN numbers uniquely identify published editions of a book. We want to create a type representing a 10-digit ISBN number, and be able to compare ISBN numbers using URN logic.

The implementation:

  • define a new URN type representing an ISBN-10 number
  • implement the UrnComparisonTrait for the new type

Defining the Isbn10Urn type

The Urn abstract type models a Uniform Resource Name (URN). We'll follow the requirements of the URN standard to create a URN type for ISBN-10 numbers. Its URN strings will have three colon-delimited components, beginning with the required prefix urn, then a URN type we'll call isbn10, followed by a 10-digit ISBN number. For example, the URN for Distant Horizons by Ted Underwood will be urn:isbn10:022661283X. (Yes, the last "digit" of an ISBN number can be X.)

We will make the new type a subtype of Urn, so that we can use it freely with other packages that recognize URNs.

using CitableBase
struct Isbn10Urn <: Urn
    isbn::AbstractString
end
Note on the ISBN format and our `Isbn10Urn` type

There is in fact a URN namespace for ISBN numbers identified by the isbn namespace identifier. (See this blogpost about citing publications with URNs.) This guide invents an isbn10 URN type solely to illustrate how you could create your own URN type using the CitableBase package.

Parsing the full ISBN-10 format is extremely complicated: ISBN-10 numbers have four components, each of which is variable in length! In this user's guide example, we'll restrict ourselves to ISBNs for books published in English-, French- or German-speaking countries, indicated by an initial digit of 0 or 1 (English), 2 (French) or 3 (German). In a real program, we would enforce this in the constructor, but to keep our example brief and focused on the CitableBase class, we blindly accept any string value for the isbn field of our type.

Our new type is a subtype of Urn.

supertype(Isbn10Urn)
Urn

As often in Julia, we'll override the default show method for our type. (Note that in Julia this requires importing the specific method, not just using the package.)

import Base: show
function show(io::IO, u::Isbn10Urn)
    print(io, u.isbn)
end
show (generic function with 289 methods)

Now when we create objects of our new type, the display in our REPL (or other contexts) will be easily recognizable as an Isbn10Urn.

distanthorizons = Isbn10Urn("urn:isbn10:022661283X")
urn:isbn10:022661283X

Defining the UrnComparisonTrait

Subtypes of Urn are required to implement the UrnComparisonTrait, and its three functions. CitableBase uses the "Holy trait trick" to dispatch functions implementing URN comparison.

The Tim Holy Trait Trick

See this post on julia bloggers for an introduction to the "Tim Holy Trait Trick" (THTT). .

We first define a subtype of the abstract UrnComparisonTrait. It's a singleton type with no fields which we'll use as the trait value for our ISBN type. CitableBase provides the urncomparisontrait function to determine if a class implements the UrnComparisonTrait so we'll import urncomparisontrait, and define a function returning a concrete value of IsbnComparable() for the type Isbn10Urn.

struct IsbnComparable <: UrnComparisonTrait end

import CitableBase: urncomparisontrait
function urncomparisontrait(::Type{Isbn10Urn})
    IsbnComparable()
end
urncomparisontrait (generic function with 7 methods)

Let's test it.

urncomparisontrait(typeof(distanthorizons))
Main.IsbnComparable()

This lets us use CitableBases boolean function urncomparable to test specific objects.

urncomparable(distanthorizons)
true

Implementing the logic of URN comparison

To fulfill the contract of the UrnComparisonTrait, we must implement three boolean functions for three kinds of URN comparison: urnequals (for equality), urncontains (for containment) and and urnsimilar (for similarity). Because we have defined our type as implementing the UrnComparisonTrait, CitableBase can dispatch to functions including an Isbn10Urn as the first parameter.

Equality

The == function of Julia Base is overridden in CitableBase for all subtypes of Urn. This makes it trivial to implement urnequals once we use CitableBase and import urnequals.

import CitableBase: urnequals
function urnequals(u1::Isbn10Urn, u2::Isbn10Urn)
    u1 == u2
end
urnequals (generic function with 8 methods)
dupe = distanthorizons
urnequals(distanthorizons, dupe)
true
enumerations = Isbn10Urn("urn:isbn10:022656875X")
urnequals(distanthorizons, enumerations)
false
Why do we need 'urnequals' when we already have '==' ?

Our implementation of urnequals uses two parameters of the same type to compare two URNs and produce a boolean result. In the following section, we will implement the functions of UrnComparisonTrait with one URN parameter and one parameter giving a citable collection. In those implementations, we can filter the collection by comparing the URN parameter to the URNs of items in the collection. We will reserve == for comparing the contents of two collections, and use urnequals to filter a collection's content.

Containment

For our ISBN type, we'll define "containment" as true when two ISBNS belong to the same initial-digit group (0 - 4). We'll use the components functions from CitableBase to extract the third part of each URN string, and compare their first characters.

import CitableBase: urncontains
function urncontains(u1::Isbn10Urn, u2::Isbn10Urn)
    initial1 = components(u1.isbn)[3][1]
    initial2 = components(u2.isbn)[3][1]

    initial1 == initial2
end
urncontains (generic function with 7 methods)

Both Distant Horizons and Enumerations are in ISBN group 0.

urncontains(distanthorizons, enumerations)
true

But Can We Be Wrong? is in ISBN group 1.

wrong = Isbn10Urn("urn:isbn10:1108922036")
urncontains(distanthorizons, wrong)
false

Similarity

We'll define "similarity" as belonging to the same language area. In this definition, both 0 and 1 indicate English-language countries.

# True if ISBN starts with `0` or `1`
function english(urn::Isbn10Urn)
    langarea = components(urn.isbn)[3][1]
    langarea == '0' || langarea == '1'
end

import CitableBase: urnsimilar
function urnsimilar(u1::Isbn10Urn, u2::Isbn10Urn)
    initial1 = components(u1.isbn)[3][1]
    initial2 = components(u2.isbn)[3][1]

    (english(u1) && english(u2)) ||  initial1 == initial2
end
urnsimilar (generic function with 7 methods)

Both Distant Horizons and Can We Be Wrong? are published in English-language areas.

urnsimilar(distanthorizons, wrong)
true

But they are coded for different ISBN areas.

wrong = Isbn10Urn("urn:isbn10:1108922036")
urncontains(distanthorizons, wrong)
false

Optional methods

The CtsUrn (from CitableText) and the Cite2Urn (from CitableObject) illustrate two optional behaviors: support for versioning of URNs, and support for subreferences. Although our ISBN numbers don't require either of those features, we'll illustrate how they are implemented.

Versioning

The three relevant functions are supportsversion, addversion and dropversion. We'll indicate that our URN type supports versioning but for this demonstration will just pass the URN through unchanged.

import CitableBase: supportsversion
function supportsversion(u::Isbn10Urn)
    true
end

import CitableBase: addversion
function addversion(u::Isbn10Urn, versioninfo::AbstractString)
    u
end

import CitableBase: dropversion
function dropversion(u::Isbn10Urn)
    u
end
dropversion (generic function with 2 methods)
supportsversion(wrong)
true
addversion(wrong, "v2")
urn:isbn10:1108922036
dropversion(wrong)
urn:isbn10:1108922036

Subreferences

import CitableBase: supportssubref
function supportssubref(u::Isbn10Urn)
    true
end

import CitableBase: dropsubref
function dropsubref(u::Isbn10Urn)
    u
end

import CitableBase: hassubref
function hassubref(u::Isbn10Urn)
    false
end

import CitableBase: subref
function subref(u::Isbn10Urn)
    nothing
end
subref (generic function with 3 methods)
supportssubref(wrong)
true
hassubref(wrong)
false
dropsubref(wrong)
urn:isbn10:1108922036
subref(wrong)

Recap: identifiers

On this page, we defined the Isnb10Urn type as a subtype of Urn and identified our type as implementing the UrnComparisonTrait. You can test this with urncomparables.

urncomparable(distanthorizons)
true

We implemented the trait's required functions to compare pairs of URNs based on URN logic for equality, similarity and containment, and return boolean values.

The next page will make use of our URN type to define a citable object identified by Isbn10Urn.