Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Int J Digit Libr (2012) 13:3–16

DOI 10.1007/s00799-012-0098-8

SharedCanvas: a collaborative model for digital facsimiles


Robert Sanderson · Benjamin Albritton ·
Rafael Schwemmer · Herbert Van de Sompel

Received: 30 September 2011 / Revised: 14 August 2012 / Accepted: 3 September 2012 / Published online: 30 September 2012
© Springer-Verlag (outside the USA) 2012

Abstract In this article, we present a model based on Keywords Annotation · Medieval manuscript ·
the principles of Linked Data that can be used to describe Digital facsimile · Linked Data · Data model
the interrelationships of images, texts and other resources
to facilitate the interoperability of repositories of medieval 1 Introduction
manuscripts or other culturally important handwritten doc-
uments. The model is designed from a set of requirements There are many repositories and Digital Libraries that main-
derived from the real world use cases of some of the largest tain digitized images of medieval manuscript pages or other
digitized medieval content holders, and instantiations of the historically important handwritten documents. These images
model are intended as the description to be provided as input are often the only way in which scholars and students can
to collection-independent page turning and scholarly presen- interact with the material, due to the expense of travel or the
tation interfaces. A canvas painting paradigm, such as in PDF fragility of the original. Use of a digital surrogate comprised
and SVG, was selected due to the lack of a one to one correla- of the images and other resources increases the likelihood
tion between image and page, and to fulfill complex require- of its persistence, and as interactions with the physical copy
ments such as when the full text of a page is known, but only decrease its usable lifetime, the use of surrogates is attrac-
fragments of the physical object remain. The model is imple- tive to the owning institution as well as to humanities travel
mented using technologies such as OAI-ORE Aggregations budget managers. Therefore, it is essential that the digital
and Open Annotations, as the fundamental building blocks surrogate be as rich an experience as possible for the scholar,
of emerging Linked Digital Libraries. The model and imple- with access to the existing scholarship about the manuscript,
mentation are evaluated through prototypes of both content as discussed by Audenaert and Furuta [1]: the surrogate for
providing and consuming applications. Although the system the physical object is the humanist’s primary research data.
was designed from requirements drawn from the medieval Institutions holding medieval manuscripts have long
manuscript domain, it is applicable to any layout-oriented known the value of digitization and millions of grant and
presentation of images of text. industry dollars have been spent generating images of decay-
ing physical pages, yet, according to an estimate of Dr
Christopher de Hamel, Fellow Librarian of Corpus Christi
College, Cambridge, less than 1 % of existing medieval doc-
uments have been digitized to date. While the digitized man-
R. Sanderson (B) · H. Van de Sompel uscripts are unique, much of the effort to display the digi-
Research Library, Los Alamos National Laboratory,
Los Alamos, USA
tized material has been duplicated across institutions, with
e-mail: azaroth42@gmail.com each recreating very similar basic page-turning applications;
the only differences being for institutional branding and the
B. Albritton seemingly unique complexities of their documents. Further,
Stanford University, Stanford, CA, USA
practically all of the presentation effort has been used for
R. Schwemmer navigation within the institutional silos of images and texts,
e-codices, Fribourg, Switzerland rather than to implement cross-collection capabilities.

123
4 R. Sanderson et al.

At a series of meetings1 of content providers, scholars and interoperability between repositories, tools, services and
technologists from organizations such as the British Library, presentation systems. The primary domain of use is medieval
the National Library of France, Google Freebase, the Uni- manuscripts that have had their pages individually digitized
versity of Oxford, and the authors’ institutions, it was recog- and their text transcribed; the physical item is the artifact of
nized that to reduce the duplication of effort, a single shared interest, rather than the work in the FRBR sense [16]. Sev-
model for description of manuscripts was necessary. Such eral basic requirements were derived from this domain and
a model could be instantiated for various digital manuscript then expanded upon via the examination of more complex
collections and provided to a conforming display application. use cases, which were beyond the capabilities of existing
Furthermore, the “unique” rendering complexities of many systems. These extended requirements inform the design of
institutions’ documents were, in fact, shared to a very large the model, and therefore will be discussed in detail.
degree across many different collections and hence such a Requirements were identified in four main areas:
display application would require minimal, if any, customiza-
tion beyond branding. If this shared data model was at the 1. Images and their relationships with the physical object
same time sufficiently simple to allow for comprehension and 2. Texts and their relationships with the images
implementation, and sufficiently expressive to facilitate the 3. Sequencing of the images and texts
description of the multi-structured documents, then access 4. Rendering of the images and texts
to the digital surrogates would be greatly improved and the
duplicated time and effort could be rededicated to further dig- 2.1 Image requirements
itization or more complete descriptions and transcriptions.
A second goal identified at the meetings was to extend While in the most basic case there is exactly one image per
the notion of interoperability between manuscript reposito- manuscript page, multiple images will often exist. Those
ries from sharing of their resources to seamless integration images may differ in color, resolution, lighting conditions
between them. While displaying an image is a common base- or scale due to the positioning of either the page or the cam-
line capability, the experience of the scholar can be greatly era during digitization, and rectification through dewarping
enriched by involving materials from multiple repositories. transformations should be possible, as discussed by Baumann
Other repositories may contain the transcription of the text, and Seales [5]. Hence, the model must contain sufficient
processing services may be able to discover locations within information to determine the most appropriate image for the
the image of the transcribed text, and it should be possible user automatically, or at least be sufficiently well described to
to integrate this data into the display. Further value would allow the user to select the correct image for their purposes.
be added by links to Digital Libraries containing publica- To avoid unnecessary duplication of information, the model
tions regarding the manuscript or related material, scientific must also provide a means to avoid attaching information
data about the subject matter of the manuscript, or scholarly directly to any single image if it also pertains to other images.
annotations. Such capabilities would yield a coherent land- Images may also exist that depict only parts of the manu-
scape of interconnected systems, rather than the current set script page, such as very high-resolution images of beautiful
of disparate content silos. illuminations or decorated initial letters. In these cases, the
This article describes the steps taken towards meeting entire page may be left undigitized, only available at a much
these goals and, in Sect. 2, details the use cases and require- lower resolution, or only in black and white. This is espe-
ments that were generated from the medieval manuscript cially true of older digitization projects where microfilms
domain. The background work is described in Sect. 3, and of the non-illuminated pages are available. From a modeling
the abstract SharedCanvas model in Sect. 4. The technolo- perspective, the ability to connect the high-quality image with
gies used to instantiate the model are discussed in Sect. 5, and the appropriate part of the page in the full image is important.
experiments evaluating both the expressiveness for describ- The converse is also true, where part of the image depicts
ing manuscripts and the investment required for implemen- the entire page. Depending on the digitization workflow, the
tation are described in Sect. 6. image may depict more than just the page, such as parts of
the scanning bed or calibration tools such as color strips or
rulers, which would not be desirable to show to all users. In
2 Requirements and use cases this case, it is necessary for the model to be able to describe
the area within the image that depicts only the page.
The goal of this research is to provide a standardized descrip- A single image may depict multiple pages, such as an
tion of the digital resources that are surrogates for culturally image of an open spread of two pages. If these two pages
important, primarily text-bearing, physical objects to enable have pictures or text that cross between them (as in Fig. 1), it
is important to have mappings from the parts of the image to
1 See http://dmstech.group.stanford.edu/ the page, and a method for describing the spread as a whole.

123
SharedCanvas 5

Fig. 1 Open spread of two pages; Y112 [24]

Again, the model must avoid the duplication of information


between the spread and the individual pages.
There are many cases in which only fragments (parts of
the original page) remain. As an example, Fig. 2 depicts a Fig. 2 Two fragments; Cod. Sang. 1394 p. 31 [17]
manuscript from the Abbey Library of St. Gall, where two
fragments, likely not from the same original page, have been
• Part Page/Whole Image: Images of only the illumination
bound together. The volume collects together fragments from
within a page.
the fourth through fifteenth centuries, and was assembled in
• Part Page/Part Image: Fragment(s) of a page depicted in
1822 [17]. When a fragment is digitized separately, the image
an image.
depicts only part of the original page. A single image may
• Whole Page/No Image: Page no longer exists or is not
also depict multiple small fragments, regardless of where
digitized.
they were originally located. The fragments might be housed
together in the same container, or stuck to a further page or
glass slide. To display the fragment in conjunction with other 2.2 Text requirements
fragments from the same original page, individual parts of the
image must be able to be mapped in the model to the appro- The semantic properties of text have long been understood
priate segments of the page. Fragments of pages are typically and appropriate markup languages exist to describe these
irregularly shaped and hence the model must also be able to features, so this is not a direct research concern. Instead the
make use of arbitrary polygons rather than just rectangular focus is on the relationships between text and image, or the
bounding boxes to describe the parts of the images. text and the original, physical document.
Equally, there may not be any image that depicts a page All of the relationships between image and the physical
at all. This might be because the digitization process would resource depicted exist in parallel for the transcribed text and
irrevocably damage the page, or it may no longer exist but its depiction. The text may exist in multiple copies, with dif-
either is known or hypothesized to have existed in the past. ferences of format, encoding and editorial intervention, and
Despite the lack of an image, the model must still be able to these different versions would be appropriate for different
account for the existence of the page and allow other infor- users. To step briefly through the combinations of whole and
mation to be associated with it. part again:
These use cases encapsulate the full spectrum of combina-
tions of whole versus part, and image versus page, with each • Whole Page/Whole Text: A transcription of the entire
depiction being present in either a single or multiple images, page, and only the one page. The presentation layer might
and parts of images being either rectangular or polygonal: present this in a two pane layout, similar to the facing page
transcription of the print world.
• Whole Page/Whole Image: The simplest scenario where • Whole Page/Part Text: The page is transcribed as part
only the page is depicted. of a larger work, such as a TEI [34] description of the
• Whole Page/Part Image: Such as cropping out rulers, or entire manuscript, where the granularity is only at the
dividing spreads. page level.

123
6 R. Sanderson et al.

• Part Page/Whole Text: A transcription of a column, line, are collected together in quires (sets of normally 8 leaves),
word or character, perhaps as an annotation or note. The and the location of the boundaries between these is important
area on the page may be rectangular, but warping and information to scholars. Textual sections such as chapters or
decorative text also provides non-rectangular use cases verses are equally important for the humanist interested in
for text. particular parts of the text. Other features, such as the range
• Part Page/Part Text: Alignment of part of a larger tran- of pages at the beginning or end of a manuscript that do
scription to a part of the page at some level of granularity, not contain any text, are also important for navigation and
including area of the entire text, column, line, word or display of an automatically generated table of contents. To
character. enable the description of sections at any granularity, these
• Whole Page/No Text: Unfortunately, the most common sub-lists must be able to include parts of a page, since textual
scenario as very few manuscripts have been fully tran- and other features do not necessarily align with page breaks.
scribed. When the text is transcribed line by line, it is important
to be able to explicitly describe the reading order of those
In the case where an image of the page is not available, lines. Visual clues, such as size, location and color of the
the text may still be known from other copies, and this writing, may make the order clear for human readers, but
information should be available to the reader without a are difficult to interpret for a machine due to writing in the
depiction. margins (marginalia), writing between regular lines (inter-
Among the most challenging textual use cases are stitial text), decorated initials and the scribal tricks used to
palimpsests, manuscripts where one or more texts have been manually justify text into columns. The ability to express the
erased and the pages reused for another text. The original correct sequence of textual resources is just as important as
texts can be recovered using techniques such as multispec- the order in which to display the pages.
tral imaging as in the case of the Archimedes Palimpsest,
fully described by Reviel Netz and William Noel [25]. Some 2.4 Rendering requirements
images may therefore depict two or more completely sep-
arate texts, often perpendicular to each other. The identity Many of the rendering requirements have already been dis-
of each text is important, and could require rotation of either cussed with respect to the Images, Texts and Sequencing.
the image or rendered text for verisimilitude. The model must Further requirements include the ability to create and dis-
record this information and not assume a one to one relation- play scholarly annotations that discuss the images and text
ship between page or image and text, nor any mandatory at a very fine level of granularity.
rotational alignment. If the manuscript leaves were rebound A requirement expressed by many scholars is the abil-
when the second text was written, the first text would also ity to pan and zoom into an image to investigate minute
need a different page order. This brings us to page and text details that might give a clue as to the provenance of the
ordering requirements. manuscript, the identity of the scribe(s) or illuminator(s),
or just to assist in transcribing a particularly difficult pas-
2.3 Sequencing requirements sage. With the alignment of other resources such as the tran-
scribed text, this requires the rendering environment to be
Most page turning applications assume a single correct order able to maintain the alignments correctly at multiple res-
for the pages, yet many manuscripts have been disassembled olutions and regardless of the part of the facsimile being
and rebound over the centuries by well-meaning curators, displayed.
and intentionally or not, the page order has changed over The creation and maintenance of the manuscript descrip-
time. It should be possible to reconstruct the order as it was tion should be possible in a highly distributed and collabora-
at a particular point in time, without duplicating all of the tive environment, to enable the sharing of content and exper-
resources. tise between different communities and individuals. The ren-
The presentation of alternate paths within the same order dering application must be prepared to consume and display
is also an important use case when, as per Fig. 1, a spread resources from across many locations, not limited to a single
exists as a single image but also as separate pages. An ani- file or content silo as is the case currently.
mated page turning application, for example, should not try Although the discussion is in terms of page, image and
to display the image of the spread as a single page, nor should text, there are many other sorts of resource that would be
it animate the turn using the entire image. Yet for scholars, relevant, and bear the same properties as discussed above.
access to the image of the full spread is important to get as For example, a manuscript containing a musical score would
accurate a depiction as possible. benefit from having its score transcribed, perhaps in to Music
The description of subsets of pages from the ordered list is XML [26] or MEI [29], or an audio file with a performance of
also important. In the manuscript construction, process pages the music associated with the appropriate parts of the pages.

123
SharedCanvas 7

This enrichment of the original manuscript is only possible multiple versions and overlapping hierarchies, and Rehbein
with a digital facsimile, providing another motivating factor [28] considers the problem from the change of the text over
for scholars and the lay-person alike. time, as a medieval codex of law is updated over many years
Visualizing these resources and the relationships between as the laws of the town were revised.
them in an innovative way that promotes scholarship is an
interesting research challenge for future work. 3.3 Sequencing

The modeling of multiple page orders is one of the aspects


3 Background research tackled by Bauer and Hernath [4] along with the use of off-
set annotations to describe differences of opinion about con-
Research was needed to discover if any existing systems tentious transcriptions. Their system, tested with the previ-
or models met the above requirements. While much related ously mentioned Archimedes Palimpsest, has an Ordering
work has been done, interoperability and the complexities of and Indexing Layer separate from the text and images to
medieval manuscripts have not been primary research topics. enable multiple sequences. Their system does not have any
Again, we discuss in terms of the four areas of concern. image requirements, and overlays the annotations on top of
the standard, tree-structured TEI.
3.1 Image layout analysis
3.4 Rendering
Baechler et al.’s recent work on a layout model [2] for
manuscripts looks promising in its acknowledgement of We do not consider the exact user interface or the details of
the complexities of the issues and its demonstration of the simulating interaction with a physical object in this research.
need for non-rectangular bounding areas for line segmenta- Marshall [22] and Liesaputra [20] have studied the page turn-
tion. Unfortunately, their model is designed for evaluating ing experience extensively and we defer to their knowledge.
automated methods of segmentation, and not as an inter- Beyond the area of manuscripts, there are many well-
operability mechanism. Their four layers of text, physical known layout oriented systems capable of rendering images
medium, illuminations and commentary would be insuffi- and text. The most well known is, of course, PDF [18] where
cient for describing palimpsests, and their XML document images and text can be laid out on a blank, page-sized canvas.
structure inappropriate for ease of cross-institution collab- The canvas notion is also used in SVG [12] (an XML descrip-
oration given the expectation with XML of a single, self- tion format primarily for vector graphics) and HTML5’s can-
contained document. vas element [15] that can be drawn upon using javascript
The analysis of the archive of the Dutch queen [10], which functions.
focuses on extracting information from a handwritten table of Beyond the domain of documents altogether, the user
contents document, is also focused on layout. It demonstrates interfaces of software applications are also often built up on
complexities such as arbitrarily shaped boundaries, but is not empty canvases by adding layout boxes, controls, text and
a general solution. images. Mozilla’s Firefox browser’s user interface is con-
structed from a series of XML files describing the layout.2
3.2 Text and image linking Programming libraries such as GTK3 function in essentially
the same way, but with function invocations rather than XML
The work of Brugman et al. [9] provides a strong base- elements.
line model for linking resources describing cultural heritage
objects through the use of annotations; their first example is 4 SharedCanvas model
transcription of an image with additional semantics linked to
the text. In their description of ARMARIUS [13], Doumat At the foundation of the SharedCanvas model is the insight
and colleagues give a model for web-based annotation of that there is not a one to one correlation between manuscript
digitized medieval manuscripts. However, in both of these page and the depicting image. The common approach of link-
cases the annotations are modeled as graph relationships ing a transcription or annotation to a point on a particular
between individual images and texts, and there is no notion image is not appropriate, as convincingly illustrated by the
of a method to transcribe without an image, of transferabil- case in which an image for the page does not exist. Also,
ity of information between equivalent images, or of services in cases where multiple images do exist, such linking would
providing scaled and tiled images.
Other modeling focuses only on the textual require-
2
ments and ignores the relationship with images. Schmidt See: http://www.mozilla.org/projects/xul/.
and Colomb [33] look at models for online text with 3 See: http://www.gtk.org/.

123
8 R. Sanderson et al.

depicted with a blue dotted line within the Canvas. The “Text
Anno” node represents a Text Annotation that associates the
transcribed text with the region of the Canvas in which the
text is located, potentially to a curve rather than straight line.
The Image or Text itself may also have Segment Information
if only part of a larger resource is needed, such as a section
of a TEI transcription. A non-rectangular manuscript would
use this approach to paint only to the appropriate segments
of the Canvas.
The body of an annotation may also be a set of equiv-
alent resources from which a choice is made by the pre-
sentation application. A Choice set could then contain
all the equivalent images of a page, to be applied to a
Canvas, at different sizes, different color depths and so
forth. The alignment of these multiple images is enabled
by the independent coordinate system of the Canvas. The
same construction would allow for a choice of multi-
Fig. 3 Canvas, image and text; MS fr. 190/1 [8] ple Texts to be associated with a specific line segment.
The application would then select the appropriate text
need to be repeated for each of the images, some of which
based on properties such as author or the edition it is
may not be known by the system making the connection.
derived from, or a default as specified by the descrip-
For fragments, it must be possible to create annotations that
tion and present the other options to the user to choose
reference both digitized and undigitized sections of the man-
from.
uscript.
Scholarly annotations can also be applied to any of the
resources, as appropriate. If the annotation should be dis-
4.1 Image and text played regardless of the images and texts being used to ren-
der the Canvas, then it should target the Canvas. On the
Following the lead from canvas-based layout systems, the other hand, a criticism of a particular transcription should be
SharedCanvas model starts with a blank Canvas to be drawn attached to the transcription itself, so that if the transcription
on, which stands for a page in the manuscript. In Fig. 3 the is replaced, the annotation would no longer be displayed.
Canvas is depicted as a rectangle with a dashed black border. These annotations can come from any source and pull in
The Canvas has its own dimensions or aspect ratio that may or external resources that otherwise would not have been known
may not be the same as any image, however, the top left hand about.
corner of the canvas is chosen to correspond to the top left The Type of annotation is used to convey the expected
hand corner of a bounding box around the manuscript page, presentation behavior to the presentation system. There are
and similarly for the bottom right hand corners of canvas and two major types of annotation in this regard, ContentAn-
page. The most common case is a rectangular page, however, notation and CommentAnnotation. The former associates
non-rectangular manuscripts exist such as the heart-shaped content to be rendered as part of the digital facsimile with
chansonnier with the shelfmark BNF Rothschild MS 2973 the appropriate part of the Canvas, whereas the latter is a
[37]. comment or discussion about the facsimile or object that
Images and texts are then overlaid on top of the blank it represents. The different classes allow for the easy dis-
Canvas using a stand-off Annotation Paradigm, following the tinction between annotations that overlay images or tran-
approach taken by Brugman, Bauer and others. Using anno- scriptions on the canvas and more traditional scholarly
tations to paint the canvas, the number of technology depen- annotations. By recognizing the differences and similarities
dencies is reduced, as the parsing and rendering of scholarly between these types of annotation, the display system should
annotations is also required. For example, the body of an require less implementation effort than if the layout were
Image Annotation is the Image and the target is the Canvas. done in a completely different method from regular annota-
In Fig. 3, the “Img Anno” circle represents an annotation tions.
of this type that associates the Image resource with the full Areas, similar to a Canvas in that they have dimensions,
Canvas. may be defined that allow multiple annotations of any type
To paint either an image or text on the appropriate region to be grouped together into a logical unit and then further
of the Canvas, Segment Information is used. In the example, operated upon. These Zones are the targets of the Anno-
the information records the location of the bounding box tations, and are then painted onto one or more Canvases

123
SharedCanvas 9

via annotations, in the same method as all other resources.


This functionality facilitates many of the more challenging
use cases, such as when pages should be displayed together
as a spread, or in the palimpsest case when some images
contain both texts and others only one. The Zones will
maintain all of their associated annotations, permitting the
display of the information without having to repeat it in
multiple locations.

4.2 Sequencing

The annotated Canvas method allows us to build up the view


of a single page piece by piece and in a distributed fashion.
The order in which the multiple Canvases that make up the
entire manuscript should be presented to the user is recorded
in a Sequence. In Fig. 4, the Sequence is depicted as a light
green circle labeled “Seq” at the top of the diagram. The
Sequence is not necessarily a single, linear list as there may
be alternative paths from one canvas to the next, such as either
Fig. 4 Sequence and range of canvases
through two Canvases or via a single Canvas that represents
the combined spread.
The same Canvas may appear in multiple Sequences to
allow for rebinding or competing theories of provenance.
Multiple Sequences might also be used to provide better
navigation for different viewing platforms, such as to make
the most appropriate use of a multi-column manuscript on a
smart-phone’s limited width display.
Other groupings of Canvases are also desirable to model
textual or physical boundaries within the manuscript. These
groupings are called Ranges, where all of the resources in
the Range are Canvases or parts of Canvases from a single
Sequence. A Range is depicted in Fig. 4 as the darker green
node labeled “Rng” at the bottom of the diagram. The exam-
ple Range includes all of Canvas 3, and the lower part of
Canvas 2. The reuse of the segmentation concept, depicted
with the blue dotted line within Canvas 2, would enable the
inclusion of the region in which the beginning of a new chap-
ter is displayed, for example.
As multiple Ranges may overlap, there is not a strict
hierarchy of Sequence, then Range, then Canvas. Instead Fig. 5 Complete SharedCanvas model
the Range must link to the Sequence of which it is part,
and the Sequence should list all of the Ranges it knows purposes, collections of the Image Annotations would also
about. be useful (“Img List”), as would the set of Zone Anno-
Groupings above the level of the Sequence are also impor- tations (“Zone List”). Finally, as there may be multiple
tant for discovery and presentation. In Fig. 5, these group- Sequences for a single manuscript, a single top-level Manifest
ings are the nodes depicted in the Discovery section above is introduced that collects them together along with these sets
the dashed line. If the manuscript has been transcribed line of annotations.
by line, or even word by word, there will be many thou-
sands of small annotations linking each piece of text with
the appropriate region of a Canvas. To satisfy the explicit 5 Instantiation
reading order requirement, there must be an ordered collec-
tion at least per Canvas, if not across the entire Sequence. The SharedCanvas concepts were conceived to address
This is the “Text Ordr” node in Fig. 5. For discovery the core functional requirements, but these need to be

123
10 R. Sanderson et al.

instantiated using specific technologies. Although most An example of Open Annotation where a rectangular sec-
manuscript transcription and description work has taken tion of an image should be overlaid on top of a Canvas, in
place to date using XML, and notably with the TEI, METS the relatively readable Turtle [6] syntax:
[27] and ALTO [3] schemas, many of the use cases require
a graph rather than an XML tree. To permit the distributed
creation and use of the model simultaneously by multiple
tools and repositories, an approach that follows the archi-
tecture of the web [19] is necessary. Linked Open Data [7]
using RDF [21] fits these requirements for a web-centric
graph.

5.1 Open Annotations for layout

The main choice of ontology is for the annotations used


to overlay resources on the canvas and to permit schol-
ars to annotate them. The W3C Open Annotation Commu-
nity Group,4 formed initially by the Open Annotation Col-
laboration (OAC)5 and Annotation Ontology6 groups, has
defined an RDF-based ontology [31]. The Open Annota-
tion ontology allows the annotation of any resource with
any other resource irrespective of media type, unlike many
of its predecessors in which the body is required to be tex- As support for non-rectangular regions is also crucial, it is
tual. This feature is essential to overlay both images and important that the Open Annotation model provides for this
transcriptions on the blank canvas using the same method. via more extensive Selectors. For non-rectangular areas, the
It is also important for the use case where multiple equiva- recommended method is to use an SVG description, either
lent images are available, as the body is then a set of those embedded within the annotation document or referenced as
images (see Sect. 5.2). For commentary style annotations, an external resource. The same techniques used for images
this facility is forward looking as it allows audio or video could easily be applied to Canvases, as they share the essential
comments. properties of height and width.
Furthermore, the Open Annotation model has a grace-
ful method of identifying selected parts of resources to be
the target or the body of the annotation. This feature is 5.2 OAI-ORE Aggregations for sequencing
crucial for the many use cases that require segmentation,
such as manuscript fragments, removing the ruler or dig- The immediately obvious choice of ontology for a Sequence
itization housing from the presentation without modifying would be the use of an OAI-ORE Aggregation [38]. Aggre-
the images that may not be under the control of the pre- gations are sets of resources plus some metadata, and the
sentation layer, and for presenting zones within canvases to OAI-ORE ontology has been increasingly adopted by Digi-
ensure that data does not have to be added to or from multiple tal Library systems since its publication. However, the ubiq-
locations. uity of the ordering requirement in the use cases is troubling.
The segmentation method is based upon Selector objects As order is local to the Aggregation (the same resource may
that describe the region of interest, and SpecificResource be in a different order in another Aggregation, an important
objects that identify the region of interest. The distinction concept given the rebinding use case), the ORE model pre-
between these two roles is important in RDF so that they scribes the use of Proxy nodes that stand for the resource as
can be addressed separately. The most common Selector is it occurs in the Aggregation. The mandatory use of Proxies
a FragmentSelector, which encapsulates a URI fragment as for every aggregated resource would almost double the num-
the description of the region of interest. Using the W3C’s ber of nodes, and almost triple the number of relationships
Media Fragment specification [35], rectangular regions can required for even the simplest scenario.
be described using the coordinates of the upper left corner, Thankfully, for this particular case, an alternative solution
the height and width. exists. RDF objects may have multiple classes, and thus a sin-
gle resource may be an ORE Aggregation at the same time
4 See: http://www.w3c.org/community/openannotation/. as it is an RDF List. The fundamental RDF List construc-
5 See: http://www.openannotation.org/. tion imposes an order through the use of many anonymous
6 See: http://code.google.com/p/annotation-ontology/. nodes, and due to its definition at the core of the standard,

123
SharedCanvas 11

This construction can deal with orderings ranging from no


order through to multiple pathways, the same approach can be
used to model the other sets and lists needed by the model:
Ranges, Manifests, the different Choices, and the Lists of
annotations.
For a more formal definition of the instantiated model,
please see [32].

6 Experimentation

The majority of experimentation with the model has been to


attempt to describe increasingly complex use cases. Initial
prototype implementations were then created to produce the
RDF serializations, and to in turn consume them and ren-
der the images and texts for a user. The implementations
do not provide the full capabilities envisioned for a schol-
arly environment, only enough functionality to prove that
the descriptions can be rendered at least as well as in their
current interfaces.

Fig. 6 Multi-classed list, Aggregation, Sequence resource 6.1 Image and text layout

6.1.1 Morgan Library Manuscript 804


serializations have ways to avoid minting identifiers for all
of these nodes. The anonymous nodes become part of the The first manuscript description generated was derived from
plumbing that is conveniently taken care of by RDF libraries, Sanderson’s Ph.D. work [30] on an electronic edition of
rather than something that needs to be handled by every Froissart’s Chronicles. This description demonstrates most
application. Again in the Turtle syntax, a resource that is of the basic requirements throughout a full manuscript.
both an Aggregation and a List would be expressed as: The manuscript described is held at the Morgan Library,
with shelfmark M.804 [14]. The electronic resources used
consist of 515 low-quality black and white images derived
from a microfilm with each depicting a single side of a page,
ten color images depicting illuminated pages, the text marked
up in TEI, and hand crafted locations of each line within each
image.
In this way other systems that understand Aggregations The model generated for the first page is depicted in Fig. 7.
can process the descriptions, further systems can easily A Canvas was created for the page, and the representative
process the List, and the thousands of Proxy URIs do not black and white image associated with it via a ContentAn-
need to be minted. The complete model, with all of the blank notation. As the first page is richly illuminated, there is also
nodes expanded, is depicted in Fig. 6. a color image available, and thus the Annotation’s body is a
A further advantage of this approach is that there may be Choice that includes both of these images. The size and color
aggregated resources that are not part of the order expressed depth were recorded in properties that are not depicted.
in the List. This facility would allow for images of the edges, The text was split up into lines and linked to the appropri-
spine, container of the manuscript to be included, but not ate region of the Canvas via further annotations. Only the first
directly part of the page-turning sequence. four are depicted, and the different size and color represents
There is one case in which proxy nodes are still required: aspects of the physical text recorded with additional proper-
the alternate paths use case. In this case, there is not a single ties. The location of the lines of text was transferred to the
linear List, and the Proxy construction from ORE comes to Canvas coordinates from those of the image with appropri-
the rescue. From the requirements analysis, this use case is ate scaling. These became the rectangular bounding boxes
important but does not occur in the majority of manuscripts. used as segment information within the Canvas, and thus
Thus, the use of ORE plus Lists provides an easy transition the targets of the text annotations were described with Frag-
from the unordered set, through to simple lists, and on to mentSelectors that recorded x and y coordinates, plus height
more complex multi-pathways. and width.

123
12 R. Sanderson et al.

Fig. 8 Collected Fragments; Cod. Sang. 1394 p. 63 [17]

came from several different pages in one or more other man-


uscripts.
This experiment demonstrates the model’s solution to non-
Fig. 7 Model for Morgan Library M.804, f1r [14] rectangular regions on both the body (Image) and target
(Canvas) of the annotations, along with the lack of a one
Other color images depicted only an illumination, and to one relationship between images and pages, or even man-
not the entire page. These detail images were treated as the uscripts. The fragments are delimited using non-rectangular
body of image annotations that targeted the appropriate loca- regions, and image annotations created that link to the equiv-
tion within the Canvas, again FragmentSelectors, and could alent region on a Canvas.
thus be overlaid above the lower quality digitized microfilm Also shown is the mechanism for scholars to disagree
images. about the location of fragments. The lighter image annota-
A single Sequence was generated that ordered all of the tion, labeled “Img Ann1” and the regular “Img Ann2” both
515 Canvases, the green ’M804’ node in Fig. 7. refer to the same segment of the image, but to regions in dif-
Not depicted in Fig. 7 is the discovery layer, which con- ferent canvases. The display layer would be able to make a
sists of one Text Order aggregation of annotations per page, choice as to how to display this difference of opinion, again
one aggregation of the image annotations, and a Manifest that through additional metadata associated with the annotations
collects these aggregations along with the Sequence labeled as to author, certainty and so forth. The same approach can
“M804”. While these were not absolutely necessary, their be applied to any other debatable annotated piece of infor-
existence greatly facilitated the creation of the experimen- mation.
tal consuming applications, as otherwise full knowledge of
the relevant triples, in the order of half a million, would be
required to find the appropriate ones to facilitate the display 6.1.3 Kantonsbibliothek Thurgau Y112
of the current Canvas.
As the full text of the manuscript and scholarly notes The spread use case from Fig. 1 demonstrates the need for
about the illuminations and marginalia including locations annotations that associate Zones with multiple Canvases, and
are available from the previous work, this experiment pro- for alternate paths through the set of Canvases. The dotted
vided a good demonstration that the model can be used suc- lines in Fig. 9 show the order in which the Canvases should
cessfully to reproduce the status quo. be displayed: either through the spread to the top, or the two
separate pages below. This order is implemented using ORE
6.1.2 Codex Sangallensis 1394 Proxies, but is too convoluted to depict.
Zone annotations (yellow circles in the center of the dia-
An extreme case of page fragments collected together comes gram) are used to link any non-image annotations between
from another page of Codex Sangallensis 1394, the manu- the lower per-page Canvases and the appropriate locations
script depicted in Fig. 2. These fragments are stuck to a white in the spread Canvas. This ensures that all of the relevant
paper page, depicted along with the model in Fig. 8, which information is maintained regardless of which path is taken
is bound into the current volume. The fragments originally without duplication.

123
SharedCanvas 13

Fig. 9 Map Spread; Y112 f3v-f4r [24]

Fig. 10 Audio annotations; Parker CCC 008 [39]

6.1.4 Parker CCC 008

Musical scores are often present in manuscripts, and this


presents a unique opportunity with digital representations
to render more than just image and text. As an example of
how the rendering model can be extended, annotations which
associate part of an audio file with the part of the canvas that
correlates to location of the score on the page were used.
The rendering application can then allow the user to play
the performance of the music. Another extension would be
to annotate the score with transcribed music notation, either
in a modern notation or a direct transcription of the original
medieval notation.
This example, depicted in Fig. 10 also demonstrates that
existing transcriptions, such as TEI encoded text, can be used
directly. Here, the line of the TEI document that transcribes
a line of the text is referenced via an XPointer-based URI
fragment. The rendering engine would then retrieve the TEI
transcription, process the XPointer to extract the text, and
could then render it in the same way as if it had been embed-
ded directly within the TextAnnotation. Fig. 11 Missing Pages; Parker CCC 286 [36]

6.2 Sequencing Here, Canvases representing the missing leaf are added to
the Sequence that describes the manuscript’s original, rather
6.2.1 Parker CCC 286 than current, state. In addition, some content is known about
the verso side of the missing page—it contained an illustrated
Many examples exist of manuscripts that have had pages frontispiece to the Book of Matthew—which can be attached
physically removed or destroyed at some point in their his- to the Canvas through an annotation.
tory. For example, Parker CCC 286 [36] is missing a leaf This modeling is depicted in Fig. 11. There are two
between the current second and third pages. To properly rep- Sequences, “286 Curr” that represents the current state of
resent this manuscript in its original state, the fact that the the manuscript, and “286 Orig” that represents the origi-
page has been physically removed at some point in time must nal state with the additional leaf. The missing leaf has two
be made explicit. Canvases, for recto and verso-labeled Canvas 2a and 2b,

123
14 R. Sanderson et al.

respectively, with text annotations that describe the text rather


than transcribe it such as in the previous example or for the
existing pages. The distinction between description and tran-
scription would be made clear with additional properties on
the annotation that are not depicted in the diagram.

6.2.2 BNF F.Fr 113–116

An example from the Bibliothque Nationale de France


(BNF) further demonstrates the requirements for multiple
Sequences over time, as well as for Ranges. The BNF
currently holds four separate manuscripts, with shelfmarks
Fonds Franais 113 through 116, however, until 1682 the text-
bearing pages were bound together in a single large volume.
The pages have continuous numbering from folio 1 through
735, and the text is the story of Lancelot. When the original
manuscript was divided up into its current state, additional
fly-leaves were added at the beginning and end of each vol-
ume, likely to protect the beautifully illuminated initial page.
This rebound, multi-volume text is another scenario in
which multiple Sequences are crucial, but also one in which
Ranges are important. Each of the individual manuscripts
thus has their own Sequence (f.fr 113–116 in Fig. 12), and a
fifth sequence exists for the original single volume, labeled
“L. du Lac”. Each individual manuscript also has a Range,
Fig. 12 Lancelot du Lac; BNF f.fr 113–116 [23]
labeled “Content”, that collects together the content-bearing
pages, to enable a display application to skip the empty pages
at beginning and end. An equivalent content Range for the Canvas. Although these implementations were not complete,
single volume is not needed, as the extra leaves were not they were able to duplicate and extend existing applications’
present at that point. functionality.
A web-based rendering application8 was developed from
6.3 Implementation scratch to provide an easy-to-understand reference imple-
mentation. The design requirements for the implementation
Implementation of the experimental descriptions was done were that it should make use of commonly used and well-
both automatically from existing data and by hand for the documented libraries whenever possible and should require
more complex use cases where the existing data was not rich no server side components. It should implement as much of
enough to represent what can now be expressed using the the model as possible directly from RDF serializations, but
SharedCanvas model. Python’s rdflib7 library was used for not be overly concerned with additional desirable features
production and subsequent consumption of the RDF serial- such as panning and zooming of the display, or discovery of
izations, and XSLT [11] stylesheets were explored for trans- the canvases to display.
forming from the internal XML formats of the Parker and The implementation made heavy use of JQuery,9 plus
e-codices collections into RDF/XML directly. extension libraries, to provide support for the most common
Experimental consuming applications were also imple- tasks in Javascript. The extension libraries added support for
mented to demonstrate the ease of adapting existing presen- RDF and RDFA, audio and video playing, and mobile touch-
tation software or implementing a full suite from scratch. based interfaces. RaphaelJS10 was used for SVG support, as
Three implementations were completed, including render- a more commonly used and better documented library com-
ing in both PDF and HTML. The similarities with the PDF pared to JQuery’s SVG extension.
layout model made the transformation straightforward, with Each instance of the implementation allows the user to
the exception of inverted canvas axes of PDF. The traditional interact with a single manuscript, however, there may be
homegrown page turning application was also relatively sim-
8 See: http://www.shared-canvas.org/impl/.
ple to adapt, even though it originally had no concept of a
9 See: http://jquery.com/.
7 See: http://www.rdflib.net/. 10 See: http://raphaeljs.com/.

123
SharedCanvas 15

necessary to produce consistent models, and this will be


developed as more manuscripts are described.
In the cultural heritage domain, there are many important
texts recorded using media that do not have pages, such as
clay tablets, or continuous scrolls. And even if it is bound in a
volume today, the original object may have been a scroll that
was subsequently cut up and rebound. The presentation of the
reconstruction of the original form would be very different
to the current paged view, even if it uses the same images.
These use cases without pages can also be described with the
SharedCanvas model.
In the process of designing and testing the SharedCanvas
model, it was also noted that although the focus is on medieval
texts, the model is reusable for any sequence of digitized
images of text, regardless of era or medium. The importance
of the physical object depicted in the image is obvious when
it is a beautifully illuminated manuscript, however, rapidly
Fig. 13 Screenshot of implementation for Archimedes Palimpsest [25] degrading nineteenth century newspapers such as those held
at the Library of Congress11 and the British Library12 are
equally deserving of study.
multiple Sequences available for that manuscript. The RDF In other domains, further types of data in the body of the
serializations of the model, starting at the Manifest level, are overlay annotations would also be appropriate such as the
pulled in via AJAX, parsed and processed. Images and any scientific data that is depicted in a chart, or a video of a dance
additional files needed to render the annotations, such as TEI could be attached to the point in a text where it is described.
XML transcriptions, are also pulled in. The system then ren- The fact that the model could be used to describe other such
ders the annotations over top of a blank canvas, scaled to resources demonstrates that it is very general at its core, and
the width of the browser. This is depicted in Fig. 13, which is thus likely to scale to unforeseen requirements.
displays the TEI transcription over top of an image of the Being designed from the ground up to be distributed using
Archimedes Palimpsest. The multiple depictions of the same the most appropriate technologies, even if they are not tradi-
folio are able to be selected by the user from a list rendered tional for the domain, proved to enable the collaborative con-
from an ImageChoice. struction methods desired. Participants at the meetings from
The user can then add their own annotations via the Anno- different institutions are already working together using the
tate button, either commentary notes or rendered associations model to share images and the automatically extracted line
of resources. These can be stored in different services, either segment information provided by a remote service. This dis-
via the GData API as a blog post with RDFA, or as RDF/XML tribution of information promotes a cohesive landscape of
in a public annotation storage service. The implementation linked content, rather than each institution providing only a
can then consume and redisplay them as appropriate. custom built HTML interface just for resources held in their
repository.
Implementations of the model will provide unprecedented
7 Conclusions access to digital surrogates of important cultural heritage
objects, distributed amongst libraries and special collections
The annotated canvas paradigm adopted in the SharedCanvas around the world. Using the web to bring the humanists’ pri-
model was demonstrated to be successful in providing solu- mary data and the related scholarship to them, their research
tions to the challenging use cases derived from the medieval capacity is enhanced and the shared annotation space brings
manuscript domain. Multiple equivalent images, fragmen- them together with other scholars working in the same realm.
tary pages, missing pages, different page orders over time
and the description of ranges of content within a set of pages Acknowledgments This research was funded by the Andrew W. Mel-
lon Foundation, through the “Open Annotation Collaboration” grant
were all able to be described using only the two basic prim- and the “Defining a Modular and Interoperating Environment for Col-
itives of ordered ORE aggregations and OAC annotations. lections of Digitized Medieval Manuscripts, Tools and Users” grant.
Future work on the model will involve further experimen- The authors would like to thank the participants at the DMSTech
tation with larger-scale descriptions, including more detailed
11
work with the Archimedes Palimpsest. It was noted dur- See: http://chroniclingamerica.loc.gov/.
ing the experiments that best practice guidelines will be 12 See: http://newspapers.bl.uk/blcs/.

123
16 R. Sanderson et al.

meetings, and the institutions that own the manuscripts for allowing 19. Jacobs, I., Walsh, N.: Architecture of the World Wide Web, vol.
their use in this publication, including Christopher De Hamel, Don- 1. Technical Report, W3C Recommendation, W3C, 15 December
nelly Fellow Librarian, and the Master and Fellows of Corpus Christi 2004 (2004)
College, Cambridge. 20. Liesaputra, V., Witten, I.: Seeking information in realistic books:
a user study. In: Proceedings of the 8th Joint Conference on Digital
Libraries, USA (2008)
21. Manola, F., Miller, E., (eds): RDF Primer, W3C Recommendation,
References W3C, 10 February 2004 (2004)
22. Marshall, C., Bly, S.: Turning the page on navigation. In: Proceed-
1. Audenaert, N., Furuta, R.: What humanists want: how scholars ings of 5th Joint Conference on Digital Libraries, USA (2005)
use source materials. In: Proceedings of 10th Joint Conference on 23. Moap, G.: Lancelot du Lac, c. 1470. Held at: Bibliotheque
Digital Libraries, Australia (2010) Nationale de France, Shelfmark: Fonds Francais, pp. 113–116.
2. Baechler, M., Ingold, R.: Medieval manuscript layout model. http://gallica.bnf.fr/ark:/12148/btv1b60000903
In: Proceedings of ACM Symposium on Document Engineering, 24. Murer, H.: Diuae Virg: Mariae Sintlacisaugia, c. 1627. Held at
England, pp. 275–278 (2010) Frauenfeld, Kantonsbibliothek Thurgau, Shelfmark Y 112. doi:10.
3. Bauer, J., et al. (eds.): ALTO Technical Metadata for Optical Char- 5076/e-codices-kbt-y112
acter Recognition, Library of Congress (2010). http://www.loc. 25. Netz, R., Noel, W.: The Archimedes Codex: Revealing the Blue-
gov/standards/alto print for Modern Science, Phoenix (2008). ISBN 0753823721
4. Bauer, P., Hernath, Z.: A data model supporting conflict resolving 26. Recordare MusicXML 3.0 Specification. http://www.recordare.
in cooperative text editions-HypereiDoc. In: Proceedings of the com/musicxml/specification. Accessed 29 Sep 2011 (15:00:00
First International Conference on Networked Digital Technologies, MST)
Czech Republic, pp. 476–481 (2009) 27. METS Editorial Board.: METS: Metadata Encoding & Transmis-
5. Baumann, R., Seales, W.B.: Robust registration of manuscript sion Standard. http://www.loc.gov/standards/mets/
images. In: Proceedings of 9th Joint Conference on Digital 28. Rehbein, M.: Reconstructing the textual evolution of a medieval
Libraries, USA, pp. 263–266 (2009) manuscript. Lit. Linguist. Comput. 24(3), 319–327 (2009). doi:10.
6. Beckett, D., Berners-Lee, T.: Turtle-Terse RDF Triple Language, 1093/llc/fqp020
W3C Team Submission, 14 January 2008 (2008) 29. Roland, P.: The Music Encoding Initiative, MEI. In: Proceedings of
7. Bizer, C., Cyganiak, R., Heath, T.: How to publish linked the First International Conference on Musical Applications Using
data on the Web (2007). http://sites.wiwiss.fu-berlin.de/pub/ XML (2002)
LinkedDataTutorial/ 30. Sanderson, R.: Linking past and future: an application of dynamic
8. Boccace, J.: Des cas des nobles hommes et femmes, c. 1410, Paris. HTML for medieval manuscript editions. Ph.D. Thesis, University
Held at Bibliotheque de Geneve, Geneva. Shelfmark: Ms. fr. 190/1. of Liverpool (2003)
doi:10.5076/e-codices-bge-ft0190-1 31. Sanderson, R., Ciccarese, P., Van de Sompel, H., (eds): Open Anno-
9. Brugman, H., Malaise, V., Hollink, L.: A common multimedia tation Core Data Model, W3C Community Draft, 09 May 2012.
annotation framework for cross linking cultural heritage digital http://www.openannotation.org/spec/core/ (2012)
collections. In: Proceedings of the 6th International Conference of 32. Sanderson, R., Albritton, B., (eds): Shared Canvas Data Model,
Language Resources and Evaluation, Morocco (2008) June 01 2012. http://www.shared-canvas.org/datamodel/spec/
10. Bulacu, M., et al.: Layout analysis of handwritten historical docu- (2012)
ments for searching the archive of the Cabinet of the Dutch Queen. 33. Schmidt, D., Colomb, R.: A data structure for representing multi-
In: Proceedings of the 9th International Conference on Document version texts online. Int. J. Hum. Comput. Stud. 67, 497–514
Analysis and Recognition, Brazil, pp. 357–361 (2007) (2009). doi:10.1016/j.ijhcs.2009.20.001
11. Clark, J.: XSL Transformation (XSLT) Version 1.0, W3C Recom- 34. TEI Consortium (eds): TEI P5: guidelines for electronic text encod-
mendation, 16 November 1999, W3C (1999) ing and interchange, TEI Consortium, November 2007. http://
12. Dahlstrom, E., et al. (eds): Scalable Vectore Graphics (SVG) 1.1, www.tei-c.org/Guidelines/P5/ (2007)
W3C Working Draft, 22 June 2010, W3C (2010) 35. Troncy, R., Mannens, E., (eds): Media Fragments URI 1.0, W3C
13. Doumat, R., Egyed-Zsigmond, E., Pinon, J.: Digitized ancient doc- Working Draft, W3C, 17 December 2009 (2009)
uments.. what’s next?, pp. 195–210. INFORSID, France (2009) 36. Unknown Gospels of St. Augustine c. 600 Held at Parker Library,
14. Froissart, J.: Chroniques, c. 1400. Held at Morgan Library, New Corpus Christi College, Cambridge. Shelfmark: CCCC MS 286
York City, NY. Shelfmark: M.804 37. Unknown Chansonnier Cordiforme c. 1475. Held at Bibliotheque
15. Hickson, I.: HTML5. A vocabulary and associated APIs for HTML Nationale de France, Shelfmark: BNF Rothschild MS 2973
and XHTML, W3C Working Draft, W3C, 19 October 2010 (2010) 38. Van de Sompel, H., Lagoze, C., et al.: Adding eScience assets to the
16. IFLA Study Group on FRBR: Functional requirements for biblio- Data Web, linked data on the Web Workshop. In: 18th International
graphic records. Final Report, International Federation of Library World Wide Web Conference, Spain (2009)
Associations and Institutions (2009) 39. Vincent of Beauvais, Speculum Historiale, c. 1325. Held at Corpus
17. von Arx, I.: Veterum Fragmentorum Manuscriptis Codicibus Christi College, Cambridge, Parker Library, Shelfmark CCCC MS
detractorum collectio Tom. I., 1822, Held at St. Gallen, 008. http://purl.stanford.edu/cv176gb0028
Stiftsbibliothek, Shelfmark Cod. Sang. 1394. doi:10.5076/
e-codices-csg-1394
18. Iso, TC 171/SC 2: ISO 32000–1:2008 Document Management-
Portable Document Format-Part 1: PDF 1.7. http://www.iso.org/
iso/catalogue_detail.htm?csnumber=51502 (2008)

123
Copyright of International Journal on Digital Libraries is the property of Springer Science & Business Media
B.V. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright
holder's express written permission. However, users may print, download, or email articles for individual use.

You might also like