Professional Documents
Culture Documents
Lis6711 Research Paper by Crystal Stephenson
Lis6711 Research Paper by Crystal Stephenson
A Comparative Analysis:
Crystal Stephenson
“framework developed under the auspices of the Library of Congress to exert bibliographic
control over traditional and web resources in an increasingly digital world.” (Tharani, 2015)
Slated to replace the outdated Machine-Readable Cataloging format, or MARC, used for the
standards to a linked data model, in order to make bibliographic information more useful both
within and outside the library community.” (Taniguchi, 2017) Meanwhile, Schema.org launched
with the objective to “allow Web publishers a means to express rich metadata” and improve
search engine retrieval in the “evolution towards the data Web.” (Miller et al., 2012) In 2012, the
highlights their overlapping goals in linked data initiatives. While the foundation of Schema.org
is not specifically “aligned with the objectives of the Library community, the potential impact
this work may have on the library community is enormous.” The aim of this paper is to review
and compare the BIBFRAME and Schema.org models of information organization, concluding
with a brief analysis of the Library of Congress and, according to the published research, why
they might remain committed to the BIBFRAME model over that of the Schema.org framework.
Launched in 2011, BIBFRAME is a data model lauded as “the foundation for the future
of bibliographic description that happens on the web and in the networked world.” (Taniguchi,
develop BIBFRAME, projecting it to replace the MARC standards of description created in the
“translate bibliographic data to a linked data model while also incorporating emerging data
standards and models including Functional Requirements for Bibliographic Records (FRBR) and
Resource Description and Access (RDA).” (Mitchell, 2013) Created to “make library records
accessible to the Web at large” and promising “to better accommodate RDA and its vocabulary
than MARC 21” (El-Sherbini, 2017), BIBFRAME was designed to “deconstruct MARC 21
While librarians created FRBR, the outsourced BIBFRAME was introduced for
simplification of group entities with the intent of being more user-friendly, and coded to define
classes, instances, and properties of a resource. The BIBFRAME model is expressed in RDF,
consisting of three main classes of abstraction, including Work, Instance, and Item, with the
addition of the three key concepts of Agent, Subject, and Event, in relation to the core entities.
Work “identifies the conceptual essence of something” (El-Sherbini, 2017), while the Instance
“reflects the material embodiment of a Work” and Item being the actual copy of it. Agents are
defined as the “people, organizations, jurisdictions, etc., that are associated with a Work or
Instance through roles such as” authorship, and Subjects identify the concepts, including topics,
places, or events, the latter being occurrences, “the recording of which may be the content of a
Work.”
In short, the “FRBR recognizes entities, attributes and relationships between entities like
web resources” (Hallo et al., 2016), while BIBFRAME “is defined in RDF, identifying all
BIBFRAME “enables the use of annotations such as mapping to other vocabularies” like Dublin
Core, resulting in improved catalog retrieval. In an effort to simplify or streamline the FRBR
work concept,” but essentially marries both the Work and Expression of FRBR, while
BIBFRAME’s Instance entity “can roughly be described as a conflation between Expression and
open web visibility of the resources they offer their patrons” (Pesch & Miller, 2016), since most
people “start their quests for knowledge in the web instead of going to a physical site or custom
library or information provider website.” With this rationale in mind, it is natural to see why the
MARC standard for describing materials employed by librarians is outdated, since it was
developed decades before the web. “Generic tools used for discovering information on the web
were not designed to ingest and index MARC readily, so library content is not visible in the
search engines users use to find information on the web.” BIBFRAME was not the only
conceptual model to identify this disconnect and adapt to the times, however. Schema.org was
founded by the world’s top search engines, Google, Microsoft Bing, Yahoo and Yandex, with the
belief that a “shared vocabulary makes it easier for webmasters and developers to decide a
schema and get maximum benefit for their efforts.” (Scheme.org, 2018) Schema.org is a
“collaborative, community activity with a mission to create, maintain, and promote schemas for
structured data on the Internet, on web pages, in email messages, and beyond.” The vocabulary
can essentially be used in a variety of encodings, including RDFa and JSON-LD, covering
“entities, relationships between entities and actions, and can easily be extended through a well-
Schema.org’s initiative provides a “core ontology for search engines to normalize the
markup of webpages in a way that reduces ambiguity about what the pages are describing and
makes the integration of the data into search engines more efficient.” (Fons et al., 2012) OCLC
A COMPARATIVE ANALYSIS 5
took notice of the developments in the search engine industry and “realized that it could be an
important tool to more effectively represent the collective collections of libraries on the Open
Web.” And since 2011, “OCLC researchers have been experimenting with Schema.org as a
vehicle for exposing library metadata to Web search engines in a format they seek and
understand.” (Godby et al., 2015) Their efforts “led to the 2012 publication of Schema.org
metadata elements expressed as linked data on 300 million catalog records accessible from
WorldCat.org.” In other words, it would appear that Schema.org has become “an ideal tool to
mediate” the complexity between a web user’s search and the content to be delivered, and “more
efficiently connect end users to the content they desire” (Fons et al., 2012).
complimentary, the “coverage of Schema.org is necessarily broad but shallow because library
resources must compete with creative works offered by many other communities in the
information landscape.” (Godby et al., 2015) In comparison, the coverage of BIBFRAME “is
deep because it contains the vocabulary required of the next-generation standard for describing
library collections.” Similarities do exist, “particularly in the definition of entities such as Work,
Instance, Organization, and Person.” But the linked data models being developed by OCLC
“optimize descriptions of library resources for discovery on the Web beyond libraries, using the
vocabulary designed for consumption by general-purpose search engines.” Likewise, the broad
vocabulary defined by Schema.org serves the greater information-seeking public, but “may not
include many of the details defined by BIBFRAME, which aims more to address the needs of
long-term curation by libraries and other cultural heritage institutions.” (Godby et al., 2015)
When an individual begins their hunt for information “with a search engine or social
network, whose objective is to help users locate information, then cultural heritage organizations
A COMPARATIVE ANALYSIS 6
need to help those engines and networks direct users to answers, especially those held by
libraries.” (Miller et al., 2012) The BIBFRAME model is specifically designed to meet these
standards head on, by coordinating “the cataloging and metadata that libraries create with these
efforts, and connect them.” The BIBFRAME model is therefore “the library community’s formal
One may ask why the Library of Congress has not simply adopted the Schema.org model
with appropriate alterations and extensions to the existing vocabulary for cataloging purposes.
However, OCLC is still in their experimental phase, so the research is inconclusive at present,
but as previously mentioned, Schema.org is very broad in scope and formatted to make “it easier
for search engines to specialize the way pages are listed” (Pesch & Miller, 2016), but have “less
to do with the actual surfacing of otherwise hidden resources,” which is the goal of
BIBFRAME’s development. BIBFRAME affords the Library of Congress the “richness and
flexibility required for catalogers and archivists, but also renders the described resource to be
machine- and web-readable.” This is an ideal framework that encompasses the necessities of
linked data principles, practices, and technologies, in order to stay above the curve, while also
working “as a bridge between the description component and open web discovery.” BIBFRAME
descriptions are also more “detailed because they include the specialized vocabulary required for
professional curation” (Godby et al., 2015). Their “focus on vocabulary development to support
this model apart from Schema.org, which “can refer to this description and enhance its own
simply by adding a ‘same as’ assertion containing the BIBFRAME URI.” But it should be noted
that, “to generate comparable descriptions or to pass them through OCLC’s data processing
stream without loss of information,” Schema.org must “use the BIBFRAME vocabulary
A COMPARATIVE ANALYSIS 7
directly.” BIBFRAME supplies a “depth of description” that may always be missing from similar
data models optimized for discovery. For instance, a map can be defined as a resource in
Schema.org, “but the list of defined properties is too sketchy to meet the stewardship needs of
librarianship” as of yet.
Libraries must be flexible, responsive, and “serve their users in exciting new ways” (Xu
et al., 2017), or those who “cling to outdated standards,” like MARC, “will find it increasingly
difficult to serve their clients as they expect or deserve.” As one study opined, “there remains a
widespread perception that continued adherence to MARC is causing libraries to fall behind the
technological curve and to miss out on collaborative opportunities with other metadata
discussion of linked data, and while they share many similarities, there remains enough
case of the Library of Congress. More research is required and experimentation necessary before
a determination can be made as to whether the Library of Congress should consider integrating a
Schema.org model. But while the debate forges on, it is clear that the shift towards linked data is
most advantageous in our ever-increasing digital age. To be certain, we “are beginning to see the
recognition of metadata as crucial in this world where the web gives us much of our
information.” (Dull, 2016) As such, librarians have long feared their jobs would become obsolete
in the shadow of Google, but quite the contrary, as they are more relevant now than ever, “given
their skills in creating and managing metadata and the need to make these rich resources
accessible” to all.
A COMPARATIVE ANALYSIS 8
References
Dull, M. E. (2016). Moving Metadata Forward with BIBFRAME: An Interview with Rebecca
Fons, T., Penka, J., & Wallis, R. (2012). OCLC’s Linked Data Initiative: Using Schema.org to
Make Library Data Relevant on the Web. Information Standards Quarterly. Vol. 24,
Godby, C., & Denenberg, R. (2015) Common Ground: Exploring Compatibilities Between the
Linked Data Models of the Library of Congress and OCLC. Dublin, Ohio: Library of
https://www.oclc.org/content/dam/research/publications/2015/oclcresearch-loc-linked-
data-2015.pdf.
Hallo, M., Luján-Mora, S., Maté, A., & Trujillo, J. (2016). Current State of Linked Data in
doi:10.1177/0165551515594729
Miller, E., Ogbuji, U., Mueller, V., & MacDougall, K. (2012). Bibliographic Framework as a
Web of Data: Linked Data Model and Supporting Services (PDF) (Report). Library of
11-21-2012.pdf.
Reports, 49(5), 26-43.
Pesch, O., & Miller, E. (2016). Using BIBFRAME and Library Linked Data to Solve Real
A COMPARATIVE ANALYSIS 9
doi:10.1080/0361526X.2016.1183159
Schema.org. (2018). About Schema.org. Schema.org. Retrieved July 20, 2018, from
https://schema.org/
Taniguchi, S. (2017). Examining BIBFRAME 2.0 from the Viewpoint of RDA Metadata
doi:10.1080/01639374.2017.1322161
Tharani, K. (2015). Linked Data in Libraries: A Case Study of Harvesting and Sharing
5-19.
Xu, A., Hess, K., & Akerman, L. (2018). From MARC to BIBFRAME 2.0:
doi:10.1080/01639374.2017.1388326