AACR (Anglo-American back-end database geneous cultural heritage information.

Cataloguing Rules) A database that contains and manages
A data content standard for describing data for an information system, distinct
bibliographic materials. http://www from the presentation or interface compo-
An application that retrieves and/or nents of that system.
renders resources or resource manifesta-
algorithm CCO (Cataloging Cultural Objects) tions. Often used to denote a computer
A formula or procedure for solving a A data content standard for describing or other kinds of devices connected to
problem or carrying out a task. An algo- works of art, architecture, and material a network, equipped with software that
rithm is a set of steps in a very specific culture. enables users to access resources avail-
order, such as a mathematical formula or able on another computer connected to
the instructions in a computer program. index.html. the same network, called a server. See
also server.
application profile CDWA (Categories for the
A set of metadata elements, policies, and Description of Works of Art ) conceptual data model
guidelines defined for a particular appli- A set of metadata categories and recom- An abstract model or representation of
cation or community. The elements may mendations that may be used to design data for a particular domain, business
be from one or more element sets, thus information systems and to do cataloging enterprise, field of study, etc., indepen-
allowing a given application to meet its for art, architecture, objects of material dent of any specific software or informa-
functional requirements by using meta- culture, and archaeological and archival tion system. Usually expressed in terms
data from several element sets, including materials. of entities and relationships. See also
locally defined elements. conducting_research/standards/cdwa/. logical data model.
authentication CDWA Lite crosswalk
A human or machine process that verifies An XML schema for core records for A chart or table (visual or virtual) that
that an individual, computer, or informa- art, architecture, and material culture represents the semantic mapping of fields
tion object is who or what it purports designed to work with the OAI-PMH; the or data elements in one data standard
to be. elements are based on a subset of the full to fields or data elements in another
element set of Categories for the Descrip- standard that has a similar function or
authority file
tion of Works of Art (CDWA). http://www meaning. Crosswalks make it possible to
A file, typically electronic, that serves convert data between databases that use
as a source of standardized forms of
standards/cdwa/cdwalite.html. different metadata schemes and enable
names, terms, titles, and so on. Authority
heterogeneous databases to be searched
files should include references or links CGI script
simultaneously with a single query as if
from variant forms to preferred forms. A computer program, most frequently
they were a single database (semantic
For example, in the Library of Congress written in C, Perl, or a shell script, that
interoperability). Also known as field
Name Authority File (LCNAF), “Schia- uses the Common Gateway Interface
mapping. See also metadata mapping.
vone, Andrea” is the preferred name form (CGI) standard and provides an inter-
for a Dalmatian artist active in Italy during active interface between a user or an DACS (Describing Archives:
the sixteenth century, while “Medulić, external computer application and a A Content Standard)
Andrija,” “Lo Schiavone,” and several World Wide Web server. CGI scripts are A data content standard for describing
other forms are listed as variant names. most commonly used to develop forms archival collections. http://www
Authority files regulate usage but also that allow users to submit information to
provide additional access points, thus a Web server. asp?objectID=1279.
increasing both the precision and the
CIDOC CRM (CIDOC Conceptual data content standard
recall of many searches.
Reference Model) Rules that determine the vocabulary,
An object-oriented ontology for the syntax, or format of content entered into
mediation and interchange of hetero- data fields or metadata elements, for

example, Anglo-American Cataloguing dynamically generated FTP (File Transfer Protocol)

Rules (AACR), ISO 8601 (rules for Refers to a Web page, metadata record, A TCP/IP protocol that allows data files to
recording date and time), Describing or other information object that is gener- be copied directly from one computer to
Archives: A Content Standard (DACS), ated on demand, typically from content another over the Internet.
Cataloging Cultural Objects (CCO). stored in a database, and usually either
finding aid
in response to a user’s input or from
data provider (OAI nomenclature) A descriptive tool widely used in
dynamic data sources that are refreshed
An organization that exposes metadata archives. Finding aids typically take the
periodically. The expression “on the fly”
records in one or more repositories form of hierarchical narrative descriptions
is often used in relation to dynamically
(specially configured servers) for of cohesive groups of archival records
generated content.
harvesting by service providers. or collections of manuscript materials.
EAD (Encoded Archival Description) Finding aids traditionally were paper
Deep Web
A data structure standard for encoding documents; EAD is a structured way of
See Hidden Web.
archival finding aids in SGML or XML expressing finding aids as machine-
default values according to the EAD DTD or EAD XML ­readable data.
Values that are assumed or supplied schema, making it possible for the
FRBR (Functional Requirements for
automatically, for example, by a computer semantic contents of a hierarchically
Bibliographic Records)
system, if a value is not specified. structured finding aid to be machine
A set of requirements and a conceptual
digital signatures entity-relationship model developed by
A form of electronic authentication of a encryption the International Federation of Library
digital document. Digital signatures are An encoding mechanism used to prevent Associations and Institutions (IFLA) to
created and verified using public key nonauthorized users from reading digital support bibliographic access and control.
cryptography and serve to tie the docu- information and also for user and docu-
ment being signed to the signer. ment authentication. Only designated
users or recipients have the capability to
digital surrogate A joint initiative of the International
decode encrypted materials.
A digital “copy” of an original work or Federation of Library Associations and
item, for example, a JPEG or TIFF image entity-relationship model Institutions (IFLA) and the International
of a painting or sculpture or a PDF file of A type of conceptual data model that Council of Museums–International Docu-
an article or book. In OAI nomenclature, represents structured data in terms of mentation Committee (ICOM-CIDOC)
digital surrogates are often referred to as entities and relationships. An entity- to create an object-oriented ontology
“resources.” ­relationship diagram can be used to that both captures the semantics of
represent information objects and their bibliographic information and harmo-
DTD (Document Type Definition)
relationships visually. Because the nizes those concepts in common with the
A collection of markup declarations
constructs used in the entity-relationship CIDOC CRM, thus facilitating information
that define the structure, elements, and
model can easily be transformed into interchange between the museum and
attributes that can be used in encoding
relational tables, this type of model is library communities. http://cidoc.ics
certain type of documents in SGML or,
often used in database design.
more commonly, in XML. Examples of
DTDs include the EAD DTD, the HTML EXIF (Exchangeable Image File folksonomy
DTD, and the TEI DTD. XML DTDs are Format) An assemblage of concepts, represented
gradually being replaced by the newer A specification for an image file format by terms and names (called “tags”), the
XML schemas. for digital cameras that provides the result of social tagging. Note that a folk-
ability to attach image metadata to JPEG, sonomy is not a true taxonomy. See also
Dublin Core Metadata Element Set
TIFF, and RIFF images. As of this writing, social tagging, taxonomy.
EXIF is not maintained by any industry
A set of 15 metadata elements that can Google Sitemap
or standards organization but is widely
be assigned to information resources, Metadata about the content of a Web
used by camera manufacturers. http://
optimized for resource discovery on the site that assists the Googlebot Web
World Wide Web. Also often used as a crawler to index a site more efficiently
“lowest common denominator” in meta- field mapping and comprehensively.
data mapping. http://dublincore See crosswalk. .com/webmasters/sitemaps/.

granularity Internet MARC (Machine-Readable

The level of detail at which an information A global collection of computer networks Cataloging format)
object or resource is viewed or described. that exchange information by the TCP/IP A set of standardized data structures for
suite of networking protocols. describing bibliographic materials that
harvester (OAI nomenclature)
facilitates cooperative cataloging and data
A computer system that sends OAI-PMH Internet directory
exchange in bibliographic information
requests to OAI data providers’ reposito- A thematically organized list of descrip-
ries and harvests metadata records from tive links to Internet sites, often created
them. by humans who have classified sites by markup language
their content. Yahoo! provides numerous A formal way of annotating a document or
header metadata
such directories. collection of digital data using embedded
Metadata embedded in the header part of
encoding tags to indicate the structure of
a digital file. interoperability
the document or datafile and the contents
The ability of different information
Hidden Web (also known as of its data elements. This markup also
systems to work together, particularly in
Deep Web, Invisible Web) provides a computer with informa-
the correct interpretation of data seman-
The sum of the Web pages that are not tion about how to process and display
tics and functionality. See also semantic
accessible to Web crawlers, usually marked-up documents. HTML, XML,
because they are either dynami- and SGML are examples of standardized
cally generated by a user querying a Invisible Web markup languages.
database or password-protected or See Hidden Web.
memory institution
legacy system A generic term used to describe an
hostname An information system that has been institution that has a responsibility to
An identifier for a specific machine on developed and modified over a period collect, care for, and provide access to the
the Internet. The hostname identifies not of time and has become outdated and human record—for example, museums,
only the machine but also its subnet and difficult and costly to maintain but that libraries, and archives.
domain, for example, See holds important information and involves
metadata mapping
also domain name. processes that are deeply ingrained in an
A formal identification of equivalent or
organization. Legacy systems usually are
HTML (HyperText Markup nearly equivalent metadata elements
eventually replaced by a new hardware
Language) or groups of metadata elements within
and software configuration.
An SGML-derived markup language different metadata schemas, carried
used to create documents for World Wide link resolver out in order to facilitate semantic
Web applications. HTML has evolved to Software that uses the OpenURL stan- interoperability.
emphasize design and appearance rather dard to automatically redirect a user’s
metadata mining
than the representation of document request to the most appropriate copy of a
The automated extraction of metadata
structure and metadata elements. networked digital object. Typically, link
from electronic documents.
resolvers are used by libraries to direct
their patrons from bibliographic records metasearch
HyperText Transfer Protocol, the standard
or abstracts to licensed subscription- Searching of diverse databases on
protocol that enables users with Web
based resources such as full-text elec- diverse platforms with diverse metadata
browsers to access HTML documents and
tronic versions of articles and books. in real time by means of one or more
related media. protocols. The NISO MetaSearch Initia-
hyperlink detail.cfm?std_id=783. tive defines metasearch as “search and
An abbreviated reference to a “hypertext retrieval to span multiple databases,
logical data model
link,” a method of creating nonlinear sources, platforms, protocols, and
A data model that includes all enti-
pathways between related digital docu- vendors at once.” Metasearch enables
ties and the relationships among them
ments or to link to related objects such as users to enter search criteria once and
based on the structures identified in a
image or audio files. access several search engines simultane-
conceptual data model and that specifies
ously. With meta­search, fresh records are
information object all attributes for each entity. The data is
always available, because searching is in
A digital item or group of items referred described in as much detail as possible,
real time, in a distributed environment.
to as a unit, regardless of type or format, without regard to how it will be physically
that a computer can address or manipu- implemented in a specific database.
late as a single discrete object.

meta tag purposes, bandwidth is generally (and precision

An HTML tag that enables metadata to be incorrectly) used to refer to the rate of A measure of search effectiveness
embedded invisibly on Web pages, for data transfer. expressed as the ratio of relevant records
example, Description, Keywords. or documents retrieved from a database
OAI-PMH (Open Archives Initiative
to the total number retrieved in response
meta tag spamming Protocol for Metadata Harvesting)
to the query; for example, in a database
The deliberate misuse of meta tags A protocol used to harvest or collect
containing 100 records relevant to the
in order to attract traffic to a site, for metadata records from data providers.
topic “book history,” a search retrieving
example, by boosting its ranking in
50 records, 25 of which are relevant to the
search results.
object-oriented topic, would have 50 percent precision
METS (Metadata Encoding A programming or data modeling (25/50). (Definition from ODLIS, Online
Transmission Schema) methodology that utilizes the notion of Dictionary for Library and Information
A standard for encoding descriptive, classes and their properties. Members Science, See also
administrative, and structural metadata (or instances) of a class share the same recall.
relating to objects in a digital library, properties—for example, color or weight
expressed in XML. METS enables the (however, note that although members of
A specification—often a standard—that
“packaging” of complex digital objects a class all share the same properties, the
describes how computers communicate
that include a range of metadata as well values of those properties do not need
with each other, for example, the TCP/IP
as related digital surrogates. http://www to be the same). Classes can contain
suite of communication protocols or the subclasses, members of which inherit the
properties of the parent or “superclass.”
MODS (Metadata Object Description
RDF (Resource Description
Schema) ontology
An XML schema for bibliographic A formal, machine-readable specification
An application of XML that enables the
records, developed and maintained of a conceptual model, in which concepts,
creation of rich, structured, machine-
by the Library of Congress. http://www properties, relationships, functions,
readable resource descriptions. http:// constraints, and axioms are all explicitly
RDF schema
The set of unique names used to OPAC (Online Public Access
A set of semantics within a defined
identify objects within a well-defined Catalog)
namespace for use with specific applica-
domain, particularly relevant for XML A computerized inventory of a library’s
tions of RDF.
applications. An XML Namespace is holdings.
a W3C recommendation for providing recall
Open WorldCat
uniquely named elements and attributes A measure of the effectiveness of a search
A subset of the WorldCat union biblio-
in an XML instance. A namespace is expressed as the ratio of the number of
graphic database made available by
declared using the reserved XML attri- relevant records or documents retrieved
OCLC to certain Web search engines and
bute xmlns, the value of which must in response to the query to the total
online book retailers. http://www.oclc
be a URI (Uniform Resource Identifier) number of relevant records or docu-
reference. For example, the Dublin ments in the database; for example,
Core Metadata Element Set, Version 1.1 PageRank™ (Google) in a database containing 100 records
(original 15 elements) has the approved A proprietary link-analysis algorithm relevant to the topic “book history,” a
DCMI namespace URI as http://purl. developed by Google founders Larry Page search retrieving 50 records, 25 of which
org/dc/elements/1.1/. and Sergey Brin to assign a numerical are relevant to the topic, would have
score to each document in a set of hyper- 25 percent recall (25/100). (Definition
text documents based on the number of from ODLIS, Online Dictionary for Library
The way in which subelements may
referring links. The algorithm also takes and Information Science,
be contained within larger elements,
into account the rank of the referring odlis/.) See also precision.
resulting in multiple levels of metadata.
page, such that a link from a high-ranking
network bandwidth page counts more than a link from a low-
The extent to which information retrieved
Derived from the term used to describe ranking page.
in a search of a library collection or other
the size or “width” of the frequencies .com/technology/.
resource, such as an online catalog or
used to carry analog communications
a bibliographic database, is judged by
such as television and radio. For Internet
the user to be applicable to (“about”) the

subject of the query. Relevance depends context of the World Wide Web, the term social bookmarking
on the searcher’s subjective perception usually refers to a program that searches The decentralized practice and method
of the degree to which the document a large index of Web pages generated by by which individuals and groups create,
fulfills the information need, which may an automated Web crawler. See also Web classify, store, discover, and share Web
or may not have been expressed fully or search engine. bookmarks or “favorites” in an online
with precision in the search statement. “social” environment.
semantic interoperability
Measures of the effectiveness of infor-
The ability of different agents, services, social tagging
mation retrieval, such as precision and
and applications to communicate data The decentralized practice and method
recall, depend on the relevance of search
while ensuring accuracy and preserving by which individuals and groups create,
results. (Definition from ODLIS, Online
the meaning of the data (definition based manage, and share terms, names, and so
Dictionary for Library and Information
on Marcia Bates and Mary Niles Maack, on (called tags), to annotate and catego-
Encyclopedia of Library and Information rize digital resources in an online “social”
relevance ranking Sciences, 3rd ed. [New York: Marcel environment. A folksonomy is the result
The algorithmic process, a feature of Dekker, forthcoming]). of social tagging. Also referred to as
many search software applications, by collaborative tagging, social classifica-
Semantic Web
which results in a result set are sorted tion, social indexing, mob indexing, folk
An evolving, collaborative effort led
or ranked according to their relevance. categorization. See also folksonomy,
by the W3C whose goal is to provide a
In OPACs, for example, relevance is tagging.
common framework that will allow data
computed based upon the number of
to be shared and re-used across various spamming
occurrences of the search term in the
applications as well as across enterprise Used in reference to meta tags. The abuse
record that is retrieved, and the weight
and community boundaries. It derives of metadata that creators include in the
assigned to the field(s) in which the
from W3C director and inventor of the HTML header area of their Web pages
search term appears. (Definition from
World Wide Web Sir Tim Berners-Lee’s in order to increase the number of visi-
ODLIS, Online Dictionary for Library
vision of the Web as a universal medium tors to a Web site. Keyword spamming
and Information Science,
for data, information, and knowledge entails repeating keywords multiple times
odlis/.) Google’s PageRank™ is an
exchange. in order to appear at the top of search
example of a relevance ranking algorithm.
engine result listings or listing keywords
resource discovery that are irrelevant to the site in order to
An application that supplies resources or
The process of searching for specific attract visitors under false pretenses.
resource manifestations. Often used to
information objects on the Web.
refer to a networked computer that acts as spider
robot a source of data and/or applications used See Web crawler.
See Web crawler. by multiple client computers or devices.
SRU/SRW (Search and Retrieve
See also client.
schema via URL/Search and Retrieve Web
A set of rules for encoding information service provider (OAI Service)
that supports specific communities of nomenclature) Companion protocols for Web search
users. Also called “scheme.” The plural An institution or organization that queries utilizing the CQL Common
forms of the word schema are schemas harvests metadata from data providers Query Language. http://www.loc
and schemata. See also XML schema. and uses the aggregated metadata as a .gov/standards/sru/.
basis for building value-added services.
schema registry surrogate
An authoritative source of names, SGML (Standard Generalized See digital surrogate.
semantics, and syntaxes for one or more Markup Language)
schemas. International Standards Organization
In the context of the Web, the act of
standard ISO/IEC 8879:1986; a markup
screen scraping associating terms (called tags) with
language first used by the publishing
A technique in which display data an information object (e.g., a Web
industry, for defining, specifying, and
(usually unstructured) is automatically page, an image, a streaming video
creating digital documents that can be
retrieved and extracted, for example, from clip), thus describing the item and
delivered, displayed, linked, and manipu-
a Web page. enabling keyword-based classification
lated in a system-independent manner.
and retrieval. Tags—a form of user-
search engine XML and HTML are derived from SGML.
­generated metadata—from communities
A computer program that allows users
of users can be aggregated and analyzed,
to search electronic resources. In the

providing useful information about the host and directory path. For example, on the Web and puts them in an index
collection of objects with which the tags urn:issn:0167-6423 is the URN for or database that Web users can search
have been associated. See also social the journal Science of Computer in a variety of ways. The search results
tagging. Programming. provide links back to the pages matching
the user’s search in their original
taxonomy Visible Web
An orderly classification that explicitly The subset of the World Wide Web that
expresses the relationships, usually hier- is visible to Web browsers and indexable wiki
archical (e.g., genus/species, whole/part, by search engines’ Web crawlers. To be A collaborative Web site that contains
class/instance), between and among the accessible to Web crawlers, the pages pages that any authorized user can edit.
things being classified. must be accessible simply by following Wikis typically retain all former versions
links (i.e., not generated dynamically in of each page, allowing the revision
TCP/IP (Transmission Control
response to user input) and not protected history of a page to be tracked and for
Protocol/ Internet Protocol)
by a password. unwanted revisions to be reversed.
The ISO standardized suite of network
protocols that enables information VRA Core 4.0 Wikipedia
systems to communicate with other infor- An XML schema for describing works A free, collaborative, volunteer-driven
mation systems on the Internet, regard- of art and architecture and their visual Web-based encyclopedia that utilizes wiki
less of their computer platforms. surrogates.­ software to allow anyone to edit articles.
TEI (Text Encoding Initiative)
An international cooperative effort to W3C (World Wide Web Consortium) World Wide Web
develop guidelines for standard encoding The main international standards organi- A vast distributed wide-area client-server
schemes (i.e., the TEI and TEI Lite DTDs) zation for the World Wide Web. architecture for retrieving hypermedia
for literary and linguistic texts. http:// documents over the Internet.
Web 2.0
A phrase used loosely by the Web devel- XHTML (Extensible HyperText
URI (Uniform Resource Identifier) opment community to refer to a perceived Markup Language)
A short string that uniquely identifies a “second generation” of Web technologies A reformulation of HTML in XML.
resource such as an HTML document, an and applications. Wikis, folksonomies,
XML (Extensible Markup Language)
image, a downloadable file, or a service. gaming, podcasting, blogging, and so on,
A simple, flexible markup language
URLs and URNs are types of URIs. are all considered Web 2.0 applications.
derived from SGML. Originally designed
URL (Uniform Resource Locator) Web browser for large-scale electronic publishing,
A type of URI consisting of an Internet A software application that enables users XML is now playing an increasingly
address that tells users how and where to view and interact with information and important role in the publication and
to locate a specific file on the World media files on the Web. Internet Explorer, exchange of a wide variety of data on
Wide Web. A URL includes not only the Mozilla Firefox, and Netscape Navigator the Web.
name of a file but also the name of the are examples of Web browsers.
XML schema
host computer, the directory path to get
Web crawler (robot, spider) A machine-readable definition of
to that file, and the protocol needed in
A software program that systematically the structure, elements, and attri-
order to use it (e.g.,
traverses the Web, either for the purpose butes allowed in a valid instance of
of generating a searchable index of Web a conforming XML document. XML
intrometadata/intro.html specifies that the
content or to gather statistics. schemas are expressed using the
hypertext transfer protocol “http” should
XML Schema Definition language, a
be used to retrieve the document intro. Web server
W3C standard. http://www
html from the host in the A computer that is able to respond to
directory research/conducting_research/ HTTP requests from clients known as
standards/intrometadata. Web browsers and return the appropriate XMP (Extensible Metadata
HTTP responses—most typically serving Platform)
URN (Uniform Resource Name)
an HTML page. A markup language, based on RDF, for
A type of URI consisting of a unique,
recording and embedding metadata
location-independent identifier of a Web search engine/Internet
about digital assets. Developed by Adobe
file available on the Internet. The file search engine
Systems and supported across the
remains accessible by its URN regard- A software program that collects data
company’s range of software products
less of changes that might occur in its taken from the content of files available

and file formats.

An ISO 23950 and ANSI/NISO Z39.50 standard information
retrieval protocol. Z39.50 is a client/server-based protocol for
searching and retrieving information from remote databases.

