Professional Documents
Culture Documents
Knowledge Organization: Library Tools and Taxonomies For The Web
Knowledge Organization: Library Tools and Taxonomies For The Web
3
Librarians work in
corporate settings
Yahoo.com (directory)
Northern Light.com
(search engine)
Amazon.com (e-book seller)
Microsoft.com
4
OCLC Library Corporation
Cooperatively Catalogs:
5
Traditional Library Tools
on the Web
Medical Subject Headings 1996
6
Importance of controlled
vocabulary as metadata
American Library Association
Subject Analysis Committee (SAC)
Subcommittee on Metadata and
Subject Analysis recommendations
http://www.ala.org/alcts/organization/
ccs/metarept2.html
7
Controlled Vocabularies
Why We Need Them
Used “behind” search engines
Standard in online databases
New adherents (i.e., Web Content
Managers utilizing Taxonomies)
They Work !
8
Sherry Vellucci, Associate
Professor, St. John’s Univ., during
the Conference on Bibliographic
Control for the New Millennium:
“authority control is not only
wonderful, but critical. Controlled
vocabulary mediating tools
should cover Subjects, Genres,
Gazetteers, Names and Titles,
etc.”
9
Metathesauri/Subject Correlations
Universal Medical Language System
(UMLS) maps over 60 medical and
health care thesauri in one
http://www.nlm.nih.gov/pubs/
factsheets/umlsmeta.html
ClassificationWeb
The Library of Congress subject
headings and LC classification
correlations
http://classweb.loc.gov
10
11
12
13
14
15
16
17
18
19
20
21
Mapping:
Standard information exchange
systems
Dublin Core to MARC
http://lcweb.loc.gov/marc/dccross.html
MARC to Dublin Core
http://www.loc.gov/marc/marc2dc.html
XMLMARC Crosswalk
http://lcweb.loc.gov/marc/marcsgml.html (Must
download files)
MARC to XML to MARC Converter
http://www.logos.com/marc/default.asp
22
Mapping:
Specialized information exchange
systems
Standard Industrial Classification
(SIC codes)
to
North American Industrial Classification
System (NAICS codes)
23
24
SIC Code Example
Major group 73=Business services
737=Computer programming, data
processing, and other computer related
services, 7372=Prepackaged software
Equivalent NAICS codes are:
Major group=51 Information
511=Publishing industries
5112=Software publishers (with cross ref. to
Sector 42 for reselling packaged software)
25
Using old and new tools for
knowledge organization on the
Web
27
History of Taxonomies
Aristotle
384 - 322 B.C.
Library of Alexandria 28
“Classification” is
used much more
frequently than
“Taxonomy”, in all
fields of study.
29
Numerous formal
taxonomies are
maintained by
government and
commercial
enterprises
30
Taxonomies are used in:
31
32
33
Service Codes CODE TITLE
A Research and Development
B Special Studies and Analysis ‑ Not R&D
C Architect and Engineering Services ‑ Construction
D Information Technology Services, including Telecommunication Services
E Purchase of Structures and Facilities
F Natural Resources and Conservation Services
G Social Services
H Quality Control, Testing and Inspection Services
J Maintenance, Repair, and Rebuilding of Equipment
K Modification of Equipment
L Technical Representative Services
M Operation of Government‑Owned Facilities
N Installation of Equipment
P Salvage Services
Q Medical Services
R Professional, Administrative and Management Support Services
S Utilities and Housekeeping Services
T Photographic, Mapping, Printing, and Publication Services
U Education and Training Services
V Transportation, Travel and Relocation Services
W Lease or Rental of Equipment
X Lease or Rental of Facilities
Y Construction of Structures and Facilities
Z Maintenance, Repair or Alteration of Real Property
34
35
36
How do we define
taxonomies in a wired world ?
Taxonomy: A classification of elements within a
domain
Domain: a sphere of knowledge, influence, or
activity
Classification: the operation of grouping elements
and establishing relationships between them (or
the product of that operation)
Relationships: a defined linkage between two
elements
Element: an object or concept
39
More Challenges
Certification of the taxonomy by an
authoritative body.
Finding common ground across multiple
taxonomies or schemas with similar terms
and different meanings.
Ensuring the ongoing integrity of the
taxonomy with constant maintenance.
Acceptance by developers of tagging tools.
Integrating with a legacy system and
external content.
40
The core expertise required for
constructing a taxonomy is:
Systems Analyst who understands
specifications for creating taxonomies
Domain expert/Subject expert in the subject
of the taxonomy
Computational linguist, AI engineer
Linguist and/or Lexicographer
Database/Application Development Expert
Administrative Support
Review Support
41
Example of a custom taxonomy marked up in xbrl:
<element name=”propertyPlantAndEquipmentGrossNote.purchasedSoftwareForInternalUse”
type=”monetary”>
<annotation>
<documentation>this is software that...</documentation>
<appinfo>
<xbrl:rollup
to=”ci:propertyPlantAndEquipmentNetNote.propertyPlantAndEquipmentGrossNote”
weight=”1" order=”7.5" />
<xbrl:label xml:lang=”en”>Purchased software for internal use</xbrl:label>
<xbrl:reference name=”GPSI” number=”73" chapter=”11" paragraph=”b”
subparagraph=”i” />
</appinfo>
</annotation>
</element>
</schema>
42
43
Recommendations:
Actively seek out existing taxonomies in the target discipline or
subject area. If your needs are met in part by an existing
taxonomy use it and build on it.
Look at the intended purpose of the taxonomy and select
appropriate software tools.
Consider scalability of the taxonomy. Look at the big picture
and see how the taxonomy will be able to hook into others.
Consider utilizing numerical taxonomy as a schema in the
metadata in order to merge documents in foreign languages.
Accommodate new standards whenever possible.
Document “Best Practices” while creating the taxonomy and
review them regularly.
Maintain and update the taxonomy continually.
44
Meta Model
(Describes how
taxonomies Existing
are created) Taxonomy
in your Field
Your Related
Agency Taxonomy of other
Taxonomy agency in same field
Related
Taxonomy of other
Core Schema agency hooked
Electronic
(Describes how to one above
Document
document is
in XML
to be created)
45
Efficient Web information
retrieval systems
in the form of search engines
or Web portals
require continued support and
improvement of:
46
Web based classification and
numerical taxonomic tools to use in
Web based cataloging tools such as
CORC, which provides metadata
based on
Taxonomies such as controlled
vocabularies/thesauri which will be
hooked together using
Metathesauri and standard
information exchange systems such
as MARC-XML
47
And this is the house that
Jack built…
48
Knowledge Organization:
Library Tools and
Taxonomies for the Web
Jan Herd jher@loc.gov
Business Reference Services
Science, Technology & Business Division
The Library of Congress 49