Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 19

Abstracting, Indexing and

Thesaurus construction: Lecture: 2


CIS 3301
INDEXING LANGUAGES

 An indexing language is a set of codes and their related  words


and phrases used for representing the content of the documents.
as well as users queries.
 Indexing language (IL) is an artificial language adopted to the requirements
of indexing.
 The function of an IL is to do whatever a natural language (NL) does and in
addition organize any intellectual content through a different expression
providing a point of access to the seekers of information.
STRUCTURE OF INDEXING LANGUAGE

 Information systems are concerned with the communication of


information about the documents to the potential users of those
documents. The means of communication is an indexing language.
 An Indexing Language is not just a listing of index terms or descriptors
acceptable to information users; it has to do with the techniques for
structuring and usage of terms
 Like natural language, indexing language consists of three elements: (a)
controlled vocabulary b) Syntax, and (c) Semantics.
INDEXING LANGUAGE VOCABULARIES

 Controlled indexing vocabulary


 The term “vocabulary control” refers to a limited set of terms that must be
used to index documents, and to search for these documents in a library or a
particular system.
 It may be defined as a list of terms showing their relationships and indicating
ways in which they may usefully be combined to provide specific subject of a
document.
 A certain degree of structure is introduced in controlled vocabulary so that
terms whose meanings are related are brought together or linked in some way
Cont’ INDEXING LANGUAGE VOCABULARIES

 The vocabulary of IL is either verbal or coded. For example Subject heading


lists and thesaurus use verbal controlled vocabulary.
 A classification scheme employs coded vocabulary in the form of its notation
 For example in Colon Classification (CC) Schedule ‘Indian History’ is marked
as V44. In Sear’s List of Subject Headings which employs verbal vocabulary
it is indicated as: India - History.
 The vocabulary in the indexing language is controlled for standardization of
terms.
cont’ INDEXING LANGUAGE VOCABULARIES

 One concept should be indicated by only one term. This is done by


controlling synonyms, near-synonyms and word forms.
 Terms accepted as standard terms are to be linked with the respective
alternative terms for instance HIV “See” or “use” AIDS
How is indexing language vocabularies controlled?
 The first step is controlling synonyms. The is achieved simply by choosing
one of the possible alternatives as the “Preferred term” and referring to this
term and related terms by using (“See” or “Use”)e.g. car see vehicle AND
use of references among terms that are most closely related semantically e.g.
HIV See also AIDS
Cont’ INDEXING LANGUAGE VOCABULARIES

 Syntax
 syntax in simple terms meaning ‘putting things together in an orderly
manner’.
 In the context of an indexing language, syntax refers to a set of rules or
grammar which governs the sequence of words in a subject heading, or
notations in a classification number
 When a number of terms have to be used to represent a subject, syntax is
necessary to put the terms in a most helpful and known searchable order. we
can say that syntax of an indexing language provides pattern of relationship
which we recognize between the terms used in the system.
Cont’ INDEXING LANGUAGE VOCABULARIES

 Semantics
 Semantics refers to the systematic study of how meaning is structured,
expressed and understood in the use of an indexing language.
 Various types of semantic relationships are evident in an indexing
language
 These relationships include equivalence and hierarchies . Meaning of the
terms can be derived from its hierarchy.
SUBJECT AUTHORITY FILE

A subject authority file consisting of subject authority records to ensure


uniformity and consistency in subject heading terminology and cross-references
The subject authority file serves as the source of indexing vocabulary and as the
means of verifying headings assigned to individual indexing records. It helps to
ensure that:
a) the same heading is assigned to all works on the same subject,
b) each heading represents only that particular subject
c) all headings assigned to indexing records conform to the established forms
TYPES OF INDEXING CONTROLLED AUTHORITY LIST

 Subject headings lists


 It is the vocabulary control master list of words/terms that can be assigned to
documents.
 Subject heading list is alphabetical arranged with appropriate cross references
and notes
 A list of other semantically related terms or phrases are displayed under each
term or phrase such as ‘see also’ (RT, for related term)
 subject headings are to be chosen keeping in mind the needs of the users who
are likely to use the index file
TYPES OF INDEXING CONTROLLED AUTHORITY LIST

 Thesauri

It is a synonym dictionary used for finding synonyms . They are often used by


indexers to help find the list of terms to express an idea
 A thesaurus (plural-thesauri) is also meant for information retrieval and is
used as a valuable vocabulary control device for indexing and searching in a
specific subject area.
 Thesauri provide lists of terms, often indicating structural relationships
between these terms
 thesaurus is used in translating terms and phrases from the natural language
of documents into a more controlled ‘system language’
TYPES OF INDEXING LANGUAGE

 An index serves as link between the information contained in a document


and the potential users. The language which serves as the medium of
communication between the two is the indexing language
 There is always a need for an artificial language to be used by the indexer
and the searcher to describe a document since the terms or concepts
identified in a book are represented by words or phrases. The two types of
indexing languages used are :-
 Natural Indexing Language
 The indexer uses the exact words and phrases used by the author of the document.
 In natural language indexing, any term that appears in the title, abstract or text of
a document record may be used as an index term
 There is no mechanism to control the use of terms for this indexing. Similarly, the
searcher is not expected to use any controlled list of terms
 Natural indexing language is used mainly in the back of book index and
computerized indexes such as Keyword in Context (KWIC) and Key Word out of
Context (KWOC) indexes
 The purpose of this type of language is to ensure that the indexer and the
searcher operate at the same level by using the same language
 This is very easy to use by the indexer and the searcher but the major problem
is that there is no distinction between synonyms, semantics,, singular and
plurals. This type of indexing tends to scatter documents on the same subject,
where the authors have used different term
 Controlled Indexing Language
 The terms that are used to represent subjects are assigned to particular
documents and are controlled or executed by an indexer
 The indexer exercises some control over the terms that are to be used as index terms
because the indexer assigns only terms that have been listed as possible index terms
 There is generally a prepared standard list of terms to be used for a particular system
 when an indexer has identified terms that represent the document, he/she consults this
standard list to ensure that the terms used are consistent. This list is called a subject
authority list
 There are two types a subject authority list. The two common examples are subject
headings list and thesaurus
 The first type is the alphabetical controlled list in which the terms are arranged
alphabetically
 The second type is the classification scheme which assigns notation to
subject terms.
 The searcher is expected to consult the same controlled list during
formulation of a search strategy.
FUNCTIONS OF AN INDEX

Indexing of all information resources whether in print or electronic form must


fulfil certain functions if the resulting indexes have to retrieve or find a
particular name, term or passage in a text that the user has either read before, or
that is presumed to contain the desired information
 The function of indexes are:
 provide users with an efficient and systematic means for locating documents
or parts of documents which address their information needs or request
 identify and locate potentially relevant information in the document or
collection
cont’ FUNCTIONS OF AN INDEXES

 analyze concepts covered in a document so as to produce suitable index


 use headings based on their terminology
 indicate relationships among topics in the information resource
 groups together information on topics scattered by the arrangement of the
document or collection
 They direct users seeking information under terms not chosen as index
headings to terms that have been chosen by means of see references
Cont’ FUNCTIONS OF AN INDEXES

 suggest to users a topic to look up, which are related to the topics by means of
see also references;
 arrange entries into a systematic and helpful order
 They serve as a guide to what literature exist in a given field or by a given
author

You might also like