Professional Documents
Culture Documents
Information Storage, Retrieval, Indexing
Information Storage, Retrieval, Indexing
Information Storage, Retrieval, Indexing
ng
DATABASE MANAGEMENT I
CSC 226
▪ The first Information Retrieval Systems originated with the need to organize information in central repositories
(e.g., libraries) Catalogues were created to facilitate the identification and retrieval of item
Dr. Akputu Oryina Kingsley
3
ISRS 4
4
Genesis of ISRS 5
▪ IRS gained popularity in the research community in the early sixties only
when computers were being introduced in information handling and
management.
▪ These information retrieval systems are basically nothing but document retrieval
system, since they were designed to retrieve bibliographic information of stored
documents databases in response to a search request by the users.
▪ Though the basics of IRS is still the same, due to application of present advanced
techniques , the role and scope of IRS has been much widened.
5
Genesis of ISRS (cont’d) 6
▪ Therefore the connotation of information retrieval has changed and it has been
variously termed by information professionals and researchers, like:
6
Genesis of ISRS (cont’d) 7
Modern IRS
▪ The modern connotations implies that IRS presently deals not only with
textual information but also with multimedia information comprising text,
audio, images and video.
▪ While many features of conventional text retrieval systems are equally applicable
to multimedia information retrieval, the specific nature of audio, image and video
information have called for the development of many new tools and techniques
for information retrieval.
7
8
CHARACTERISTICS OF ISR
1. Information Facilitator
The ISAR system should act as facilitator between the information (contained
in document) and the users.
▪ If a user approaches with the subject term, name of contributors or title of
the document and so on, the system should be helpful to give him the
desired information.
▪ The information could be exact information or the reference of a document
which contains information.
2. Non-Ambiguous
The system should be so organized that ambiguity of information is avoided so that
search result is free from any kind of ambiguity. This requires identification of terms, setting their
context and their proper indexing.
3. User friendliness:
▪ Ease of use is an important consideration for any ISAR system
▪ The search interface should facilitate framing the search like:
✓ Keyword search
✓ Author and title search
✓ Combination search (using Boolean operators) Proximity search, etc.
Dr. Akputu Oryina Kingsley
8
9
CHARACTERISTICS OF ISR (Cont’d)
The desirability of making systems as readily usable as possible for their clienteles
9
10
Objective of ISRS
The general objective of an Information Retrieval System is to minimize the overhead of a
user locating needed information.
Overhead can be expressed as the time a user spends in all of the steps leading to
reading an item containing the needed information (e.g., query generation, query
execution, scanning results of query to select items to read, reading non-relevant items)
the TIME a user spends in all of the steps leading to reading an item
relevant items).
How then should ISRS performance (Overhead, in this case) can be benchmarked?
10
11
ISRS Performance Measure
The two major measures commonly associated with information systems
are precision and recall
TP=True positive
Precision should ideally be 1 (high) for a good classifier or retrieval system. Precision becomes
1 only when the numerator and denominator are equal i.e TP = TP +FP, this also means FP is
zero. As FP increases the value of denominator becomes greater than the numerator and
11
ISRS Performance Measure (cont’d) 12
TP=True positive
Recall should ideally be 1 (high) for a good classifier or retrieval system. Recall becomes 1 only
when the numerator and denominator are equal i.e TP = TP +FN, this also means FN is zero. As
FN increases the value of denominator becomes greater than the numerator and recall value
12