Professional Documents
Culture Documents
Information Storage and Retrieval - Professional Practice
Information Storage and Retrieval - Professional Practice
Information Storage and Retrieval - Professional Practice
Submitted to – Submitted by –
By nature man is social and the society is the web of social relationship. Being a social animal, man
wants to communicate as among all animals, only man is endowed with the gift of speech. If speech
was the first step forward in the development of human communication, the second great milestone
was invention of writing.
In the beginning the man had been developing methods of recording his experiences through
clay-tablets, wax-tablets, papyrus sheets, parchment roles, codices, etc. The third great leap forward came
with invention of printing, by means of which what was written could be reproduced and distributed
in quantity, thus disseminating information and learning among ever widening circles of the
community. Bewildering amount of progress have accompanied the development of electronic systems
of communication- the telegraphs, telephone and especially radio, television and satellites.
Recently, the electronic computer and telecommunication technologies have brought many other
revolutionary possibilities.
Introduction:
An information retrieval system is developed in order to help users to discovery relevant information
from a storehouse containing collection of documents.
The idea of information retrieval assumes that there exists several documents or records comprising data
that have been arranged in a suitable order for easy retrieval. The storehouse contains many
bibliographic information, which is quite different from other kinds of information or data. For such
scenarios the retrieval system is designed to search for and retrieve specific facts or data.
The main objective of databases is to enable the user to search for specific records that be matched
with one or more specific conditions or search criteria, for example, details of a certain recipe
containing a particular ingredient; details of a specific product within a specific range of market
price; The main purpose of designing an information retrieval system is to meet the user
requirements. It enables in document retrieval in-order to answer to the users' queries.
Uses:
-Regulatory Compliance-
A well-organized information storage and retrieval system that follows compliance regulations and tax
record-keeping guidelines significantly increases a business owner’s confidence the business is fully
complying.
A good information storage and retrieval system, including an effective indexing system, not only decreases
the chances information will be misfiled but also speeds up the storing and retrieval of information. The
resulting time-saving benefit increases office efficiency and productivity.
-Environment-
Improves Working environment. It is important for an office structure to have well-organized information
storage and retrieval system in order to create a healthy working environment and avoid stressful or poor
situations.
Electronic vs. Manual Systems-
Although a very small business may choose to institute a manual system, the importance of electronic
information storage and retrieval systems lie in the fact that electronic systems reduce storage space
requirements and decrease equipment and labor costs. In contrast, a manual system requires budgetary
allotments for storage space, filing equipment and administrative expenses to maintain an organized
filing system. Additionally, it can be significantly easier to provide and monitor internal controls designed
to deter fraud, waste and abuse as well as ensure the business is complying with information privacy
requirements with an electronic system.
Information Storage -
Organizations process data to derive the information required for their day-to-day operations. Storage is
a repository that enables users to persistently store and retrieve this digital data.
Types of Data
Organized in rows and columns in a rigidly Elements cannot be stored in rows and
defined format so that applications can retrieve columns, which makes it difficult to query and
and process it efficiently and is stored using a retrieve by applications. A vast majority of
database management system (DBMS). new data being created today is unstructured.
NOTE -Data, whether structured or unstructured, does not fulfill any purpose for individuals or
businesses unless it is presented in a meaningful form. Information is the intelligence and knowledge
derived from data.
Storage-
• Data created by individuals or businesses must be stored so that it is easily accessible for further
processing.
• In a computing environment, devices designed for storing data are termed storage devices or
simply storage. Examples:
i. Individuals: Digital camera, Cell phone, DVDs, Hard disks
ii. Businesses: Hard Disks, External Disk Arrays, Tape Library
iii. Centralized: Mainframe Computers
iv. Decentralized: Client-Server Model (Data spread across many servers)
v. Centralized: Storage Networking
Architecture-
Historically, organizations had centralized computers (mainframes) and information storage devices
(tape reels and disk packs) in their data center-
1. Server-centric storage architecture - The storage was typically internal to the server and could not be
shared with any other servers.
Managing a data center involves many tasks. The key management activities include the following:
Virtualization and cloud computing have dramatically changed the way data center infrastructure
resources are provisioned and managed. Continuous cost pressure on IT and on-demand data processing
requirements have resulted in the adoption of cloud computing.
Information Retrieval
An information retrieval system is developed in order to help users to discovery relevant information
from a storehouse containing collection of documents. Information retrieval is the activity of
obtaining information resources relevant to an information need from a collection of information
resources.
An information retrieval process begins when a user enters a query into the system. Queries are formal
statements of information needs. User queries are matched against the database information. Most IR
systems compute a numeric score on how well each object in the database matches the query, and rank
the objects according to this value.
Major Components of IR
• Information retrieval can be divided into several major constitutes which include:
For examples if we maintain a database of information about an institution , all we have are the different
types of records and related facts, such as, names of students, faculties, staffs, their positions,
qualifications and so on.
iv. Interface
Interface regularly considered whether or not an information retrieval system is user friendly.
The AND operate for narrowing down a search The OR operate for broadening a search.
The NOT operator for excluding unwanted data.
Basic Retrieval Techniques
Case Sensitivity Searching - Text sometimes exhibits case sensitivity; that is, words can differ in
meaning based on differing use of uppercase and lowercase letters. Words with capital letters do not
always have the same meaning when written with lowercase letters.
For example, Bill is the first name of former U.S. president William Clinton, who could sign a bill
The opposite term of "case-sensitive" is "case-insensitive“
For example, Google searches are generally case-insensitive and Gmail is case-sensitive by default.
Truncation - Truncation allows a search to be conducted for all the different forms of a word having the same
common roots
• Used symbol (Question mark? , asterisk* and pound sign # ) for truncation purpose.
•A number of different options are available for truncation like Left truncation, Right truncation and middle
truncation.
Left truncation retrievals all the words having the same characteristics at the right hand part, for example, *hyl
will retrieval words such as “methyl” and “ethyl”
•Right truncation, for example the term of Network* as a query results in retrieving documents on networks and
networking.
Basic Retrieval Techniques
Proximity Searching - A proximity search allows you to specify how close two (or more) words must
be to each other in order to register a match.
There are three types of proximity searches:
• Word proximity
• Sentence proximity
• Paragraph proximity
Range Searching - It is most useful with numerical information. The following options are
usually available for range searching
• greater than (>) less than (<) • equal to (=)
• not equal to (/= or o)
• greater than equal to (>=)
• less than or equal to (<=)
Fuzzy Searching - It is designed to find out terms that are spelled incorrectly at data entry and query
point. For example the term computer could be misspelled as compter, compiter, or comuter. Optical
Character Recognition (OCR) or compressed texts could also result in erroneous results. Fuzzy
searching is designed for detection and correction of spelling errors that result from OCR and text
compression.
Query Expansion - Query expansion is a retrieval technique that allows the end user to improve
retrieval performance by revising search queries based on results already retrieved.
Information Retrieval Systems
1. Online Systems - Online information retrieval systems allow the user to search databases located
remotely with the help of the computer and telecommunication technology.
2. CD - Rom Systems - CD-ROM systems are usually searched locally and it works if the systems are not
networked. Basic retrieval techniques are supported in CD-ROM systems while advanced search facilities
are applied in limited scope. The data which is stored on compact disc (CD) can to read by any computer
operating systems and any CD-ROM drive. Example: LISA
Information Retrieval Systems
3. OP AC - Online public access catalogs (OPACs) are traditional catalogs executed in a different
medium. Different features of OPACs are
First, OPACs contains bibliographic information about library resources.
Second, OPACs can be considered as an extension of MARC records.
Third, OPACs support at least field searching, keyword searching and Boolean searching.
4. Web Information Retrieval Systems - It deals with text as well as multimedia information resources
that are linked with other documents and there is no target user’s community as such.
Basically web is a platform where anyone from anywhere can publish virtually any information, in any
language or in any format. Examples, Google, Alta Vista
Thank You