RL1 1

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 15

Information Retrieval

Dr. Maheswari Karthikeyan

BITS Pilani
Pilani Campus
Course Outline
• To acquire basic understanding of the components and
the different IR methods.
• Boolean
• Vector Space

• To understand the various application areas of IR:

• Text Mining
• Web Search
• Cross Lingual IR
• Multimedia IR

BITS Pilani, Pilani Campus

Lecture Outline

– Information vs. Data Retrieval
– IR task

BITS Pilani, Pilani Campus

Information Retrieval
• To retrieve documents efficiently, relevant to an
information need from a large document set
Information need

IR system
collection Answer List

BITS Pilani, Pilani Campus

Information Retrieval
• Search
• Filtering
• Organization
• Multiple languages
• Multiple media

BITS Pilani, Pilani Campus

Types of Information Needs

• Retrospective
• “Searching the past”
• Different queries posed against a static collection
• Time invariant
• Prospective
• “Searching the future”
• Static query posed against a dynamic collection
• Time dependent

BITS Pilani, Pilani Campus

IR Task

• Input:
• A corpus of textual natural-language
• A user query in the form of a textual string.
• Output:
• A ranked set of documents that are
relevant to the query

BITS Pilani, Pilani Campus

IR Task


String IR System

BITS Pilani, Pilani Campus
• Relevance is a subjective judgment and may
• Being on the proper subject.
• Being timely (recent information).
• Being authoritative (from a trusted source).
• Satisfying the goals of the user and intended
use of the information (information need).

BITS Pilani, Pilani Campus

Intelligent IR
• Meaning of the words used
• Order of words in the query
• Direct or indirect feedback
• Authority of the source

BITS Pilani, Pilani Campus

• IR: representation, storage, organization of, and
access to information items
• Focus on the user information need
• Emphasis is on the retrieval of information (not data)

BITS Pilani, Pilani Campus

IR vs. Data Retrieval
• Data retrieval
• Which documents contain a set of keywords?
• Well defined structure and semantics
• A single erroneous object implies failure
• Provide solution to the user of a database system
• Information retrieval
• Information about a subject or topic
• Semantics is frequently loose
• Small errors are tolerated
• Deals with natural language text

BITS Pilani, Pilani Campus

IR vs. Data Retrieval
Data IR
Data Structured Unstructured
Clear semantics No fields (other than text)
Fields (SSN, age)
Defined (relational Free text (“natural
Queries algebra, SQL) language”), Boolean
Exact (results are Imprecise (need to
Matching always “correct”) measure effectiveness)

BITS Pilani, Pilani Campus

IR system

• Interpret contents of information items

• Generate a ranking which reflects the degree of
• Notion of relevance is most important
• Few non-relevant documents also possible to

BITS Pilani, Pilani Campus

Thank you!

BITS Pilani, Pilani Campus

You might also like