Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Shri Ramdeobaba College of Engineering & Management, Nagpur

Department of Computer Science & Engineering(AIML)


Session 2023-2024

SEMINAR ON :

Information Retrieval

Presented By:
Mr. Jasmit Singh Saggu - 40
Mr. Pratyush Shendre -45
Mr. Parthesh Patel - 44 Guided By:
Mr. Sahil Khune - 52 Dr. Amit Pimpalkar
Mr. Kushagra Selokar - 70
Content
1) Introduction
2) Why IR?
3) Types of IR Model
4) Design Feature
5) Components of IR model
6) Advantages and disadvantages of IR
7) Summary
Introduction
❏ Information retrieval embraces the intellectual aspects of the description of information and
its specification for search, and also whatever systems, techniques, or machines are
employed to carry out the operation.
❏ The main objective of Information retrieval is to provide the users with effective access to
and interaction with information resources.
❏ An information retrieval process begins when a user enters a query into the system. Queries
are formal statements of information needs.
❏ User queries are matched against the database information. Depending on the application
the data objects may be, for example, text documents, images, audio, mind maps or videos
❏ The system aids users in locating information without giving direct answers by indicating
the presence and location of relevant documents.
❏ IR is at the heart of searching used in systems such as DIALOG, LexisNexis

and others

❏ Efficient Information Retrieval: IR models streamline the process of

searching for information within large document repositories, enabling

users to quickly locate relevant documents.

Why IR? ❏ Organizing and Structuring Data: These models help in organizing and

structuring textual information, making it easier to access and retrieve

specific data based on user queries.

❏ Optimizing User Experience: By providing efficient search capabilities, IR

models enhance the user experience by saving time and effort in finding

relevant information, thus increasing productivity and effectiveness.


Types of IR Model
Algebraic Model

Vector Space Model:Represents documents and queries as vectors.

Generalized Vector Space Model:Extends basic VSM with additional features.

Enhanced Topic-based Vector Space Model:Integrates topic information into VSM.

Extended Boolean Model:Represents documents and queries using Boolean logic.

Latent Semantic Indexing:Analyzes relationships between terms and documents

using SVD.
Probabilistic Model

Binary Independence Model:Assumes term independence in documents.

Probabilistic Relevance Model:Estimates relevance using probability.

Uncertain Inference: Addresses uncertainty in retrieval tasks.

Divergence-from-Randomness Model: Measures discrepancy from random

distribution.

Language Models:Model documents and queries as probabilities.

Latent Dirichlet Allocation(LAD):Generates topics from document

collections.
Set Theoretic Model

Standard Boolean Model:Represents documents and queries using Boolean logic.

Extended Boolean Model:Extends standard Boolean model with additional

operators like proximity and truncation.

Fuzzy Retrieval:Addresses imprecision in queries and documents.

Allows for partial matches and ranking of results based on relevance.


Properties of Model
Models without term-interdependencies:

● Treat each term in a query or document as independent of others.

● Examples include the Boolean Model and the Binary Independence Model (BIM).

Models with immanent term interdependencies:

● Acknowledge dependencies between terms within the same document or query.

● Examples include the Vector Space Model (VSM) and Language Models for IR.

Models with transcendent term interdependencies:

● Consider dependencies between terms that transcend individual documents or queries.

● Examples include Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA).
Design Feature

Inverted Index :

● Primary data structure of IR systems

● Data structure that lists each word and its frequency in all documents.

● Including the position information allows us to search for phrases.

Stop List (Function Words):

● Lists words unlikely to be useful for searching.

● Examples: the, from, to ....

● Excluding this reduces the size of the inverted index


Design Feature(Cont.)

Stemming:

● Simplified form of morphological analysis consisting simply of truncating a word.

● For example laughing, laughs, laugh and laughed are all stemmed to laugh.

● The problem is semantically different words like gallery and gall may both be truncated to gall making the

stems unintelligible to users.

Thesaurus:

● Widen search to include documents using related terms


Components of IR model
Advantages and Disadvantages of IR

❏ Advantages

1. Efficient Access: Information retrieval techniques make it possible for users to easily locate and retrieve

vast amounts of data or information.

2. Personalization of Results: User profiling and personalization techniques are used in information retrieval

models to tailor search results to individual preferences and behaviors.

3. Scalability: Information retrieval models are capable of handling increasing data volumes.

4. Precision: These systems can provide highly accurate and relevant search results, reducing the likelihood of

irrelevant information appearing in search results.


Advantages and Disadvantages of IR(Cont.)
❏ Disadvantages

1. Information Overload: When a lot of information is available, users often face information overload, making it

difficult to find the most useful and relevant material.

2. Lack of Context: Information retrieval systems may fail to understand the context of a user’s query, potentially

leading to inaccurate results.

3. Maintenance Challenges: Keeping these systems up-to-date and effective requires ongoing efforts, including regular

updates, data cleaning, and algorithm adjustments.

4. Bias and fairness: Ensuring that information retrieval systems do not exhibit biases and provide fair and unbiased

results is a crucial challenge, especially in contexts like web search engines and recommendation systems.
Thank you

You might also like