Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Clustering

in
Information Retrieval System

Amit Sagu
saguamit98@gmail.com
Subscribe Our YouTube Channel: www.youtube.com/@cssimplified51
Grouping of
similar objects Clustering in IRS

Clustering in Information Retrieval (IR) systems refers to the process of grouping a set of
objects.
in such a way that objects in the same group (called a cluster) are more similar to each
other than to those in other groups (clusters).

• Document Clustering
• Term Clustering
Searching in IRS OR Basic Block Diagram of IRS

Query
IRS Database
(Documents)

User

Highest Ranking
Doc -1

Checking Results
Doc-2

Doc-3

Lowest Ranking
Doc-n
Query
Doc -1
Doc -1
Doc -1
Doc -2
Doc -2
Doc -2

Doc -3
Doc -3
Doc -3

Doc -4
Doc -4
Doc -4

Cluster- 2 Cluster- n
Cluster- 1

Results: Doc -2
Highest Ranking
Document
Doc -1 Clustering

Doc -4

Lowest Ranking
Doc -3
Term Clustering

Query "Connectivity solutions for networking"

(Query Processing)

Term -1 Term-2 Term -3 Term -4


Connectivity Solutions for networking

Solution, Solutions,
Connected, Network, Networks,
Solve, Solves,
connectivity, Solving, Solver Networking, Networked
connection …… Solvable Networker
Web, grid, net,
circuit
Cluster

Thesaurus
Database
Generation (Documents)
Thesaurus: A book or electronic resource that lists words in groups
of synonyms and related concepts.
Query Results: Doc -2
Highest Ranking

Doc -1
• Document Clustering : - > Result Expansion
Doc -4

Lowest Ranking
Doc -3

• Term Clustering : -> Query Expansion Query

Connectivity

Connected,
connectivity,
connection ……
Web, grid, net,
circuit

You might also like