Professional Documents
Culture Documents
Chapter - Three
Chapter - Three
Chapter - Three
ISR
Chapter -Three
IR Models 1
IR models
2
What is a Retrieval Model?
•A retrieval model describes the human and
computational processes involved in retrieval
• Example: A model of human information seeking
behavior
• Example: A model of how documents are ranked
computationally
• Components: Users, information needs, queries,
documents, relevance assessments, ….
• Retrieval models define relevance, explicitly or implicitly
Boolean IR Model
one of the earliest and simplest retrieval methods
uses exact matching to match documents to a user
"query“
information request by finding documents that are
"relevant" in terms of matching the words in the query.
Any number of logical statements can be combined using
the three Boolean operators.
Operators: AND, OR, NOT
4
AND: Finds only documents containing all of the
or phrase.
P Q Not P P And Q P Or Q
False False True False False
False True True False True
True False False False True
True True False True True 5
Advantages:
• Very efficient
• Structured queries
Disadvantages:
6
Vector Space Model
is a way of representing documents through the words
that they contain
Term weighting
7
• Each document is broken down into a word frequency
table.
8
Example Vector Space
• Document A
• “A dog and a cat.”
• Document B
• “A frog.”
a frog
1 1
9
Vector Example ….
10
Vector Example ….
• Document A: “A dog and a cat.”
11
For simplicity, let’s assume three index terms: dog, bite, man
(i.e., V=3)
0 = the term does not appear 1 = the term appears at least once
12
Probabilistic Model
The probabilistic retrieval model is based on the Probability
Ranking Principle.
The document is retrieved according to the probability of the
document being relevant to the query. Mathematically, the scoring
function is given by
• P(R = 1|d, q)
13
Probabilistic model
14
Basic Probability priniple
Let a, b be two events.
Bayesian formulas
15
16
17
18
19
End of Chapter -Three
Questions
20