Professional Documents
Culture Documents
Modern Information Retrieval: Modeling
Modern Information Retrieval: Modeling
Chapter 3
Modeling
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 1
IR Models
Modeling in IR is a complex process aimed at
producing a ranking function
Ranking function: a function that assigns scores to documents
with regard to a given query
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 2
Modeling and Ranking
IR systems usually adopt index terms to index and
retrieve documents
Index term:
In a restricted sense: it is a keyword that has some meaning on
its own; usually plays the role of a noun
In a more general form: it is any word that appears in a document
Retrieval based on index terms can be implemented
efficiently
Also, index terms are simple to refer to in a query
Simplicity is important because it reduces the effort of
query formulation
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 3
Introduction
Information retrieval process
index terms
1
docs terms
documents match
2
query terms 3
information ...
need ranking
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 4
Introduction
A ranking is an ordering of the documents that
(hopefully) reflects their relevance to a user query
Thus, any IR system has to deal with the problem of
predicting which documents the users will find relevant
This problem naturally embodies a degree of
uncertainty, or vagueness
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 5
IR Models
An IR model is a quadruple [D, Q, F , R(qi , dj )] where
1. D is a set of logical views for the documents in the collection
2. Q is a set of logical views for the user queries
3. F is a framework for modeling documents and queries
4. R(qi , dj ) is a ranking function
D
d
R(d ,q )
Q
q
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 6
A Taxonomy of IR Models
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 7
Retrieval: Ad Hoc x Filtering
Ad Hoc Retrieval:
Q1 Q3
Q2 Collection
Q4
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 8
Retrieval: Ad Hoc x Filtering
Filtering
user 1
!
&
$
"
%
user 2
!
'
$
"
%
documents stream
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 9
Basic Concepts
Each document is represented by a set of
representative keywords or index terms
An index term is a word or group of consecutive words
in a document
A pre-selected set of index terms can be used to
summarize the document contents
However, it might be interesting to assume that all
words are index terms (full text representation)
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 10
Basic Concepts
Let,
t be the number of index terms in the document collection
ki be a generic index term
Then,
The vocabulary V = {k1 , . . . , kt } is the set of all distinct index
terms in the collection
V= k k k k
5
0
6
-.
/3
-
,
4
(
+
?
<
8
=>
;
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 11
Basic Concepts
Documents and queries can be represented by
patterns of term co-occurrences
V=
C
C
D
G
@
@
@
JN
P
HI
KL
OK
M
S
A
JN
JN
JN
Z
@
KL
KL
MQ
KL
S
[
^
^
^
JN
J
HI
KL
OK
TK
O
S
A
JN
J
I
KL
KL
O
S
]
@
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 12
The Term-Document Matrix
The occurrence of a term ki in a document dj
establishes a relation between ki and dj
A term-document relation between ki and dj can be
quantified by the frequency of the term in the document
In matrix form, this can written as
d1 d2
k1 f1,1 f1,2
k2 f f
2,1 2,2
k3 f3,1 f3,2
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 13
Basic Concepts
Logical view of a document: from full text to a set of
index terms
e
`
a
e
`
`
fgh
h
e
k
`
`
j
ab
cad
ia
ld
pb
il
`
`
v
h
m
`
b
fg
il
na
q
j
`
`
d
pd
`
d
pd
`
h
t
k
k
`
d
i
`
`
`
`
d
pd
ta
q
p
e
h
cad
_
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 14
The Boolean Model
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 15
The Boolean Model
Simple model based on set theory and boolean
algebra
Queries specified as boolean expressions
quite intuitive and precise semantics
neat formalism
example of query
q = ka ∧ (kb ∨ ¬kc )
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 16
The Boolean Model
A term conjunctive component that satisfies a query q is
called a query conjunctive component c(q)
A query q rewritten as a disjunction of those
components is called the disjunct normal form qDN F
To illustrate, consider
query q = ka ∧ (kb ∨ ¬kc )
vocabulary V = {ka , kb , kc }
Then
qDN F = (1, 1, 1) ∨ (1, 1, 0) ∨ (1, 0, 0)
c(q): a conjunctive component for q
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 17
The Boolean Model
The three conjunctive components for the query
q = ka ∧ (kb ∨ ¬kc )
Ka
Kb
(1,1,1)
Kc
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 18
The Boolean Model
This approach works even if the vocabulary of the
collection includes terms not in the query
Consider that the vocabulary is given by
V = {ka , kb , kc , kd }
Then, a document dj that contains only terms ka , kb ,
and kc is represented by c(dj ) = (1, 1, 1, 0)
The query [q = ka ∧ (kb ∨ ¬kc )] is represented in
disjunctive normal form as
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 19
The Boolean Model
The similarity of the document dj to the query q is
defined as
(
1 if ∃c(q) | c(q) = c(dj )
sim(dj , q) =
0 otherwise
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 20
Drawbacks of the Boolean Model
Retrieval based on binary decision criteria with no
notion of partial matching
No ranking of the documents is provided (absence of a
grading scale)
Information need has to be translated into a Boolean
expression, which most users find awkward
The Boolean queries formulated by the users are most
often too simplistic
The model frequently returns either too few or too many
documents in response to a user query
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 21
Term Weighting
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 22
Term Weighting
The terms of a document are not equally useful for
describing the document contents
In fact, there are index terms which are simply vaguer
than others
There are properties of an index term which are useful
for evaluating the importance of the term in a document
For instance, a word which appears in all documents of a
collection is completely useless for retrieval tasks
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 23
Term Weighting
To characterize term importance, we associate a weight
wi,j > 0 with each term ki that occurs in the document dj
If ki that does not appear in the document dj , then wi,j = 0.
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 24
Term Weighting
Let,
ki be an index term and dj be a document
V = {k1 , k2 , ..., kt } be the set of all index terms
wi,j > 0 be the weight associated with (ki , dj )
|
w
k w
x
|
x
k w
{
~
}
y
|
y
~
~
{
.. ..
k w
|
z
z
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 25
Term Weighting
The weights wi,j can be computed using the frequencies
of occurrence of the terms within documents
Let fi,j be the frequency of occurrence of index term ki in
the document dj
The total frequency of occurrence Fi of term ki in the
collection is defined as
N
X
Fi = fi,j
j=1
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 26
Term Weighting
The document frequency ni of a term ki is the number
of documents in which it occurs
Notice that ni ≤ Fi .
For instance, in the document collection below, the
values fi,j , Fi and ni associated with the term do are
f (do, d1 ) = 2 To do is to be.
f (do, d2 ) = 0 To be is to do. To be or not to be.
I am what I am.
d1
f (do, d3 ) = 3
f (do, d4 ) = 3 d2
I think therefore I am.
F (do) = 8 Do be do be do. Do do do, da da da.
Let it be, let it be.
d3
n(do) = 3
d4
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 27
Term-term correlation matrix
For classic information retrieval models, the index term
weights are assumed to be mutually independent
This means that wi,j tells us nothing about wi+1,j
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 28
Term-term correlation matrix
To take into account term-term correlations, we can
compute a correlation matrix
~ = (mij ) be a term-document matrix t × N where
Let M
mij = wi,j
~ =M
The matrix C ~M~ t is a term-term correlation matrix
Each element cu,v ∈ C expresses a correlation between
terms ku and kv , given by
X
cu,v = wu,j × wv,j
dj
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 29
Term-term correlation matrix
Term-term correlation matrix for a sample collection
d1 d2 k1 k2 k3
k1 w w1,2 " #
1,1 d1 w1,1 w2,1 w3,1
w2,1
k2 w2,2
d2 w1,2 w2,2 w3,2
k3 w3,1 w3,2
M × MT
| {z }
⇓
k1 k2 k3
k1 w w + w1,2 w1,2 w1,1 w2,1 + w1,2 w2,2 w1,1 w3,1 + w1,2 w3,2
1,1 1,1
k2
w2,1 w1,1 + w2,2 w1,2 w2,1 w2,1 + w2,2 w2,2 w2,1 w3,1 + w2,2 w3,2
k3 w3,1 w1,1 + w3,2 w1,2 w3,1 w2,1 + w3,2 w2,2 w3,1 w3,1 + w3,2 w3,2
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 30
TF-IDF Weights
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 31
TF-IDF Weights
TF-IDF term weighting scheme:
Term frequency (TF)
Inverse document frequency (IDF)
Foundations of the most popular term weighting scheme in IR
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 32
Term-term correlation matrix
Luhn Assumption. The value of wi,j is proportional to
the term frequency fi,j
That is, the more often a term occurs in the text of the document,
the higher its weight
tfi,j = fi,j
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 33
Term Frequency (TF) Weights
A variant of tf weight used in the literature is
(
1 + log fi,j if fi,j > 0
tfi,j =
0 otherwise
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 34
Term Frequency (TF) Weights
Log tf weights tfi,j for the example collection
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 35
Inverse Document Frequency
We call document exhaustivity the number of index
terms assigned to a document
The more index terms are assigned to a document, the
higher is the probability of retrieval for that document
If too many terms are assigned to a document, it will be retrieved
by queries for which it is not relevant
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 36
Inverse Document Frequency
Specificity is a property of the term semantics
A term is more or less specific depending on its meaning
To exemplify, the term beverage is less specific than the
terms tea and beer
We could expect that the term beverage occurs in more
documents than the terms tea and beer
Term specificity should be interpreted as a statistical
rather than semantic property of the term
Statistical term specificity. The inverse of the number
of documents in which the term occurs
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 37
Inverse Document Frequency
Terms are distributed in a text according to Zipf’s Law
Thus, if we sort the vocabulary terms in decreasing
order of document frequencies we have
n(r) ∼ r−α
n(r) = Cr−α
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 38
Inverse Document Frequency
Setting α = 1 (simple approximation for english
collections) and taking logs we have
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 39
Inverse Document Frequency
Let ki be the term with the rth largest document
frequency, i.e., n(r) = ni . Then,
N
idfi = log
ni
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 40
Inverse Document Frequency
Idf values for example collection
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 41
TF-IDF weighting scheme
The best known term weighting schemes use weights
that combine idf factors with term frequencies
Let wi,j be the term weight associated with the term ki
and the document dj
Then, we define
(
(1 + log fi,j ) × log nNi if fi,j > 0
wi,j =
0 otherwise
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 42
TF-IDF weighting scheme
Tf-idf weights of all terms present in our example
document collection
d1 d2 d3 d4
1 to 3 2 - -
2 do 0.830 - 1.073 1.073
3 is 4 - - -
4 be - - - -
5 or - 2 - -
6 not - 2 - -
7 I - 2 2 -
8 am - 2 1 -
9 what - 2 - -
10 think - - 2 -
11 therefore - - 2 -
12 da - - - 5.170
13 let - - - 4
14 it - - - 4
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 43
The Vector Model
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 54
The Vector Model
Boolean matching and binary weights is too limiting
The vector model proposes a framework in which
partial matching is possible
This is accomplished by assigning non-binary weights
to index terms in queries and in documents
Term weights are used to compute a degree of
similarity between a query and each document
The documents are ranked in decreasing order of their
degree of similarity
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 55
The Vector Model
For the vector model:
The weight wi,j associated with a pair (ki , dj ) is positive and
non-binary
The index terms are assumed to be all mutually independent
They are represented as unit vectors of a t-dimensionsal space (t
is the total number of index terms)
The representations of document dj and query q are
t-dimensional vectors given by
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 56
The Vector Model
Similarity between a document dj and a query q
j d
q
i cos(θ) = d~j •~
q
|d~j |×|~q|
Pt
wi,j ×wi,q
sim(dj , q) = qP
t
i=1
2
qP
t 2
i=1 wi,j × j=1 wi,q
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 57
The Vector Model
Weights in the Vector model are basically tf-idf weights
N
wi,q = (1 + log fi,q ) × log
ni
N
wi,j = (1 + log fi,j ) × log
ni
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 58
The Vector Model
Document ranks computed by the Vector model for the
query “to do” (see tf-idf weight values in Slide 43)
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 59
The Vector Model
Advantages:
term-weighting improves quality of the answer set
partial matching allows retrieval of docs that approximate the
query conditions
cosine ranking formula sorts documents according to a degree of
similarity to the query
document length normalization is naturally built-in into the ranking
Disadvantages:
It assumes independence of index terms
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 60
Probabilistic Model
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 61
Probabilistic Model
The probabilistic model captures the IR problem using a
probabilistic framework
Given a user query, there is an ideal answer set for
this query
Given a description of this ideal answer set, we could
retrieve the relevant documents
Querying is seen as a specification of the properties of
this ideal answer set
But, what are these properties?
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 62
Probabilistic Model
An initial set of documents is retrieved somehow
The user inspects these docs looking for the relevant
ones (in truth, only top 10-20 need to be inspected)
The IR system uses this information to refine the
description of the ideal answer set
By repeating this process, it is expected that the
description of the ideal answer set will improve
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 63
Probabilistic Ranking Principle
The probabilistic model
Tries to estimate the probability that a document will be relevant
to a user query
Assumes that this probability depends on the query and
document representations only
The ideal answer set, referred to as R, should maximize the
probability of relevance
But,
How to compute these probabilities?
What is the sample space?
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 64
The Ranking
Let,
R be the set of relevant documents to query q
R be the set of non-relevant documents to query q
P (R|d~j ) be the probability that dj is relevant to the query q
P (R|d~j ) be the probability that dj is non-relevant to q
P (R|d~j )
sim(dj , q) =
P (R|d~j )
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 65
The Ranking
Using Bayes’ rule,
where
P (d~j |R, q) : probability of randomly selecting the document
dj from the set R
P (R, q) : probability that a document randomly selected
from the entire collection is relevant to query q
P (d~j |R, q) and P (R, q) : analogous and complementary
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 66
The Ranking
Assuming that the weights wi,j are all binary and
assuming independence among the index terms:
Q Q
( ki |wi,j =1 P (ki |R, q)) × ( ki |wi,j =0 P (k i |R, q))
sim(dj , q) ∼ Q Q
( ki |wi,j =1 P (ki |R, q)) × ( ki |wi,j =0 P (k i |R, q))
where
P (ki |R, q): probability that the term ki is present in a
document randomly selected from the set R
P (k i |R, q): probability that ki is not present in a document
randomly selected from the set R
probabilities with R: analogous to the ones just described
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 67
The Ranking
To simplify our notation, let us adopt the following
conventions
piR = P (ki |R, q)
qiR = P (ki |R, q)
Since
P (ki |R, q) + P (k i |R, q) = 1
P (ki |R, q) + P (k i |R, q) = 1
we can write:
Q Q
( ki |wi,j =1 piR ) × ( ki |wi,j =0 (1 − piR ))
sim(dj , q) ∼ Q Q
( ki |wi,j =1 qiR ) × ( ki |wi,j =0 (1 − qiR ))
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 68
The Ranking
Taking logarithms, we write
Y Y
sim(dj , q) ∼ log piR + log (1 − piR )
ki |wi,j =1 ki |wi,j =0
Y Y
− log qiR − log (1 − qiR )
ki |wi,j =1 ki |wi,j =0
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 69
The Ranking
Summing up terms that cancel each other, we obtain
Y Y
sim(dj , q) ∼ log piR + log (1 − pir )
ki |wi,j =1 ki |wi,j =0
Y Y
− log (1 − pir ) + log (1 − pir )
ki |wi,j =1 ki |wi,j =1
Y Y
− log qiR − log (1 − qiR )
ki |wi,j =1 ki |wi,j =0
Y Y
+ log (1 − qiR ) − log (1 − qiR )
ki |wi,j =1 ki |wi,j =1
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 70
The Ranking
Using logarithm operations, we obtain
Y piR Y
sim(dj , q) ∼ log + log (1 − piR )
(1 − piR )
ki |wi,j =1 ki
Y (1 − qiR ) Y
+ log − log (1 − qiR )
qiR
ki |wi,j =1 ki
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 71
The Ranking
Further, assuming that
∀ ki 6∈ q, piR = qiR
and converting the log products into sums of logs, we
finally obtain
P
piR 1−qiR
sim(dj , q) ∼ ki ∈q∧ki ∈dj log 1−piR + log qiR
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 72
Term Incidence Contingency Table
Let,
N be the number of documents in the collection
ni be the number of documents that contain term ki
R be the total number of relevant documents to query q
ri be the number of relevant documents that contain term ki
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 73
Ranking Formula
If information on the contingency table were available
for a given query, we could write
ri
piR = R
ni −ri
qiR = N −R
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 74
Ranking Formula
In the previous formula, we are still dependent on an
estimation of the relevant dos for the query
For handling small values of ri , we add 0.5 to each of
the terms in the formula above, which changes
sim(dj , q) into
X ri + 0.5 N − ni − R + ri + 0.5
log ×
R − ri + 0.5 ni − ri + 0.5
ki [q,dj ]
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 75
Ranking Formula
The previous equation cannot be computed without
estimates of ri and R
One possibility is to assume R = ri = 0, as a way to
boostrap the ranking equation, which leads to
P
N −ni +0.5
sim(dj , q) ∼ ki [q,dj ] log ni +0.5
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 76
Ranking Example
Document ranks computed by the previous probabilistic
ranking equation for the query “to do”
d2 log 4−2+0.5
2+0.5 0
d3 log 4−3+0.5
3+0.5 - 1.222
d4 log 4−3+0.5
3+0.5 - 1.222
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 77
Ranking Example
The ranking computation led to negative weights
because of the term “do”
Actually, the probabilistic ranking equation produces
negative terms whenever ni > N/2
One possible artifact to contain the effect of negative
weights is to change the previous equation to:
X N + 0.5
sim(dj , q) ∼ log
ni + 0.5
ki [q,dj ]
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 78
Ranking Example
Using this latest formulation, we redo the ranking
computation for our example collection for the query “to
do” and obtain
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 79
Latent Semantic Indexing
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 145
Latent Semantic Indexing
Classic IR might lead to poor retrieval due to:
unrelated documents might be included in the
answer set
relevant documents that do not contain at least one
index term are not retrieved
Reasoning: retrieval based on index terms is vague
and noisy
The user information need is more related to concepts
and ideas than to index terms
A document that shares concepts with another
document known to be relevant might be of interest
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 146
Latent Semantic Indexing
The idea here is to map documents and queries into a
dimensional space composed of concepts
Let
t: total number of index terms
N : number of documents
M = [mij ]: term-document matrix t × N
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 147
Latent Semantic Indexing
The matrix M = [mij ] can be decomposed into three
components using singular value decomposition
M = K · S · DT
were
K is the matrix of eigenvectors derived from C = M · MT
DT is the matrix of eigenvectors derived from MT · M
S is an r × r diagonal matrix of singular values where
r = min(t, N ) is the rank of M
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 148
Computing an Example
Let MT = [mij ] be given by
K1 K2 K3 q • dj
d1 2 0 1 5
d2 1 0 0 1
d3 0 1 3 11
d4 2 0 0 2
d5 1 2 4 17
d6 1 2 0 5
d7 0 5 0 10
q 1 2 3
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 149
Latent Semantic Indexing
In the matrix S, consider that only the s largest singular
values are selected
Keep the corresponding columns in K and DT
The resultant matrix is called Ms and is given by
Ms = Ks · Ss · DTs
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 150
Latent Ranking
The relationship between any two documents in s can
be obtained from the MTs · Ms matrix given by
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 151
Latent Ranking
The user query can be modelled as a
pseudo-document in the original M matrix
Assume the query is modelled as the document
numbered 0 in the M matrix
The matrix MTs · Ms quantifies the relationship between
any two documents in the reduced concept space
The first row of this matrix provides the rank of all the
documents with regard to the user query
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 152
Conclusions
Latent semantic indexing provides an interesting
conceptualization of the IR problem
Thus, it has its value as a new theoretical framework
From a practical point of view, the latent semantic
indexing model has not yielded encouraging results
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 153
Neural Network Model
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 154
Neural Network Model
Classic IR:
Terms are used to index documents and queries
Retrieval is based on index term matching
Motivation:
Neural networks are known to be good pattern
matchers
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 155
Neural Network Model
The human brain is composed of billions of neurons
Each neuron can be viewed as a small processing unit
A neuron is stimulated by input signals and emits output
signals in reaction
A chain reaction of propagating signals is called a
spread activation process
As a result of spread activation, the brain might
command the body to take physical reactions
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 156
Neural Network Model
A neural network is an oversimplified representation of
the neuron interconnections in the human brain:
nodes are processing units
edges are synaptic connections
the strength of a propagating signal is modelled by a
weight assigned to each edge
the state of a node is defined by its activation level
depending on its activation level, a node might issue
an output signal
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 157
Neural Network for IR
A neural network model for information retrieval
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 158
Neural Network for IR
Three layers network: one for the query terms, one for
the document terms, and a third one for the documents
Signals propagate across the network
First level of propagation:
Query terms issue the first signals
These signals propagate across the network to
reach the document nodes
Second level of propagation:
Document nodes might themselves generate new
signals which affect the document term nodes
Document term nodes might respond with new
signals of their own
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 159
Quantifying Signal Propagation
Normalize signal strength (MAX = 1)
Query terms emit initial signal equal to 1
Weight associated with an edge from a query term
node ki to a document term node ki :
wi,q
wi,q = qP
t 2
i=1 wi,q
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 160
Quantifying Signal Propagation
After the first level of signal propagation, the activation
level of a document node dj is given by:
t Pt
X
i=1 wi,q wi,j
wi,q wi,j = qP qP
t 2 × t 2
i=1 i=1 wi,q i=1 w i,j
Chap 03: Modeling, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 161
Modern Information Retrieval
Chapter 4
Retrieval Evaluation
The Cranfield Paradigm
Retrieval Performance Evaluation
Evaluation Using Reference Collections
Interactive Systems Evaluation
Search Log Analysis using Clickthrough Data
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 1
Introduction
To evaluate an IR system is to measure how well the
system meets the information needs of the users
This is troublesome, given that a same result set might be
interpreted differently by distinct users
To deal with this problem, some metrics have been defined that,
on average, have a correlation with the preferences of a group of
users
Without proper retrieval evaluation, one cannot
determine how well the IR system is performing
compare the performance of the IR system with that of other
systems, objectively
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 2
Introduction
Systematic evaluation of the IR system allows
answering questions such as:
a modification to the ranking function is proposed, should we go
ahead and launch it?
a new probabilistic ranking function has just been devised, is it
superior to the vector model and BM25 rankings?
for which types of queries, such as business, product, and
geographic queries, a given ranking modification works best?
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 3
Introduction
Retrieval performance evaluation consists of
associating a quantitative metric to the results produced
by an IR system
This metric should be directly associated with the relevance of
the results to the user
Usually, its computation requires comparing the results produced
by the system with results suggested by humans for a same set
of queries
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 4
The Cranfield Paradigm
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 5
The Cranfield Paradigm
Evaluation of IR systems is the result of early
experimentation initiated in the 50’s by Cyril Cleverdon
The insights derived from these experiments provide a
foundation for the evaluation of IR systems
Back in 1952, Cleverdon took notice of a new indexing
system called Uniterm, proposed by Mortimer Taube
Cleverdon thought it appealing and with Bob Thorne, a colleague,
did a small test
He manually indexed 200 documents using Uniterm and asked
Thorne to run some queries
This experiment put Cleverdon on a life trajectory of reliance on
experimentation for evaluating indexing systems
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 6
The Cranfield Paradigm
Cleverdon obtained a grant from the National Science
Foundation to compare distinct indexing systems
These experiments provided interesting insights, that
culminated in the modern metrics of precision and recall
Recall ratio: the fraction of relevant documents retrieved
Precision ration: the fraction of documents retrieved that are
relevant
For instance, it became clear that, in practical situations,
the majority of searches does not require high recall
Instead, the vast majority of the users require just a few
relevant answers
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 7
The Cranfield Paradigm
The next step was to devise a set of experiments that
would allow evaluating each indexing system in
isolation more thoroughly
The result was a test reference collection composed
of documents, queries, and relevance judgements
It became known as the Cranfield-2 collection
The reference collection allows using the same set of
documents and queries to evaluate different ranking
systems
The uniformity of this setup allows quick evaluation of
new ranking functions
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 8
Reference Collections
Reference collections, which are based on the
foundations established by the Cranfield experiments,
constitute the most used evaluation method in IR
A reference collection is composed of:
A set D of pre-selected documents
A set I of information need descriptions used for testing
A set of relevance judgements associated with each pair [im , dj ],
im ∈ I and dj ∈ D
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 9
Precision and Recall
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 10
Precision and Recall
Consider,
I : an information request
R: the set of relevant documents for I
A: the answer set for I , generated by an IR system
R ∩ A: the intersection of the sets R and A
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 11
Precision and Recall
The recall and precision measures are defined as
follows
Recall is the fraction of the relevant documents (the set R)
which has been retrieved i.e.,
|R ∩ A|
Recall =
|R|
|R ∩ A|
P recision =
|A|
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 12
Precision and Recall
The definition of precision and recall assumes that all
docs in the set A have been examined
However, the user is not usually presented with all docs
in the answer set A at once
User sees a ranked set of documents and examines them
starting from the top
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 13
Precision and Recall
Consider a reference collection and a set of test queries
Let Rq1 be the set of relevant docs for a query q1 :
Rq1 = {d3 , d5 , d9 , d25 , d39 , d44 , d56 , d71 , d89 , d123 }
Consider a new IR algorithm that yields the following
answer to q1 (relevant docs are marked with a bullet):
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 14
Precision and Recall
If we examine this ranking, we observe that
The document d123 , ranked as number 1, is relevant
This document corresponds to 10% of all relevant documents
Thus, we say that we have a precision of 100% at 10% recall
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 15
Precision and Recall
If we proceed with our examination of the ranking
generated, we can plot a curve of precision versus
recall as follows:
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 16
Precision and Recall
Consider now a second query q2 whose set of relevant
answers is given by
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 17
Precision and Recall
If we examine this ranking, we observe
The first relevant document is d56
It provides a recall and precision levels equal to 33.3%
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 18
Precision and Recall
The precision figures at the 11 standard recall levels
are interpolated as follows
Let rj , j ∈ {0, 1, 2, . . . , 10}, be a reference to the j -th
standard recall level
Then, P (rj ) = max∀r | rj ≤r P (r)
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 19
Precision and Recall
In the examples above, the precision and recall figures
have been computed for single queries
Usually, however, retrieval algorithms are evaluated by
running them for several distinct test queries
To evaluate the retrieval performance for Nq queries, we
average the precision at each recall level as follows
Nq
X Pi (rj )
P (rj ) =
Nq
i=1
where
P (rj ) is the average precision at the recall level rj
Pi (rj ) is the precision at recall level rj for the i-th query
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 20
Precision and Recall
To illustrate, the figure below illustrates precision-recall
figures averaged over queries q1 and q2
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 21
Precision and Recall
Average precision-recall curves are normally used to
compare the performance of distinct IR algorithms
The figure below illustrates average precision-recall
curves for two distinct retrieval algorithms
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 22
Precision-Recall Appropriateness
Precision and recall have been extensively used to
evaluate the retrieval performance of IR algorithms
However, a more careful reflection reveals problems
with these two measures:
First, the proper estimation of maximum recall for a query
requires detailed knowledge of all the documents in the collection
Second, in many situations the use of a single measure could be
more appropriate
Third, recall and precision measure the effectiveness over a set
of queries processed in batch mode
Fourth, for systems which require a weak ordering though, recall
and precision might be inadequate
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 23
Single Value Summaries
Average precision-recall curves constitute standard
evaluation metrics for information retrieval systems
However, there are situations in which we would like to
evaluate retrieval performance over individual queries
The reasons are twofold:
First, averaging precision over many queries might disguise
important anomalies in the retrieval algorithms under study
Second, we might be interested in investigating whether a
algorithm outperforms the other for each query
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 24
P@5 and P@10
In the case of Web search engines, the majority of
searches does not require high recall
Higher the number of relevant documents at the top of
the ranking, more positive is the impression of the users
Precision at 5 (P@5) and at 10 (P@10) measure the
precision when 5 or 10 documents have been seen
These metrics assess whether the users are getting
relevant documents at the top of the ranking or not
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 25
P@5 and P@10
To exemplify, consider again the ranking for the
example query q1 we have been using:
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 26
MAP: Mean Average Precision
The idea here is to average the precision figures
obtained after each new relevant document is observed
For relevant documents not retrieved, the precision is set to 0
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 27
R-Precision
Let R be the total number of relevant docs for a given
query
The idea here is to compute the precision at the R-th
position in the ranking
For the query q1 , the R value is 10 and there are 4
relevants among the top 10 documents in the ranking
Thus, the R-Precision value for this query is 0.4
The R-precision measure is a useful for observing the
behavior of an algorithm for individual queries
Additionally, one can also compute an average
R-precision figure over a set of queries
However, using a single number to evaluate a algorithm over
several queries might be quite imprecise
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 28
Precision Histograms
The R-precision computed for several queries can be
used to compare two algorithms as follows
Let,
RPA (i) : R-precision for algorithm A for the i-th query
RPB (i) : R-precision for algorithm B for the i-th query
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 29
Precision Histograms
Figure below illustrates the RPA/B (i) values for two
retrieval algorithms over 10 example queries
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 30
MRR: Mean Reciprocal Rank
MRR is a good metric for those cases in which we are
interested in the first correct answer such as
Question-Answering (QA) systems
Search engine queries that look for specific sites
URL queries
Homepage queries
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 31
MRR: Mean Reciprocal Rank
Let,
Ri : ranking relative to a query qi
Scorrect (Ri ): position of the first correct answer in Ri
Sh : threshold for ranking position
PNq
M RR(Q) = i RR(Ri )
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 32
The E-Measure
A measure that combines recall and precision
The idea is to allow the user to specify whether he is
more interested in recall or in precision
The E measure is defined as follows
1 + b2
E(j) = 1 − b2 1
r(j) + P (j)
where
r(j) is the recall at the j-th position in the ranking
P (j) is the precision at the j-th position in the ranking
b ≥ 0 is a user specified parameter
E(j) is the E metric at the j-th position in the ranking
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 33
The E-Measure
The parameter b is specified by the user and reflects the
relative importance of recall and precision
If b = 0
E(j) = 1 − P (j)
low values of b make E(j) a function of precision
If b → ∞
limb→∞ E(j) = 1 − r(j)
high values of b make E(j) a function of recal
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 34
F-Measure: Harmonic Mean
The F-measure is also a single measure that combines
recall and precision
2
F (j) = 1 1
r(j) + P (j)
where
r(j) is the recall at the j-th position in the ranking
P (j) is the precision at the j-th position in the ranking
F (j) is the harmonic mean at the j-th position in the ranking
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 35
F-Measure: Harmonic Mean
The function F assumes values in the interval [0, 1]
It is 0 when no relevant documents have been retrieved
and is 1 when all ranked documents are relevant
Further, the harmonic mean F assumes a high value
only when both recall and precision are high
To maximize F requires finding the best possible
compromise between recall and precision
Notice that setting b = 1 in the formula of the E-measure
yields
F (j) = 1 − E(j)
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 36
Summary Table Statistics
Single value measures can also be stored in a table to
provide a statistical summary
For instance, these summary table statistics could
include
the number of queries used in the task
the total number of documents retrieved by all queries
the total number of relevant docs retrieved by all queries
the total number of relevant docs for all queries, as judged by the
specialists
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 37
User-Oriented Measures
Recall and precision assume that the set of relevant
docs for a query is independent of the users
However, different users might have different relevance
interpretations
To cope with this problem, user-oriented measures
have been proposed
As before,
consider a reference collection, an information request I, and a
retrieval algorithm to be evaluated
with regard to I, let R be the set of relevant documents and A be
the set of answers retrieved
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 38
User-Oriented Measures
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 39
User-Oriented Measures
The coverage ratio is the fraction of the documents
known and relevant that are in the answer set, that is
|K ∩ R ∩ A|
coverage =
|K ∩ R|
The novelty ratio is the fraction of the relevant docs in
the answer set that are not known to the user
|(R ∩ A) − K|
novelty =
|R ∩ A|
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 40
User-Oriented Measures
A high coverage indicates that the system has found
most of the relevant docs the user expected to see
A high novelty indicates that the system is revealing
many new relevant docs which were unknown
Additionally, two other measures can be defined
relative recall: ratio between the number of relevant docs found
and the number of relevant docs the user expected to find
recall effort: ratio between the number of relevant docs the user
expected to find and the number of documents examined in an
attempt to find the expected relevant documents
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 41
DCG — Discounted Cumulated Gain
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 42
Discounted Cumulated Gain
Precision and recall allow only binary relevance
assessments
As a result, there is no distinction between highly
relevant docs and mildly relevant docs
These limitations can be overcome by adopting graded
relevance assessments and metrics that combine them
The discounted cumulated gain (DCG) is a metric
that combines graded relevance assessments
effectively
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 43
Discounted Cumulated Gain
When examining the results of a query, two key
observations can be made:
highly relevant documents are preferable at the top of the ranking
than mildly relevant ones
relevant documents that appear at the end of the ranking are less
valuable
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 44
Discounted Cumulated Gain
Consider that the results of the queries are graded on a
scale 0–3 (0 for non-relevant, 3 for strong relevant docs)
For instance, for queries q1 and q2 , consider that the
graded relevance scores are as follows:
Rq1 = { [d3 , 3], [d5 , 3], [d9 , 3], [d25 , 2], [d39 , 2],
[d44 , 2], [d56 , 1], [d71 , 1], [d89 , 1], [d123 , 1] }
Rq2 = { [d3 , 3], [d56 , 2], [d129 , 1] }
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 45
Discounted Cumulated Gain
Given these assessments, the results of a new ranking
algorithm can be evaluated as follows
Specialists associate a graded relevance score to the
top 10-20 results produced for a given query q
This list of relevance scores is referred to as the gain vector G
G1 = (1, 0, 1, 0, 0, 3, 0, 0, 0, 2, 0, 0, 0, 0, 3)
G2 = (0, 0, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 3)
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 46
Discounted Cumulated Gain
By summing up the graded scores up to any point in the
ranking, we obtain the cumulated gain (CG)
For query q1 , for instance, the cumulated gain at the first
position is 1, at the second position is 1+0, and so on
Thus, the cumulated gain vectors for queries q1 and q2
are given by
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 47
Discounted Cumulated Gain
In formal terms, we define
Given the gain vector Gj for a test query qj , the CGj associated
with it is defined as
Gj [1] if i = 1;
CGj [i] =
Gj [i] + CGj [i − 1] otherwise
where CGj [i] refers to the cumulated gain at the ith position of
the ranking for query qj
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 48
Discounted Cumulated Gain
We also introduce a discount factor that reduces the
impact of the gain as we move upper in the ranking
A simple discount factor is the logarithm of the ranking
position
If we consider logs in base 2, this discount factor will be
log2 2 at position 2, log2 3 at position 3, and so on
By dividing a gain by the corresponding discount factor,
we obtain the discounted cumulated gain (DCG)
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 49
Discounted Cumulated Gain
More formally,
Given the gain vector Gj for a test query qj , the vector DCGj
associated with it is defined as
G [1] if i = 1;
j
DCGj [i] =
Gj [i] + DCGj [i − 1] otherwise
log2 i
where DCGj [i] refers to the discounted cumulated gain at the ith
position of the ranking for query qj
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 50
Discounted Cumulated Gain
For the example queries q1 and q2 , the DCG vectors are
given by
DCG1 = (1.0, 1.0, 1.6, 1.6, 1.6, 2.8, 2.8, 2.8, 2.8, 3.4, 3.4, 3.4, 3.4, 3.4, 4.2)
DCG2 = (0.0, 0.0, 1.3, 1.3, 1.3, 1.3, 1.3, 1.6, 1.6, 1.6, 1.6, 1.6, 1.6, 1.6, 2.4)
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 51
DCG Curves
To produce CG and DCG curves over a set of test
queries, we need to average them over all queries
Given a set of Nq queries, average CG[i] and DCG[i]
over all queries are computed as follows
PNq CGj [i] PNq DCGj [i]
CG[i] = j=1 Nq ; DCG[i] = j=1 Nq
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 52
DCG Curves
Then, average curves can be drawn by varying the rank
positions from 1 to a pre-established threshold
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 53
Ideal CG and DCG Metrics
Recall and precision figures are computed relatively to
the set of relevant documents
CG and DCG scores, as defined above, are not
computed relatively to any baseline
This implies that it might be confusing to use them
directly to compare two distinct retrieval algorithms
One solution to this problem is to define a baseline to
be used for normalization
This baseline are the ideal CG and DCG metrics, as we
now discuss
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 54
Ideal CG and DCG Metrics
For a given test query q , assume that the relevance
assessments made by the specialists produced:
n3 documents evaluated with a relevance score of 3
n2 documents evaluated with a relevance score of 2
n1 documents evaluated with a score of 1
n0 documents evaluated with a score of 0
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 55
Ideal CG and DCG Metrics
Ideal CG and ideal DCG vectors can be computed
analogously to the computations of CG and DCG
For the example queries q1 and q2 , we have
ICG1 = (3, 6, 9, 11, 13, 15, 16, 17, 18, 19, 19, 19, 19, 19, 19)
ICG2 = (3, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6)
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 56
Ideal CG and DCG Metrics
Further, average ICG and average IDCG scores can
be computed as follows
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 57
Normalized DCG
Precision and recall figures can be directly compared to
the ideal curve of 100% precision at all recall levels
DCG figures, however, are not build relative to any ideal
curve, which makes it difficult to compare directly DCG
curves for two distinct ranking algorithms
This can be corrected by normalizing the DCG metric
Given a set of Nq test queries, normalized CG and DCG
metrics are given by
CG[i] DCG[i]
N CG[i] = ICG[i]
; N DCG[i] = IDCG[i]
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 58
Normalized DCG
For instance, for the example queries q1 and q2 , NCG
and NDCG vectors are given by
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 59
Discussion on DCG Metrics
CG and DCG metrics aim at taking into account
multiple level relevance assessments
This has the advantage of distinguishing highly relevant
documents from mildly relevant ones
The inherent disadvantages are that multiple level
relevance assessments are harder and more time
consuming to generate
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 60
Discussion on DCG Metrics
Despite these inherent difficulties, the CG and DCG
metrics present benefits:
They allow systematically combining document ranks and
relevance scores
Cumulated gain provides a single metric of retrieval performance
at any position in the ranking
It also stresses the gain produced by relevant docs up to a
position in the ranking, which makes the metrics more imune to
outliers
Further, discounted cumulated gain allows down weighting the
impact of relevant documents found late in the ranking
Chap 04: Retrieval Evaluation, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 61
Modern Information Retrieval
Chapter 5
Relevance Feedback and
Query Expansion
Introduction
A Framework for Feedback Methods
Explicit Relevance Feedback
Explicit Feedback Through Clicks
Implicit Feedback Through Local Analysis
Implicit Feedback Through Global Analysis
Trends and Research Issues
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 1
Introduction
Most users find it difficult to formulate queries that are
well designed for retrieval purposes
Yet, most users often need to reformulate their queries
to obtain the results of their interest
Thus, the first query formulation should be treated as an initial
attempt to retrieve relevant information
Documents initially retrieved could be analyzed for relevance and
used to improve initial query
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 2
Introduction
The process of query modification is commonly referred
as
relevance feedback, when the user provides information on
relevant documents to a query, or
query expansion, when information related to the query is used
to expand it
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 3
A Framework for Feedback Methods
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 4
A Framework
Consider a set of documents Dr that are known to be
relevant to the current query q
In relevance feedback, the documents in Dr are used to
transform q into a modified query qm
However, obtaining information on documents relevant
to a query requires the direct interference of the user
Most users are unwilling to provide this information, particularly in
the Web
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 5
A Framework
Because of this high cost, the idea of relevance
feedback has been relaxed over the years
Instead of asking the users for the relevant documents,
we could:
Look at documents they have clicked on; or
Look at terms belonging to the top documents in the result set
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 6
A Framework
A feedback cycle is composed of two basic steps:
Determine feedback information that is either related or expected
to be related to the original query q and
Determine how to transform query q to take this information
effectively into account
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 7
A Framework
In an explicit relevance feedback cycle, the feedback
information is provided directly by the users
However, collecting feedback information is expensive
and time consuming
In the Web, user clicks on search results constitute a
new source of feedback information
A click indicate a document that is of interest to the user
in the context of the current query
Notice that a click does not necessarily indicate a document that
is relevant to the query
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 8
Explicit Feedback Information
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 9
A Framework
In an implicit relevance feedback cycle, the feedback
information is derived implicitly by the system
There are two basic approaches for compiling implicit
feedback information:
local analysis, which derives the feedback information from the
top ranked documents in the result set
global analysis, which derives the feedback information from
external sources such as a thesaurus
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 10
Implicit Feedback Information
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 11
Explicit Relevance Feedback
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 12
Explicit Relevance Feedback
In a classic relevance feedback cycle, the user is
presented with a list of the retrieved documents
Then, the user examines them and marks those that
are relevant
In practice, only the top 10 (or 20) ranked documents
need to be examined
The main idea consists of
selecting important terms from the documents that have been
identified as relevant, and
enhancing the importance of these terms in a new query
formulation
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 13
Explicit Relevance Feedback
Expected effect: the new query will be moved towards
the relevant docs and away from the non-relevant ones
Early experiments have shown good improvements in
precision for small test collections
Relevance feedback presents the following
characteristics:
it shields the user from the details of the query reformulation
process (all the user has to provide is a relevance judgement)
it breaks down the whole searching task into a sequence of small
steps which are easier to grasp
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 14
The Rocchio Method
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 15
The Rocchio Method
Documents identified as relevant (to a given query)
have similarities among themselves
Further, non-relevant docs have term-weight vectors
which are dissimilar from the relevant documents
The basic idea of the Rocchio Method is to reformulate
the query such that it gets:
closer to the neighborhood of the relevant documents in the
vector space, and
away from the neighborhood of the non-relevant documents
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 16
The Rocchio Method
Let us define terminology regarding the processing of a
given query q , as follows:
Dr : set of relevant documents among the documents retrieved
Nr : number of documents in set Dr
Dn : set of non-relevant docs among the documents retrieved
Nn : number of documents in set Dn
Cr : set of relevant docs among all documents in the collection
N : number of documents in the collection
α, β, γ: tuning constants
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 17
The Rocchio Method
Consider that the set Cr is known in advance
Then, the best query vector for distinguishing the
relevant from the non-relevant docs is given by
1 X ~ 1
d~j
X
~qopt = dj −
|Cr | N − |Cr |
∀d~j ∈Cr ∀d~j 6∈Cr
where
|Cr | refers to the cardinality of the set Cr
d~j is a weighted term vector associated with document dj , and
~qopt is the optimal weighted term vector for query q
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 18
The Rocchio Method
However, the set Cr is not known a priori
To solve this problem, we can formulate an initial query
and to incrementally change the initial query vector
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 19
The Rocchio Method
There are three classic and similar ways to calculate
the modified query ~qm as follows,
β X γ X
Standard_Rocchio : ~qm = α q~ + d~j − d~j
Nr Nn
∀d~j ∈Dr ∀d~j ∈Dn
X X
Ide_Regular : ~qm = α ~q + β d~j − γ d~j
∀d~j ∈Dr ∀d~j ∈Dn
X
Ide_Dec_Hi : ~qm = α ~q + β d~j − γ max_rank(Dn )
∀d~j ∈Dr
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 20
The Rocchio Method
Three different setups of the parameters in the Rocchio
formula are as follows:
α = 1, proposed by Rocchio
α = β = γ = 1, proposed by Ide
γ = 0, which yields a positive feedback strategy
The current understanding is that the three techniques yield
similar results
The main advantages of the above relevance feedback
techniques are simplicity and good results
Simplicity: modified term weights are computed directly from the
set of retrieved documents
Good results: the modified query vector does reflect a portion of
the intended query semantics (observed experimentally)
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 21
Relevance Feedback for the Probabilistic
Model
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 22
A Probabilistic Method
The probabilistic model ranks documents for a query q
according to the probabilistic ranking principle
The similarity of a document dj to a query q in the
probabilistic model can be expressed as
X P (ki |R) 1 − P (ki |R)
sim(dj , q) α log + log
1 − P (ki |R) P (ki |R)
k ∈q∧k ∈d
i i j
where
P (ki |R) stands for the probability of observing the term ki in the
set R of relevant documents
P (ki |R) stands for the probability of observing the term ki in the
set R of non-relevant docs
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 23
A Probabilistic Method
Initially, the equation above cannot be used because
P (ki |R) and P (ki |R) are unknown
Different methods for estimating these probabilities
automatically were discussed in Chapter 3
With user feedback information, these probabilities are
estimated in a slightly different way
For the initial search (when there are no retrieved
documents yet), assumptions often made include:
P (ki |R) is constant for all terms ki (typically 0.5)
the term probability distribution P (ki |R) can be approximated by
the distribution in the whole collection
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 24
A Probabilistic Method
These two assumptions yield:
ni
P (ki |R) = 0.5 P (ki |R) =
N
where ni stands for the number of documents in the
collection that contain the term ki
Substituting into similarity equation, we obtain
X N − ni
siminitial (dj , q) = log
ni
ki ∈q∧ki ∈dj
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 25
A Probabilistic Method
Let nr,i be the number of documents in set Dr that
contain the term ki
Then, the probabilities P (ki |R) and P (ki |R) can be
approximated by
nr,i ni − nr,i
P (ki |R) = P (ki |R) =
Nr N − Nr
Using these approximations, the similarity equation can
rewritten as
X nr,i N − Nr − (ni − nr,i )
sim(dj , q) = log + log
Nr − nr,i ni − nr,i
ki ∈q∧ki ∈dj
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 26
A Probabilistic Method
Notice that here, contrary to the Rocchio Method, no
query expansion occurs
The same query terms are reweighted using feedback
information provided by the user
The formula above poses problems for certain small
values of Nr and nr,i
For this reason, a 0.5 adjustment factor is often added
to the estimation of P (ki |R) and P (ki |R):
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 27
A Probabilistic Method
The main advantage of this feedback method is the
derivation of new weights for the query terms
The disadvantages include:
document term weights are not taken into account during the
feedback loop;
weights of terms in the previous query formulations are
disregarded; and
no query expansion is used (the same set of index terms in the
original query is reweighted over and over again)
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 28
Evaluation of Relevance Feedback
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 29
Evaluation of Relevance Feedback
Consider the modified query vector ~qm produced by
expanding ~q with relevant documents, according to the
Rocchio formula
Evaluation of ~qm :
Compare the documents retrieved by ~qm with the set of relevant
documents for ~q
In general, the results show spectacular improvements
However, a part of this improvement results from the higher ranks
assigned to the relevant docs used to expand ~q into q~m
Since the user has seen these docs already, such evaluation is
unrealistic
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 30
The Residual Collection
A more realistic approach is to evaluate ~qm considering
only the residual collection
We call residual collection the set of all docs minus the set of
feedback docs provided by the user
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 31
Explicit Feedback Through Clicks
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 32
Explicit Feedback Through Clicks
Web search engine users not only inspect the answers
to their queries, they also click on them
The clicks reflect preferences for particular answers in
the context of a given query
They can be collected in large numbers without
interfering with the user actions
The immediate question is whether they also reflect
relevance judgements on the answers
Under certain restrictions, the answer is affirmative as
we now discuss
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 33
Eye Tracking
Clickthrough data provides limited information on the
user behavior
One approach to complement information on user
behavior is to use eye tracking devices
Such commercially available devices can be used to
determine the area of the screen the user is focussed in
The approach allows correctly detecting the area of the
screen of interest to the user in 60-90% of the cases
Further, the cases for which the method does not work
can be determined
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 34
Eye Tracking
Eye movements can be classified in four types:
fixations, saccades, pupil dilation, and scan paths
Fixations are a gaze at a particular area of the screen
lasting for 200-300 milliseconds
This time interval is large enough to allow effective brain
capture and interpretation of the image displayed
Fixations are the ocular activity normally associated
with visual information acquisition and processing
That is, fixations are key to interpreting user behavior
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 35
Relevance Judgements
To evaluate the quality of the results, eye tracking is not
appropriate
This evaluation requires selecting a set of test queries
and determining relevance judgements for them
This is also the case if we intend to evaluate the quality
of the signal produced by clicks
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 36
User Behavior
Eye tracking experiments have shown that users scan
the query results from top to bottom
The users inspect the first and second results right
away, within the second or third fixation
Further, they tend to scan the top 5 or top 6 answers
thoroughly, before scrolling down to see other answers
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 37
User Behavior
Percentage of times each one of the top results was
viewed and clicked on by a user, for 10 test tasks and
29 subjects (Thorsten Joachims et al)
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 38
User Behavior
We notice that the users inspect the top 2 answers
almost equally, but they click three times more in the
first
This might be indicative of a user bias towards the
search engine
That is, that the users tend to trust the search engine in
recommending a top result that is relevant
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 39
User Behavior
This can be better understood by presenting test
subjects with two distinct result sets:
the normal ranking returned by the search engine and
a modified ranking in which the top 2 results have their positions
swapped
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 40
Clicks as a Metric of Preferences
Thus, it is clear that interpreting clicks as a direct
indicative of relevance is not the best approach
More promising is to interpret clicks as a metric of user
preferences
For instance, a user can look at a result and decide to
skip it to click on a result that appears lower
In this case, we say that the user prefers the result
clicked on to the result shown upper in the ranking
This type of preference relation takes into account:
the results clicked on by the user
the results that were inspected and not clicked on
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 41
Clicks within a Same Query
To interpret clicks as user preferences, we adopt the
following definitions
Given a ranking function R(qi , dj ), let rk be the kth ranked result
That is, r1 , r2 , r3 stand for the first, the second, and the third top
results, respectively
√
Further, let rk indicate that the user has clicked on the kth result
Define a preference function rk > rk−n , 0 < k − n < k, that states
that, according to the click actions of the user, the kth top result is
preferrable to the (k − n)th result
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 42
Clicks within a Same Query
To illustrate, consider the following example regarding
the click behavior of a user:
√ √ √
r1 r2 r3 r4 r5 r6 r7 r8 r9 r10
This behavior does not allow us to make definitive
statements about the relevance of results r3 , r5 , and r10
However, it does allow us to make statements on the
relative preferences of this user
Two distinct strategies to capture the preference
relations in this case are as follows.
√
Skip-Above: if rk then rk > rk−n , for all rk−n that was not clicked
√
Skip-Previous: if rk and rk−1 has not been clicked then rk > rk−1
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 43
Clicks within a Same Query
To illustrate, consider again the following example
regarding the click behavior of a user:
√ √ √
r1 r2 r3 r4 r5 r6 r7 r8 r9 r10
According to the Skip-Above strategy, we have:
r3 > r2 ; r3 > r1
And, according to the Skip-Previous strategy, we have:
r3 > r2
We notice that the Skip-Above strategy produces more
preference relations than the Skip-Previous strategy
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 44
Clicks within a Same Query
Empirical results indicate that user clicks are in
agreement with judgements on the relevance of results
in roughly 80% of the cases
Both the Skip-Above and the Skip-Previous strategies produce
preference relations
If we swap the first and second results, the clicks still reflect
preference relations, for both strategies
If we reverse the order of the top 10 results, the clicks still reflect
preference relations, for both strategies
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 45
Clicks within a Query Chain
The discussion above was restricted to the context of a
single query
However, in practice, users issue more than one query
in their search for answers to a same task
The set of queries associated with a same task can be
identified in live query streams
This set constitute what is referred to as a query chain
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 46
Clicks within a Query Chain
To illustrate, consider that two result sets in a same
query chain led to the following click actions:
r1 r√2 r3 r4 r5 √ r6 r7 r8 r9 r10
s1 s2 s3 s4 s5 s6 s7 s8 s9 s10
where
rj refers to an answer in the first result set
sj refers to an answer in the second result set
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 47
Clicks within a Query Chain
Two distinct strategies to capture the preference relations
in this case, are as follows
√
Top-One-No-Click-Earlier: if ∃ sk | sk then sj > r1 , for j ≤ 10.
√
Top-Two-No-Click-Earlier: if ∃ sk | sk then sj > r1 and sj > r2 , for
j ≤ 10
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 48
Clicks within a Query Chain
We notice that the second strategy produces twice
more preference relations than the first
These preference relations must be compared with the
relevance judgements of the human assessors
The following conclusions were derived:
Both strategies produce preference relations in agreement with
the relevance judgements in roughly 80% of the cases
Similar agreements are observed even if we swap the first and
second results
Similar agreements are observed even if we reverse the order of
the results
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 49
Clicks within a Query Chain
These results suggest:
The users provide negative feedback on whole result sets (by not
clicking on them)
The users learn with the process and reformulate better queries
on the subsequent iterations
Chap 05: Relevance Feedback and Query Expansion, Baeza-Yates & Ribeiro-Neto, Modern Information Retrieval, 2nd Edition – p. 50