Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Article

Information Retrieval: Recent Advances and Beyond


Kailash Hambarde 1 *, and Hugo Proença 1,2

1 University of Beira Interior, Portugal; kailas.srt@gmail.com


2 IT: Instituto de Telecomunicações, Portugal; hugomcp@ubi.pt
* Correspondence: kailas.srt@gmail.com

Abstract: In this paper, we provide a detailed overview of the models used for information retrieval
arXiv:2301.08801v1 [cs.IR] 20 Jan 2023

in the first and second stages of the typical processing chain. We discuss the current state-of-the-art
models, including methods based on terms, semantic retrieval, and neural. Additionally, we delve
into the key topics related to the learning process of these models. This way, this survey offers a
comprehensive understanding of the field and is of interest for for researchers and practitioners
entering/working in the information retrieval domain.

Keywords: information retrieval; dense representation, document retrieval, relevance ranking

1. Introduction
In the present day, Information Retrieval (IR) plays a vital role in human daily life
through its implementation in a range of practical applications, including searching the
web, question answering systems, personal assistants, chatbots, and digital libraries among
others. The primary objective of IR is to locate and retrieve information that is relevant to a
Citation: Hambarde, K.; Proença, H.
user’s query. The main goal of IR is to identify and retrieve information that is related to a
Information Retrieval: Recent
user’s query. As multiple records may be relevant, the results are often ranked according to
Advances and Beyond. Journal Not
their relevance score to the user’s query.
Specified 2021, 1, 0. https://doi.org/ The traditional text retrieval systems mainly rely on the matching of terms between the
query and the documents. However, term-based retrieval systems have several limitations
Academic Editor: Amelia Carolina such as polysemy, synonymy, and lexical gaps between the query and the documents
Sparavigna [1]. In recent years, advancements in computing power and the availability of large
labeled datasets have greatly impacted the field of Natural Language Processing (NLP) by
Received: 21 November 2021 enabling researchers to use deep learning techniques for a variety of innovations. These
Accepted: 15 December 2021 techniques have been used to improve the traditional text retrieval systems and overcome
Published: the limitations of term-based retrieval systems.
However, these techniques require significant amounts of data and computing re-
Publisher’s Note: MDPI stays neutral sources. As a result, researchers are continually developing more advanced deep learning
with regard to jurisdictional claims in algorithms to meet these demands and achieve better results in NLP tasks [2]. With the
published maps and institutional affil- use of these advanced deep learning algorithms, the performance of IR systems has been
iations. greatly improved, leading to more accurate and efficient retrieval of information for the
end-users. Some of the advancements in deep learning techniques that have been applied
Copyright: © 2023 by the authors.
to IR include neural network architectures such as convolutional neural networks (CNNs)
Submitted to Journal Not Specified [3] and recurrent neural networks (RNNs) [4], as well as transfer learning and pre-training
for possible open access publication techniques [5]. These methods have been used to improve the representation of text data
under the terms and conditions of and to enhance the ability of IR systems to understand natural language queries. Addi-
the Creative Commons Attribution tionally, attention-based mechanisms such as the Transformer architecture [6] have been
(CC BY) license (https://creativecom- used to improve the ability of IR systems to attend to important parts of the query and
mons.org/licenses/by/ 4.0/). documents for matching. Furthermore, the use of pre-trained language models, such as

Version January 24, 2023 submitted to Journal Not Specified https://www.mdpi.com/journal/notspecified


Version January 24, 2023 submitted to Journal Not Specified 2 of 26

BERT [5] and GPT-2 [7], have been shown to improve the performance of IR systems by
providing a better understanding of the semantics and context of natural language queries
and documents.
The density term plot 1 provides a visual representation of the relative frequency
of these keywords in the surveyed literature, which can be used to gain insights into
the current research trends in the field. Moreover, recent advancements in IR have also
focused on incorporating external knowledge to improve the relevance of the retrieved
information. For example, incorporating knowledge graph embeddings [8] into the IR
process can help link the query and documents to the relevant entities and concepts, thus
providing more accurate results. Furthermore, the use of multi-modal information retrieval,
which combines text, image, and audio information has also been shown to improve the
performance of IR systems [9].

Figure 1. Term map of the information retrieval. Colors indicate the recent terms density, extracted
from survery papers.

Overall, the advancements in deep learning techniques, large labeled datasets and
high-computing power have significantly improved the performance of IR systems and
made them more capable of handling the complexity of natural language queries. However,
there is still room for further improvements and research in the field. In this paper, we
aim to give a detailed overview of the models used for information retrieval in both the
first and second stages of the process. We will discuss current models, including those
based on terms, semantic retrieval and neural methods. We will also cover key topics
related to the learning of these models. Our survey paper aims to provide a comprehensive
understanding of the field and will be beneficial for researchers and practitioners in the
area of information retrieval.

2. Information Retrieval: Overview


The basic objective of modern Information Retrieval (IR) is to provide the most relevant
information to the end user in their query. This task can be divided into two parts: 1)
retrieval; and 2) ranking, as illustrated in Fig. 2. The first stage retrieves a set of initial
Version January 24, 2023 submitted to Journal Not Specified 3 of 26

documents that are likely to be relevant to the query, and the second stage re-ranks the rank
documents based on their relevance score.
In terms of retrieval, the goal is to retrieve a set of relevant documents from the
collection based on the user’s query. This is done by using various algorithms and models
such as the vector space model [10], Boolean model, Latent Semantic Indexing (LSI) [11],
Latent Dirichlet Allocation (LDA) [12] and recent techniques such as pre-trained models
like BERT [5].

Figure 2. Overview of modern Information Retrieval system.

In the second stage, ranking, the main objective is to adjust the ranking of the initially
retrieved documents based relevance score. The ranking process typically employs different
models compared to those used in the retrieval stage, as the primary focus is on improving
the effectiveness of the results rather than their efficiency. Traditional models such as BM25
[13] are used as initial retrievers as they prioritize efficiency in recalling relevant documents
from a massive document pool. Conventional ranking models encompass a range of
techniques, such models include RankNet [14] and [15] for learning to rank and DRMM
[16] and Duet [17] for neural models. These models leverage various techniques, including
reinforcement learning [18], contextual embeddings [19] and attention mechanisms [20], to
learn how to rank documents based on the user’s query criteria.

3. Conventional Term-based Retrieval


Classical term-based retrieval, also known as Boolean retrieval, is a traditional ap-
proach to information retrieval that matches the terms in a query to the terms in a document
[21]. It is simple, fast, and easy to implement but has limitations such as the inability to
handle synonyms, polysemy, and context [22,23]. Despite these limitations, it is widely
used in practical applications, especially in small-scale information retrieval systems [21].
Other methods, such as the BIM proposed by [24] and the TDM proposed by [25], have been
proposed to improve the accuracy and efficiency of retrieval systems. The TF-IDF weighting
method is also an important concept in the field of IR, as proposed by [26] and [27]. Recently,
there has been a growing interest in using probabilistic and language modeling techniques
to improve retrieval performance, such as the Query Likelihood model (QL) proposed by
[28] and the divergence from randomness (DFR) approach introduced by [29].
Version January 24, 2023 submitted to Journal Not Specified 4 of 26

3.1. Pioneering Methods for Semantic Retrieval


3.1.1. Query Augmentation
Early methods for information retrieval focused on using query expansion techniques
to improve information retrieval systems. One approach was using global models to
identify and utilize concepts relevant to a query [30]. Another approach was using lexical-
semantic relations to identify semantically related terms to the query [31]. Additionally,
using query context to improve retrieval performance was also studied [32]. These methods
demonstrate the potential of query expansion techniques, concepts and lexical-semantic
relations to improve information retrieval systems. Another important approach for query
expansion is using local models, which incorporate user feedback into the retrieval process.
A widely used local model is the Rocchio relevance feedback algorithm [33]. It forms the
foundation for many modern relevance feedback techniques. [34] proposed incorporating
user feedback into information retrieval systems using a divergence minimization model.
[35] presented a method for improving the effectiveness of pseudo-relevance feedback (PRF)
in information retrieval systems. [36] presented a study comparing various methods for
incorporating PRF into information retrieval systems and [37] proposed a novel approach
for incorporating PRF into information retrieval systems using matrix factorization. These
studies show the potential of using user feedback to improve retrieval performance.

3.1.2. Document Augmentation


Document expansion is a technique used in information retrieval (IR) systems to
improve the performance of retrieval by expanding the representation of each document.
The idea behind document expansion is that by including additional related terms in the
representation of each document, the IR system will be able to better match the query with
the relevant documents. [38] and [39] focus on the relationship between corpus structure,
language models, and ad-hoc information retrieval, with [39] proposing a novel approach
using clustering techniques. [40] compares the effectiveness of document expansion and
query expansion methods in ad-hoc information retrieval systems. [41] proposes a technique
for expanding the representation of each document using related terms to improve retrieval
performance. [42] expands documents using WordNet, a large lexical database of English.
[43] focuses on improving the retrieval of short texts by expanding the representation of
each document with semantically related terms. Lastly, [44] expands the representation of
each document using external collections like WordNet. These papers show a progression in
the field, with later papers building on the ideas presented in earlier papers and continuing
to explore new methods for document expansion in information retrieval systems.

3.1.3. Lexical Dependency Model


The field of information retrieval has seen a significant amount of research focused on
the use of lexical dependency models for document retrieval. One study that has contributed
to this area is [45], which compares the effectiveness of different automatic phrase indexing
methods. Another important contribution to this field is [26], which presents a thorough
analysis of various term weighting schemes and their impact on retrieval performance, and
proposes a novel term dependency weighting scheme that accounts for the relationship
between terms in a document.
Several other papers have built on the ideas presented in [26] and [45]. For example,
[46] proposes a new method for phrase-based retrieval using a vector space model (VSM)
and term dependency weighting scheme. Similarly, [47] presents a new approach to
information retrieval based on a general language model (LM) that incorporates term
dependency weighting. The use of probabilistic models for information retrieval has also
been explored in the literature, with [48] presenting a new probabilistic model that is
Version January 24, 2023 submitted to Journal Not Specified 5 of 26

based on the vector space model (VSM) and incorporates term dependency weighting.
Another approach that has been proposed is the use of sentence trees to capture term
dependencies, as presented in [49]. Recently, [50] presents a new approach to information
retrieval based on a language model (LM) that incorporates term dependency by proposing
a new Dependence Language Model (DLM) that utilizes term dependencies to improve
retrieval effectiveness. Furthermore, [51] presents a new approach to relevance ranking
in information retrieval that combines the use of kernels with the well-known BM25
retrieval function, and describes how the kernel-based approach can be used to model term
dependencies similar to the schemes proposed in previous papers such as [26] and [46].

3.1.4. Topic Model


Topic models have been widely used in the field of information retrieval as a way to
improve the representation and indexing of text documents. One of the early contributions
to this area was [52], which introduced the concept of Generalized Vector Space Model
(GVSM) for information retrieval. This model extends the traditional vector space model
(VSM) by allowing non-binary values in the term-document matrix. Another important
contribution to this field is Latent Semantic Analysis (LSA) for information retrieval, which
was introduced in [11]. LSA can be used to analyze and represent the semantic structure
of a corpus of text documents for indexing and retrieval. In recent years, various topic
models have been proposed for information retrieval. For example, [38] describes how
corpus statistics can be used to improve ad-hoc information retrieval systems. [53] presents
a new smoothing technique for ad-hoc retrieval based on regularizing the retrieval scores
to account for the uncertainty in the language model. [54] introduces a new approach to
ad-hoc retrieval that utilizes Latent Dirichlet Allocation (LDA) for both the indexing and
the language modeling stages. [55] compares the performance of various topic models
for information retrieval, including LDA and PLSA. [56] provides a comparison of the
performance of PLSA and LDA on several benchmark datasets. However, not all studies
have found topic models to be effective for information retrieval. For example, [57] critiques
the performance of Latent Semantic Indexing (LSI) on TREC collections and argues that LSI
is not effective for this type of data.

3.1.5. Multilingual Retrieval Model


Multilingual retrieval models have been an area of active research in the field of infor-
mation retrieval. One of the key studies in this area is [58], which presents a new approach
to information retrieval that views the retrieval task as a statistical translation problem and
proposes a model to improve retrieval effectiveness. Another important contribution to this
field is [59], which presents a new approach to estimating statistical translation models for
ad-hoc information retrieval based on mutual information. [60] describes a new approach
to web search that utilizes click-through data to estimate translation models. In addition
to these studies, there have also been several papers that provide a theoretical analysis
of the translation language model for information retrieval. For example, [61] presents
an axiomatic analysis of the model, its properties and limitations. [62] describes a new
approach to query expansion that utilizes monolingual statistical machine translation (SMT)
to improve retrieval effectiveness. [63] describes a new approach to query expansion that
utilizes concept-based translation models and search logs to improve retrieval effectiveness.
These papers demonstrate the potential of using statistical translation models to
improve information retrieval systems.
Version January 24, 2023 submitted to Journal Not Specified 6 of 26

4. First Stage: Retrieval


In this section, an extensive review of the literature pertaining to the first stage of infor-
mation retrieval is conducted. The review is divided into three categories, namely: sparse
techniques for semantic retrieval, state-of-the-art deep learning techniques for semantic
retrieval, and hybrid techniques.

4.1. Deep Learning Methods for Semantic Retrieval


Neural-based approaches for semantic retrieval, such as those utilizing CNNs and
RNNs have been proposed as a means to improve the effectiveness of information re-
trieval systems. These approaches involve representing text documents and queries in a
continuous vector space, allowing for the utilization of neural network-based similarity
measures, such as cosine similarity or dot-product, to rank documents according to their
relevance to a given query [64,65]. Furthermore, these neural methods can be used to
learn complex non-linear representations of text data, which can be utilized to improve
the performance of retrieval models [3,65]. Additionally, attention-based mechanisms such
as the Transformer architecture [6] have been used to improve the ability of IR systems to
attend to important parts of the query and documents for matching. Furthermore, the use of
pre-trained language models, such as BERT [5] and GPT-2 [7], have been shown to improve
the performance of IR systems by providing a better understanding of the semantics and
context of natural language queries and documents.
The use of neural methods in information retrieval has gained popularity in recent
years, as evidenced by the increasing number of publications in this field [65,66].

4.1.1. Discrete Retrieval Methods


Sparse retrieval methods refer to techniques used in information retrieval that aim to
reduce the dimensionality of the data being indexed and retrieve only the most relevant
documents. These methods are based on the idea that the majority of the data in a collection
is not relevant to a given query, and that it is unnecessary to retrieve all the documents
when only a small subset is relevant.
One approach that has been proposed is the use of term re-weighting methods, which
aim to assign higher weights to more relevant terms in the retrieval process. One such
method is "DeepTR" presented in [67], which learns to reweight terms based on their se-
mantic similarity using distributed representations. Another approach is the integration
of neural word embeddings into information retrieval systems, as described in [68]. The
authors propose a method for evaluating the effectiveness of different neural word embed-
dings in information retrieval and describe how these embeddings can be integrated into
retrieval models. Context-aware term weighting methods, such as "DeepCT" presented
in [69], aim to improve the effectiveness of the first-stage retrieval process by assigning
higher weights to more relevant terms based on their context. The authors propose a deep
learning-based method that learns to estimate the importance of terms in a sentence or
passage based on their context. "TDV" presented in [70] is another approach that focuses
on learning term discrimination in information retrieval. The authors propose a deep
learning-based method that learns to distinguish relevant terms from irrelevant terms based
on their context. Building on the concept of "DeepCT", "HDCT" proposed in [71] considers
the term importance in multiple levels of granularity, from document-level to sentence-
level, to improve the effectiveness of the retrieval process. The efficiency of "DeepCT"
approach is evaluated and compared with traditional term weighting methods in terms of
computational cost and retrieval performance in [72].
Finally, [73] presents a conceptual framework for analyzing and comparing differ-
ent information retrieval techniques, particularly neural-based methods. They introduce
Version January 24, 2023 submitted to Journal Not Specified 7 of 26

"DeepImpact" and "COIL" (Continuous Optimization-based Information Retrieval) as


two different dimensions of the framework, and propose "uniCOIL" (Unified Continu-
ous Optimization-based Information Retrieval) as a unified approach that combines the
strengths of both dimensions.
Expansion Information retrieval systems have been constantly evolving in order to
improve their effectiveness. One approach that has been proposed is expanding a given
document or query by predicting or generating relevant queries, and then using those
queries to retrieve additional information. [74] proposed a model called "Doc2Query"
which uses a neural network to predict the relevant queries given a document, and then
retrieves additional documents using those queries. They showed that this approach is
able to improve the effectiveness of document retrieval. [75] further improved upon their
previous work by incorporating additional information, such as the title of the document
and the clicked snippets, into the Doc2Query model. They also proposed a new evaluation
metric for the task, which they called "Top-k-TTTTT" (where TTTTT stands for "title,
terms, terms, terms, terms"). They showed that the new model, called "DocTTTTTQuery",
outperforms the previous model on the benchmark dataset. [76] proposed a model called
"GAR" (Generation-augmented Retrieval) which generates new queries based on the given
question and then retrieves relevant documents using those queries. They showed that this
approach is able to improve the effectiveness of open-domain question answering. [77]
proposed a model called "UED" (Unified Expansion and Ranking model) that is trained on
a large corpus of text using a combination of supervised and unsupervised learning. They
showed that this model is able to effectively rank and expand passages. [74–77] proposed
new methods for expanding a given document or query by predicting or generating relevant
queries, and then using those queries to retrieve additional information. They show that
these methods can improve the effectiveness of document retrieval, open-domain question
answering, and passage ranking. With each new approach, the field of information retrieval
continues to evolve and improve.
Expansion +́’ Term Re-weighting Recent research in information retrieval has focused
on using a combination of expansion and term re-weighting techniques to enhance the
performance of retrieval systems, such as EPIC proposed by [78] which uses a neural
network to predict the importance of terms in a document and then uses those terms to
retrieve additional information, SparTerm proposed by [79] which uses a neural network to
learn a sparse term-based representation of a document, SPLADE and SPLADE v2 proposed
by [80] which use neural networks to learn a sparse lexical representation of a document
and expand the query, DeepImpact proposed by [81] which uses a neural network to learn
the impact of passages on the retrieval process, TILDEv2 proposed by [82] which uses a
neural network to learn term-based representations of passages and expand the query, and
SpaDE proposed by [83] which uses a neural network to learn a sparse representation of a
document and encode the document using two encoders.
Sparse Representation Learning Another approach that has been proposed to im-
prove the effectiveness and efficiency of information retrieval systems is the use of sparse
representations. [84] proposed a method called semantic hashes, which use a neural net-
work to map documents to compact binary codes that can be used for efficient approximate
nearest neighbor search. They showed that this approach is able to achieve state-of-the-art
performance. [85] proposed a model called SNRM which uses a neural network to learn a
sparse representation of a document and showed that it is able to improve the effectiveness
of information retrieval. [86] proposed a model called UHD-BERT which uses a neural
network to learn ultra-high-dimensional sparse representations of a document and showed
that it is able to improve the effectiveness of full-ranking. [87] proposed a model called BPR
which uses a neural network to learn a semantic hash for each passage and showed that it is
Version January 24, 2023 submitted to Journal Not Specified 8 of 26

able to improve the efficiency of open-domain question answering. [88] proposed a model
called CCSA which uses a neural network to learn a composite code sparse representation
of a document and showed that it is able to improve the effectiveness of first-stage retrieval.
All these papers propose new methods that use sparse representations to improve the
effectiveness and efficiency of information retrieval systems.

4.2. Dense Retrieval Methods


In recent years, word embeddings have become a popular method for representing
documents and queries in information retrieval systems. These embeddings are dense, con-
tinuous representations of words that capture their semantic meaning. Word-Embedding-
based [89] uses word embeddings and a Fisher Vector aggregation technique to represent
documents and queries, [90] uses bilingual word embeddings for cross-lingual information
retrieval, [91] uses word embeddings to represent short texts, [92] uses two different em-
bedding spaces to represent documents and queries, and [93] uses word embeddings to
represent context and potential responses for natural language generation. [94] and [95]
both use neural networks to map documents and queries to a continuous space and apply
a similarity measure to rank the documents, while [96] uses a neural network to extract
important phrases from documents and indexes them to improve efficiency in retrieval.
[97], suggests using a retrieval component to gather relevant information and a generation
component to create a response, while [98] proposes using contextualized text embeddings
to improve first-stage retrieval. [99] suggests using multiple transformer-based models to
improve document retrieval. One approach that has been gaining popularity is the use
of pre-trained transformer-based models for encoding questions and documents. [100]
presents a method for document retrieval by decoupling the encoding of questions and
documents using a pre-trained transformer-based model called "DC-BERT". The authors
show that this approach is able to improve the effectiveness of document retrieval. Similarly,
[101] presents a method for question answering that uses a neural retrieval component and
a cross-attention component to generate a response based on the retrieved passages. They
also propose a data augmentation method to improve the performance of the model.
Another approach that has been explored is the use of approximate nearest neighbor
search and negative contrastive learning for dense text retrieval. [102] presents a method
called ANCE which uses a neural network to encode documents and queries, and then
applies approximate nearest neighbor search and negative contrastive learning to rank the
documents. [103] proposes a method for training dense retrieval models effectively and
efficiently, using a combination of hard and soft negative sampling, and optimizing the
balance between the retrieval and generation components of the model. [104] presents a
method for web search using a global weighted self-attention network, which improves the
effectiveness of web search. [105] presents a method for open-domain question answering
using dense passage retrieval and an optimized training approach, which improves the
effectiveness of open-domain question answering.
The papers [103], [104], [105], [106], [107], [108], [109], and [110] all propose different
methods for improving the performance of dense retrieval models. [103] focuses on effective
and efficient training using a combination of hard and soft negative sampling, [104] uses a
global weighted self-attention network for web search, [105] uses dense passage retrieval
and an optimized training approach for open-domain question answering, [106] uses
dense representations of phrases to improve performance, [107] uses balanced topic aware
sampling to improve the quality of the training data, [108] uses hard negatives to improve
the model’s ability to handle difficult cases, [109] is focused on training dense retrieval
models for conversational scenarios with a limited number of examples, and [110] propose
more robust to variations in the input data.
Version January 24, 2023 submitted to Journal Not Specified 9 of 26

Recently, researchers have been exploring new methods to improve the performance
of dense retrieval models, which are used to retrieve relevant information from a large
collection of documents. In [111], the authors propose a new method for dense passage
retrieval by using a passage-centric similarity relation, which focuses on the relationship
between passages rather than individual words. Building on this, [112] proposed a new
method for training ColBERT, a pre-trained transformer model, for OpenQA by using
relevance-guided supervision. This approach focuses on training the model to learn the
relevance of passages to a given query, which is important for OpenQA tasks. In [113],
the authors propose a new method for training a multi-document reader and retriever for
open-domain question answering by using end-to-end training. This approach focuses on
training the entire system, including the reader and retriever components, in an end-to-end
manner to improve performance. The paper [114] propose a new method for improving
query representations for dense retrieval by using pseudo relevance feedback, which is a
technique to extract relevant information from a set of retrieved documents and use it to
improve the retrieval performance.
Following this, the paper [115] propose a new method for training dense retrieval
models by using pseudo-relevance feedback and multiple representations, which allows
the model to learn more robust representations of queries. To further improve performance,
[116] proposed a new method for training a discriminative semantic ranker for question
retrieval. This approach focuses on training the model to differentiate between relevant
and non-relevant documents, which is important for accurate retrieval.
Finally, in [117], the authors propose a new method for improving open-domain
passage retrieval by using representation decoupling, which separates the encoding of
passages and queries to improve the model’s ability to understand the relationships between
them. In [118], the authors introduce a new approach for training a dense passage retrieval
and re-ranking model. [119] presents a method for more efficiently training retrieval models
through the use of negative caching. [120] proposes a technique for training neural passage
retrieval models by utilizing multi-stage training and improved negative contrast.All these
methods were proposed to improve the effectiveness of dense retrieval models, and they
all showed promising results in their respective evaluation.
Knowledge Distillation is a technique for transferring knowledge from a pre-trained,
larger model to a smaller model. This can be done to improve the performance of the smaller
model on a specific task, or to make the model more efficient in terms of computational
resources. Researchers propose methods for transferring knowledge from a pre-trained,
larger model to a smaller model in order to improve performance on specific tasks such as
document ranking [121], chat-bot systems [122], question answering [123] and large-scale
retrieval tasks [124]. These methods include the use of a pre-trained BERT model [121];
[124], transferring knowledge across different model architectures and using a Margin-MSE
loss function [125]. The papers also explore the relationship between pre-trained models
and the effect of distilling knowledge from one model to another [126] [127].
Multi-vector Representation
In [128], the authors propose a method called MUPPET which uses multiple hops to
retrieve paragraphs that are relevant to a given question. In [129], the authors propose
a method called ME-BERT, which uses a combination of sparse, dense, and attentional
representations to improve text retrieval tasks. In [130], the authors propose a method
called ColBERT which uses contextualized late interaction to improve passage search. In
[131], the authors propose a method called COIL which uses contextualized inverted lists to
improve exact lexical match in information retrieval. In [132], the authors propose a method
that generates pseudo query embeddings to improve the representation of documents
for dense retrieval. In [133], the authors propose a method called DensePhrases, which
Version January 24, 2023 submitted to Journal Not Specified 10 of 26

uses phrase retrieval to learn passage retrieval. In[134], the authors propose a method
for pruning query embeddings to improve dense retrieval performance. In [135], the
authors propose a method for learning multi-view document representations to improve
open-domain dense retrieval. In [20], the authors propose a method for learning diverse
document representations by using deep query interactions. In [136], the authors propose a
method for using topic-grained text representations to improve document retrieval.
Accelerate Interaction-based Models Accelerating interaction-based models for infor-
mation retrieval tasks is a crucial area of research in natural language processing. There
have been several recent studies that propose various methods for improving the efficiency
and accuracy of open-domain question answering and document retrieval. In [137,138], the
authors propose methods that incorporate the query term independence assumption to
improve retrieval and ranking using deep neural networks. [139] proposes a method for
efficient interaction-based neural ranking using locality sensitive hashing. [140] proposes a
method called Poly-encoders, which uses a combination of architectures and pre-training
strategies for fast and accurate multi-sentence scoring. [141] proposes a method called
MORES, which uses a modularized transformer-based ranking framework to improve doc-
ument retrieval. [142] proposes a method called PreTTR, which uses precomputing term
representations to improve document re-ranking for transformers. [143] proposes a method
called DeFormer, which decomposes pre-trained transformers for faster question answering.
[144] proposes a method called SPARTA, which uses sparse transformer matching retrieval
for efficient open-domain question answering.
Pre-training In recent years, there has been a growing interest in pre-training methods
for dense passage retrieval, which aims to improve the performance of retrieval systems
by leveraging large amounts of unannotated data. In this survey, we will discuss several
methods proposed in the literature that utilize pre-training to improve dense passage
retrieval. One line of research focuses on using latent retrieval for pre-training. For example,
in [145], the authors propose a method called ORQA, which uses latent retrieval for weakly
supervised open domain question answering. Similarly, in [146], the authors propose a
method called REALM, which uses retrieval-augmented language model pre-training to
improve performance. Another line of research focuses on using pre-training tasks to
improve embedding-based large-scale retrieval. For example, in [147], the authors propose
a method called BFS+WLP+MLM, which uses pre-training tasks to improve embedding-
based large-scale retrieval. Additionally, there are methods that use dense representation
fine-tuning for language models. For example, in [148], the authors propose a method called
Condenser, which uses dense representation fine-tuning for language models. Similarly, in
[149], the authors propose a method called coCondenser, which uses unsupervised corpus
aware language model pre-training for dense passage retrieval. Another line of research
focuses on using a weak decoder to pre-train a strong siamese encoder. For example, in
[150], the authors propose a method called SEED-Encoder, which uses a weak decoder to
pre-train a strong siamese encoder. Additionally, there are methods that use hyperlink
information for pre-training in ad-hoc retrieval. For example, in [151], the authors propose a
method called HARP, which uses hyperlink information for pre-training in ad-hoc retrieval.
Another line of research focus on using contextual information to improve dense passage
retrieval. For example, in [152], the authors propose a method called ConTextual Mask
Auto-Encoder (CMAE), which uses contextual information to improve dense passage
retrieval. Additionally, there are methods that use masked auto-encoder for pre-training
retrieval-oriented transformers. For example, in [153], the authors propose a method
called RetroMAE, which uses masked auto-encoder for pre-training retrieval-oriented
transformers. Another line of research focus on using representation bottleneck for pre-
training dense passage retrieval models. For example, in [154], the authors propose a
Version January 24, 2023 submitted to Journal Not Specified 11 of 26

method called SimLM, which uses representation bottleneck for pre-training dense passage
retrieval models. Lastly, there are methods that use lexicon-bottlenecked pre-training for
large-scale retrieval. For example, in [155], the authors propose a method called LexMAE,
which uses lexicon-bottlenecked pre-training for large-scale retrieval. The last method is a
method that uses a contrastive pre-training approach to learn discriminative autoencoder
for dense retrieval. For example in [156]. Overall, these pre-training methods demonstrate
promising results in improving dense passage retrieval performance. However, more
research is needed to fully understand
Zero-shot/Few-shot Learning Recent research in the field of zero-shot information
retrieval has been focused on finding ways to improve the performance and generalizability
of retrieval models. One common approach is the use of query generation, as seen in [157]
and QGen [158]. These methods aim to improve zero-shot retrieval by embedding queries
into a shared space, or by using synthetic question generation to improve passage retrieval.
Another approach is the use of synthetic pre-training, as suggested by [159], to improve
the robustness of neural retrieval models. Additionally, [160] presents a benchmark for
evaluating the performance of zero-shot retrieval models. Another approach is the use of
momentum adversarial domain-invariant representations as proposed by [161] for zero-
shot dense retrieval. [162] proposes DTR, which uses large dual encoders for generalizable
retrieval. [163] suggests utilizing out-of-domain semantics to improve zero-shot hybrid
retrieval models. [164] proposes InPars, a method that uses large language models for data
augmentation in information retrieval.
Moreover, contrastive learning has been proposed as a means of performing unsu-
pervised dense information retrieval with Contriever method by [165]. [166] proposes
GPL method that uses generative pseudo-labeling for unsupervised domain adaptation
of dense retrieval. [167] proposes a method called Spider, which enables unsupervised
passage retrieval. [168] proposes a method that utilizes contrastive pre-training to learn
embeddings for text and code. [169] proposes a method that disentangles the modeling of
domain and relevance for adaptable dense retrieval. Lastly, [170] proposes a method called
Promptagator, which uses few-shot dense retrieval from 8 examples.
Overall, recent research in zero-shot information retrieval has been focused on finding
ways to improve the performance and generalizability of retrieval models by utilizing
various techniques such as query generation, synthetic pre-training, large language models,
and contrastive learning. These methods have been applied to tasks such as passage
retrieval, unsupervised domain adaptation, and dense retrieval.
Probing Analysis Several studies have been conducted to address the limitations of
dense low-dimensional retrieval, particularly when dealing with large index sizes. These
studies propose various methods such as redundancy elimination, benchmarking, incorpo-
rating salient phrase information, entities-centric questions, and interpreting dense retrieval
as a mixture of topics as ways to improve the performance and interpretability of dense
retrieval models. For example, [171] delves into the challenges that arise from utilizing
dense low-dimensional retrieval for large index sizes and proposes solutions to mitigate
these limitations. Similarly, [172] proposes a method for unsupervised redundancy elimi-
nation to compress dense vectors for passage retrieval as a means of enhancing retrieval
performance. [160] proposed a benchmark to evaluate the performance of zero-shot infor-
mation retrieval models, while [173] explores the potential of incorporating salient phrase
information in dense retrieval to imitate the performance of sparse retrieval. [174] proposed
a benchmark of simple entities-centric questions to evaluate and challenge the performance
of dense retrievers, and [175] proposed a new approach to improve the effectiveness and
interpretability of dense retrieval by interpreting it as a mixture of topics.
Version January 24, 2023 submitted to Journal Not Specified 12 of 26

4.3. Hybrid Retrieval Methods


Word-Embedding-based [90] proposed a method for linearly combining monolingual
and cross-lingual word embeddings to improve retrieval performance. [176] proposed
a GLM (Generalized Language Model) which utilizes word embeddings to improve the
retrieval performance. [177] proposed a method for combining word embeddings using
set operations to improve retrieval performance. [92] proposed a method for combining
word embeddings into a dual embedding space model (DESM) to improve retrieval per-
formance. [178] proposed a method for replacing term-based retrieval with k-NN search
and incorporating translation models and BM25 to improve retrieval performance. [179]
proposed a method for combining Bag-of-Words (BOW) and CNN (Convolutional Neural
Network) to improve question answering performance. [180] proposed a method called
DenSPI (Dense-Sparse Phrase Index) to improve question answering performance in real-
time. [181] proposed a method called SPARC (Sparse, Contextualized Representations)
to improve question answering performance in real-time. [99] proposed a method called
CoRT (Complementary Rankings from Transformers) which combines transformer-based
models with BM25 to improve retrieval performance. [129] proposed a method called ME-
Hybrid, which combines sparse and dense representations with attentional mechanisms
to improve retrieval performance. [182] proposed a method called CLEAR (Complement
Lexical Retrieval Model) which combines lexical and semantic residual embeddings to
improve retrieval performance. [183] proposed a hybrid approach that combines semantic
and lexical matching to improve recall of document retrieval systems. [73] proposed a con-
ceptual framework for information retrieval techniques called uniCOIL (unified Conceptual
framework for Information Retrieval) [184] proposed a method called CORW (Contextu-
alized Offline Relevance Weighting) which improves the efficiency and effectiveness of
neural retrieval by utilizing context and relevance weighting. Overall, these papers show
that combining different methods, representations and architectures can lead to improved
performance of question answering and text retrieval systems. They propose different
methods such as hybrid representations, dense-sparse phrase index, sparse and dense
representations, attentional mechanisms, lexical and semantic embeddings, semantic and
lexical matching and offline relevance weighting. [185] proposed a method for predicting ef-
ficiency and effectiveness trade-offs for dense vs. sparse retrieval strategies. [186] proposed
a method called Fast Forward Indexes which aims to improve the efficiency of document
ranking by using a forward index that stores the positions of terms within documents. [187]
proposed a method called Representational Slicing which densifies sparse representations
for passage retrieval. [188]) proposed a method called UnifieR which aims to improve the
efficiency and effectiveness of large-scale retrieval by unifying different retrieval strategies.
Overall, these papers show that different methods can be used to improve the effi-
ciency and effectiveness of text retrieval systems. They propose different methods such as
contextualized offline relevance weighting, predicting efficiency/effectiveness trade-offs,
fast forward indexes, densifying sparse representations, and unified retrieval for large-scale
retrieval.

5. Second Stage - Ranker


In recent years, there has been a growing interest in the use of neural networks for
IR ranking. These models, also known as deep ranking models, have been shown to
be highly effective in learning the relevance of documents from labeled training data.
One of the key advantages of using deep learning for IR ranking is the ability to learn
complex representations of the documents and queries, which can capture subtle semantic
relationships and handle large amounts of data [17]. The Vector Space Model (VSM) [10] is
a widely used algorithm in information retrieval (IR) that represents documents and queries
Version January 24, 2023 submitted to Journal Not Specified 13 of 26

as vectors in a high-dimensional space and ranks documents based on their similarity to


the query. The basic idea is that the smaller the angle between the vectors, the more similar
the query and the document are. One of the first papers to propose the use of VSM for
IR was [26], which proposed a probabilistic model for IR and showed that it outperforms
the traditional Boolean model. Recently, there have been a number of papers that have
proposed using VSM in combination with other IR algorithms to improve performance,
such as [16,54,189–196].
Ad-hoc retrieval, where a user’s query is used to search a collection of documents
for relevance, is a fundamental task in the field of information retrieval [197,198]. The
task involves using a ranking algorithm to create an ordered list of documents from a
corpus, where the top-ranked documents are deemed the most pertinent to the user’s query.
However, the search request in ad-hoc retrieval is typically brief and may come from a user
with unclear intent, creating a vocabulary mismatch between the user’s query and the terms
present in the collection, hindering the re-ranker’s ability to match relevant documents
[199,200].
Recent research has focused on addressing these challenges by proposing novel ap-
proaches to ad-hoc retrieval. For example, [16] presents the deep relevance matching model
(DRMM) which addresses the challenges of relevance matching through a joint deep archi-
tecture at the query term level. Other papers propose incorporating topic models like LDA
[54] and graph neural networks [190] to leverage document-level word relationships for
ad-hoc retrieval. Additionally, pre-training methods such as PROP [191] and counterfactual
inference techniques [193] have been proposed to improve the performance of pre-trained
language models in IR tasks. Furthermore, [195] present a new approach to improve the
performance of pre-trained language models in IR tasks by using established IR cues such
as exact term-matching.
Overall, the field of information retrieval is constantly evolving, with researchers
proposing new and innovative approaches to improve the performance of ad-hoc retrieval
and other IR tasks. The above-mentioned papers highlights the various approaches that
have been proposed in recent years to address the challenges of ad-hoc retrieval and
improve the performance of IR algorithms.

IR researchers have proposed a variety of approaches to optimize and improve perfor-


mance, with a particular focus on learning-to-rank (LTR) systems. One such approach is the
use of cascaded ranking models, as proposed by in [201]. The authors propose integrating
feature costs into multi-stage LTR systems, which they demonstrate can increase both
efficiency and effectiveness. Another approach is the use of weak supervision, as proposed
by in [193]. The authors present a method for training a neural ranking model using weak
supervision, where labels are obtained automatically without human annotators or external
resources.
[202] presents a novel approach for assessing relevance between a query and a docu-
ment in ad-hoc retrieval, which addresses the challenge of diverse relevance patterns. The
proposed method is a data-driven approach called Hierarchical Neural matching model
which consists of two stacked components: a local matching layer and a global decision
layer. [203] propose a method for making transformer-based ranking models more effi-
cient by modularizing them into separate modules for text representation and interaction.
This makes the ranking process faster using offline pre-computed representations and
lightweight online interactions. Additionally, this modular design makes the model easier
to interpret and better understand the ranking process in transformer-based models.
Version January 24, 2023 submitted to Journal Not Specified 14 of 26

Several studies have been conducted to make transformer-based models more efficient
and effective. One such study, by [141], proposed a modularized design for transformer-
based ranking models to improve efficiency by using offline pre-computed representations
and lightweight online interactions. This design also improved interpretability by pro-
viding insight into the ranking process. Another study, by [204], presented a method
for improving text retrieval performance using pre-trained deep language models (LMs)
through a technique called Localized Contrastive Estimation (LCE). The LCE approach aims
to address the issue that popular rerankers are not able to fully exploit improved retrieval
results when pre-trained deep LMs are used to improve search index. The experiments
showed that this approach significantly improves text retrieval performance. [125,205,206]
presented a balanced neural re-ranking approach for text retrieval called TK (Transformer-
Kernel) model, which uses a small number of transformer layers to contextualize query and
document word embeddings, and a document-length enhanced kernel-pooling to score in-
dividual term interactions. Furthermore, [206] presented TK (Transformer-Kernel), a neural
re-ranking model for ad-hoc search that uses an efficient contextualization mechanism. The
model’s design aims to achieve an optimal balance between effectiveness and efficiency.
[207] proposed the Query-Directed Sparse attention method for document ranking tasks in
transformer models, called QDS-Transformer, which aims to improve efficiency while still
maintaining important properties for ranking such as local contextualization, hierarchical
representation, and query-oriented proximity matching.

Researchers in the field of information retrieval (IR) have proposed a variety of ap-
proaches to optimize and improve the performance of IR systems, with a particular focus
on learning-to-rank (LTR) systems. One approach is the use of cascaded ranking models, as
proposed by in [201]. The authors propose integrating feature costs into multi-stage LTR
systems, which they demonstrate can increase both efficiency and effectiveness. Another
approach is the use of weak supervision, as proposed by in [193]. The authors present
a method for training a neural ranking model using weak supervision, where labels are
obtained automatically without human annotators or external resources. [208] presented a
counterfactual inference framework for unbiased LTR using implicit feedback data. They
highlighted the issue of bias in using this type of data, specifically position bias in search
rankings, and its negative impact on LTR methods. To overcome this problem, the authors
proposed a Propensity-Weighted Ranking SVM which uses click models as the propensity
estimator. [209] discussed the issue of repeatability in document ranking experiments
using the open-source Lucene search engine. They pointed out that score ties during re-
trieval, which occur when multiple documents have the same ranking score, can lead to
non-deterministic rankings due to multi-threaded indexing. They suggested using external
collection document ids as a solution to ensure repeatability. in [121] presented an approach
to improve the ranking performance of the late-interaction ColBERT model by applying
knowledge distillation. The authors used distillation to transfer knowledge from the Col-
BERT model into a simpler dot product, which improves query latency and reduces storage
requirements, while still maintaining good effectiveness. in [210] studied the cross-domain
transferability of deep neural ranking models in IR and found that they "catastrophically
forget" old knowledge when learning new knowledge from different domains, leading to
decreased performance. The study suggests that using a lifelong learning strategy with a
cross-domain regularizer can mitigate this problem and provides insights on how domain
characteristics impact the issue of catastrophic forgetting. in [19] investigated the use of
contextualized language models (ELMo and BERT) for ad-hoc document ranking in infor-
Version January 24, 2023 submitted to Journal Not Specified 15 of 26

mation retrieval. They proposed a new joint approach, CEDR, which incorporates BERT’s
classification vector into existing neural models and showed that it outperforms current
baselines. The paper also addressed practical challenges like maximum input length and
runtime performance. in [211] suggests using weak supervision sources to train neural
ranking models. These sources provide pseudo query-document pairs that already exhibit
relevance, such as newswire headline-content pairs and encyclopedic heading-paragraph
pairs. The authors also proposed filters to eliminate irrelevant training samples. They
showed that these weak supervision sources are effective and that filtering can further
improve performance. This approach allows for more efficient and cost-effective training
of neural ranking models. in [142] proposed PreTTR (Precomputing Transformer Term
Representations) which precomputes part of the document term representations at indexing
time and merges them with query representation at query time for faster ranking score
computation. Additionally, the authors propose a compression technique to reduce storage
requirements

Another approach is using curriculum learning to train neural answer ranking models
[212]. The idea is to learn to identify easy correct answers before incorporating more
complex logic. Heuristics are proposed to estimate the difficulty of a sample and used to
build a training curriculum that gradually shifts to weighting all samples equally. This
approach has been shown to improve the performance of BERT and ConvKNRM models.
Standard neural rerankers have also been proposed to improve the efficiency of IR-based
QA systems [213]. These rerankers can reduce the number of sentence candidates for the
Transformer models without affecting accuracy, improving efficiency by up to 4 times.
This allows for the use of powerful Transformer models in real-world applications while
maintaining accuracy. Additionally, researchers have proposed new models for document
relevance ranking, such as the Deep Relevance Matching Model (DRMM) [214], which have
context-sensitive encodings throughout, inspired by another existing model, PACRR. These
encodings are used in multiple ways to match query and document inputs. Furthermore,
incorporating query term independence into neural IR models has also been proposed [137]
as a way to make them more efficient for retrieval from large collections. This approach
has been shown to result in little loss in result quality for Duet and CKNRM and a small
degradation in the case of BERT.
[215] evaluates transferability of BERT-based neural ranking models across 5 datasets
and finds that training on pseudo-labels can produce a competitive or better model than
transfer learning. However, it also highlights the need to improve stability and effectiveness
of few-shot training as it can degrade performance of pretrained model.
[216] proposes a new approach to document ranking called the context-aware Passage-
level Cumulative Gain (PCG) which aggregates relevance scores of passages and accounts
for context information. This work aims to improve ranking performance and provide more
explainability for document ranking.

6. Datasets
Table 6 presents a summary of the state-of-the-art datasets used in the survey paper.
The table includes datasets for various NLP tasks such as Passage-Retrieval [217], Bio-
Medical Information Retrieval [218–220], Question Answering [221–225], Tweet[226], News
Article [227][228], Argument Retrieval [229,230], Community QA [125], Entities [231], Fact
Checking [232–234] and Search Query [235]. The table lists the task, dataset name, domain,
and corpus size for each dataset. The datasets are organized into sections based on the task
they are used for and the table provides a brief description of each dataset, including its
Version January 24, 2023 submitted to Journal Not Specified 16 of 26

source, as mentioned in the references, and the size of the corpus. This table can be used as
a reference for researchers to select appropriate datasets for their NLP tasks and to cite the
datasets used in their research.

Table 2. SOTA Datasets in IR. For each dataset, we provide the corresponding task, domain and
corpus.

Task Data-set Domain Corpus


Passage-Retrieval MS MARCO [217] Real Queries 8.84M
Bio-Medical CORD-19 [218] Scholarly Articles 171K
Information BioASQ [219] Human-annotated QA 14.91M
Retriveval NFCorpus [220] Medical Information 3.6K
Natural Questions [221] Real Users QA 2.68M
Question HotpotQA [222] Question Answering QA 5.23M
Answering FiQA-2018 [223] Financial Opinion QA 57K
QASC [224] MCQ 17M
InsuranceQA [225] InsuranceQA 24K
Tweet Signal-1M(RT) [226] Twitter QA 2.86M
News TREC-NEWS [227] Article 595K
Article TREC Robust [228] Article 69.9
MLSUM [236] News Articles 1.5M
Argument Arg-Microtexts Synthesis [229] Argument 8.67K
Retrieval Conversational Argument [230] Conv. Argument 382K
Community QA CQADupstack [125] Community QA 457K
Entity DBPedia [231] Entity 4.63M
Fever [232] 5.42M
Fact Climate-FEVER [233] 5.42M
Checking SciFact [234] claims climate-change 5K
Search MSLR-WEB10K [235] Search Queries 10K
Query MSLR-WEB30K [235] Search Queries 30K

7. Current Challenges and Further Directions


In this survey paper, we have provided an overview of the current state of the field of
information retrieval, including conventional term-based retrieval, pioneering methods for
semantic retrieval, query and document augmentation, lexical dependency model, topic
model, multilingual retrieval model, and deep learning methods for semantic retrieval. We
have also discussed the two stages of the retrieval process: retrieval and ranking.
Recent studies have delved into the utilization of pre-training objectives that are
specifically tailored to the task of information retrieval (IR). For instance,[145] proposed the
use of a large-scale document collection and the Inverse Cloze Task (ICT) for retrieval tasks,
as a method to improve performance. [146] on the other hand, proposed a method to capture
inner-page and inter-page semantic relations through the usage of Body First Selection
(BFS) and Wiki Link Prediction (WLP) for passage retrieval within question-answering (QA)
tasks. Additionally, [191] and [151] proposed the Representative Words Prediction (ROP)
objective for pre-training, which has shown to yield significant improvements. However,
it is still in the early stages of research to effectively design pre-training objectives that
are highly suited to the task of IR, thus providing ample room for further exploration and
experimentation.
Recent studies have highlighted some of the main challenges and future directions for
semantic retrieval that can be considered as a general view on the field.
Handling long-tail queries: A recent study has highlighted that handling long-tail
queries, which are queries that are infrequent or rare, is a major challenge for semantic
retrieval systems.
Version January 24, 2023 submitted to Journal Not Specified 17 of 26

Leveraging pre-trained models: Pre-trained models, such as BERT [5], have been
shown to be effective in many natural language processing tasks and are being increasingly
used in semantic retrieval.
Handling multilingual retrieval: With the increasing amount of multilingual infor-
mation available on the web, handling multilingual retrieval has become an important
challenge for semantic retrieval
As for the future directions, some recent papers have proposed: Developing more
sophisticated models for document representation, such as deep neural networks and
transformer-based models to improve the effectiveness of semantic retrieval [237].

8. Conclusions
This survey provided a comprehensive overview of the state-of-the-art for semantic
retrieval models, in the context of information retrieval. We covered a wide range of topics,
from the early semantic retrieval methods, to the most recent neural semantic retrieval
methods, discussing the connections between them. In terms of structure, our focus was on
the key IR topics: first-stage and second-stage retrieval, neural semantic retrieval model
learning. Additionally, the survey highlights the major difficulties and challenges in the
field, and points for promising directions for future research. Overall, this survey is expected
to be useful for researchers interested in this challenging topic, providing inspiration for
new ideas and further developments.

Author Contributions: Conceptualization, K.H. and H.P.; Writing – original draft, K.H.; Writing –
review & editing, K.H and H.P.
Funding: This work is funded by FCT/MCTES through national funds and co-funded by EU funds
under the project UIDB/50008/2020.
Institutional Review Board Statement: Not applicable
Informed Consent Statement: Not applicable.
Data Availability Statement: Data is contained within the article.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Croft, W.B.; Metzler, D.; Strohman, T. Search engines: Information retrieval in practice; Vol. 520, Addison-Wesley Reading, 2010.
2. Lin, J.; Nogueira, R.; Yates, A. Pretrained transformers for text ranking: Bert and beyond. Synthesis Lectures on Human Language
Technologies 2021, 14, 1–325.
3. Kim, Y. Convolutional Neural Networks for Sentence Classification. Proceedings of the 2014 Conference on Empirical Methods in
Natural Language Processing (EMNLP), 2014, pp. 1746–1751.
4. Liu, Y. Recurrent Convolutional Neural Networks for Text Classification. Proceedings of the 25th International Joint Conference on
Artificial Intelligence, 2016, pp. 2397–2403.
5. Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human
Language Technologies, Volume 1 (Long and Short Papers), 2019, pp. 4171–4186.
6. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, l.; Polosukhin, I. Attention Is All You Need.
Advances in Neural Information Processing Systems, 2017, pp. 6000–6010.
7. Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language Models are Unsupervised Multitask Learners. OpenAI
2019.
8. Wang, Y.; Zhang, J.; Feng, J.; Chen, Z.; Liu, M. Knowledge Graph Embedding: A Survey of Approaches and Applications. IEEE
Transactions on Knowledge and Data Engineering 2017, 29, 2724–2743.
9. Wang, H.; Shi, X.; Chen, Z.; Yeung, D.Y.; Wong, W.k. Learning Multi-Modal Representations with Deep Generative Models.
Proceedings of the 27th International Joint Conference on Artificial Intelligence, 2018, pp. 4111–4118.
10. Salton, G.; Wong, A.; Yang, C.S. A vector space model for automatic indexing. Communications of the ACM 1975, 18, 613–620.
Version January 24, 2023 submitted to Journal Not Specified 18 of 26

11. Deerwester, S.; Dumais, S.T.; Furnas, G.W.; Landauer, T.K.; Harshman, R. Indexing by latent semantic analysis. Journal of the American
society for information science 1990, 41, 391–407.
12. Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent dirichlet allocation. Journal of machine Learning research 2003, 3, 993–1022.
13. Robertson, S.E.; Walker, S.; Jones, S.; Hancock-Beaulieu, M.M.; Gatford, M.; others. Okapi at TREC-3. Nist Special Publication Sp 1995,
109, 109.
14. Burges, C.; Shaked, T.; Renshaw, E.; Lazier, A.; Deeds, M.; Hamilton, N.; Hullender, G. Learning to rank using gradient descent.
Proceedings of the 22nd international conference on Machine learning, 2005, pp. 89–96.
15. Burges, C.; Ragno, R.; Le, Q. Learning to rank with nonsmooth cost functions. Advances in neural information processing systems 2006,
19.
16. Guo, J.; Fan, Y.; Ai, Q.; Croft, W.B. A deep relevance matching model for ad-hoc retrieval. Proceedings of the 25th ACM international
on conference on information and knowledge management, 2016, pp. 55–64.
17. Mitra, B.; Diaz, F.; Craswell, N. Learning to match using local and distributed representations of text for web search. Proceedings of
the 26th international conference on world wide web, 2017, pp. 1291–1299.
18. Zhou, J.; Agichtein, E. Rlirank: Learning to rank with reinforcement learning for dynamic search. Proceedings of The Web Conference
2020, 2020, pp. 2842–2848.
19. MacAvaney, S.; Yates, A.; Cohan, A.; Goharian, N. CEDR: Contextualized embeddings for document ranking. Proceedings of the
42nd international ACM SIGIR conference on research and development in information retrieval, 2019, pp. 1101–1104.
20. Li, J.; Zeng, H.; Peng, L.; Zhu, J.; Liu, Z. Learning to rank method combining multi-head self-attention with conditional generative
adversarial nets. Array 2022, 15, 100205.
21. Manning, C.D.; Raghavan, P.; Sch"utze, H. Introduction to Information Retrieval; Cambridge University Press, 2008.
22. Baeza-Yates, R.; Ribeiro-Neto, B. Modern information retrieval; Pearson Education, 2011.
23. Ribeiro-Neto, B.; Baeza-Yates, R. Modern information retrieval; Pearson Education, 2011.
24. Robertson, S.E.; Jones, K.S. Relevance weighting of search terms. Journal of the American Society for Information science 1976, 27, 129–146.
25. Van Rijsbergen, C.J. A theoretical basis for the use of co-occurrence data in information retrieval. Journal of documentation 1977.
26. Salton, G.; Buckley, C. Term-weighting approaches in automatic text retrieval. Information processing & management 1988, 24, 513–523.
27. Salton, G. Developments in automatic text retrieval. science 1991, pp. 974–980.
28. Ponte, J.M.; Croft, W.B. A language modeling approach to information retrieval. ACM SIGIR Forum. ACM New York, NY, USA, 1998,
Vol. 51, pp. 202–208.
29. Amati, G.; Van Rijsbergen, C.J. Probabilistic models of information retrieval based on measuring the divergence from randomness.
ACM Transactions on Information Systems (TOIS) 2002, 20, 357–389.
30. Qiu, Y.; Frei, H.P. Concept based query expansion. Proceedings of the 16th annual international ACM SIGIR conference on Research
and development in information retrieval, 1993, pp. 160–169.
31. Voorhees, E.M. Query expansion using lexical-semantic relations. SIGIR’94. Springer, 1994, pp. 61–69.
32. Bai, J.; Nie, J.Y.; Cao, G.; Bouchard, H. Using query contexts in information retrieval. Proceedings of the 30th annual international
ACM SIGIR conference on Research and development in information retrieval, 2007, pp. 15–22.
33. Rocchio, J. Relevance feedback in information retrieval. The Smart retrieval system-experiments in automatic document processing 1971,
pp. 313–323.
34. Zhai, C.; Lafferty, J. Model-based feedback in the language modeling approach to information retrieval. Proceedings of the tenth
international conference on Information and knowledge management, 2001, pp. 403–410.
35. Cao, G.; Nie, J.Y.; Gao, J.; Robertson, S. Selecting good expansion terms for pseudo-relevance feedback. Proceedings of the 31st
annual international ACM SIGIR conference on Research and development in information retrieval, 2008, pp. 243–250.
36. Lv, Y.; Zhai, C. A comparative study of methods for estimating query language models with pseudo feedback. Proceedings of the
18th ACM conference on Information and knowledge management, 2009, pp. 1895–1898.
37. Zamani, H.; Dadashkarimi, J.; Shakery, A.; Croft, W.B. Pseudo-relevance feedback based on matrix factorization. Proceedings of the
25th ACM international on conference on information and knowledge management, 2016, pp. 1483–1492.
38. Kurland, O.; Lee, L. Corpus structure, language models, and ad hoc information retrieval. Proceedings of the 27th annual international
ACM SIGIR conference on Research and development in information retrieval, 2004, pp. 194–201.
39. Liu, X.; Croft, W.B. Cluster-based retrieval using language models. Proceedings of the 27th annual international ACM SIGIR
conference on Research and development in information retrieval, 2004, pp. 186–193.
40. Billerbeck, B.; Zobel, J. Document expansion versus query expansion for ad-hoc retrieval. Proceedings of the 10th Australasian
Document Computing Symposium. Citeseer, 2005, pp. 34–41.
41. Tao, T.; Wang, X.; Mei, Q.; Zhai, C. Language model information retrieval with document expansion. Proceedings of the Human
Language Technology Conference of the NAACL, Main Conference, 2006, pp. 407–414.
42. Agirre, E.; Arregi, X.; Otegi, A. Document expansion based on WordNet for robust IR. Coling 2010: Posters, 2010, pp. 9–17.
Version January 24, 2023 submitted to Journal Not Specified 19 of 26

43. Efron, M.; Organisciak, P.; Fenlon, K. Improving retrieval of short texts through document expansion. Proceedings of the 35th
international ACM SIGIR conference on Research and development in information retrieval, 2012, pp. 911–920.
44. Sherman, G.; Efron, M. Document expansion using external collections. Proceedings of the 40th International ACM SIGIR Conference
on Research and Development in Information Retrieval, 2017, pp. 1045–1048.
45. Fagan, J.L. Experiments in automatic phrase indexing for document retrieval: A comparison of syntactic and nonsyntactic methods; Cornell
University, 1988.
46. Mitra, M.; Buckley, C.; Singhal, A.; Cardie, C.; others. An analysis of statistical and syntactic phrases. RIAO, 1997, Vol. 97, pp.
200–214.
47. Song, F.; Croft, W.B. A general language model for information retrieval. Proceedings of the eighth international conference on
Information and knowledge management, 1999, pp. 316–321.
48. Jones, K.S.; Walker, S.; Robertson, S.E. A probabilistic model of information retrieval: development and comparative experiments:
Part 2. Information processing & management 2000, 36, 809–840.
49. Nallapati, R.; Allan, J. Capturing term dependencies using a language model based on sentence trees. Proceedings of the eleventh
international conference on Information and knowledge management, 2002, pp. 383–390.
50. Gao, J.; Nie, J.Y.; Wu, G.; Cao, G. Dependence language model for information retrieval. Proceedings of the 27th annual international
ACM SIGIR conference on Research and development in information retrieval, 2004, pp. 170–177.
51. Xu, J.; Li, H.; Zhong, C. Relevance ranking using kernels. Asia Information Retrieval Symposium. Springer, 2010, pp. 1–12.
52. Wong, S.M.; Ziarko, W.; Wong, P.C. Generalized vector spaces model in information retrieval. Proceedings of the 8th annual
international ACM SIGIR conference on Research and development in information retrieval, 1985, pp. 18–25.
53. Diaz, F. Regularizing ad hoc retrieval scores. Proceedings of the 14th ACM international conference on Information and knowledge
management, 2005, pp. 672–679.
54. Wei, X.; Croft, W.B. LDA-based document models for ad-hoc retrieval. Proceedings of the 29th annual international ACM SIGIR
conference on Research and development in information retrieval, 2006, pp. 178–185.
55. Yi, X.; Allan, J. A comparative study of utilizing topic models for information retrieval. European conference on information retrieval.
Springer, 2009, pp. 29–41.
56. Lu, Y.; Mei, Q.; Zhai, C. Investigating task performance of probabilistic topic models: an empirical study of PLSA and LDA. Information
Retrieval 2011, 14, 178–203.
57. Atreya, A.; Elkan, C. Latent semantic indexing (LSI) fails for TREC collections. ACM SIGKDD Explorations Newsletter 2011, 12, 5–10.
58. Berger, A.; Lafferty, J. Information retrieval as statistical translation. ACM SIGIR Forum. ACM New York, NY, USA, 1999, Vol. 51, pp.
219–226.
59. Karimzadehgan, M.; Zhai, C. Estimation of statistical translation models based on mutual information for ad hoc information
retrieval. Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, 2010,
pp. 323–330.
60. Gao, J.; He, X.; Nie, J.Y. Clickthrough-based translation models for web search: from word models to phrase models. Proceedings of
the 19th ACM international conference on Information and knowledge management, 2010, pp. 1139–1148.
61. Karimzadehgan, M.; Zhai, C. Axiomatic analysis of translation language model for information retrieval. European Conference on
Information Retrieval. Springer, 2012, pp. 268–280.
62. Riezler, S.; Liu, Y. Query rewriting using monolingual statistical machine translation. Computational Linguistics 2010, 36, 569–582.
63. Gao, J.; Nie, J.Y. Towards concept-based translation models using search logs for query expansion. Proceedings of the 21st ACM
international conference on Information and knowledge management, 2012, pp. 1–10.
64. Shen, Y.; He, X.; Gao, J.; Deng, L. Latent semantic models with deep neural networks for information retrieval. Proceedings of the
37th international ACM SIGIR conference on Research & development in information retrieval. ACM, 2014, pp. 269–278.
65. Huang, P.S.; Berard, B.; Gupta, R.; Saini, A.; Saparov, A.; Yin, D. Learning deep structured semantic models for web search using
clickthrough data. Proceedings of the 22nd ACM international conference on Information & Knowledge Management. ACM, 2013,
pp. 2333–2338.
66. Wang, S.; Zhang, L.; Liu, T.Y. Neural information retrieval: A brief overview. ACM Computing Surveys (CSUR) 2018, 51, 1–32.
67. Zheng, G.; Callan, J. Learning to reweight terms with distributed representations. Proceedings of the 38th international ACM SIGIR
conference on research and development in information retrieval, 2015, pp. 575–584.
68. Zuccon, G.; Koopman, B.; Bruza, P.; Azzopardi, L. Integrating and evaluating neural word embeddings in information retrieval.
Proceedings of the 20th Australasian document computing symposium, 2015, pp. 1–8.
69. Dai, Z.; Callan, J. Context-aware sentence/passage term importance estimation for first stage retrieval. arXiv preprint arXiv:1910.10687
2019.
70. Frej, J.; Mulhem, P.; Schwab, D.; Chevallet, J.P. Learning term discrimination. Proceedings of the 43rd International ACM SIGIR
Conference on Research and Development in Information Retrieval, 2020, pp. 1993–1996.
Version January 24, 2023 submitted to Journal Not Specified 20 of 26

71. Dai, Z.; Callan, J. Context-aware term weighting for first stage passage retrieval. Proceedings of the 43rd International ACM SIGIR
conference on research and development in Information Retrieval, 2020, pp. 1533–1536.
72. Mackenzie, J.; Dai, Z.; Gallagher, L.; Callan, J. Efficiency implications of term weighting for passage retrieval. Proceedings of the 43rd
International ACM SIGIR Conference on Research and Development in Information Retrieval, 2020, pp. 1821–1824.
73. Lin, J.; Ma, X. A few brief notes on deepimpact, coil, and a conceptual framework for information retrieval techniques. arXiv preprint
arXiv:2106.14807 2021.
74. Nogueira, R.; Yang, W.; Lin, J.; Cho, K. Document expansion by query prediction. arXiv preprint arXiv:1904.08375 2019.
75. Nogueira, R.; Lin, J.; Epistemic, A. From doc2query to docTTTTTquery. Online preprint 2019, 6.
76. Mao, Y.; He, P.; Liu, X.; Shen, Y.; Gao, J.; Han, J.; Chen, W. Generation-augmented retrieval for open-domain question answering.
arXiv preprint arXiv:2009.08553 2020.
77. Yan, M.; Li, C.; Bi, B.; Wang, W.; Huang, S. A unified pretraining framework for passage ranking and expansion. Proceedings of the
AAAI Conference on Artificial Intelligence, 2021, Vol. 35, pp. 4555–4563.
78. MacAvaney, S.; Nardini, F.M.; Perego, R.; Tonellotto, N.; Goharian, N.; Frieder, O. Expansion via prediction of importance with
contextualization. Proceedings of the 43rd International ACM SIGIR conference on research and development in Information
Retrieval, 2020, pp. 1573–1576.
79. Bai, Y.; Li, X.; Wang, G.; Zhang, C.; Shang, L.; Xu, J.; Wang, Z.; Wang, F.; Liu, Q. SparTerm: Learning term-based sparse representation
for fast text retrieval. arXiv preprint arXiv:2010.00768 2020.
80. Formal, T.; Lassance, C.; Piwowarski, B.; Clinchant, S. SPLADE v2: Sparse lexical and expansion model for information retrieval.
arXiv preprint arXiv:2109.10086 2021.
81. Mallia, A.; Khattab, O.; Suel, T.; Tonellotto, N. Learning passage impacts for inverted indexes. Proceedings of the 44th International
ACM SIGIR Conference on Research and Development in Information Retrieval, 2021, pp. 1723–1727.
82. Zhuang, S.; Zuccon, G. Fast passage re-ranking with contextualized exact term matching and efficient passage expansion. arXiv
preprint arXiv:2108.08513 2021.
83. Choi, E.; Lee, S.; Choi, M.; Ko, H.; Song, Y.I.; Lee, J. SpaDE: Improving sparse representations using a dual document encoder for
first-stage retrieval. Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022, pp.
272–282.
84. Salakhutdinov, R.; Hinton, G. Semantic hashing. International Journal of Approximate Reasoning 2009, 50, 969–978.
85. Zamani, H.; Dehghani, M.; Croft, W.B.; Learned-Miller, E.; Kamps, J. From neural re-ranking to neural ranking: Learning a sparse
representation for inverted indexing. Proceedings of the 27th ACM international conference on information and knowledge
management, 2018, pp. 497–506.
86. Jang, K.R.; Kang, J.; Hong, G.; Myaeng, S.H.; Park, J.; Yoon, T.; Seo, H. Ultra-High Dimensional Sparse Representations with
Binarization for Efficient Text Retrieval. arXiv preprint arXiv:2104.07198 2021.
87. Yamada, I.; Asai, A.; Hajishirzi, H. Efficient passage retrieval with hashing for open-domain question answering. arXiv preprint
arXiv:2106.00882 2021.
88. Lassance, C.; Formal, T.; Clinchant, S. Composite code sparse autoencoders for first stage retrieval. Proceedings of the 44th
International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021, pp. 2136–2140.
89. Clinchant, S.; Perronnin, F. Aggregating continuous word embeddings for information retrieval. Proceedings of the workshop on
continuous vector space models and their compositionality, 2013, pp. 100–109.
90. Vulić, I.; Moens, M.F. Monolingual and cross-lingual information retrieval models based on (bilingual) word embeddings. Proceedings
of the 38th international ACM SIGIR conference on research and development in information retrieval, 2015, pp. 363–372.
91. Kenter, T.; De Rijke, M. Short text similarity with word embeddings. Proceedings of the 24th ACM international on conference on
information and knowledge management, 2015, pp. 1411–1420.
92. Mitra, B.; Nalisnick, E.; Craswell, N.; Caruana, R. A dual embedding space model for document ranking. arXiv preprint
arXiv:1602.01137 2016.
93. Henderson, M.; Al-Rfou, R.; Strope, B.; Sung, Y.H.; Lukács, L.; Guo, R.; Kumar, S.; Miklos, B.; Kurzweil, R. Efficient natural language
response suggestion for smart reply. arXiv preprint arXiv:1705.00652 2017.
94. Gillick, D.; Presta, A.; Tomar, G.S. End-to-end retrieval in continuous space. arXiv preprint arXiv:1811.08008 2018.
95. Karpukhin, V.; Oğuz, B.; Min, S.; Lewis, P.; Wu, L.; Edunov, S.; Chen, D.; Yih, W.t. Dense passage retrieval for open-domain question
answering. arXiv preprint arXiv:2004.04906 2020.
96. Seo, M.; Kwiatkowski, T.; Parikh, A.P.; Farhadi, A.; Hajishirzi, H. Phrase-indexed question answering: A new challenge for scalable
document comprehension. arXiv preprint arXiv:1804.07726 2018.
97. Lewis, P.; Perez, E.; Piktus, A.; Petroni, F.; Karpukhin, V.; Goyal, N.; Küttler, H.; Lewis, M.; Yih, W.t.; Rocktäschel, T.; others. Retrieval-
augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems 2020, 33, 9459–9474.
98. Zhan, J.; Mao, J.; Liu, Y.; Zhang, M.; Ma, S. RepBERT: Contextualized text embeddings for first-stage retrieval. arXiv preprint
arXiv:2006.15498 2020.
Version January 24, 2023 submitted to Journal Not Specified 21 of 26

99. Wrzalik, M.; Krechel, D. CoRT: Complementary Rankings from Transformers. arXiv preprint arXiv:2010.10252 2020.
100. Nie, P.; Zhang, Y.; Geng, X.; Ramamurthy, A.; Song, L.; Jiang, D. Dc-bert: Decoupling question and document for efficient contextual
encoding. Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, 2020,
pp. 1829–1832.
101. Yang, Y.; Jin, N.; Lin, K.; Guo, M.; Cer, D. Neural retrieval for question answering with cross-attention supervised data augmentation.
arXiv preprint arXiv:2009.13815 2020.
102. Xiong, L.; Xiong, C.; Li, Y.; Tang, K.F.; Liu, J.; Bennett, P.; Ahmed, J.; Overwijk, A. Approximate nearest neighbor negative contrastive
learning for dense text retrieval. arXiv preprint arXiv:2007.00808 2020.
103. Zhan, J.; Mao, J.; Liu, Y.; Zhang, M.; Ma, S. Learning to retrieve: How to train a dense retrieval model effectively and efficiently. arXiv
preprint arXiv:2010.10469 2020.
104. Shan, X.; Liu, C.; Xia, Y.; Chen, Q.; Zhang, Y.; Luo, A.; Luo, Y. BISON: BM25-weighted self-attention framework for multi-fields
document search. arXiv preprint arXiv:2007.05186 2020.
105. Qu, Y.; Ding, Y.; Liu, J.; Liu, K.; Ren, R.; Zhao, W.X.; Dong, D.; Wu, H.; Wang, H. RocketQA: An optimized training approach to dense
passage retrieval for open-domain question answering. arXiv preprint arXiv:2010.08191 2020.
106. Lee, J.; Sung, M.; Kang, J.; Chen, D. Learning dense representations of phrases at scale. arXiv preprint arXiv:2012.12624 2020.
107. Hofstätter, S.; Lin, S.C.; Yang, J.H.; Lin, J.; Hanbury, A. Efficiently teaching an effective dense retriever with balanced topic aware
sampling. Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval,
2021, pp. 113–122.
108. Zhan, J.; Mao, J.; Liu, Y.; Guo, J.; Zhang, M.; Ma, S. Optimizing dense retrieval model training with hard negatives. Proceedings of the
44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021, pp. 1503–1512.
109. Yu, S.; Liu, Z.; Xiong, C.; Feng, T.; Liu, Z. Few-shot conversational dense retrieval. Proceedings of the 44th International ACM SIGIR
Conference on Research and Development in Information Retrieval, 2021, pp. 829–838.
110. Li, Y.; Liu, Z.; Xiong, C.; Liu, Z. More robust dense retrieval with contrastive dual learning. Proceedings of the 2021 ACM SIGIR
International Conference on Theory of Information Retrieval, 2021, pp. 287–296.
111. Ren, R.; Lv, S.; Qu, Y.; Liu, J.; Zhao, W.X.; She, Q.; Wu, H.; Wang, H.; Wen, J.R. PAIR: Leveraging passage-centric similarity relation for
improving dense passage retrieval. arXiv preprint arXiv:2108.06027 2021.
112. Khattab, O.; Potts, C.; Zaharia, M. Relevance-guided supervision for openqa with colbert. Transactions of the Association for
Computational Linguistics 2021, 9, 929–944.
113. Singh, D.; Reddy, S.; Hamilton, W.; Dyer, C.; Yogatama, D. End-to-end training of multi-document reader and retriever for
open-domain question answering. Advances in Neural Information Processing Systems 2021, 34, 25968–25981.
114. Yu, H.; Xiong, C.; Callan, J. Improving query representations for dense retrieval with pseudo relevance feedback. arXiv preprint
arXiv:2108.13454 2021.
115. Wang, X.; Macdonald, C.; Tonellotto, N.; Ounis, I. Pseudo-relevance feedback for multiple representation dense retrieval. Proceedings
of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval, 2021, pp. 297–306.
116. Cai, Y.; Fan, Y.; Guo, J.; Zhang, R.; Lan, Y.; Cheng, X. A discriminative semantic ranker for question retrieval. Proceedings of the 2021
ACM SIGIR International Conference on Theory of Information Retrieval, 2021, pp. 251–260.
117. Wu, B.; Zhang, Z.; Wang, J.; Zhao, H. Representation Decoupling for Open-Domain Passage Retrieval. arXiv preprint arXiv:2110.07524
2021.
118. Ren, R.; Qu, Y.; Liu, J.; Zhao, W.X.; She, Q.; Wu, H.; Wang, H.; Wen, J.R. Rocketqav2: A joint training method for dense passage
retrieval and passage re-ranking. arXiv preprint arXiv:2110.07367 2021.
119. Lindgren, E.; Reddi, S.; Guo, R.; Kumar, S. Efficient Training of Retrieval Models using Negative Cache. Advances in Neural Information
Processing Systems 2021, 34, 4134–4146.
120. Lu, J.; Ábrego, G.H.; Ma, J.; Ni, J.; Yang, Y. Multi-stage training with improved negative contrast for neural passage retrieval.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021, pp. 6091–6103.
121. Lin, S.C.; Yang, J.H.; Lin, J. Distilling dense representations for ranking using tightly-coupled teachers. arXiv preprint arXiv:2010.11386
2020.
122. Vakili Tahami, A.; Ghajar, K.; Shakery, A. Distilling knowledge for fast retrieval-based chat-bots. Proceedings of the 43rd International
ACM SIGIR conference on research and development in information retrieval, 2020, pp. 2081–2084.
123. Izacard, G.; Grave, E. Distilling knowledge from reader to retriever for question answering. arXiv preprint arXiv:2012.04584 2020.
124. Lu, W.; Jiao, J.; Zhang, R. Twinbert: Distilling knowledge to twin-structured compressed bert models for large-scale retrieval.
Proceedings of the 29th ACM International Conference on Information & Knowledge Management, 2020, pp. 2645–2652.
125. Hofstätter, S.; Althammer, S.; Schröder, M.; Sertkan, M.; Hanbury, A. Improving efficient neural ranking models with cross-architecture
knowledge distillation. arXiv preprint arXiv:2010.02666 2020.
126. Yang, S.; Seo, M. Is Retriever Merely an Approximator of Reader? arXiv preprint arXiv:2010.10999 2020.
Version January 24, 2023 submitted to Journal Not Specified 22 of 26

127. Choi, J.; Jung, E.; Suh, J.; Rhee, W. Improving Bi-encoder Document Ranking Models with Two Rankers and Multi-teacher Distillation.
Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021, pp.
2192–2196.
128. Feldman, Y.; El-Yaniv, R. Multi-hop paragraph retrieval for open-domain question answering. arXiv preprint arXiv:1906.06606 2019.
129. Luan, Y.; Eisenstein, J.; Toutanova, K.; Collins, M. Sparse, dense, and attentional representations for text retrieval. Transactions of the
Association for Computational Linguistics 2021, 9, 329–345.
130. Khattab, O.; Zaharia, M. Colbert: Efficient and effective passage search via contextualized late interaction over bert. Proceedings of
the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020, pp. 39–48.
131. Gao, L.; Dai, Z.; Callan, J. COIL: Revisit exact lexical match in information retrieval with contextualized inverted list. arXiv preprint
arXiv:2104.07186 2021.
132. Tang, H.; Sun, X.; Jin, B.; Wang, J.; Zhang, F.; Wu, W. Improving document representations by generating pseudo query embeddings
for dense retrieval. arXiv preprint arXiv:2105.03599 2021.
133. Lee, J.; Wettig, A.; Chen, D. Phrase retrieval learns passage retrieval, too. arXiv preprint arXiv:2109.08133 2021.
134. Tonellotto, N.; Macdonald, C. Query embedding pruning for dense retrieval. Proceedings of the 30th ACM International Conference
on Information & Knowledge Management, 2021, pp. 3453–3457.
135. Zhang, S.; Liang, Y.; Gong, M.; Jiang, D.; Duan, N. Multi-View Document Representation Learning for Open-Domain Dense Retrieval.
arXiv preprint arXiv:2203.08372 2022.
136. Du, M.; Li, S.; Yu, J.; Ma, J.; Ji, B.; Liu, H.; Lin, W.; Yi, Z. Topic-Grained Text Representation-Based Model for Document Retrieval.
International Conference on Artificial Neural Networks. Springer, 2022, pp. 776–788.
137. Mitra, B.; Rosset, C.; Hawking, D.; Craswell, N.; Diaz, F.; Yilmaz, E. Incorporating query term independence assumption for efficient
retrieval and ranking using deep neural networks. arXiv preprint arXiv:1907.03693 2019.
138. Mitra, B.; Hofstatter, S.; Zamani, H.; Craswell, N. Conformer-kernel with query term independence for document retrieval. arXiv
preprint arXiv:2007.10434 2020.
139. Ji, S.; Shao, J.; Yang, T. Efficient interaction-based neural ranking with locality sensitive hashing. The World Wide Web Conference,
2019, pp. 2858–2864.
140. Humeau, S.; Shuster, K.; Lachaux, M.A.; Weston, J. Poly-encoders: Transformer architectures and pre-training strategies for fast and
accurate multi-sentence scoring. arXiv preprint arXiv:1905.01969 2019.
141. Gao, L.; Dai, Z.; Callan, J. Modularized transfomer-based ranking framework. arXiv preprint arXiv:2004.13313 2020.
142. MacAvaney, S.; Nardini, F.M.; Perego, R.; Tonellotto, N.; Goharian, N.; Frieder, O. Efficient document re-ranking for transformers by
precomputing term representations. Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in
Information Retrieval, 2020, pp. 49–58.
143. Cao, Q.; Trivedi, H.; Balasubramanian, A.; Balasubramanian, N. DeFormer: Decomposing pre-trained transformers for faster question
answering. arXiv preprint arXiv:2005.00697 2020.
144. Zhao, T.; Lu, X.; Lee, K. SPARTA: efficient open-domain question answering via sparse transformer matching retrieval. arXiv preprint
arXiv:2009.13013 2020.
145. Lee, K.; Chang, M.W.; Toutanova, K. Latent retrieval for weakly supervised open domain question answering. arXiv preprint
arXiv:1906.00300 2019.
146. Guu, K.; Lee, K.; Tung, Z.; Pasupat, P.; Chang, M. Retrieval augmented language model pre-training. International Conference on
Machine Learning. PMLR, 2020, pp. 3929–3938.
147. Chang, W.C.; Yu, F.X.; Chang, Y.W.; Yang, Y.; Kumar, S. Pre-training tasks for embedding-based large-scale retrieval. arXiv preprint
arXiv:2002.03932 2020.
148. Gao, L.; Callan, J. Is your language model ready for dense representation fine-tuning. arXiv preprint arXiv:2104.08253 2021.
149. Gao, L.; Callan, J. Unsupervised corpus aware language model pre-training for dense passage retrieval. arXiv preprint arXiv:2108.05540
2021.
150. Lu, S.; He, D.; Xiong, C.; Ke, G.; Malik, W.; Dou, Z.; Bennett, P.; Liu, T.Y.; Overwijk, A. Less is More: Pretrain a Strong Siamese Encoder
for Dense Text Retrieval Using a Weak Decoder. Proceedings of the 2021 Conference on Empirical Methods in Natural Language
Processing, 2021, pp. 2780–2791.
151. Ma, Z.; Dou, Z.; Xu, W.; Zhang, X.; Jiang, H.; Cao, Z.; Wen, J.R. Pre-training for ad-hoc retrieval: hyperlink is also you need.
Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2021, pp. 1212–1221.
152. Wu, X.; Ma, G.; Lin, M.; Lin, Z.; Wang, Z.; Hu, S. Contextual mask auto-encoder for dense passage retrieval. arXiv preprint
arXiv:2208.07670 2022.
153. Liu, Z.; Shao, Y. RetroMAE: Pre-training Retrieval-oriented Transformers via Masked Auto-Encoder. arXiv preprint arXiv:2205.12035
2022.
154. Wang, L.; Yang, N.; Huang, X.; Jiao, B.; Yang, L.; Jiang, D.; Majumder, R.; Wei, F. Simlm: Pre-training with representation bottleneck
for dense passage retrieval. arXiv preprint arXiv:2207.02578 2022.
Version January 24, 2023 submitted to Journal Not Specified 23 of 26

155. Shen, T.; Geng, X.; Tao, C.; Xu, C.; Huang, X.; Jiao, B.; Yang, L.; Jiang, D. LexMAE Lexicon-Bottlenecked Pretraining for Large-Scale
Retrieval. arXiv preprint arXiv:2208.14754 2022.
156. Ma, X.; Zhang, R.; Guo, J.; Fan, Y.; Cheng, X. A Contrastive Pre-training Approach to Discriminative Autoencoder for Dense Retrieval.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2020, Vol. 34, pp. 4314–4318.
157. Liang, D.; Xu, P.; Shakeri, S.; Santos, C.N.d.; Nallapati, R.; Huang, Z.; Xiang, B. Embedding-based zero-shot retrieval through query
generation. arXiv preprint arXiv:2009.10270 2020.
158. Ma, J.; Korotkov, I.; Yang, Y.; Hall, K.; McDonald, R. Zero-shot neural passage retrieval via domain-targeted synthetic question
generation. arXiv preprint arXiv:2004.14503 2020.
159. Reddy, R.G.; Yadav, V.; Sultan, M.A.; Franz, M.; Castelli, V.; Ji, H.; Sil, A. Towards robust neural retrieval models with synthetic
pre-training. arXiv preprint arXiv:2104.07800 2021.
160. Thakur, N.; Reimers, N.; Rücklé, A.; Srivastava, A.; Gurevych, I. BEIR: A heterogenous benchmark for zero-shot evaluation of
information retrieval models. arXiv preprint arXiv:2104.08663 2021.
161. Xin, J.; Xiong, C.; Srinivasan, A.; Sharma, A.; Jose, D.; Bennett, P.N. Zero-shot dense retrieval with momentum adversarial domain
invariant representations. arXiv preprint arXiv:2110.07581 2021.
162. Ni, J.; Qu, C.; Lu, J.; Dai, Z.; Ábrego, G.H.; Ma, J.; Zhao, V.Y.; Luan, Y.; Hall, K.B.; Chang, M.W.; others. Large dual encoders are
generalizable retrievers. arXiv preprint arXiv:2112.07899 2021.
163. Chen, T.; Zhang, M.; Lu, J.; Bendersky, M.; Najork, M. Out-of-Domain Semantics to the Rescue! Zero-Shot Hybrid Retrieval Models.
European Conference on Information Retrieval. Springer, 2022, pp. 95–110.
164. Bonifacio, L.; Abonizio, H.; Fadaee, M.; Nogueira, R. InPars: Data Augmentation for Information Retrieval using Large Language
Models. arXiv preprint arXiv:2202.05144 2022.
165. Izacard, G.; Caron, M.; Hosseini, L.; Riedel, S.; Bojanowski, P.; Joulin, A.; Grave, E. Towards unsupervised dense information retrieval
with contrastive learning. arXiv preprint arXiv:2112.09118 2021.
166. Wang, K.; Thakur, N.; Reimers, N.; Gurevych, I. Gpl: Generative pseudo labeling for unsupervised domain adaptation of dense
retrieval. arXiv preprint arXiv:2112.07577 2021.
167. Ram, O.; Shachaf, G.; Levy, O.; Berant, J.; Globerson, A. Learning to retrieve passages without supervision. arXiv preprint
arXiv:2112.07708 2021.
168. Neelakantan, A.; Xu, T.; Puri, R.; Radford, A.; Han, J.M.; Tworek, J.; Yuan, Q.; Tezak, N.; Kim, J.W.; Hallacy, C.; others. Text and code
embeddings by contrastive pre-training. arXiv preprint arXiv:2201.10005 2022.
169. Zhan, J.; Ai, Q.; Liu, Y.; Mao, J.; Xie, X.; Zhang, M.; Ma, S. Disentangled Modeling of Domain and Relevance for Adaptable Dense
Retrieval. arXiv preprint arXiv:2208.05753 2022.
170. Dai, Z.; Zhao, V.Y.; Ma, J.; Luan, Y.; Ni, J.; Lu, J.; Bakalov, A.; Guu, K.; Hall, K.B.; Chang, M.W. Promptagator: Few-shot dense retrieval
from 8 examples. arXiv preprint arXiv:2209.11755 2022.
171. Reimers, N.; Gurevych, I. The curse of dense low-dimensional information retrieval for large index sizes. arXiv preprint
arXiv:2012.14210 2020.
172. Ma, X.; Li, M.; Sun, K.; Xin, J.; Lin, J. Simple and effective unsupervised redundancy elimination to compress dense vectors for
passage retrieval. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021, pp. 2854–2859.
173. Chen, X.; Lakhotia, K.; Oğuz, B.; Gupta, A.; Lewis, P.; Peshterliev, S.; Mehdad, Y.; Gupta, S.; Yih, W.t. Salient Phrase Aware Dense
Retrieval: Can a Dense Retriever Imitate a Sparse One? arXiv preprint arXiv:2110.06918 2021.
174. Sciavolino, C.; Zhong, Z.; Lee, J.; Chen, D. Simple entity-centric questions challenge dense retrievers. arXiv preprint arXiv:2109.08535
2021.
175. Zhan, J.; Mao, J.; Liu, Y.; Guo, J.; Zhang, M.; Ma, S. Interpreting Dense Retrieval as Mixture of Topics. arXiv preprint arXiv:2111.13957
2021.
176. Ganguly, D.; Roy, D.; Mitra, M.; Jones, G.J. Word embedding based generalized language model for information retrieval. Proceedings
of the 38th international ACM SIGIR conference on research and development in information retrieval, 2015, pp. 795–798.
177. Roy, D.; Ganguly, D.; Mitra, M.; Jones, G.J. Representing documents and queries as sets of word embedded vectors for information
retrieval. arXiv preprint arXiv:1606.07869 2016.
178. Boytsov, L.; Novak, D.; Malkov, Y.; Nyberg, E. Off the beaten path: Let’s replace term-based retrieval with k-nn search. Proceedings
of the 25th ACM international on conference on information and knowledge management, 2016, pp. 1099–1108.
179. Dos Santos, C.; Barbosa, L.; Bogdanova, D.; Zadrozny, B. Learning hybrid representations to retrieve semantically equivalent
questions. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint
Conference on Natural Language Processing (Volume 2: Short Papers), 2015, pp. 694–699.
180. Seo, M.; Lee, J.; Kwiatkowski, T.; Parikh, A.P.; Farhadi, A.; Hajishirzi, H. Real-time open-domain question answering with dense-sparse
phrase index. arXiv preprint arXiv:1906.05807 2019.
181. Lee, J.; Seo, M.; Hajishirzi, H.; Kang, J. Contextualized sparse representations for real-time open-domain question answering. arXiv
preprint arXiv:1911.02896 2019.
Version January 24, 2023 submitted to Journal Not Specified 24 of 26

182. Gao, L.; Dai, Z.; Chen, T.; Fan, Z.; Van Durme, B.; Callan, J. Complement lexical retrieval model with semantic residual embeddings.
European Conference on Information Retrieval. Springer, 2021, pp. 146–160.
183. Kuzi, S.; Zhang, M.; Li, C.; Bendersky, M.; Najork, M. Leveraging semantic and lexical matching to improve the recall of document
retrieval systems: A hybrid approach. arXiv preprint arXiv:2010.01195 2020.
184. Chen, X.; He, B.; Hui, K.; Wang, Y.; Sun, L.; Sun, Y. Contextualized Offline Relevance Weighting for Efficient and Effective Neural
Retrieval. Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval,
2021, pp. 1617–1621.
185. Arabzadeh, N.; Yan, X.; Clarke, C.L. Predicting Efficiency/Effectiveness Trade-offs for Dense vs. Sparse Retrieval Strategy Selection.
Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2021, pp. 2862–2866.
186. Leonhardt, J.; Rudra, K.; Khosla, M.; Anand, A.; Anand, A. Fast forward indexes for efficient document ranking. arXiv preprint
arXiv:2110.06051 2021.
187. Lin, S.C.; Lin, J. Densifying Sparse Representations for Passage Retrieval by Representational Slicing. arXiv preprint arXiv:2112.04666
2021.
188. Shen, T.; Geng, X.; Tao, C.; Xu, C.; Zhang, K.; Jiang, D. UnifieR: A Unified Retriever for Large-Scale Retrieval. arXiv preprint
arXiv:2205.11194 2022.
189. Desai, D.; Ghadge, A.; Wazare, R.; Bagade, J. A Comparative Study of Information Retrieval Models for Short Document Summaries.
In Computer Networks and Inventive Communication Technologies; Springer, 2022; pp. 547–562.
190. Zhang, Y.; Zhang, J.; Cui, Z.; Wu, S.; Wang, L. A graph-based relevance matching model for ad-hoc retrieval. Proceedings of the
AAAI Conference on Artificial Intelligence, 2021, Vol. 35, pp. 4688–4696.
191. Ma, X.; Guo, J.; Zhang, R.; Fan, Y.; Ji, X.; Cheng, X. PROP: pre-training with representative words prediction for ad-hoc retrieval.
Proceedings of the 14th ACM International Conference on Web Search and Data Mining, 2021, pp. 283–291.
192. Khodabakhsh, M.; Bagheri, E. Qualitative Measures for Ad hoc Table Retrieval. Information Sciences 2022.
193. Agarwal, A.; Takatsu, K.; Zaitsev, I.; Joachims, T. A general framework for counterfactual learning-to-rank. Proceedings of the 42nd
International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019, pp. 5–14.
194. Ai, Q.; Wang, X.; Bruch, S.; Golbandi, N.; Bendersky, M.; Najork, M. Learning groupwise multivariate scoring functions using deep
neural networks. Proceedings of the 2019 ACM SIGIR international conference on theory of information retrieval, 2019, pp. 85–92.
195. Boualili, L.; Moreno, J.G.; Boughanem, M. MarkedBERT: Integrating traditional IR cues in pre-trained language models for passage
retrieval. Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2020,
pp. 1977–1980.
196. Capannini, G.; Lucchese, C.; Nardini, F.M.; Orlando, S.; Perego, R.; Tonellotto, N. Quality versus efficiency in document scoring with
learning-to-rank models. Information Processing & Management 2016, 52, 1161–1177.
197. Baeza-Yates, R.; Ribeiro-Neto, B.; others. Modern information retrieval; Vol. 463, ACM press New York, 1999.
198. Mitra, B.; Craswell, N. Neural models for information retrieval. arXiv preprint arXiv:1705.01509 2017.
199. Furnas, G.W.; Landauer, T.K.; Gomez, L.M.; Dumais, S.T. The vocabulary problem in human-system communication. Communications
of the ACM 1987, 30, 964–971.
200. Zhao, L.; Callan, J. Term necessity prediction. Proceedings of the 19th ACM international conference on Information and knowledge
management, 2010, pp. 259–268.
201. Chen, R.C.; Gallagher, L.; Blanco, R.; Culpepper, J.S. Efficient cost-aware cascade ranking in multi-stage retrieval. Proceedings of the
40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017, pp. 445–454.
202. Fan, Y.; Guo, J.; Lan, Y.; Xu, J.; Zhai, C.; Cheng, X. Modeling diverse relevance patterns in ad-hoc retrieval. The 41st international
ACM SIGIR conference on research & development in information retrieval, 2018, pp. 375–384.
203. Gao, L.; Dai, Z.; Callan, J. EARL: Speedup transformer-based rankers with pre-computed representation. CoRR abs/2004.13313 2020.
204. Gao, L.; Dai, Z.; Callan, J. Rethink training of BERT rerankers in multi-stage retrieval pipeline. European Conference on Information
Retrieval. Springer, 2021, pp. 280–286.
205. Hofstätter, S.; Zlabinger, M.; Hanbury, A. TU Wien@ TREC Deep Learning’19–Simple Contextualization for Re-ranking. arXiv preprint
arXiv:1912.01385 2019.
206. Hofstätter, S.; Zlabinger, M.; Hanbury, A. Interpretable & time-budget-constrained contextualization for re-ranking. arXiv preprint
arXiv:2002.01854 2020.
207. Jiang, J.Y.; Xiong, C.; Lee, C.J.; Wang, W. Long document ranking with query-directed sparse transformer. arXiv preprint
arXiv:2010.12683 2020.
208. Joachims, T.; Swaminathan, A.; Schnabel, T. Unbiased learning-to-rank with biased feedback. Proceedings of the tenth ACM
international conference on web search and data mining, 2017, pp. 781–789.
209. Lin, J.; Yang, P. The impact of score ties on repeatability in document ranking. Proceedings of the 42nd International ACM SIGIR
Conference on Research and Development in Information Retrieval, 2019, pp. 1125–1128.
Version January 24, 2023 submitted to Journal Not Specified 25 of 26

210. Lovón-Melgarejo, J.; Soulier, L.; Pinel-Sauvagnat, K.; Tamine, L. Studying catastrophic forgetting in neural ranking models. European
Conference on Information Retrieval. Springer, 2021, pp. 375–390.
211. MacAvaney, S.; Yates, A.; Hui, K.; Frieder, O. Content-based weak supervision for ad-hoc re-ranking. Proceedings of the 42nd
International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019, pp. 993–996.
212. MacAvaney, S.; Nardini, F.M.; Perego, R.; Tonellotto, N.; Goharian, N.; Frieder, O. Training curricula for open domain answer
re-ranking. Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval,
2020, pp. 529–538.
213. Matsubara, Y.; Vu, T.; Moschitti, A. Reranking for efficient transformer-based answer selection. Proceedings of the 43rd international
ACM SIGIR conference on research and development in information retrieval, 2020, pp. 1577–1580.
214. McDonald, R.; Brokos, G.I.; Androutsopoulos, I. Deep relevance ranking using enhanced document-query interactions. arXiv preprint
arXiv:1809.01682 2018.
215. Mokrii, I.; Boytsov, L.; Braslavski, P. A systematic evaluation of transfer learning and pseudo-labeling with bert-based ranking models.
Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021, pp.
2081–2085.
216. Wu, Z.; Mao, J.; Liu, Y.; Zhan, J.; Zheng, Y.; Zhang, M.; Ma, S. Leveraging passage-level cumulative gain for document ranking.
Proceedings of The Web Conference 2020, 2020, pp. 2421–2431.
217. Nguyen, T.; Rosenberg, M.; Song, X.; Gao, J.; Tiwary, S.; Majumder, R.; Deng, L. MS MARCO: A human generated machine reading
comprehension dataset. CoCo@ NIPs, 2016.
218. Wang, L.L.; Lo, K.; Chandrasekhar, Y.; Reas, R.; Yang, J.; Eide, D.; Funk, K.; Kinney, R.; Liu, Z.; Merrill, W.; others. Cord-19: The
covid-19 open research dataset. ArXiv 2020.
219. Tsatsaronis, G.; Balikas, G.; Malakasiotis, P.; Partalas, I.; Zschunke, M.; Alvers, M.R.; Weissenborn, D.; Krithara, A.; Petridis, S.;
Polychronopoulos, D.; others. An overview of the BIOASQ large-scale biomedical semantic indexing and question answering
competition. BMC bioinformatics 2015, 16, 1–28.
220. Boteva, V.; Gholipour, D.; Sokolov, A.; Riezler, S. A Full-Text Learning to Rank Dataset for Medical Information Retrieval. European
Conference on Information Retrieval, 2016.
221. Kwiatkowski, T.; Palomaki, J.; Redfield, O.; Collins, M.; Parikh, A.; Alberti, C.; Epstein, D.; Polosukhin, I.; Devlin, J.; Lee, K.; others.
Natural questions: a benchmark for question answering research. Transactions of the Association for Computational Linguistics 2019,
7, 453–466.
222. Yang, Z.; Qi, P.; Zhang, S.; Bengio, Y.; Cohen, W.W.; Salakhutdinov, R.; Manning, C.D. HotpotQA: A dataset for diverse, explainable
multi-hop question answering. arXiv preprint arXiv:1809.09600 2018.
223. Maia, M.; Handschuh, S.; Freitas, A.; Davis, B.; McDermott, R.; Zarrouk, M.; Balahur, A. Www’18 open challenge: financial opinion
mining and question answering. Companion proceedings of the the web conference 2018, 2018, pp. 1941–1942.
224. Khot, T.; Clark, P.; Guerquin, M.; Jansen, P.; Sabharwal, A. Qasc: A dataset for question answering via sentence composition.
Proceedings of the AAAI Conference on Artificial Intelligence, 2020, Vol. 34, pp. 8082–8090.
225. Feng, M.; Xiang, B.; Glass, M.R.; Wang, L.; Zhou, B. Applying deep learning to answer selection: A study and an open task. 2015
IEEE workshop on automatic speech recognition and understanding (ASRU). IEEE, 2015, pp. 813–820.
226. Suarez, A.; Albakour, D.; Corney, D.; Martinez, M.; Esquivel, J. A data collection for evaluating the retrieval of related tweets to news
articles. European Conference on Information Retrieval. Springer, 2018, pp. 780–786.
227. Soboroff, I.; Huang, S.; Harman, D. TREC 2018 News Track Overview. TREC, 2018.
228. Voorhees, E.M. The TREC robust retrieval track. ACM SIGIR Forum. ACM New York, NY, USA, 2005, Vol. 39, pp. 11–20.
229. Wachsmuth, H.; Syed, S.; Stein, B. Retrieval of the best counterargument without prior topic knowledge. Proceedings of the 56th
Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2018, pp. 241–251.
230. Ajjour, Y.; Panchenko, A.; Biemann, C.; Stein, B.; Wachsmuth, H.; Potthast, M.; Hagen, M. Overview of Touché 2021: Argument
Retrieval. Experimental IR Meets Multilinguality, Multimodality, and Interaction 2020, p. 384.
231. Hasibi, F.; Nikolaev, F.; Xiong, C.; Balog, K.; Bratsberg, S.E.; Kotov, A.; Callan, J. DBpedia-entity v2: a test collection for entity search.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017, pp.
1265–1268.
232. Thorne, J.; Vlachos, A.; Christodoulopoulos, C.; Mittal, A. Fever: a large-scale dataset for fact extraction and verification. arXiv
preprint arXiv:1803.05355 2018.
233. Diggelmann, T.; Boyd-Graber, J.; Bulian, J.; Ciaramita, M.; Leippold, M. Climate-fever: A dataset for verification of real-world climate
claims. arXiv preprint arXiv:2012.00614 2020.
234. Wadden, D.; Lin, S.; Lo, K.; Wang, L.L.; van Zuylen, M.; Cohan, A.; Hajishirzi, H. Fact or fiction: Verifying scientific claims. arXiv
preprint arXiv:2004.14974 2020.
235. Qin, T.; Liu, T.Y. Introducing LETOR 4.0 datasets. arXiv preprint arXiv:1306.2597 2013.
Version January 24, 2023 submitted to Journal Not Specified 26 of 26

236. Scialom, T.; Dray, P.A.; Lamprier, S.; Piwowarski, B.; Staiano, J. MLSUM: The multilingual summarization corpus. arXiv preprint
arXiv:2004.14900 2020.
237. Huang, Z.; Bonab, H.; Sarwar, S.M.; Rahimi, R.; Allan, J. Mixed Attention Transformer for Leveraging Word-Level Knowledge to
Neural Cross-Lingual Information Retrieval. Proceedings of the 30th ACM International Conference on Information and Knowledge
Management, 2021, pp. 760–770.

You might also like