Professional Documents
Culture Documents
33
33
33
LLMs are simply instructed to use the retrieved information, instead of being explicitly trained to
use it; for example, Liu et al. (2023b) recently observed that LLMs struggle to handle long input con-
texts when they are naı̈vely appended. Despite its importance, how to improve retrieval-augmented
LLMs via prompting has been under-explored. Therefore, to improve ODQA via LLMs, we aim
to develop a simple yet effective framework based on prompting, that could be easily applicable to
various LLMs and retrieval methods.
Contribution. We propose a framework based on Summarized Retrieval (S U R E), to improve
ODQA performance of retrieval-augmented LLMs. At a high level, S U R E helps LLMs predict
more grounded answers, which are well-supported by the summarization of retrieved passages that
could be viewed as an explicit rationale extracted from the retrieved passages. To be specific, S U R E
first constructs the multiple summarizations of retrieved passages conditioned on each of a few pos-
sible answer candidates. It enables LLMs to focus on the specific contexts relevant to the given
candidate, and hence provides more discriminative viewpoints for the given question. Then, using
the generated summarizations, S U R E confirms the most plausible answer among candidates by mea-
suring the corresponding summaries’ validity to support the given candidate and ranking of relative
informativeness to answer the question. Remarkably, all the procedures of S U R E are conducted
via zero-shot prompting. Consequently, S U R E is widely applicable when LLMs are only accessible
with black-box API, even without query-relevant few-shot examples.
Through the experiments on four different QA ChatGPT + Contriever (SuRe)
40 ChatGPT + Contriever
datasets, we demonstrate the effectiveness of S U R E ChatGPT + BM25 (SuRe)
for improving the zero-shot ODQA performance of ChatGPT + BM25
2 R ELATED W ORK