Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Scien

Library
Science Cleary
Cleary master's degree major
degree major degree
master's gender
gender female
degree profession
profession

writer writer

Open Domain Question Answering based on


Text Enhanced Knowledge Graph with Hyperedge Infusion

Jiale Han, Bo Cheng, Xu Wang


State Key Laboratory of networking and switching technology,
Beijing University of Posts and Telecommunications
{hanjl,chengbo,wxx}@bupt.edu.cn

Question: Which university did Cleary graduate from?


Abstract Text: Cleary obtained her master's
degree in Library Science at the
The incompleteness of knowledge base (KB) University of Washington. University of Library
is a vital factor limiting the performance of Washington Science
question answering (QA). This paper proposes Cleary Cleary
a novel QA method by leveraging text informa- master's degree major Library master's
degree major
degree Science
degree
tion to enhance the incomplete KB. The model gender
enriches the entity representation through se- profession gender
profession
female
mantic information contained in the text, and writer
writer female
employs graph convolutional networks to up-
date the entity status. Furthermore, to exploit Figure 1: Example of a question with its related KB
the latent structural information of text, we and text. The KB is incomplete to answer the question,
treat the text as hyperedges connecting enti- which lacks relation ”graduated university” and entity
ties among it to complement the deficient re- ”University of Washington”. By completing the miss-
lations in KB, and hypergraph convolutional ing information with adding text as hyperedge, we can
networks are further applied to reason on handle the question more effectively.
the hypergraph-formed text. Extensive ex-
periments on the WebQuestionsSP benchmark
with different KB settings prove the effective- association between KB and text. Sun et al. (2018)
ness of our model.
build a heterogeneous graph with entities and text
as nodes and employ a graph based method. Xiong
1 Introduction
et al. (2019) first encode entities in KB by graph at-
Open domain question answering (QA) is a chal- tention networks and then read text with the help of
lenging task that requires answering the factual accumulated entity knowledge. Although good re-
questions in natural language. According to the sults have been achieved, the text information is not
structure of supporting information, QA system fully utilized, especially the relation information
can be divided into knowledge-based QA (KBQA) among the entities contained in the text. Figure 1
(Bordes et al., 2015; Zhang et al., 2018) and text- shows an example that the KB is insufficient to an-
based QA (TBQA) (Welbl et al., 2018; Yang et al., swer the question. This question can be adequately
2018). KBQA obtains the answers by a structured answered by using the structural information of the
knowledge base, which is easy to query and reason text to bring high-level relationships.
with but limited by the incompleteness of well- In this paper, we propose a novel QA model
designed triples. TBQA’s supporting information based on text enhanced knowledge graph, which
is plain text containing rich semantic and latent enriches entity representation by text semantic in-
structural information, however, it’s difficult for a formation and complements the relations in KB
machine to understand. The complementary prop- through structural information of the text. Specifi-
erties inspire us to fuse these two kinds of data to cally, the model firstly encodes entities in KB com-
enhance the incomplete KB and further improve bining text information and applies graph convo-
the QA system’s performance. lutional networks (GCN) (Wu et al., 2020) to rea-
Some work has already been proposed. Das son across KB. Note that a document usually men-
et al. (2017) represent KB and text using universal tions multiple entities, we convert the unstructured
schema and apply memory networks, but lack the text into a structured hypergraph by regarding text

1475
Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1475–1481
November 16 - 20, 2020. 2020
c Association for Computational Linguistics
as hyperedge connecting entities among text, and
Answer Prediction
then employ hypergraph convolutional networks
(HGCN) (Feng et al., 2019; Yadati et al., 2019) to
further update the entity states. Finally, the model

HGCN
predicts the final answers.
Our highlights are summarized as follows: 1) We
novelly treat documents as high-order relations (hy-
peredges) connecting entities mentioned in them.
2) We apply Hypergraph Convolutional Networks
to reason and propose the dual-step attention to
catch the importance of different entities and doc-

GCN
uments. 3) Extensive experiments conducted on
the widely used WebQuestionsSP (Yih et al., 2016)
with different KB settings demonstrate our model candidates documents
is effective. Input Encoder
2 Related Work KB query Text
The combination of knowledge base and text in
QA is a challenging task, which has attracted many Figure 2: The overview of the model. We utilize the se-
researchers’ attention. The work of (Das et al., mantic information mentioned in the text to enrich the
entity representation, and novelly treat text as hyper-
2017) extends universal schema to question an-
edges to complement the relation in incomplete KB.
swering and employs Key-Value Memory networks
to process to text and KB. Sun et al. (2018) re-
gard documents as heterogeneous nodes and com- Personalized PageRank (Haveliwala, 2002), where
bine them with entities in KB to form a uniform V is the entity set, E is the relation set, and T con-
graph. The model proposed by Xiong et al. (2019) tains a set of triples (vh , r, vt ) indicated there is
contains a graph-attention based KB reader and a a relation r ∈ E between vh ∈ V and vt ∈ V.
knowledge-aware text reader. Some other work Also a relevant text corpus D = d1 , d2 , ..., d|D|
focuses on retrieving a small graph that contains is retrieved from Wikipedia by an off-the-shelf
just the question-related information (Sun et al., document retriever (Chen et al., 2017), which
2019) and the interpretability of QA on KB and di = (w1 , w2 , ..., w|di | ) represents a document and
text (Sydorova et al., 2019). These methods lack the entities mentioned in documents have been
considering the high-order relationship among the linked. The task requires to extract answers from
entities contained in the text. This paper regards the all KB and document entities. The overview of our
text as hyperedge and further employs hypergraph model is shown in Figure 2.
convolutional networks.
Hypergraph convolutional networks (Feng et al., 3.2 Input Encoder
2019; Yadati et al., 2019) utilize hypergraph struc- Query and Text Encoder: Let Xq ∈ R|q|×n and
ture rather than a general graph to represent the Xd ∈ R|d|×n be the embedding matrices of query
high-order correlation among data entirely, and q and document d ∈ D, where n is the embed-
hypergraph attention (Bai et al., 2019) further en- ding dimension. Bi-LSTM networks (Hochreiter
hances the ability of representation learning by us- and Schmidhuber, 1997) are applied to encode the
ing an attention module. query and document separately and get the hidden
3 Model states Hq ∈ R|q|×h and Hd ∈ R|d|×h , h is the hid-
den dimension of bi-LSTM. Then we compute the
3.1 Task Definition representation of query hq and document hd with
To maintain consistency and fairness, we adopt attention mechanism.
the same setting as Sun et al. (2018) that builds
a subgraph for each question. Specifically, given hq = HT
q softmax(fq (Hq )) ∈ R
h×1

a question q = (w1 , w2 , ..., w|q| ), the related sub


knowledge graph K = (V, E, T ) is extracted by the hd = HT T
d softmax(fd (Hd Hq )) ∈ R
h×1

1476
where T represents matrix transposition, fq is a
linear network which converts h dimension to 1 αi = σ(hT (l1 )
q fa ([hvi ; hri ]))
dimension, and fd converts |q| dimension to 1 di-
where W1 ∈ Rh×h , W2 ∈ Rh×2h are learnable
mension.
parameters, Nv represents the adjacent triple set of
KB Encoder: Each entity v ∈ V is initialized entity v, fa converts 2h dimension to h dimension,
by pre-trained knowledge graph embedding xv ∈ l1 represents the current GCN layer, which has a
Rn×1 . And relation is initialized by semantic vec- total of L1 layers, and σ is the sigmoid function.
tor and KG embedding. Specifically, for relation
HGCN for Hypergraph-Formed Text: The
r ∈ E and its KG embedding xr ∈ Rn×1 , we to-
model regards plain text as hyperedges connect-
kenize it as r = (w1 , w2 , ..., w|r| ) and feed into
ing the entities among the text to complement the
bi-LSTM layer with word embedding to get the
lack of relations in KB. HGCN is employed to en-
hidden states Hr ∈ R|r|×h , then calculate the rep-
code on the hypergraph-formed text. What’s more,
resentation hr as follows.
dual-step attention catches the importance of dif-
|r|×h ferent entities and documents. Formally, at layer
Hrq = softmax(Hr HT
q )Hq ∈ R
l2 , the model first transfers the entity feature to
0 the connected hyperedges to form the document
Hr = [Hr ; Hrq ] ∈ R|r|×2h
representation,
0 0 0
hr = HrT softmax(fr1 (Hr )) ∈ R2h×1 0 (l +1) 0 (l ) X 0
0
hd 2 = W3 hd 2 + βi W4 hv(li 2 ) ∈ Rh×1
hr = fr2 ([hr ; xr ]) ∈ Rh×1 vi ∈Nd

where [; ] denotes column-wise concatenation, fr1 βi = σ(hT


0
(l2 )
q hvi )
is a linear network which converts 2h dimension to
1 dimension, and fr2 converts 2h + n dimension where W3 , W4 ∈ Rh×h are learnable parame-
0 (0) (L ) 0 (0)
to h dimension. ters, hv = hv 1 , hd = hd , and Nd represents
the connected entity set of document d. Then the
3.3 Reasoning over Text Enhanced model gathers the documents’ information to up-
Knowledge Graph date the connected entity states.
This component utilizes text information to im- 0 0
X 0 (l +1)

prove the incomplete KB by enriching entity repre- hv(l2 +1) = W5 hv(l2 ) + γi W6 hdi 2 ∈ Rh×1
di ∈Dv
sentation and adding hyperedges, and applies GCN
and HGCN to reasoning. 0 (l +1)
γi = σ(hT
q hdi
2
)
GCN for Entity-Enriched KB: To utilize the
where W5 , W6 ∈ Rh×h are learnable parameters.
rich semantic information contained in the text,
we construct a binary matrix M, where Mvd ∈ 3.4 Answer Prediction
R|d|×1 indicates the span of entity v in document After L1 GCN layers and L2 HGCN layers, the
d, and pass information from documents to entities model finally predicts the probability of each entity
0
to form text-aware entity representation xv , then being the answer,
(0)
concatenate with xv as initial node state hv . 0
pv = σ(fout (hv(L2 ) ))
0
X
xv = HT v
d Md ∈ R
h×1

d∈Dv
where fout converts h dimension to 1 dimension.

0 4 Experiments
h(0)
v = fv ([xv ; xv ]) ∈ R
h×1

4.1 Dataset
where Dv is the linked documents set of entity
v, fv converts h + n dimension to h dimension. WebQuestionsSP (Yih et al., 2016) is a multi-
Then the model learns the entity representation by answer QA dataset which contains 4737 questions.
aggregating the connected entity feature. In our experiments we adopt the dataset 1 prepro-
cessed by Sun et al. (2018). Table 1 shows the
X 1
h(l
v
1 +1)
= W1 h(l
v
1)
+ αi W2 [h(l 1)
vi ; hri ] ∈ R
h×1 https://github.com/OceanskySun/
(vi ,ri )∈Nv
GraftNet

1477
questions avg candidates avg linked avg entities
Dataset avg answers
train / dev / test KB / Text / KB+Text documents in documents
WebQSP 2848 / 250 / 1639 384.6 / 141.6 / 515.1 43.6 11.2 4.6

Table 1: Statistical results of the dataset WebQuestionsSP.

Model KB-only Text-only KB+Text The GCN layer L1 and HGCN layer L2 are 1 and
KVMem 46.7 / 38.6 23.2 / 13.0 40.5 / 30.9 2 separately. The average runtime for one epoch is
GraftNet 66.7 / 62.4 25.3 / 15.3 67.8 / 60.4 5 minutes, and we set the max number of epochs to
SG-KA 66.5 / 58.0 -/- 67.2 / 57.3 200. The number of parameters is 69 million. The
PullNet 68.1 / - 24.8 / - -/- Adam optimizer (Kingma and Ba, 2015) is applied
Ours 66.9 / 60.1 27.2 / 17.1 68.4 / 60.6 to minimize the binary cross-entropy loss with a
learning rate of 0.0005. The threshold for F1 is set
Table 2: Hits@1 / F1 scores on WebQSP.
to 0.05.

4.4 Results
statistics of the dataset and retrieved subgraphs for
the questions, including KB and linked text. In par- Main Results: The metrics adopted in the exper-
ticular, the average number of linked entities in the iments are Hits@1, which is the accuracy of the top
documents is 4.6, which illustrates the rationality answer predicted by the model, and F1, which rep-
of adopting hyperedges. resents the ability to predict all answers. As shown
in table 2, we experiment with our model under KB-
4.2 Baseline Methods only, Text-only, and KB+Text settings and compare
We compare our methods with the following mod- them with baseline methods. Our model gets com-
els: petitive performance in the KB-only setting and
achieves the best results in the other two settings,
• KVMemNet (Miller et al., 2016) is an end-to- especially in the Text-only setting, Hits@1 and F1
end memory network which stores KB facts are 1.9% and 1.8% higher than the second-best
and text into key-value pairs. method respectively, which shows the validity of
• GraftNet (Sun et al., 2018) combines KB and treating documents as hyperedges. The promising
text with the early fusion strategy and applies performance may inspire us to handle similar tasks
a graph-based model. that build plain text to hypergraph and apply ef-
ficient HGCN. In KB+Text’s setting, our method
• SG-KA Reader (Xiong et al., 2019) proposes also achieves the best performance, proving that
two components to reason over KB and incor- our proposed enhancement strategy can effectively
porate entity information to text. enhance incomplete KB by fully introducing the
semantic and structural information implied in the
• PullNet (Sun et al., 2019) is a QA framework text. In particular, our model improves a lot com-
for learning how to retrieve small sub-graph pared with KB-only, more than the work of (Sun
related to answering the question. et al., 2018), which demonstrates our way that treat-
4.3 Training Details ing documents as hyperedges is more productive
than regarding them as heterogeneous nodes.
The model is implemented in PyTorch (Paszke
et al., 2019) and trained on one Nvidia Tesla P40 Different KB Setting: Following the work of
GPU. We apply 100-dimensional TransE embed- Sun et al. (2018) that the KB is downsampled to
dings (Bordes et al., 2013) for entities and relations, different extents, we experiment on 10%, 30%, and
and 300-dimensional GloVe embeddings (Penning- 50% KB settings, which represents the percentage
ton et al., 2014) for question and text words. The of required evidence covered by KB to simulate
word numbers of questions and documents are lim- the situation of incomplete KB, and analyze the im-
ited to be 10 and 50. The hidden size is set to 100. pact of the text on model performance. As shown
We select the hyperparameter values by manual in table 3, our model obtains the promising per-
tuning to perform the best results on the validation formance in the KB-only setting, especially the
dataset. The dropout is 0.2, and the batch size is 8. F1 metric all achieves the highest values, demon-

1478
10% 30% 50%
Model
KB-only KB+Text KB-only KB+Text KB-only KB+Text
KVMem 12.5 / 4.3 24.6 / 14.4 25.8 / 13.8 27.0 / 17.7 33.3 / 21.3 32.5 / 23.6
GraftNet 15.5 / 6.5 31.5 / 17.7 34.9 / 20.4 40.7 / 25.2 47.7 / 34.3 49.9 / 34.7
SG-KA 17.1 / 7.0 33.6 / 18.9 35.9 / 20.2 42.6 / 27.1 49.2 / 33.5 52.7 / 36.1
PullNet -/- -/- -/- -/- 50.3 / - 51.9 / -
Ours 18.3 / 7.9 33.7 / 19.9 35.2 / 21.0 42.8 / 27.5 49.3 / 34.3 52.8 / 37.1

Table 3: Hits@1 / F1 scores under different KB settings.

10%KB+Text
 .90HP Model
*UDIW1HW Hits@1 F1
6*.$ Full Model 33.7 19.9
 2XUV
−GCN attention 33.3 19.3
+LWV#

 −dual-step attention 32.5 18.9


−entity-enriched KB 32.8 18.7

Table 4: Experimental results of ablation study.

    the model. Table 4 shows the results under 10%
.%IUDFWLRQ KB setting. From the second and third rows, the
attention mechanism adopted by the model is effec-
Figure 3: Improvement of KB+Text over KB-only un- tive, especially the dual-step attention proposed at
der different KB fraction setting.
the HGCN layer, which brings 1.2% improvement
of Hits@1. The strategy of entity-enriched KB also
increases Hits@1 by 0.9%, proving its validity.
strating the ability of our method for multi-answer
prediction. After combining the text, our model 5 Conclusion
achieves the best results compared with the base-
line methods. What’s more, the performance in the We propose a QA method that aims to enhance the
KB+Text setting has been significantly improved incomplete KB by text information, which fully
over the KB-only setting. The more incomplete the explored the semantic and latent structural infor-
KB, the more obvious the performance improve- mation in the text. In particular, the text is treated
ment, which shows it can effectively use the docu- as hyperedges to complement the lack of relations
ment information to complete and enhance the KB, in KB. The model first applies GCN to encode the
so as to further improve the performance of QA. entity-enriched KB, then employs HGCN to further
In order to intuitively visualize the improvement, reason over hypergraph-formed text, and predicts
figure 3 displays the increment of all models after the final answers. Experimental results on the We-
adding text under different settings (KB+Text−KB- bQuestionsSP benchmark prove the effectiveness
only). We can observe that our method achieves of our model and each component.
the largest or almost the largest increment. What’s
Acknowledgements
more, we notice the text information improves the
performance obviously in the case of incomplete This work is supported by the Fundamental Re-
KB, but may cause the extra interference when the search Funds for the Central Universities under
KB is sufficient to support answering questions, grant 2020XD-A07-1, the National Key Research
which even lead to performance degradation. This and Development Program of China under grant
makes us think about how to effectively use text 2018YFB1003804, in part by the National Nat-
to further improve the performance of question an- ural Science Foundation of China under grant
swering under the full KB setting. 61921003, 61972043. Jiale Han is supported by
BUPT Excellent Ph.D. Students Foundation (Grant
Ablation Study: An ablation study is conducted No. CX2020102). We would like to thank all re-
to evaluate the benefits of different components in viewers for careful and valuable comments.

1479
References 2016 Conference on Empirical Methods in Natural
Language Processing, EMNLP 2016, Austin, Texas,
Song Bai, Feihu Zhang, and Philip H. S. Torr. 2019. USA, November 1-4, 2016, pages 1400–1409. The
Hypergraph convolution and hypergraph attention. Association for Computational Linguistics.
CoRR, abs/1901.08150.
Antoine Bordes, Nicolas Usunier, Sumit Chopra, and Adam Paszke, Sam Gross, Francisco Massa, Adam
Jason Weston. 2015. Large-scale simple ques- Lerer, James Bradbury, Gregory Chanan, Trevor
tion answering with memory networks. CoRR, Killeen, Zeming Lin, Natalia Gimelshein, Luca
abs/1506.02075. Antiga, Alban Desmaison, Andreas Köpf, Edward
Yang, Zachary DeVito, Martin Raison, Alykhan Te-
Antoine Bordes, Nicolas Usunier, Alberto Garcı́a- jani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang,
Durán, Jason Weston, and Oksana Yakhnenko. Junjie Bai, and Soumith Chintala. 2019. Pytorch:
2013. Translating embeddings for modeling multi- An imperative style, high-performance deep learn-
relational data. In Advances in Neural Information ing library. In Advances in Neural Information Pro-
Processing Systems 26: 27th Annual Conference on cessing Systems 32: Annual Conference on Neu-
Neural Information Processing Systems 2013. Pro- ral Information Processing Systems 2019, NeurIPS
ceedings of a meeting held December 5-8, 2013, 2019, 8-14 December 2019, Vancouver, BC, Canada,
Lake Tahoe, Nevada, United States, pages 2787– pages 8024–8035.
2795.
Jeffrey Pennington, Richard Socher, and Christopher D.
Danqi Chen, Adam Fisch, Jason Weston, and Antoine Manning. 2014. Glove: Global vectors for word
Bordes. 2017. Reading wikipedia to answer open- representation. In Proceedings of the 2014 Confer-
domain questions. In Proceedings of the 55th An- ence on Empirical Methods in Natural Language
nual Meeting of the Association for Computational Processing, EMNLP 2014, October 25-29, 2014,
Linguistics, ACL 2017, Vancouver, Canada, July 30 - Doha, Qatar, A meeting of SIGDAT, a Special Inter-
August 4, Volume 1: Long Papers, pages 1870–1879. est Group of the ACL, pages 1532–1543. ACL.
Association for Computational Linguistics.
Haitian Sun, Tania Bedrax-Weiss, and William W. Co-
Rajarshi Das, Manzil Zaheer, Siva Reddy, and Andrew hen. 2019. Pullnet: Open domain question answer-
McCallum. 2017. Question answering on knowl- ing with iterative retrieval on knowledge bases and
edge bases and text using universal schema and text. In Proceedings of the 2019 Conference on
memory networks. In Proceedings of the 55th An- Empirical Methods in Natural Language Processing
nual Meeting of the Association for Computational and the 9th International Joint Conference on Nat-
Linguistics, ACL 2017, Vancouver, Canada, July 30 ural Language Processing, EMNLP-IJCNLP 2019,
- August 4, Volume 2: Short Papers, pages 358–365. Hong Kong, China, November 3-7, 2019, pages
Association for Computational Linguistics. 2380–2390. Association for Computational Linguis-
tics.
Yifan Feng, Haoxuan You, Zizhao Zhang, Rongrong Ji,
and Yue Gao. 2019. Hypergraph neural networks. Haitian Sun, Bhuwan Dhingra, Manzil Zaheer, Kathryn
In The Thirty-Third AAAI Conference on Artificial Mazaitis, Ruslan Salakhutdinov, and William W. Co-
Intelligence, AAAI 2019, The Thirty-First Innova- hen. 2018. Open domain question answering using
tive Applications of Artificial Intelligence Confer- early fusion of knowledge bases and text. In Pro-
ence, IAAI 2019, The Ninth AAAI Symposium on Ed- ceedings of the 2018 Conference on Empirical Meth-
ucational Advances in Artificial Intelligence, EAAI ods in Natural Language Processing, Brussels, Bel-
2019, Honolulu, Hawaii, USA, January 27 - Febru- gium, October 31 - November 4, 2018, pages 4231–
ary 1, 2019, pages 3558–3565. AAAI Press. 4242. Association for Computational Linguistics.
Taher H. Haveliwala. 2002. Topic-sensitive pagerank. Alona Sydorova, Nina Pörner, and Benjamin Roth.
In Proceedings of the Eleventh International World 2019. Interpretable question answering on knowl-
Wide Web Conference, WWW 2002, May 7-11, 2002, edge bases and text. In Proceedings of the 57th Con-
Honolulu, Hawaii, USA, pages 517–526. ACM. ference of the Association for Computational Lin-
Sepp Hochreiter and Jürgen Schmidhuber. 1997. guistics, ACL 2019, Florence, Italy, July 28- August
Long short-term memory. Neural Computation, 2, 2019, Volume 1: Long Papers, pages 4943–4951.
9(8):1735–1780. Association for Computational Linguistics.

Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Johannes Welbl, Pontus Stenetorp, and Sebastian
method for stochastic optimization. In 3rd Inter- Riedel. 2018. Constructing datasets for multi-hop
national Conference on Learning Representations, reading comprehension across documents. Trans.
ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Assoc. Comput. Linguistics, 6:287–302.
Conference Track Proceedings.
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong
Alexander H. Miller, Adam Fisch, Jesse Dodge, Amir- Long, Chengqi Zhang, and Philip S. Yu. 2020. A
Hossein Karimi, Antoine Bordes, and Jason We- comprehensive survey on graph neural networks.
ston. 2016. Key-value memory networks for di- IEEE Transactions on Neural Networks and Learn-
rectly reading documents. In Proceedings of the ing Systems, pages 1–21.

1480
Wenhan Xiong, Mo Yu, Shiyu Chang, Xiaoxiao Guo,
and William Yang Wang. 2019. Improving ques-
tion answering over incomplete kbs with knowledge-
aware reader. In Proceedings of the 57th Confer-
ence of the Association for Computational Linguis-
tics, ACL 2019, Florence, Italy, July 28- August 2,
2019, Volume 1: Long Papers, pages 4258–4264. As-
sociation for Computational Linguistics.
Naganand Yadati, Madhav Nimishakavi, Prateek Ya-
dav, Vikram Nitin, Anand Louis, and Partha P. Taluk-
dar. 2019. Hypergcn: A new method for train-
ing graph convolutional networks on hypergraphs.
In Advances in Neural Information Processing Sys-
tems 32: Annual Conference on Neural Information
Processing Systems 2019, NeurIPS 2019, 8-14 De-
cember 2019, Vancouver, BC, Canada, pages 1509–
1520.
Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Ben-
gio, William W. Cohen, Ruslan Salakhutdinov, and
Christopher D. Manning. 2018. Hotpotqa: A dataset
for diverse, explainable multi-hop question answer-
ing. In Proceedings of the 2018 Conference on
Empirical Methods in Natural Language Process-
ing, Brussels, Belgium, October 31 - November 4,
2018, pages 2369–2380. Association for Computa-
tional Linguistics.
Wen-tau Yih, Matthew Richardson, Christopher Meek,
Ming-Wei Chang, and Jina Suh. 2016. The value of
semantic parse labeling for knowledge base question
answering. In Proceedings of the 54th Annual Meet-
ing of the Association for Computational Linguistics,
ACL 2016, August 7-12, 2016, Berlin, Germany, Vol-
ume 2: Short Papers. The Association for Computer
Linguistics.

Yuyu Zhang, Hanjun Dai, Zornitsa Kozareva, Alexan-


der J. Smola, and Le Song. 2018. Variational
reasoning for question answering with knowledge
graph. In Proceedings of the Thirty-Second AAAI
Conference on Artificial Intelligence, (AAAI-18),
the 30th innovative Applications of Artificial Intel-
ligence (IAAI-18), and the 8th AAAI Symposium
on Educational Advances in Artificial Intelligence
(EAAI-18), New Orleans, Louisiana, USA, February
2-7, 2018, pages 6069–6076. AAAI Press.

1481

You might also like