Professional Documents
Culture Documents
Download Computer Supported Cooperative Work And Social Computing 15Th Ccf Conference Chinesecscw 2020 Shenzhen China November 7 9 2020 Revised Selected Papers 1St Edition Yuqing Sun online ebook texxtbook full chapter pdf
Download Computer Supported Cooperative Work And Social Computing 15Th Ccf Conference Chinesecscw 2020 Shenzhen China November 7 9 2020 Revised Selected Papers 1St Edition Yuqing Sun online ebook texxtbook full chapter pdf
https://ebookmeta.com/product/machine-translation-16th-china-
conference-ccmt-2020-hohhot-china-october-10-12-2020-revised-
selected-papers-junhui-li-editor/
Human Centered Computing 6th International Conference
HCC 2020 Virtual Event December 14 15 2020 Revised
Selected Papers Lecture Notes in Computer Science 12634
https://ebookmeta.com/product/human-centered-computing-6th-
international-conference-hcc-2020-virtual-event-
december-14-15-2020-revised-selected-papers-lecture-notes-in-
computer-science-12634/
https://ebookmeta.com/product/digital-health-and-medical-
analytics-second-international-conference-dha-2020-beijing-china-
july-25-2020-revised-selected-papers-1st-edition-yichuan-wang/
Editorial Board
Joaquim Filipe
Polytechnic Institute of Setúbal, Setúbal, Portugal
Ashish Ghosh
Indian Statistical Institute, Kolkata, India
Lizhu Zhou
Tsinghua University, Beijing, China
Dongning Liu
Guangdong University of Technology, Guangzhou, China
Hao Liao
Shenzhen University, Shenzhen, China
Hongfei Fan
Tongji University, Shanghai, China
Liping Gao
University of Shanghai for Science and Technology, Shanghai, China
The publisher, the authors and the editors are safe to assume that the
advice and information in this book are believed to be true and accurate
at the date of publication. Neither the publisher nor the authors or the
editors give a warranty, expressed or implied, with respect to the
material contained herein or for any errors or omissions that may have
been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
Steering Committee
Yong Tang South China Normal University, China
Weiqing Tang Chinese Academy of Sciences, China
Ning Gu Fudan University, China
Shaozi Li Xiamen University, China
Bin Hu Lanzhou University, China
Yuqing Sun Shandong University, China
Xiaoping Liu Hefei University of Technology, China
Zhiwen Yu Northwestern University of Technology, China
Xiangwei Zheng Shandong Normal University, China
Tun Lu Fudan University, China
General Chairs
Yong Tang South China Normal University, China
Dajun Zeng Institute of Automation, Chinese Academy of Sciences,
China
Publicity Chairs
Xiangwei Zheng Shandong Normal University, China
Jianguo Li South China Normal University, China
Publication Chairs
Bin Hu Lanzhou University, China
Hailong Sun Beihang University, China
Finance Chair
Xiaoyan Huang Institute of Automation, Chinese Academy of Sciences,
China
Program Committee
Tie Bao Jilin University, China
Hongming Cai Shanghai Jiao Tong University, China
Zhicheng Cai Nanjing University of Science and Technology, China
Buqing Cao Hunan University of Science and Technology, China
Donglin Cao Xiamen University, China
Jian Cao Shanghai Jiao Tong University, China
Chao Chen Chongqing University, China
Jianhui Chen Beijing University of Technology, China
Longbiao Chen Xiamen University, China
Liangyin Chen Sichuan University, China
Qingkui Chen University of Shanghai for Science and Technology,
China
Ningjiang Chen Guangxi University, China
Weineng Chen South China University of Technology, China
Yang Chen Fudan University, China
Shiwei Cheng Zhejiang University of Technology, China
Xiaohui Cheng Guilin University of Technology, China
Yuan Cheng uhan University, China
Lizhen Cui Shandong University, China
Weihui Dai Fudan University, China
Xianghua Ding Fudan University, China
Wanchun Dou Nanjing University, China
Bowen Du University of Warwick, UK
Hongfei Fan Tongji University, China
Shanshan Feng Shandong Normal University, China
Liping Gao University of Shanghai for Science and Technology, China
Ning Gu Fudan University, China
Kun Guo Fuzhou University, China
Wei Guo Shandong University, China
Yinzhang Guo Taiyuan University of Science and Technology, China
Tao Han Zhejiang Gongshang University, China
Fei Hao Shanxi Normal University, China
Fazhi He Wuhan University, China
Haiwu He Chinese Academy of Sciences, China
Bin Hu Lanzhou University, China
Daning Hu South University of Science and Technology, China
Wenting Hu Jiangsu Open University, China
Yanmei Hu Chengdu University of Technology, China
Changqin Huang South China Normal University, China
Bo Jiang Zhejiang Gongshang University, China
Bin Jiang Hunan University, China
Jiuchuan Jiang Nanjing University of Finance and Economics, China
Weijin Jiang Xiangtan University, China
Yichuan Jiang Southeast University, China
Lu Jia China Agricultural University, China
Yi Lai Xi'an University of Posts and Telecommunications, China
Dongsheng Li Microsoft Research, China
Guoliang Li Tsinghua University, China
Hengjie Li Lanzhou University of Arts and Science, China
Jianguo Li South China Normal University, China
Li Li Southwest University, China
Renfa Li Hunan University, China
Shaozi Li Xiamen University, China
Taoshen Li Guangxi University, China
Xiaoping Li Southeast University, China
Lu Liang Guangdong University of Technology, China
Bing Lin Fujian Normal University, China
Dazhen Lin Xiamen University, China
Dongning Liu Guangdong University of Technology, China
Hong Liu Shandong Normal University, China
Li Liu Chongqing University, China
Shijun Liu Shandong University, China
Shufen Liu Jilin University, China
Xiaoping Liu Hefei University of Technology, China
Tun Lu Fudan University, China
Huijuan Lu China Jiliang University, China
Dianjie Lu
Shandong Normal University, China
Qiang Lu Hefei University of Technology, China
Pin Lv Guangxi University, China
Hui Ma University of Electronic Science and Technology of China and
Zhongshan Institute, China
KeJi Mao Zhejiang University of Technology, China
Haiwei Pan Harbin Engineering University, China
Li Pan Shandong University, China
Limin Shen Yanshan University, China
Yuliang Shi ShanDa Dareway Company Limited, China
Yanjun Shi Dalian University of Science and Technology, China
Xiaoxia Song Datong University, China
Kehua Su Wuhan University, China
Hailong Sun Beihang University, China
Ruizhi Sun China Agricultural University, China
Yuqing Sun Shandong University, China
Yuling Sun East China Normal University, China
Wen'an Tan Nanjing University of Aeronautics and Astronautics,
China
Lina Tan Hunan University of Technology and Business, China
Yong Tang South China Normal University, China
Shan Tang Shanghai Polytechnic University, China
Weiqing Tang Beijing Zhongke Fulong Computer Technology
Company Limited, China
Yan Tang Hohai University, China
Yiming Tang HeFei University of Technology, China
Yizheng Tao China Academy of Engineering Physics, China
Shaohua Teng Guangdong University of Technology, China
Dakuo Wang IBM Research – USA, USA
Hongbin Wang Kunming University of Science and Technology, China
Hongjun Wang Southwest Jiaotong University, China
Hongbo Wang University of Science and Technology Beijing, China
Lei Wang Dalian University of Technology, China
Tianbo Wang Beihang University, China
Tong Wang Harbin Engineering University, China
Wanyuan Wang Southeast University, China
Yijie Wang National University of Defense Technology, China
Zhiwen Wang Guangxi University of Science and Technology, China
Yiping Wen Hunan University of Science and Technology, China
Chunhe Xia Beihang University, China
Fangxion Xiao Jinling Institute of Technology, China
Zheng Xiao Hunan University, China
Xiaolan Xie Guilin University of Technology, China
Zhiqiang Xie Harbin University of Science and Technology, China
Yu Xin Harbin University of Science and Technology, China
Jianbo Xu Hunan University of Science and Technology, China
Jiuyun Xu China University of Petroleum, China
Meng Xu Shandong Technology and Business University, China
Heyang Xu Henan University of Technology, China
Jiaqi Yan Nanjing University, China
Bo Yang University of Electronic Science and Technology of China,
China
Chao Yang Hunan University, China
Dingyu Yang Shanghai DianJi University, China
Gang Yang Northwestern Polytechnical University, China
Jing Yang Harbin Engineering University, China
Lin Yang Shanghai Computer Software Technology Development
Center, China
Xiaochun Yang Northeastern University, China
Xu Yu Qingdao University of Science and Technology, China
Zhiwen Yu Northwestern Polytechnical University, China
Jianyong Yu Hunan University of Science and Technology, China
Yang Yu Zhongshan University, China
Zhengtao Yu Kunming University of Science and Technology, China
An Zeng Guangdong Polytechnical University, China
Dajun Zeng Chinese Academy of Sciences, China
Changyou Zhang Chinese Academy of Sciences, China
Jifu Zhang Taiyuan University of Science and Technology, China
Liang Zhang Fudan University, China
Miaohui Zhang Jiangxi Academy of Sciences, China
Peng Zhang Fudan University, China
Shaohua Zhang Shanghai Software Technology Development Center,
China
Wei Zhang Guangdong University of Technology, China
Zhiqiang Zhang Harbin Engineering University, China
Zili Zhang Southwest University, China
Xiangwei Zheng Shandong Normal University, China
Ning Zhong Beijing University of Technology, China
Yifeng Zhou Southeast University, China
Tingshao Zhu Chinese Academy of Science, China
Xia Zhu Southeast University, China
Xianjun Zhu Jinling University of Science and Technology, China
Yanhua Zhu The First Affiliated Hospital of Guangdong
Pharmaceutical University, China
Jianhua Zhu City University of Hong Kong, China
Qiaohong Zu Wuhan University of Technology, China
Contents
Crowdsourcing, Crowd Intelligence, and Crowd Cooperative
Computing
Selective Self-attention with BIO Embeddings for Distant
Supervision Relation Extraction
Mingjie Tang and Bo Yang
Service Clustering Method Based on Knowledge Graph
Representation Learning
Bo Jiang, Xuejun Xu, Junchen Yang and Tian Wang
An Efficient Framework of Convention Emergence Based on
Multiple-Local Information
Cui Chen, Chenxiang Luo and Wu Chen
WikiChain:A Blockchain-Based Decentralized Wiki Framework
Zheng Xu, Chaofan Liu, Peng Zhang, Tun Lu and Ning Gu
Dynamic Vehicle Distribution Path Optimization Based on
Collaboration of Cloud, Edge and End Devices
Tiancai Li, Yiping Wen, Zheng Tan, Hong Chen and Buqing Cao
CLENet:A Context and Location Enhanced Transformer Network
for Target-Oriented Sentiment Classification
Chao Yang, Hefeng Zhang, Jing Hou and Bin Jiang
Long-Tail Items Recommendation Framework Focusing on Low
Precision Rate Users
Jing Qin, Jing Sun, Rong Pu and Bin Wang
Learning from Crowd Labeling with Semi-crowdsourced Deep
Generative Models
Xuan Wei, Mingyue Zhang and Daniel Dajun Zeng
Multi-view Weighted Kernel Fuzzy Clustering Algorithm Based on
the Collaboration of Visible and Hidden Views
Yiming Tang, Bowen Xia, Fuji Ren, Xiaocheng Song, Hongmang Li
and Wenbin Wu
Impacts of Individuals’ Trust in Information Diffusion of the
Weighted Multiplex Networks
Jianyong Yu, Jie Luo and Pei Li
Intuitionistic Entropy-Induced Cooperative Symmetric
Implicational Inference
Yiming Tang, Jiajia Huang, Fuji Ren, Witold Pedrycz and
Guangqing Bao
Multi-task Allocation Strategy and Incentive Mechanism Based on
Spatial-Temporal Correlation
Zihui Jiang and Wenan Tan
Domain-Specific Collaborative Applications
Dynamic Key-Value Gated Recurrent Network for Knowledge
Tracing
Bo Xie, Lina Fu, Bo Jiang and Long Chen
Law Article Prediction via a Codex Enhanced Multi-task Learning
Framework
Bingjun Liu, Zhiming Luo, Dazhen Lin and Donglin Cao
MDA-Network:Mask and Dual Attention Network for Handwritten
Mathematical Expression Recognition
Jian Hu, Pan Zhou, Shirong Liao, Shaozi Li, Songjian Su and
Songzhi Su
A Study Based on P300 Component in Single-Trials for
Discriminating Depression from Normal Controls
Wei Zhang, Tao Gong, Jianxiu Li, Xiaowei Li and Bin Hu
Research on Ship Classification Method Based on AIS Data
Pu Luo, Jing Gao, Guiling Wang and Yanbo Han
A Low-Code Development Framework for Constructing Industrial
Apps
Jingyue Wang, Binhang Qi, Wentao Zhang and Hailong Sun
Research on Machine Learning Method for Equipment Health
Management in Industrial Internet of Things
Zheng Tan and Yiping Wen
A Transfer Learning Method for Handwritten Chinese Character
Recognition
Meixian Jiang, Baohua Zhang, Yan Sun, Feng Qiang and
Hongming Cai
An Aided Vehicle Diagnosis Method Based on the Fault Description
Xiaofeng Cai, Wenjia Wu, Wen Bo and Changyou Zhang
Blockchain Medical Asset Model Supporting Analysis of Transfer
Path Across Medical Institutions
Qingqing Yin, Lanju Kong, Xinping Min and Siqi Feng
Cooperative Anomaly Detection Model and Real-Time Update
Strategy for Industrial Stream Data
Tengjiang Wang, Pengyu Yuan, Cun Ji and Shijun Liu
An Analytical Model for Cooperative Control of Vehicle Platoon
Nianfu Jin, Fangyi Hu and Yanjun Shi
Collaborative Mechanisms, Models, Approaches, Algorithms, and
Systems
Research on Scientific Workflow Scheduling Based on Fuzzy
Theory Under Edge Environment
Chaowei Lin, Bing Lin and Xing Chen
Fast Shapelet Discovery with Trend Feature Symbolization
Shichao Zhang, Xiangwei Zheng and Cun Ji
A Distributed Storage Middleware Based on HBase and Redis
Lingling Xu, Yuzhong Chen and Kun Guo
Cooperative Full-Duplex Relay Selection Strategy Based on Power
Splitting
Taoshen Li, Anni Shi, Jichang Chen and Zhihui Ge
A Two-Stage Graph Computation Model with Communication
Equilibrium
Yanmei Dong, Rongwang Chen and Kun Guo
Video Pedestrian Re-identification Based on Human-Machine
Collaboration
Yanfei Wang, Zhiwen Yu, Fan Yang, Liang Wang and Bin Guo
Cooperative Scheduling Strategy of Container Resources Based on
Improved Adaptive Differential Evolution Algorithm
Chengyao Hua, Ningjiang Chen, Yongsheng Xie and Linming Lian
Social Media and Online Communities
Personalized Recommendation Based on Scholars’ Similarity and
Trust Degree
Lunjie Qiu, Chengzhe Yuan, Jianguo Li, Shanchun Lian and
Yong Tang
Academic Paper Recommendation Method Combining
Heterogeneous Network and Temporal Attributes
Weisheng Li, Chao Chang, Chaobo He, Zhengyang Wu,
Jiongsheng Guo and Bo Peng
An Analysis on User Behaviors in Online Question and Answering
Communities
Wanxin Wang, Shiming Li, Yuanxing Rao, Xiaoxue Shen and Lu Jia
An Overlapping Local Community Detection Algorithm Based on
Node Transitivity and Modularity Density
Xintong Huang, Ling Wu and Kun Guo
SONAS:A System to Obtain Insights on Web APIs from Stack
Overflow
Naixuan Wang, Jian Cao, Qing Qi, Qi Gu and Shiyou Qian
Anatomy of Networks Through Matrix Characteristics of Core/
Periphery
Chenxiang Luo, Cui Chen and Wu Chen
A Light Transfer Model for Chinese Named Entity Recognition for
Specialty Domain
Jiaqi Wu, Tianyuan Liu, Yuqing Sun and Bin Gong
Outlier Detection for Sensor Data Streams Based on Maximum
Frequent and Minimum Rare Patterns
Xiaochen Shi, Saihua Cai and Ruizhi Sun
Embedding Based Personalized New Paper Recommendation
Yi Xie, Shaoqing Wang, Wei Pan, Huaibin Tang and Yuqing Sun
Short Papers
Scholar Knowledge Graph and Its Application in Scholar
Encyclopedia
Kun Hu, Jianguo Li, Wande Chen, Chengzhe Yuan, Qiang Xu and
Yong Tang
MPNet:A Multiprocess Convolutional Neural Network for Animal
Classification
Bin Jiang, Wei Huang, Yun Huang, Chao Yang and Fangqiang Xu
An Analysis of the Streamer Behaviors in Social Live Video
Streaming
Yuanxing Rao, Wanxin Wang, Xiaoxue Shen and Lu Jia
Research on the Method of Emotion Contagion in Virtual Space
Based on SIRS
Heng Liu, Xiao Hong, Dianjie Lu, Guijuan Zhang and Hong Liu
A Comprehensive Evaluation Method for Container Auto-Scaling
Algorithms on Cloud
Jingxuan Xie, Shubo Zhang, Maolin Pan and Yang Yu
Well Begun Is Half Done:How First Respondeners Affect Issue
Resolution Process in Open Source Software Development?
Yiran Wang and Jian Cao
The Design and Implementation of KMeans Based on Unified Batch
and Streaming Processing
Hao Chen, Kun Guo, Yuzhong Chen and Hong Guo
Image Super-Resolution Using Deformable Convolutional Network
Chang Li, Lunke Fei, Jianyang Qin, Dongning Liu, Shaohua Teng and
Wei Zhang
Research on Cognitive Effects of Internet Advertisements Based on
Eye Tracking
Yanru Chen, Yushi Jiang, Miao Miao, Bofu Feng, Liangxiong Wei,
Yanbing Yang, Bing Guo, Youling Lin and Liangyin Chen
A Short Text Classification Method Based on Combining Label
Information and Self-attention Graph Convolutional Neural
Network
Hongbin Wang, Gaorong Luo and Runxin Li
Community-Based Propagation of Important Nodes in the
Blockchain Network
Xin Li, Weidong Zheng and Hao Liao
ScholatAna:Big Data-Based Academic Social Network User
Behavior Preference System
Wenjie Ma, Ronghua Lin, Jianguo Li, Chengjie Mao, Qing Xu and
Angjian Wen
Pedestrian Detection Algorithm in SSD Infrared Image Based on
Transfer Learning
Jing Feng and Zhiwen Wang
How to Make Smart Contract Smarter
Shanxuan Chen, Jia Zhu, Zhihao Lin, Jin Huang and Yong Tang
Population Mobility Driven COVID-19 Analysis in Shenzhen
Ziqiang Wu, Xin Li, Juyang Cao, Zhanning Zhang, Xiaozhi Lin,
Yunkun Zhu, Lijun Ma and Hao Liao
Author Index
Crowdsourcing, Crowd Intelligence, and
Crowd Cooperative Computing
© Springer Nature Singapore Pte Ltd. 2021
Y. Sun et al. (eds.), Computer Supported Cooperative Work and Social Computing, Communications in Computer and
Information Science 1330
https://doi.org/10.1007/978-981-16-2540-4_1
Abstract
Distant supervision method is proposed to label instances automatically, which could operate relation
extraction without human annotations. However, the training data generated in this way intrinsically
include massive noise. To alleviate this problem, attention mechanism is employed by most prior
works that achieves significant improvements but could be still imcompetent for one-sentence bags
which means only one sentence within a bag. To this end, in this paper, we propose a novel neural
relation extraction method employing BIO embeddings and a selective self-attention with fusion gate
mechanism to fix the aformentioned defects in previous attention methods. First, in addition to
commonly adopted embedding methods in input layer, we propose to add BIO embeddings to enrich
the input representation. Second, a selective self-attention mechanism is proposed to capture context
dependency information and combined with PCNN via a Fusion Gate module to enhance the
representation of sentences. Experiments on the NYT dataset demonstrate the effectiveness of our
proposed methods and our model achieves consistent improvements for relation extraction compared
to the state-of-the-art methods.
Keywords Distant supervision relation extraction – One-sentence bag problem – Selective self-
attention – BIO embeddings
1 Introduction
Relation extraction, which aims at extracting ternary relations between entity pairs in the given
sentences, is a crucial task in natural language processing (NLP) [1, 2]. Conventional methods with
supervised learning paradigm[4, 5] require large-scale manually labeled data, which is time-
consuming and labor-intensive. To this end, a distant supervision method [1] was proposed to
overcome the lack of labeled training data, which automatically generates the data by aligning entities
in knowledge bases (KBs). The assumption for distant supervision is that if a entity pair has a relation
corresponding to the KBs, then all sentences mentioning this entity pair hold that relation.
However, intuitively, it is inevitable that distant supervision suffers from wrongly-labeled problem
due to its strong assumption and the dataset created by distant supervision can be seriously noisy [6,
9]. In order to alleviate this problem, a multi-instance learning (MIL) method was proposed [6] to be
incorporated with distant supervision. In MIL method, the dataset generated by distant supervision is
divided into bags where each bag consists of sentences involving the same entity pair.
In recent years, many neural relation extraction in MIL methods with attention mechanism have
been proposed to mitigate the influence of noisy data [2, 10] and achieved significant improvements.
These proposed methods mainly focus on the correctly labeled sentences in one bag and utilize them
to train a robust model. However, most of the bags in the distant supervision dataset (e.g. NYT dataset
[6]) are one-sentence, and more importantly, the one-sentence bag is highly likely to be wrongly
labeled. We investigate NYT dataset and find out that about 80% of its training examples are one-
sentence bags while up to 40% of them are wrongly labeled, taking labeled sentences in NYT dataset
for example in Table 1. Actually, vanilla selective attention mechanism is vulnerable to such situations
and could hurt the model performance. Furthermore, we argue that sentences with wrong label in a
bag also contain implicitly useful information for training model and are supposed to be well
exploited.
Table 1. Two examples of one-sentence bag with correct and wrong label in distant supervision respectively.
2 Related Work
Many conventional methods with supervised learning paradigm [4, 5] designed hand-crafted features
to train feature-based and kernel-based models, which have high performance and better
practicability. However, these methods require large-scale manually labeled data which is hard to get.
To address this issue, distant supervision was proposed [1] by automatically aligning plain texts
toward Freebase relational facts to generate relation label for entity pairs. However, it is such a strong
assumption that inevitably introduce the wrongly-labeled problem. Therefore, in order to alleviate
such problem, a line of works [6, 7] solve distant supervision relation extraction with multi-instance
learning method that all sentences mentioning the same entity pair are taken into a bag.
In recent years, with the development of deep learning methods [8], many deep neural networks
have been developed for distant supervision relation extraction. Zeng et al. [11] proposed piecewise
convolutional networks (PCNNs) to extract feature representations from sentences and select the
most reliable sentence for bag representation. Lin et al. [2] adopt PCNNs as sentence encoder and
propose a selective attention mechanism to generate bag representation over sentences. Liu et al. [3]
propose a soft-label method to denoise the sentences. Ye et al. [9] propose a intra-bag and inter-bag
attention mechanism to extend the sentence-level attention to bag-level. These methods all achieve
significant improvement in relation extraction, but they cannot handle the one-sentence bag problem
as a result of the drawback of vanilla selective attention.
To address the one-sentence bag problem, we propose a selective self-attention mechanism with
fusion gate which could capture rich context dependency information and enhance the representation
of sentences. Further, a BIO embeddings method is proposed to enrich the input representation.
3 Methodology
In this section, we introduce the proposed SS-Att framework for distant supervision relation
extraction. The framework of our method is shown in Fig. 1, which consists of the following neural
layers: input layer, encoder layer, fusion layer and output layer.
Fig. 1. The framework of our proposed SS-Att for distant supervision relation extraction
words. Given a bag within a set of sentences, the input layer transforms all the sentences into a matrix
S, which includes word embeddings, position embeddings and BIO embeddings of each word. The
descriptions of the three embeddings are as follows.
– Word Embeddings The word embeddings are pre-trained with word2vec method [12] for each
word. The dimension of word embeddings is .
– Position Embeddings We employ position embeddings [11] to represent the relative distance
between current word and target entity pair. Rich information is captured by position embeddings
which are mapped into two vectors and of dimensions.
– BIO Embeddings We add BIO embeddings into our framework which enhance the representation
of the entity pairs. Specifically, if a word is the start of an entity, we labeled this word as B, while the
rest part of the entity as I and other words as O. Then the BIO embeddings are obtained by looking
up a embedding table according to its label. The dimension of BIO embeddings is . Table 2
illustrates how to generate BIO labels.
These three embeddings are concatenated for each word and then a sentence is transformed into a
matrix as the input, where the word representation ,
.
Sentence Possibilities for more conflict in Iran and elsewhere in the Middle East are adding to the surge .
Label O O O O O B O O O O B I O O O O O O
(2)
where , are both learnable parameters and K is attention weights matrix. After that,
(3)
(5)
(6)
relations.
To capture complete information for bag representations and address the wrong-label issue,
similar to Ye et al. [9], our approach calculates bag representations for bag b with all relations as
(7)
where denotes the relation index and denotes the attention weight between the i-
th sentence in the bag and the k-th relation. Further, is calculated as
(8)
where denotes similarity between the i-th sentence in the bag and the k-th relation query.
Simply, we employ a dot product on vectors to calculate the similarity as
(9)
where denotes the k-th row in the relation embedding matrix . As a result, we can get the
relation representation matrix composed by the bag representation , where each row
(10)
where denotes compressing operation and denotes a fully-connected layer respectively,
and we also employ dropout strategy in fully-connected layer to prevent overfitting.
(11)
where |T| indicates the number of bags in training set and indicates the label of a bag while is the
set of model parameters to be learned. We utilize mini-batch stochastic gradient descent (SGD)
strategy to minimize the objective function and learn model parameters.
4 Experiments
In this section, firstly, we give a introduction of the dataset and evaluation metrics. Then we specify
training details and list our experimental parameter settings. Further, we conduct experiments with
our method regarding performance compared with previous baselines and competitive methods.
Besides, ablation experiments and case study show that our SS-Att framework is effective to extract
context features and address one-sentence bag problem.
Table 4. AUC values of previous baselines and our model, where PCNN+HATT and PCNN+BAG-ATT are two models
proposed in Han et al. [10] and Ye et al. [9] respectively.
Model AUC
PCNN+HATT 0.42
PCNN+BAG-ATT 0.42
SS-Att 0.45
Top N Precision. Following Ye et al. [9], we compare our proposed SS-Att with previous work
mentioned above for top-N precision (P@N). P@N means the the top N highest probabilities for the
precision of the relation classification results in the test set. We randomly select one sentence, two
sentences and all of them for each test entity pair to generate three test sets. We report the results for
the P@100, P@200, P@300 values and means of them on these three test sets in Table 5. From the
table we can see that: (1) Our proposed method achieves higher P@N values than any other work
which means SS-Att has a more powerful capability to capture better semantic features. (2) Compare
the results between SS-Att w/o BIO and SS-Att w/o Self-Att, we can draw a conclusion that BIO
embeddings and self-attention mechanism both contribute to the model performance while the
improvement of self-attention is more significant. One rational reason for that is self-attention
mechanism is good at extracting context dependency information and semantic features.
Table 5. P@N values in entity pairs for different number of test sentences.
Table 6. Result of training and testing on the bags including only one sentence from NYT dataset for the AUC value and
Accuracy on not-NA bags.
Model AUC
SS-Att w/o Self-Att 0.40
SS-Att w/o BIO 0.43
SS-Att w/o ALL 0.38
SS-Att (ours) 0.45
Fig. 3. Performance comparison for ablation study on PR curves
5 Conclusion
In this paper, we propose SS-Att framework to cope with the noisy one-sentence bag problem in
distant supervision relation extraction. First, we propose to add BIO embeddings to enrich the input
representation. Second, rich context dependency information is captured by the proposed Selective
Self-Attention, which is combined with PCNN module via a gate mechanism named Fusion Gate to
enhance the feature representation of sentences and overcome the one-sentence bag problem.
Further, we employ relation-aware attention that calculate weights for all relations dynamically to
alleviate the noise in wrongly-labeled bags. Experimental results on a widely-used dataset show that
our model consistently outperform the performance of models using only vanilla selective attentions.
In the future, we would like to explore the direction about multi-label problem of relation extraction
and further conduct experiments on multi-label loss functions.
Acknowledgements
This work is supported by National Natural Science Foundation of China (Project No. 61977013) and
Sichuan Science and Technology Program, China (Project No. 2019YJ0164).
References
1. Mintz, M., Bills, S., Snow, R.: Distant supervision for relation extraction without labeled data. In: Proceedings of the
Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural
Language Processing of the AFNLP, pp. 1003–1011 (2009)
2.
Lin, Y., Shen, S., Liu, Z., Luan, H.: Neural relation extraction with selective attention over instances. In: Proceedings of
the 54th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 2124–2133 (2016)
3.
Liu, T., Wang, K., Chang, B., Sui, Z.: A soft-label method for noise-tolerant distantly supervised relation extraction. In:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1790–1795
(2017)
4.
Zelenko, D., Aone, C.: Kernel methods for relation extraction. J. Mach. Learn. Res. 1083–1106 (2003)
5.
Culotta, A., Sorensen, J.: Dependency tree kernels for relation extraction. In: Proceedings of the 42nd Annual Meeting
of the Association for Computational Linguistics (ACL), pp. 423–429 (2004)
6.
Riedel, S., Yao, L., McCallum, A.: Modeling relations and their mentions without labeled text. In: Balcázar, J.L., Bonchi,
F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS (LNAI), vol. 6323, pp. 148–163. Springer, Heidelberg (2010).
https://doi.org/10.1007/978-3-642-15939-8_10
[Crossref]
7.
Hoffmann, R., Zhang, C., Ling, X., Zettlemoyer, L., Weld, D.S.: Knowledge-based weak supervision for information
extraction of overlapping relations. In: Proceedings of the 49th Annual Meeting of the Association for Computational
Linguistics (ACL): Human Language Technologies, pp. 541–550 (2011)
8.
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
[Crossref]
9.
Ye, Z.X., Ling, Z.H.: Distant supervision relation extraction with intra-bag and inter-bag attentions. In: Proceedings of
the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL):
Human Language Technologies, pp. 2810–2819 (2019)
10.
Han, X., Yu, P., Liu, Z., Sun, M., Li, P.: Hierarchical relation extraction with coarse-to-fine grained attention. In:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 2236–2245
(2018)
11.
Zeng, D., Liu, K., Chen, Y., Zhao, J.: Distant supervision for relation extraction via piecewise convolutional neural
networks. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP),
pp. 1753–1762 (2015)
12.
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings
of the 1st International Conference on Learning Representations (ICLR), pp. 1–12 (2013)
13.
Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31st Annual Conference on Neural Information
Processing Systems (NIPS), pp. 5998–6008 (2017)
14.
Lin, Z., et al.: A structured self-attentive sentence embedding. In: Proceedings of the 5th International Conference on
Learning Representations (ICLR), pp. 1–15 (2017)
15.
Li, Y., et al.: Self-attention enhanced selective gate with entity-aware embedding for distantly supervised relation
extraction. In: Proceedings of 34th AAAI Conference on Artificial Intelligence (AAAI), pp. 8269–8276 (2020)
16.
Yuan, Y., et al.: Cross-relation cross-bag attention for distantly-supervised relation extraction. In: Proceedings of the
33th AAAI Conference on Artificial Intelligence (AAAI), pp. 419–426 (2019)
17.
Surdeanu, M., Tibshirani, J., Nallapati, R., Manning, C.D.: Multi-instance multi-label learning for relation extraction. In:
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational
Natural Language Learning, pp. 455–465 (2012)
© Springer Nature Singapore Pte Ltd. 2021
Y. Sun et al. (eds.), Computer Supported Cooperative Work and Social Computing,
Communications in Computer and Information Science 1330
https://doi.org/10.1007/978-981-16-2540-4_2
Abstract
With the changing of users’ requirements, the number of Web services
is growing rapidly. It has been a popular research field to discover the
suitable service accurately and quickly in service computing research.
At present, most of the Web services published on the Internet are
described in natural language. This trend is becoming more and more
obvious. Existing service clustering methods are not only limited to a
specifically structured document but also rarely consider the
relationship between services into semantic information. In response
to the problems mentioned above, this paper suggests a Service
Clustering method based on Knowledge Graph Representation
Learning (SCKGRL). This method firstly crawled the services data from
ProgrammableWeb.com, use natural language tools to process the web
services description document, and obtain the service function
information set. Secondly, we constructed the service knowledge graph
by using the service-related information, the triples are converted into
vectors and minimize the dimension of service feature vectors due to
the knowledge representation learning method. Finally, the services
were clustered by the Louvain algorithm. The experiments show that
SCKGRL gives better performance compared with other methods, such
as LDA, VSM, WordNet, and Edit Distance, which provides on-demand
service more accurately.
1 Introduction
Web services are applications or business logic that can be accessed
using standard Internet protocols. Since the mobile Internet, cloud
computing, and SOA (Service-Oriented Architecture) technologies are
developing rapidly, the scale and type of services are increases quickly.
At the same time, service description languages are also diversified,
whether it is WSDL (Web Services Description Language), OWL-S
(Ontology Web Language for Services), or the very popular RESTful
service that uses natural language text to describe [1]. How to
accurately locate the service that meets the user’s specific business
needs from the large number of service sets whose functional attribute
differences are difficult to define, and accurately and efficiently
discover that meeting the needs of the masses of users is a challenge in
the research field of SOC.
The knowledge graph [2] is an open model of a knowledge domain
for semantic processing proposed by Google in 2012 [2]. The data
storage of the knowledge graph is usually in the form of triples (h, r, t)
to represent the different concepts in the physical world and the
interrelationship between concepts, where h represents the head entity
and r represents the relationship, t represents the tail entity.
Knowledge is presented in the form of graphs, thereby achieving the
role of knowledge description. Researchers have found that by
combining and analyzing the relationship information between entities,
they can learn rich representation features and have achieved good
results in search engines [3], recommendation systems [4], and
relationship mining [5]. Knowledge representation learning has
received widespread attention [4–10], among which the TransX series
model of knowledge representation learning based on the translation
mechanism has become a representative model [5–10].
In service clustering, clustering can gather similar services together.
In the service discovery stage or in the service recommendation stage,
the search scope of the service is narrowed to reduce the number of
service matches. Related research confirmed that by calculating
functional similarity, clustering Web services into a certain fixed-
function category can improve the performance of Web service search
engines [11–13]. Literature [14] extracts key features that reflect
service functions, quantifies key features, and finally clusters services
into clusters of similar function sets for WSDL documents. Literature
[15] apply the LDA model to extract the hidden service function theme
information of the Web service, reduces dimensionality and encodes
the topic information, and then calculates the similarity between
services for service clustering. Literature [16] considers the division of
services with similar functions into homogeneous service communities
to classify services. However, existing work generally has the following
two shortcomings:
(1) The types of service documents for clustering are relatively single:
such as WSDL documents and OWL-S documents, and most of
these services follow the SOAP protocol, and less attention is paid
to currently popular RESTful services.
(2) In our real world, Web services do not exist independently. They
usually have a certain relationship and influence each other in a
certain way (such as the relationship of shared tags). The
semantic relationship between these services is rarely fully
utilized.
2 Related Work
Service clustering can reduce the search scope of services, thereby
improving the efficiency of service discovery. At present, academia has
carried out a lot of work and achieved considerable research results.
Literature [17] proposed a probabilistic topic model MR-LDA that
considers multiple Web service relationships. Modeling through the
existence of mutual combination and sharing of tags between Web
services. This method increases the relevant features between services
to achieve the purpose of accurate clustering. Literature [18] suggests
that new attributes such as context can be extracted from WSDL
structured text to achieve aggregation of similar services. Literature
[19] proposed a method of clustering all words according to semantics
based on the Word2vec tool, and then when training the topic
information of the service, the auxiliary effect of the words belonging to
the same cluster as the words in the service description document is
considered. Literature [11] proposed a domain service clustering
model DSCM and based on this model, clustered services subject-
oriented, and organized services with similar functions in specific
domains into subject clusters. Literature [20] Taking into account the
unsatisfactory results of using traditional modeling methods due to the
sparse service representation data, BTM can be used to learn the
hidden topics in the entire service description texts set, and the topic
distribution of each text can be acquired through inference, and then
used K-Means algorithm performs clustering. Literature [21] uses the
machine learning model to acquire the synonym table in the
description texts and generate the domain feature word set to minimize
the dimension of the service feature vector, and cluster services in the
same domain through the S-LDA model.
Most of the existing service clustering methods are not limited to
specific service document types such as WSDL documents or OWL-S.
They lack attention to RESTful style types and the rich semantic
relationships between services are rarely used.
Through the analysis of the above related work, this paper suggests
to apply the knowledge graph related technology to maximize the
description of the semantic features of the service, represented in the
form of triples, and extract the functional information set of the service
from the service description through natural language technology as a
type of entity in the service knowledge graph, combining other related
features of the service, construct the knowledge graph. At the same
time, after comparing different dimensionality reduction techniques,
we choose the translation model algorithm to represent the service
triples in the form of vectors, which greatly reduces the dimension of
service feature vectors, and the modular Louvain algorithm is used to
group similar services into one category.
3 SCKGRL Method
To improve service clustering performance, the SCKGRL method needs
to preprocess the collected service data set, extract the functional
information set of the service from the service description through
natural language technology as a type of entity in the service
knowledge graph, combining other related attributes of the service,
construct the service knowledge graph. Then according to the
representation learning method, Translating the triples into vectors to
minimize the dimension of service feature vectors, calculate the
pairwise similarity between services, and apply the service similarity to
the Louvain algorithm to achieve service clustering. The data set for
constructing the service knowledge graph comes from real service data
on PWeb. The following sections will detail the main parts of the
framework. Figure 1 shows the framework of the SCKGRL method.
Fig. 1. The framework of SCKGRL